One of my colleagues, Yash Rode developed a very nice improvement for Media Migration about one and half months ago: he added a YouTube field → media migration to the module. After I committed it, the question immediately arose in my mind: what will happen during incremental migrations? I definitely cannot migrate file IDs to media entity IDs anymore!

So I started getting rid of the mid destination properties in the media entity migrations d7_file_entity and d7_file_plain and also update all migrate process plugins which assumed that the source file ID will be the ID of the migrated media entity, like the plugins which are processing text fields1.

The task wasn’t that difficult if we didn’t count with Migrate Upgrade… Well, that was what I thought before I ran into a core bug (I think it is a bug).

The situation

The first thing I did was removing the mid: fid process pipelines from d7_file_entity and d7_file_plain. Then I took a look at the process plugins which are parsing text fields2 and I updated them accordingly. Latter was required because I couldn’t assume anymore that the destination media entity’s ID is the same as the source file ID – by the way, I already knew that it was a wrong assumption… See this (#3214791) and this (#3267064) issue.

Pretty slowly I refactored all the components I knew to handle the new situation properly, and then I ran into a whole unexpected EntityStorageException while running the refresh test:

SQLSTATE[23502]: Not null violation: 7 ERROR:  null value in column "uuid" violates not-null constraint
DETAIL:  Failing row contains (4, 4, image, null, en).: UPDATE "test68160068media" SET "vid"=:db_update_placeholder_0, "bundle"=:db_update_placeholder_1, "uuid"=:db_update_placeholder_2, "langcode"=:db_update_placeholder_3
WHERE "mid" = :db_condition_placeholder_0; Array
(
  "db_update_placeholder_0" => 4,
  "db_update_placeholder_1" => "image",
  "db_update_placeholder_2" => NULL,
  "db_update_placeholder_3" => "en"
)
(/Users/zoli/projects/media_migration/zdev/public_html/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php:811)

Debugging

Obviously I started debugging this situation by setting up a breakpoint in SqlContentEntityStorage at line 811 then checking the backtrace. There was no surprise: the uuid property of the updated entity in EntityContentBase::save()3 really was empty!

I checked the next trace (EntityContentBase::import()), and found a very weird thing in the $row variable’s empty destination properties:

Why is uuid listed twice? I flagged it as an empty destination in the MediaMigrateUuid process plugin if there is no prophecy for the given file ID (managed files are the data sources of the migrated media entities). So I set up another breakpoint there, and then followed the processing of the actual migration row.

Change between 9.2.x and 9.3.x

It turned out that the second time uuid is marked as empty happens in MigrateExecutable, more precisely here, at the end of its processRow method:

1
2
3
4
5
6
7
8
9
// Ensure all values, including nulls, are migrated.
if ($plugins) {
  if (isset($value)) {
    $row->setDestinationProperty($destination, $value);
  }
  else {
    $row->setEmptyDestinationProperty($destination);
  }
}

I was absolutely sure that this was not the way how destination property values were handled until recently, so I started checking the contents of the MigrateExecutable class switching between core development branches.

It turned out I remembered well! There was a bigger change between 9.2.x and 9.3.x. A commit (with a very less expressive commit message) has introduced even the MigrateExecutable::processRow method and with it, the current way of handling empty destination properties.

I went back to my MediaMigrateUuid process plugin, and made the empty property flagging conditional: I do it only if the actual core version is lower than 9.3.0:

1
2
3
4
5
6
// No UUID was found – lets set the destination property to empty before
// throwing a skip process exception (this is only required for 9.2.x and
// below).
if (version_compare(\Drupal::VERSION, '9.3.0', 'lt')) {
  $row->setEmptyDestinationProperty($destination_property);
}

I hoped that this would fix the issue, but unfortunately this fix wasn’t enough: my change tracking test was wailing with the same error message that I saw at first.

Still wrong! But why?…

I knew I needed to know how my fully processed migration row looks and then what is the entity that is attempted to be saved by the migration destination plugin. In every similar situation I used to set up a breakpoint in MigrateExecutabe, at the line where the destination plugin’s import method is invoked.

I tracked down what happened: Before the import, my row contained the calculated destination properties and values I expected. But in EntityContentBase::import, the entity instance returned by ::getEntity() method (this is inherited from the parent Entity class) contained weird things: At the entity’s values key, I could see the current uuid, but at the fields key (this is where the processed destination property values are pushed into), the uuid property was present, and it was an empty array!

Let’s see what happens in Entity::getEntity()! This method basically starts with a condition: if old destination IDs are available, then it tries to load the preexisting entity, then invokes ::updateEntity(); and at the end of the EntityContentBase::updateEntity() method we can see why the value of the uuid property will be NULL:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
protected function updateEntity(EntityInterface $entity, Row $row) {
  $empty_destinations = $row->getEmptyDestinationProperties();
  
  (...)
  
  foreach ($empty_destinations as $field_name) {
    $entity->$field_name = NULL;
  }

  $this->setRollbackAction($row->getIdMap(), $rollback_action);

  // We might have a different (translated) entity, so return it.
  return $entity;
}

I’ll be honest: I have no idea why this is needed. I would definitely unset the destination property values of the current $row object instead of setting the fields of the entity to NULL. NULL is a value, it does not mean emptiness by default. The “empty” attribution is determined by the field’s FieldType class (see ComplexDataInterface::isEmtpy()) or item list class (see ListInterface::isEmpty()).

Maybe I totally misunderstood what empty destination properties are used for…

Workaround

I worked this around with a complex process pipeline. Luckily, I had to write MigMagGetEntityProperty for working around menu link migration weirdnesses of Drupal core, and this process plugin was a big help this time too.

So the actual solution:

  • track_changes_uuid: I do a lookup for the ID of an already migrated entity, and if it exists, then I get the entity’s UUID. This destination property only has value if the same entity has been migrated previously.
  • oracle_uuid: This is the process pipeline that computed the uuid property previously, by only taking the UUID prophecy table into account – but it was renamed to oracle_uuid.
  • uuid: Here is the magic! One of my favorite process plugins: null_coalesce! If track_changes_uuid is not null, then the computed uuid will be the value of track_changes_uuid. If it is NULL, but oracle_uuid has a non-null value, then that will be set.

    If both are empty, then this destination property will be set to NULL (and will be marked as being empty like before), but this happens only if we migrate the entity for the first time. And fortunately, the database exception is only triggered during entity updates.

Here is the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
id: d7_file_entity
# Label, source config, deriver class configuration...
process:
  track_changes_uuid:
    -
      plugin: migration_lookup
      source: fid
      migration: d7_file_entity
      no_stub: true
    -
      plugin: skip_on_empty
      method: process
    -
      plugin: migmag_get_entity_property
      entity_type_id: 'media'
      property: 'uuid'
  oracle_uuid:
    plugin: media_migrate_uuid
    source: fid
  uuid:
    plugin: 'null_coalesce'
    source:
      - '@track_changes_uuid'
      - '@oracle_uuid'
# The rest of the migration YAML remaind unchanged.

Footnotes:

  1. These plugins are parsing test fields and transforming the old media embed codes to their Drupal 9+ equivalents and also replace inline <img> tags with embed code. 

  2. These are the process plugins which are changing Drupal 7 Media embed JSONs to Drupal 9 embed tags and which are transforming image and other file links to linkit tags, and which are converting directly used images (<img src="...">) to Drupal 9 equivalent embed HTML tags. 

  3. EntityContentBase is the destination plugin class of Drupal content entities. It is derived by MigrateEntity which decides what plugin class should be used per entity type