Issue
- We use the Data Migration tool to migrate our document library from FileSystemStore to S3 (although the issue might be relevant for other migrations too)
- While it runs, it fails with NoSuchFileVersionException (see attached file for full error stack trace)
Environment
- 7.4
Resolution
-
the migration process probably fails while it is processing the entries of Adaptive Media, and possibly because it is encountering such an entry (in the amimageentry table) which has a fileversionid that does not exist in the dlfileversion table.
A select like the below should tell us if there are indeed such orphan records and how many:
select count(*) from amimageentry where fileversionid not in (select fileversionid from dlfileversion);
or to see them in detail:
select * from amimageentry where fileversionid not in (select fileversionid from dlfileversion);
You might want to check what these return in your pre-migration database.
Assuming the above seems to confirm that this is what is happening, see below further info about it. -
This inconsistency, these orphan records (and folders) were probably created while there was a bug in the product (in earlier versions): https://issues.liferay.com/browse/LPS-114817.
In summary, before this fix was introduced, if an end user was deleting an older version of a document, the Adaptive Media previews and thumbnails for that document were not deleted (neither from the database - amimageentry table, neither from the file system). So these orphaned entries were left there.
-
You can find attached a Groovy script that was made to clean this up. Please read it through carefully before using it, there are important explanations and instructions in it, but I will also try to summarize it here:
- by default the script runs in so-called safe mode (see _safeMode = true variable towards the end), which means that in this mode it will only print out the orphaned database entries, so you can cross-check also with your corresponding folders in the file system. Finally, when everything is prepared and thought through, you can set the _safeMode variable to false in the script, and run the script again. This is when it will actually delete the orphan files from the database.
- the script can only remove the orphaned database entries from the amimageentry table, but not the corresponding files/folders in the file system - this is explained at the beginning of the script, how to proceed with manually deleting the files from the file system. However, deleting manually the folders from the file system will not be necessarily needed, because if those orphan entries will not be there anymore in the database (after running the cleanup script), the data migration process will not encounter them, so it will probably not do anything about them, so it will not do anything about the corresponding orphan files in the file system either, so it will not transfer them over to your S3 bucket. So practically the data migration process will do the file system part of the cleanup for you, by simply ignoring those orphaned files and not transferring them to your new "file system" in the S3 bucket. Therefore I believe it is just enough to run the script (which will delete the orphan database entries) , and then do the data migration again.
- the script is written for 7.2, but it should work for 7.4 as well