Removed all the Extractors that now exist in the T-Engines:
JodConnverterMetadataExtracter
TikaPoweredMetadataExtracter – the abstract base class used by other extractors
-- MailMetadataExtracter
-- PoiMetadataExtracter
-- TikaAutoMetadataExtracter
-- MP3MetadataExtracter
-- TikaSpringConfiguredMetadataExtracter - removed as it required Spring config and would run in process
-- PdfBoxMetadataExtracter
-- OpenDocumentMetadataExtracter
-- OfficeMetadataExtracter
-- DWGMetadataExtracter
HtmlMetadataExtracter
RFC822MetadataExtracter
XmlMetadataExtracter and XPathMetadataExtracter still exist but don't provide any extraction out of the box. The reason they still exist is to support custom transforms (in AMPs) to extract from XML. There are no XML extractors in the T-Engines at the moment, but that is where the custom transformer code really should be moved.
There are new tests to ensure the async transforms take place as expected.
Additionally many of the existing tests still exist (those not related to a specific extractor). Some of these have been modified to reflect that the extract is now async and to no longer check the modified value has not changed (it is now expected to change).
There are also a number of new metadata extract smoke tests that ensure that a selected subset of extracts are supported by the OOTB T-Engines.
* REPO-5208 Addition of extra async metadata extract tests for overriding policy, tag extractio and carryAspectProperties
Main author: Adina Ababei <adina.ababei@ness.com>
Testing of tagging modified by Alan Davis to pass when Solr is not running.
Addition of support of async metadata extraction via T-Engines.
Still needs support for RM to control what is extracted in emails.
Still includes OOTB metadata extractors. To be removed.
Still needs removal of legacy transformers and 3rd party libraries they use.
* MNT-22009 - When setting permissions async, nodes with the aspect applied cannot be deleted
* Added unit test that deletes a node with the sys:pendingFixAcl aspect
before the job runs
* Added verification to the job to verify if node is in archive store
and if so, it shall not process that node and remove sys:pendingFixAcl aspect and properties