A new constructor has been added to the TikaController to provide
the new spring config.
The creation of the TikaExecutor has been moved to "singleton pattern" as
the injection of the @Value happens after the instantiation of the
TikaJavaExecutor and does not pass the value correctly. The
instantiation is now done once, on the first transform request.
Param has been added to the AIO beans.
Bug found while reviewing documents on how to create a custom metadata extractor. The original refactor had left the repo doing the mapping. It should have been passing the fully qualified repo properties to the T-Engine to do the mapping.
Linked to:
Alfresco/alfresco-community-repo#227Alfresco/acs-packaging#1826
ATS-829: Release T-Core (T-Engines) 2.3.6 [trigger release]
Linked to REPO-5219 Allow AGS AMP to specify metadata extract mapping
Added an extractMapping transform option to all metadata extractors to override the default one.
3rd party libraries to get a green build.
* Upgrade cxf-rt-transports-http and woodstox-core to avoid issues
* Upgrade to org.springframework.boot:spring-boot-starter-parent:2.3.5.RELEASE to avoid problem in org.springframework:spring-web
* Upgrade to activemq 5.15.13 to avoid problem in activemq-broker 5.15.12
* MNT-21869 libreoffice timeout set too high
Reduce default value of timeout for libreoffice from 2000min to 20min
Add option to configure libreoffice timeout externally.
Enable to configure externally the port on which the app starts.
Add external-engine-configuration.md
* ATS-762: Add Tika unit test for pdf to csv
* ATS-762: Fix indentation
* ATS-762: Added 3 tests for simple pipepline. msg > txt, txt > doc, txt > odt, txt > rtf
* ATS-762: Added tests for libreofficeToPdf pipeline
* ATS-762: Addressed Jan's comment about not using asterisk when importing modules
* ATS-762: Changed comment to pdf-->csv to address Jan's comment on the PR
* task/ATS-762_T: noticed the txt mime type was wrong so fixed it
Co-authored-by: kristian <kristian.dimitrov@alfresco.com>
* Metadata extract code added to T-Engines
* Required a refactor of duplicate code to avoid 3x more duplication:
- try catches used to return return exit codes
- calls to java libraries or commands to external processes
- building of transform options in controllers, adaptors
* integration tests based on current extracts performed in the repo
* included extract code for libreoffice, and embed code even though not used out of the box any more. There may well be custom extracts using them that move to T-Engines
* removal of unused imports
* minor autoOrient / allowEnlargement bug fixes that were not included in Paddington on the T-Engine side.