Bug found while reviewing documents on how to create a custom metadata extractor. The original refactor had left the repo doing the mapping. It should have been passing the fully qualified repo properties to the T-Engine to do the mapping.
Linked to:
Alfresco/alfresco-community-repo#227Alfresco/acs-packaging#1826
ATS-829: Release T-Core (T-Engines) 2.3.6 [trigger release]
Linked to REPO-5219 Allow AGS AMP to specify metadata extract mapping
Added an extractMapping transform option to all metadata extractors to override the default one.
3rd party libraries to get a green build.
* Upgrade cxf-rt-transports-http and woodstox-core to avoid issues
* Upgrade to org.springframework.boot:spring-boot-starter-parent:2.3.5.RELEASE to avoid problem in org.springframework:spring-web
* Upgrade to activemq 5.15.13 to avoid problem in activemq-broker 5.15.12
* ATS-816: Fix tika apple keynote
The application/vnd.apple.keynote -> text/plain transformation has been found to fail after switching the version of tika in ATS-801
The previous version of tika would use the org.apache.tika.parser.pkg.PackageParser but the new version uses an empty parser producing empty target file.
* Re enable test for application/vnd.apple.keynote to text
* ATS-801: Tika Update - part 1 (T-Core) sanity check
- initially switch to 1.21 (to see if any unit/quick tests fail in T-Core)
* ATS-801: Tika Update - part 1 (T-Core) sanity check
- 1st attempt to bump to Tika 1.24.1 / Poi 4.1.2
* ATS-762: Add Tika unit test for pdf to csv
* ATS-762: Fix indentation
* ATS-762: Added 3 tests for simple pipepline. msg > txt, txt > doc, txt > odt, txt > rtf
* ATS-762: Added tests for libreofficeToPdf pipeline
* ATS-762: Addressed Jan's comment about not using asterisk when importing modules
* ATS-762: Changed comment to pdf-->csv to address Jan's comment on the PR
* task/ATS-762_T: noticed the txt mime type was wrong so fixed it
Co-authored-by: kristian <kristian.dimitrov@alfresco.com>
* Metadata extract code added to T-Engines
* Required a refactor of duplicate code to avoid 3x more duplication:
- try catches used to return return exit codes
- calls to java libraries or commands to external processes
- building of transform options in controllers, adaptors
* integration tests based on current extracts performed in the repo
* included extract code for libreoffice, and embed code even though not used out of the box any more. There may well be custom extracts using them that move to T-Engines
* removal of unused imports
* minor autoOrient / allowEnlargement bug fixes that were not included in Paddington on the T-Engine side.
* ATS-763: Added missing tests in Ticka
* ATS-763: Added the missing transform tests for Libre Office and replaced quick files in Ticka
* ATS-763: Replaced newly added quick.xml and quick.msg with preexisting files.
* ATS-763: Added targets to tests in Libre Office -see Jan's comment in PR
* ATS-763: Added test files to Image Magick, and uncommented the PSD source file
* ATS-763: put back a comment in Image Magick how it was before my previous commit
* ATS-763: Resolved Jan's comment about seperating out mimetypes into their correct section such as SPREADSHEET or PRESENTATION
* ATS-763: Fixed failing test (ppsm and ppsx)
* ATS-763: Removed unnecessary source files in Image Magick
* ATS-763: Fix failing LibreOffice unit tests
* ATS-763: Fix indentation in LibreOfficeTransformationIT
* ATS-763: fixed failing image magick tests and removed failing transform from config
* ATS-763: Added missing priority for pages -> txt
Co-authored-by: kristian <kristian.dimitrov@alfresco.com>