mirror of
https://github.com/Alfresco/alfresco-transform-core.git
synced 2025-05-12 17:04:48 +00:00
* REPO-4639: Split tika engine_config.json into separate transformers. * WIP: REPO-4639 Content conversion failed using Tika The Tika T-Engine "transform" option does not exist when called via the Transform Service or Local transforms, which resulted in no transforms taking place. However this value is really not be needed as the T-Engine should be able to read its own engine_config.xml to work out which sub transform to use. Transforms only worked via Legacy transforms, which used a T-Engine. This code is based on tried and tested ACS repository code. It has been further simplified. TODO: - replace the ConfigFileFinder class just added with something that uses Spring to read the JSON. i.e. simplify it. - replace the CombinedConfig class just added with something that does not need the InLineTransformer. i.e. simplify it. - create tests based on the repo tests - remove the source and target mimetype checks in Tika as a check against engine_config.xml is cleaner. - repeat the process for the Misc T-Engine as it has similar code checking source and target mimetypes. - remove the transform option passed by the legacy transforms. * Removed CombindConfig and ConfigFileFnder classes. * Extracted AbstractTransformRegistry so that it may be used in the ACS repository too. TODO AbstractTransformRegistry and AbstractTransformRegistry need to be moved to the alfresco-transform-model pakage * tidy up only * REPO-4639: Add priority to duplicate transforms. * REPO-4639: Refactor TikaTransformationIT to use the new Tika /transform specifications Changes AbstractTransformerControllerTest as the engine_config is now loaded in TransformRegistryImpl instead of AbstractTransformerController * Rename to TransformServiceRegistry, so we don't have to change the repo code. * Added the baseUrl parameter to the register method and fixed the missed rename in the last commit. * Javadoc change only * Moved common classes (with repo) AbbstractTransformRegistry and TransformServiceRegistry to alfresco-transform-model * Replace (simplify) all the isTransformable calls with a check against the JSON. - Tests now only pass targetEncoding to the 'string' transformer. * Fix failing tests. * Revert port change * REPO-4639 : Add priorities to misc engine_config * REPO-4639 : Add priorities to pdf-renderer and imagemagick engine_config * Remove test that is @Ignored * Pick up alfresco-transformer-model 1.0.2.7-REPO-4639-1 * REPO-4639 : Add priorities to libreoffice engine_config * REPO-4639 : Add priorities to tika engine_config * REPO-4639 : Remove all priorities with value equal to 50 (default) from engine_config * Switch over to using TransformServiceRegistry in org.alfresco.transform.client.registry Reintroduce the noExtensionSourceFilenameTest having removed @Ignore. * New whitesource issue on commons-compress 1.18. Upgrading to 1.19. * Removed the text/javascript -> text/plain test as this is not supported * Modifications as a result of changes to method names in alfresco-transform-model * Pick up alfresco-transform-model 1.0.2.7-ATS545-2 * Remove unused imports
Alfresco Transform Core
Contains the common transformer (T-Engine) code, plus a few actual implementations.
Sub-projects
alfresco-transformer-base
- library packaged as a jar file which contains code that is common to all the transformers; see the sub-project's READMEalfresco-docker-<name>
- multiple T-Engines; each one of them builds both a SpringBoot fat jar and a Docker image
Building and testing
The project can be built by running the Maven command:
mvn clean install -Plocal,docker-it-setup
The
local
Maven profile builds local Docker images for each T-Engine.
Artifacts
Maven
The artifacts can be obtained by:
- downloading from Alfresco repository
- getting as Maven dependency by adding the dependency to your pom file:
<dependency>
<groupId>org.alfresco</groupId>
<artifactId>alfresco-transformer-base</artifactId>
<version>version</version>
</dependency>
and Alfresco Maven repository:
<repository>
<id>alfresco-maven-repo</id>
<url>https://artifacts.alfresco.com/nexus/content/groups/public</url>
</repository>
Docker
The core T-Engine images are available on Docker Hub:
- alfresco/alfresco-imagemagick
- alfresco/alfresco-pdf-renderer
- alfresco/alfresco-libreoffice
- alfresco/alfresco-tika
Release Process
For a complete walk-through check out the
build-and-release.MD
under the docs
folder.
Contributing guide
Please use this guide to make a contribution to the project.
Languages
Java
80.5%
C
12.5%
Rich Text Format
5.4%
Dockerfile
0.7%
HTML
0.5%
Other
0.3%