Alan Davis 59325bc38a Repeat Bump dependency.tika.version from 2.1.0 to 2.2.1 (#516)
* Repeat Bump dependency.tika.version from 2.1.0 to 2.2.1

Original PR https://github.com/Alfresco/alfresco-transform-core/pull/506 was merged to master where it failed. There had been no build of the PR before the merge, which is why this branch has been created.

* Use non deprecated TikaCoreProperties.SUBJECT with tika 2.2.1.

The deprecated OfficeOpenXMLCore.SUBJECT value worked in 2.2.0 but not 2.2.1

* With the upgrade of Tika from 2.2.0 to 2.2.1, the deprecated OfficeOpenXMLCore.SUBJECT metadata value became being null and the replacement TikaCoreProperties.SUBJECT became a multi value in a few of our test cases. For backward compatibility with very old versions of Alfresco, we have historically been added a number of extra values including "subject" and "description" back into the raw metadata, before mapping them onto Alfresco properties. These values existed in the original version of Tika used by Alfresco, so it is possible there are custom mappings out there that using them.

To complicate matters a little, out standard mappings for some types put the raw "subject" value into cm:description property. What makes it interesting is that the extra "description" value is not used but has the value originally in our expected metadata extarct data. That is why the quick_*_json files have been modified.
2022-01-13 17:25:56 +00:00
2019-05-20 14:06:25 +03:00
2018-03-07 14:39:07 +00:00

Alfresco Transform Core

Build Status

Contains the common transformer (T-Engine) code, plus a few actual implementations.

Sub-projects

  • alfresco-transformer-base - library packaged as a jar file which contains code that is common to all the transformers; see the sub-project's README
  • alfresco-transform-<name> - multiple T-Engines; each one of them builds both a SpringBoot fat jar and a Docker image

Documentation

In addition to the sub-projects (such as alfresco-transformer-base README above) some additional documentation can be found in:

Note: if you're interested in the Alfresco Transform Service (ATS) that is part of the enterprise Alfresco Content Services (ACS) please see:

Building and testing

The project can be built by running the Maven command:

mvn clean install -Plocal,docker-it-setup

The local Maven profile builds local Docker images for each T-Engine.

Artifacts

Maven

The artifacts can be obtained by:

  • downloading from Alfresco repository
  • getting as Maven dependency by adding the dependency to your pom file:
<dependency>
    <groupId>org.alfresco</groupId>
    <artifactId>alfresco-transformer-base</artifactId>
    <version>version</version>
</dependency>

and Alfresco Maven repository:

<repository>
  <id>alfresco-maven-repo</id>
  <url>https://artifacts.alfresco.com/nexus/content/groups/public</url>
</repository>

Docker

The core T-Engine images are available on Docker Hub.

Either as a single Core AIO (All-In-One) T-Engine:

Or as set of individual T-Engines:

You can find examples of using Core AIO in the reference ACS Deployment for Docker Compose:

You can find examples of using the indivudal T-Engines in the reference ACS Deployment for Helm / Kubernetes:

Release Process

For a complete walk-through check out the build-and-release.MD under the docs folder.

Contributing guide

Please use this guide to make a contribution to the project.

Description
No description provided
Readme 517 MiB
Languages
Java 79.8%
C 12.1%
Rich Text Format 5.3%
omnetpp-msg 1.3%
Dockerfile 0.7%
Other 0.7%