* Repeat Bump dependency.tika.version from 2.1.0 to 2.2.1 Original PR https://github.com/Alfresco/alfresco-transform-core/pull/506 was merged to master where it failed. There had been no build of the PR before the merge, which is why this branch has been created. * Use non deprecated TikaCoreProperties.SUBJECT with tika 2.2.1. The deprecated OfficeOpenXMLCore.SUBJECT value worked in 2.2.0 but not 2.2.1 * With the upgrade of Tika from 2.2.0 to 2.2.1, the deprecated OfficeOpenXMLCore.SUBJECT metadata value became being null and the replacement TikaCoreProperties.SUBJECT became a multi value in a few of our test cases. For backward compatibility with very old versions of Alfresco, we have historically been added a number of extra values including "subject" and "description" back into the raw metadata, before mapping them onto Alfresco properties. These values existed in the original version of Tika used by Alfresco, so it is possible there are custom mappings out there that using them. To complicate matters a little, out standard mappings for some types put the raw "subject" value into cm:description property. What makes it interesting is that the extra "description" value is not used but has the value originally in our expected metadata extarct data. That is why the quick_*_json files have been modified.
Common code for Transform Engines
This project contains code that is common between all the ACS T-Engine transformers that run as Spring Boot process (optionally within their own Docker containers). It performs common actions such as logging, throttling requests and handling the streaming of content to and from the container.
For more details on build a custom T-Engine, please refer to the current docs in ACS Packaging, including:
Overview
A transformer project is expected to provide the following files:
src/main/resources/templates/transformForm.html
src/main/java/org/alfresco/transformer/<TransformerName>Controller.java
src/main/java/org/alfresco/transformer/Application.java
- transformForm.html - A simple test page using thymeleaf that gathers request parameters so they may be used to test the transformer.
<html xmlns:th="http://www.thymeleaf.org">
<body>
<div>
<h2>Test Transformation</h2>
<form method="POST" enctype="multipart/form-data" action="/transform">
<table>
<tr><td><div style="text-align:right">file *</div></td><td><input type="file" name="file" /></td></tr>
<tr><td><div style="text-align:right">file *</div></td><td><input type="file" name="file" /></td></tr>
<tr><td><div style="text-align:right">sourceExtension *</div></td><td><input type="text" name="sourceExtension" value="" /></td></tr>
<tr><td><div style="text-align:right">targetExtension *</div></td><td><input type="text" name="targetExtension" value="" /></td></tr>
<tr><td><div style="text-align:right">sourceMimetype *</div></td><td><input type="text" name="sourceMimetype" value="" /></td></tr>
<tr><td><div style="text-align:right">targetMimetype *</div></td><td><input type="text" name="targetMimetype" value="" /></td></tr>
<tr><td><div style="text-align:right">abc:width</div></td><td><input type="text" name="width" value="" /></td></tr>
<tr><td><div style="text-align:right">abc:height</div></td><td><input type="text" name="height" value="" /></td></tr>
<tr><td><div style="text-align:right">timeout</div></td><td><input type="text" name="timeout" value="" /></td></tr>
<tr><td></td><td><input type="submit" value="Transform" /></td></tr>
</table>
</form>
</div>
<div>
<a href="/log">Log entries</a>
</div>
</body>
</html>
- TransformerNameController.java - A Spring Boot Controller that extends AbstractTransformerController to handel requests. It implements a few methods including transformImpl which is intended to perform the actual transform. Generally the transform is done in a sub class of JavaExecutor, when a Java library is being used or AbstractCommandExecutor, when an external process is used. Both are sub interfaces of Transformer.
...
@Controller
public class TransformerNameController extends AbstractTransformerController
{
private static final Logger logger = LoggerFactory.getLogger(TransformerNameController.class);
TransformerNameExecutor executor;
@PostConstruct
private void init()
{
executor = new TransformerNameExecutor();
}
@Override
public String getTransformerName()
{
return "Transformer Name";
}
@Override
public String version()
{
return commandExecutor.version();
}
@Override
public ProbeTestTransform getProbeTestTransform()
{
// See the Javadoc on this method and Probes.md for the choice of these values.
return new ProbeTestTransform(this, "quick.pdf", "quick.png",
7455, 1024, 150, 10240, 60 * 20 + 1, 60 * 15 - 15)
{
@Override
protected void executeTransformCommand(File sourceFile, File targetFile)
{
transformImpl(null, null, null, Collections.emptyMap(), sourceFile, targetFile);
}
};
}
@Override
public void transformImpl(String transformName, String sourceMimetype, String targetMimetype,
Map<String, String> transformOptions, File sourceFile, File targetFile)
{
executor.transform(sourceMimetype, targetMimetype, transformOptions, sourceFile, targetFile);
}
}
- TransformerNameExecuter.java - JavaExecutor and CommandExecutor sub classes need to extract values from transformOptions and use them in a call to an external process or as parameters to a library call.
...
public class TransformerNameExecutor extends AbstractCommandExecutor
{
...
@Override
public void transform(String transformName, String sourceMimetype, String targetMimetype,
Map<String, String> transformOptions,
File sourceFile, File targetFile) throws TransformException
{
final String options = TransformerNameOptionsBuilder
.builder()
.withWidth(transformOptions.get(WIDTH_REQUEST_PARAM))
.withHeight(transformOptions.get(HEIGHT_REQUEST_PARAM))
.build();
Long timeout = stringToLong(transformOptions.get(TIMEOUT));
run(options, sourceFile, targetFile, timeout);
}
}
- Application.java - Spring Boot expects to find an Application in a project's source files. The following may be used:
package org.alfresco.transformer;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
public class Application
{
public static void main(String[] args)
{
SpringApplication.run(Application.class, args);
}
}
Transform requests are handled by the AbstractTransformerController, but are either:
- POST requests (a direct http request from a client) where the transform options are passed as parameters, the source is supplied as a multipart file and the response is a file download.
- POST request (a request via a message queue) where the transform options are supplied as JSON and the response is also JSON. The source and target content is read from a location accessible to both the client and the transfomer.
Example JSON request body
var transformRequest = {
"requestId": "1",
"sourceReference": "2f9ed237-c734-4366-8c8b-6001819169a4",
"sourceMediaType": "application/pdf",
"sourceSize": 123456,
"sourceExtension": "pdf",
"targetMediaType": "text/plain",
"targetExtension": "txt",
"clientType": "ACS",
"clientData": "Yo No Soy Marinero, Soy Capitan, Soy Capitan!",
"schema": 1,
"transformRequestOptions": {
"targetMimetype": "text/plain",
"targetEncoding": "UTF-8",
"abc:width": "120",
"abc:height": "200"
}
}
Example JSON response body
var transformReply = {
"requestId": "1",
"status": 201,
"errorDetails": null,
"sourceReference": "2f9ed237-c734-4366-8c8b-6001819169a4",
"targetReference": "34d69ff0-7eaa-4741-8a9f-e1915e6995bf",
"clientType": "ACS",
"clientData": "Yo No Soy Marinero, Soy Capitan, Soy Capitan!",
"schema": 1
}
Building and testing
The project can be built by running the Maven command:
mvn clean install
Artifacts
The artifacts can be obtained by:
- downloading from the Alfresco repository
- Adding a Maven dependency to your pom file.
<dependency>
<groupId>org.alfresco</groupId>
<artifactId>alfresco-transformer-base</artifactId>
<version>1.0</version>
</dependency>
and the Alfresco Maven repository:
<repository>
<id>alfresco-maven-repo</id>
<url>https://artifacts.alfresco.com/nexus/content/groups/public</url>
</repository>
The build plan is available in TravisCI.
Contributing guide
Please use this guide to make a contribution to the project.