Adding extra error messages to the t-config checking, for the case where a pipeline specifies source and target mimetypes that cannot be provided by the step transformers, so that it will be clearer that the pipeline t-config is wrong. In the case of the AI rendition tests the AI-transform t-config has a pipeline that uses libreoffice as a step transformer, to transform some source to text/plain before asking AWS_AI to process it into the final mimetype. However libreoffice does not convert to text/plain. What is happening is that the request was still being sent to the all-in-one t-engine that contains libreoffice, and it workout that it should be using the tika transformer. As a result the pipeline works by accident. The size check that was commented out (uncommented now) was just finding out that libreoffice was unable to do the transform and was reporting it. officeToComprehendPiiEntityTypesViaText is the pipeline with the error.
Common base code for T-Engines
This project provides a common base for T-Engines and supersedes the original base.
This project provides a base Spring Boot application (as a jar) to which transform specific code may be added. It includes actions such as communication between components and logging.
For more details on build a custom T-Engine and T-Config, please refer to the docs in ACS Packaging, including:
Overview
A T-Engine project which extends this base is expected to provide the following:
- An implementation of the TransformEngine interface to describe the T-Engine.
- Implementations of the CustomTransformer interface with the actual transform code.
- An
application-default.yaml
file to define a unique name for the message queue to the T-Engine.
The TransformEngine
and CustomTransformer
implementations should have an
@Component
annotation and be in or below theorg.alfresco.transform
package, so
that they will be discovered by the base T-Engine.
The TransformEngine.getTransformConfig()
method typically reads a json
file.
The names in the config should match the names returned by the CustomTransformer
implementations.
Example TransformEngine
The TransformEngineName
is important if the config from multiple T-Engines is being
combined as they are sorted by name.
package org.alfresco.transform.example;
import com.google.common.collect.ImmutableMap;
import org.alfresco.transform.base.TransformEngine;
import org.alfresco.transform.base.probes.ProbeTransform;
import org.alfresco.transform.config.reader.TransformConfigResourceReader;
import org.alfresco.transform.config.TransformConfig;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
@Component
public class HelloTransformEngine implements TransformEngine
{
@Autowired
private TransformConfigResourceReader transformConfigResourceReader;
@Override
public String getTransformEngineName()
{
return "0200_hello";
}
@Override
public String getStartupMessage()
{
return "Startup "+getTransformEngineName()+"\nNo 3rd party licenses";
}
@Override
public TransformConfig getTransformConfig()
{
return transformConfigResourceReader.read("classpath:hello_engine_config.json");
}
@Override
public ProbeTransform getProbeTransform()
{
return new ProbeTransform("probe.txt", "text/plain", "text/plain",
ImmutableMap.of("sourceEncoding", "UTF-8", "language", "English"),
11, 10, 150, 1024, 1, 60 * 2);
}
}
Example CustomTransformer
package org.alfresco.transform.example;
import org.alfresco.transform.base.CustomTransformer;
import org.alfresco.transform.base.TransformManager;
import org.springframework.stereotype.Component;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Map;
@Component
public class HelloTransformer implements CustomTransformer
{
@Override
public String getTransformerName()
{
return "hello";
}
@Override
public void transform(String sourceMimetype, InputStream inputStream, String targetMimetype,
OutputStream outputStream, Map<String, String> transformOptions, TransformManager transformManager)
throws Exception
{
String name = new String(inputStream.readAllBytes(), transformOptions.get("sourceEncoding"));
String greeting = String.format(getGreeting(transformOptions.get("language")), name);
byte[] bytes = greeting.getBytes(transformOptions.get("sourceEncoding"));
outputStream.write(bytes, 0, bytes.length);
}
private String getGreeting(String language)
{
return "Hello %s";
}
}
Example T-Config resources/hello_engine_config.json
{
"transformOptions": {
"helloOptions": [
{"value": {"name": "language"}},
{"value": {"name": "sourceEncoding"}}
]
},
"transformers": [
{
"transformerName": "hello",
"supportedSourceAndTargetList": [
{"sourceMediaType": "text/plain", "targetMediaType": "text/plain" }
],
"transformOptions": [
"helloOptions"
]
}
]
}
Example properties resources/application-default.yaml
As can be seen the following defines a default which can be overridden by an environment variable.
queue:
engineRequestQueue: ${TRANSFORM_ENGINE_REQUEST_QUEUE:org.alfresco.transform.engine.libreoffice.acs}
Example ProbeTransform test file resources/probe.txt
Jane