From ade4b95a499c0250b6ef508485e84b15cdd58fc0 Mon Sep 17 00:00:00 2001
From: alandavis <alan.davis@hyland.com>
Date: Wed, 17 Aug 2022 13:33:16 +0100
Subject: [PATCH] Doc changes only [skip ci]

---
 docs/engine_config.md                         | 161 ---------
 docs/t-engines.md                             |  60 ++++
 docs/transform-config.md                      | 308 ++++++++++++++++++
 docs/transform-specific-code.md               | 140 ++++++++
 docs/transformer-selection.md                 |  27 ++
 docs/transformerDebug.md                      |  46 +++
 .../base/registry/TransformConfigFiles.java   |   4 +-
 7 files changed, 583 insertions(+), 163 deletions(-)
 delete mode 100644 docs/engine_config.md
 create mode 100644 docs/t-engines.md
 create mode 100644 docs/transform-config.md
 create mode 100644 docs/transform-specific-code.md
 create mode 100644 docs/transformer-selection.md
 create mode 100644 docs/transformerDebug.md
diff --git a/docs/engine_config.md b/docs/engine_config.md
deleted file mode 100644
index 9d49ccde..00000000
--- a/docs/engine_config.md
+++ /dev/null
@@ -1,161 +0,0 @@
-## T-Engine configuration
-
-T-Engines provide a */transform/config* end point for clients (e.g. Transform-Router or 
-Repository) that indicate what is supported. T-Engines store this 
-configuration as a JSON resource file named *engine_config.json*.
-
-The config can be found under `alfresco-transform-core/engines/<t-engine-name>/src/main/resources
-/<t-engine-name>_engine_config.json`; current configuration files are:
-* [Pdf-Renderer T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/pdfrenderer/src/main/resources/pdfrenderer_engine_config.json).
-* [ImageMagick T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/imagemagick/src/main/resources/imagemagick_engine_config.json).
-* [Libreoffice T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/libreoffice/src/main/resources/libreoffice_engine_config.json).
-* [Tika T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/tika/src/main/resources/tika_engine_config.json).
-* [Misc T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/misc/src/main/resources/misc_engine_config.json).
-
-*Snippet from Tika T-engine configuration:*
-```json
-{
-  "transformOptions": {
-    "tikaOptions": [
-      {"value": {"name": "targetEncoding"}}
-    ],
-    "pdfboxOptions": [
-      {"value": {"name": "notExtractBookmarksText"}},
-      {"value": {"name": "targetEncoding"}}
-    ]
-  },
-  "transformers": [
-    {
-      "transformerName": "PdfBox",
-      "supportedSourceAndTargetList": [
-        {"sourceMediaType": "application/pdf",                                 "targetMediaType": "text/html"},
-        {"sourceMediaType": "application/pdf", "maxSourceSizeBytes": 26214400, "targetMediaType": "text/plain"}
-      ],
-      "transformOptions": [
-        "pdfboxOptions"
-      ]
-    },
-    {
-      "transformerName": "TikaAuto",
-      "supportedSourceAndTargetList": [
-        {"sourceMediaType": "application/msword",              "priority": 55, "targetMediaType": "text/xml"}
-      ],
-      "transformOptions": [
-        "tikaOptions"
-      ]
-    },
-    {
-      "transformerName": "TextMining",
-      "supportedSourceAndTargetList": [
-        {"sourceMediaType": "application/msword",                              "targetMediaType": "text/xml"}
-      ],
-      "transformOptions": [
-        "tikaOptions"
-      ]
-    }
-  ]
-}
-```
-
-### Transform Options
-*  **transformOptions** provides a list of transform options that may be
-  referenced for use in different transformers. This way common options
-  don't need to be repeated for each transformer, they can be shared between
-  T-Engines. In this example there are two groups of options called **tikaOptions**
-  and **pdfboxOptions** which has a group of options **targetEncoding** and
-  **notExtractBookmarksText**. Unless an option has a **"required": true** field it is
-  considered to be optional.
-  
-  *Snippet from ImageMagick T-engine configuration:*
-```json
-    "transformOptions": {
-      "imageMagickOptions": [
-        {"value": {"name": "alphaRemove"}},
-        {"group": {"transformOptions": [
-          {"value": {"name": "cropGravity"}},
-          {"value": {"name": "cropWidth"}},
-          {"value": {"name": "cropHeight"}},
-          {"value": {"name": "cropPercentage"}},
-          {"value": {"name": "cropXOffset"}},
-          {"value": {"name": "cropYOffset"}}
-        ]}},
-      ]
-    },
-```
-*  There are two types of transformOptions, *transformOptionsValue* and *transformOptionsGroup*:
-   *  _TransformOptionsValue_ is used to represent a single transformation option, it is defined 
-   by a **name** and an optional **required** field.
-   *  _TransformOptionGroup_ represents a group of one or more options, it is used to group 
-   options that define a
-   characteristic. In the above snippet all the options for crop are defined under a group, it is recommended to
-   use this approach as it is easier to read. A transformOptionsGroup can contain one or more transformOptionsValue 
-   and transformOptionsGroup. 
-  
-### Transformers
-* **transformers** - A list of transformer definitions.
-  Each transformer definition should have a unique **transformerName**,
-  specify a **supportedSourceAndTargetList** and indicate which
-  options it supports. As it is shown in the Tika snippet, an *engine_config*
-  can describe one or more transformers, as a T-engine can have
-  multiple transformers (e.g. Tika, Misc). A transformer configuration may 
-  specify references to 0 or more transformOptions.
-
-### Supported Source and Target List
-* **supportedSourceAndTargetList** is simply a list of source and target
-  Media Types that may be transformed, optionally specifying a
-  **maxSourceSizeBytes** and a **priority** value. 
-*  *maxSourceSizeBytes* is used to set the upper size limit of a transformation.
-   * If not specified, the default value for maxSourceSizeBytes is **unlimited**.
-*  *priority* it is used by clients to determine which transfomer to call or by T-engines
-    with multiple transformers to determine which one to use. In the above Tika snippet,
-    both *TikaAuto* and *TextMining* have the capability to transform *"application/msword"*
-    into *"text/xml"*, the transformer containing the source-target media type with higher priority will be chosen by the
-    T-engine as the one to execute the transformation, in this case it will be *TextMining*, because:
-   * If not specified, the default value for priority is **50**.
-   * Note: priority values are like the order in a queue, the **lower** the number the **higher the
-    priority** is.
-   
-## Transformer selection strategy
-The T-Engine configuration is used to choose which T-Engine will perform a transform.
-A transformer definition contains a supported list of source and target Media Types. This is used for the
-most basic selection. This is further refined by checking that the definition also supports transform options
-(parameters) that have been supplied in a transform request or a Rendition Definition used in a rendition request.
-Order for selection is:
-1. Source->Target Media Types
-2. transformOptions
-3. maxSourceSizeBytes
-4. priority
- 
-#### Case 1:
-```
-Transformer 1 defines options: Op1, Op2
-Transformer 2 defines options: Op1, Op2, Op3, Op4
-```
-```
-Rendition provides values for options: Op2, Op3
-```
-If we assume both transformers support the required source and target Media Types, Transformer 2 will be selected
-because it knows about all the supplied options. The definition may also specify that some options are required or grouped.
-
-#### Case 2:
-```
-Transformer 1 defines options: Op1, Op2, maxSize
-Transformer 2 defines options: Op1, Op2, Op3
-```
-```
-Rendition provides values for options: Op1, Op2
-```
-If we assume both transformers support the required source and target Media Types, and file size is greater than *maxSize*
-,Transformer 2 will be selected because if can handle *maxSourceSizeBytes* for this transformation.
-
-#### Case 3:
-```
-Transformer 1 defines options: Op1, Op2, priorty1
-Transformer 2 defines options: Op1, Op2, Op3, priority2
-```
-```
-Rendition provides values for options: Op1, Op2
-```
-If we assume both transformers support the required source and target Media Types and
- *priority1* < *priority2*, Transformer 1 will be selected because its priority is higher.
- 
\ No newline at end of file
diff --git a/docs/t-engines.md b/docs/t-engines.md
new file mode 100644
index 00000000..e06e0fe3
--- /dev/null
+++ b/docs/t-engines.md
@@ -0,0 +1,60 @@
+## T-Engines
+
+The t-engines provide the basic transform operations. The Transform Service
+provides a common base for the communication with other components. It is
+this base that is described in this section. The base is a Spring Boot
+application to which transform specific code is added and then wrapped
+in a Docker image with any programs that the transforms need. The base
+does not need to be used as long as there appears to be a process responding
+endpoints and messages.
+
+A t-engine groups together one of more Transformers. Each Transformer
+(provided by transform specific code) knows how to perform a set of
+transformations from one MIME Type to another with a common set of
+t-options.
+
+~~~
+0010 my-t-engine
+  Transformer 1
+    mimetype A -> mimetype B
+    mimetype A -> mimetype C
+    mimetype B -> mimetype C
+    option1
+    option2
+  Transformer 2
+    mimetype A -> mimetype B
+    mimetype D -> mimetype C
+    option2
+    option3
+0020 another-t-engine
+  ...
+0030 yet-another-t-engine
+  ...
+~~~
+
+### Endpoints
+
+* `POST /transform` to perform a transform. There are two forms:
+  * For asynchronous transforms: Perform a transform using a
+    `TransformRequest` received from the t-router via a message queue. The
+    `TransformReply` is sent back via the queue.
+  * For synchronous transforms: Performs a transform on content uploaded as
+    a Multipart File and provides the resulting content as a download.
+    Transform options are extracted from the request properties. The
+    following are not added as transform options, but are used to select the
+    transformer: `sourceMimetype` & `targetMimetype`.
+* `GET /transform/config` to obtain t-config about what the t-engine supports.
+  It has a parameter `configVersion` to allow a caller and the t-engine to
+  negotiate down to a common format. The value is an integer which indicate
+  which elements may to be added to the config. These elements reflect
+  functionality supported by the base (such as pre-signed URLs). The
+  `CoreVersionDecorator` adds to the Config returned by the transform
+  specific code.
+* `GET /` provides an html test page to upload a source file, enter transform
+  options and issue a synchronous transform request. Useful in testing.
+* `GET /log` provides a page with basic log information. Useful in testing.
+* `GET /error` provides an error page when testing.
+* `GET /version` provides a String message to be included in client debug
+  messages.
+* `GET /ready` used by Kubernetes as a ready probe.
+* `GET /live` used by Kubernetes as a ready probe.
\ No newline at end of file
diff --git a/docs/transform-config.md b/docs/transform-config.md
new file mode 100644
index 00000000..0227bfbd
--- /dev/null
+++ b/docs/transform-config.md
@@ -0,0 +1,308 @@
+## T-Engine configuration
+
+Each t-engine provides an endpoint that returns t-config that defines what
+it supports. The t-router and t-engines may also have external t-config files.
+These are combined in name order. As sorting is alphanumeric, you may wish to
+consider using a fixed length numeric prefix in filenames and t-engine names. As will be seen
+t-config may reference elements from other components or modify elements
+from earlier t-config.
+
+Current configuration files are:
+* [Pdf-Renderer T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/pdfrenderer/src/main/resources/pdfrenderer_engine_config.json).
+* [ImageMagick T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/imagemagick/src/main/resources/imagemagick_engine_config.json).
+* [Libreoffice T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/libreoffice/src/main/resources/libreoffice_engine_config.json).
+* [Tika T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/tika/src/main/resources/tika_engine_config.json).
+* [Misc T-Engine configuration](https://github.com/Alfresco/alfresco-transform-core/blob/master/engines/misc/src/main/resources/misc_engine_config.json).
+
+Additional config files (which may be resources on the classpath or external
+files) are specified in Spring Boot properties or such as
+`transform.config.file.<filename>` or environment variables like
+`TRANSFORM_CONFIG_FILE_<filename>`.
+
+The following is a simple t-config file from an example Hello World
+t-engine.
+
+~~~
+{
+  "transformOptions":
+  {
+    "helloWorldOptions":
+    [
+      {"value": {"name": "language"}}
+    ]
+  },
+  "transformers":
+  [
+    {
+      "transformerName": "helloWorld",
+      "supportedSourceAndTargetList":
+      [
+        {"sourceMediaType": "text/plain",  "maxSourceSizeBytes": 50, "targetMediaType": "text/html"  }
+      ],
+      "transformOptions":
+      [
+        "helloWorldOptions"
+      ]
+    }
+  ]
+}
+~~~
+
+* **transformOptions** provides a list of transform options (each with its own
+  name) that may be referenced for use in different transformers. This way
+  common options don't need to be repeated for each transformer. They can
+  even be shared between T-Engines. In this example there is only one group
+  of options called `helloWorldOptions`, which has just one option the
+  `language`. Unless an option has a `"required": true` field it is considered
+  to be optional. You don't need to specify _sourceMimetype, sourceExtension,
+  sourceEncoding, targetMimetype, targetExtension_ or _timeout_ as options as
+  these are available to all transformers.
+* **transformers** is a list of transformer definitions. Each transformer
+  definition should have a unique `transformerName`, specify a
+  `supportedSourceAndTargetList` and indicate which options it supports.
+  In this case there is only one transformer called `Hello World` and it
+  accepts `helloWorldOptions`. A transformer may specify references to 0
+  or more transformOptions.
+* **supportedSourceAndTargetList** is simply a list of source and target
+  Media Types that may be transformed, optionally specifying
+  `maxSourceSizeBytes` and `priority` values. In this case there is only one
+  from text to HTML and we have limited the source file size, to avoid
+  transforming files that clearly don't contain names.
+
+### Transform pipelines
+
+Transforms may be combined in a pipeline to form a new transformer, where
+the output from one becomes the input to the next and so on. The t-config
+defines the sequence of transform steps and intermediate Media Types. Like
+any other transformer, it specifies a list of supported source and target
+Media Types. If you don't supply any, all possible combinations are assumed
+to be available. The definition may reuse the `transformOptions` of
+transformers in the pipeline, but typically will define its own subset
+of these.
+
+The following example begins with the `helloWorld` Transformer, which takes a
+text file containing a name and produces an HTML file with `Hello <name>`
+message in the body. This is then transformed back into a text file. This
+example contains just one pipeline transformer, but many may be defined 
+in the same file.
+
+~~~
+{
+  "transformers": [
+    {
+      "transformerName": "helloWorldText",
+      "transformerPipeline" : [
+        {"transformerName": "helloWorld", "targetMediaType": "text/html"},
+        {"transformerName": "html"}
+      ],
+      "supportedSourceAndTargetList": [
+        {"sourceMediaType": "text/plain", "priority": 45,  "targetMediaType": "text/plain" }
+      ],
+      "transformOptions": [
+        "helloWorldOptions"
+      ]
+    }
+  ]
+}
+~~~
+
+* **transformerName** Try to create a unique name for the transform.
+* **transformerPipeline** A list of transformers in the pipeline. The
+  `targetMediaType` specifies the intermediate Media Types between
+  transformers. There is no final `targetMediaType` as this comes from the
+  `supportedSourceAndTargetList`. The `transformerName` may reference a
+  transformer that has not been defined yet. A warning is issued if
+  it remains undefined after all t-config has been combined. Generally
+  it is better for a t-engine rather than the t-router to define pipeline
+  transformers as this limits the number of places that have to be changed.
+  Normally it is obvious which t-engine should contain the definition. 
+* **supportedSourceAndTargetList** The supported source and target Media
+  Types, which refer to the Media Types this pipeline transformer can
+  transform from and to, additionally you can set the `priority` and the
+  `maxSourceSizeBytes`. If blank, this indicates that all possible
+  combinations are supported. This is the cartesian product of all source
+  types to the first intermediate type and all target types from the last
+  intermediate type. Any combinations supported by the first transformer
+  are excluded. They will also have the priority from the first transform.
+* **transformOptions** A list of references to options required by the
+  pipeline transformer.
+
+### Failover transforms
+
+A failover transform, simply provides a list of transforms to be attempted
+one after another until one succeeds. For example, you may have a fast
+transform that is able to handle a limited set of transforms and another
+that is slower but handles all cases.
+
+~~~
+{
+  "transformers": [
+    {
+      "transformerName": "imgExtractOrImgCreate",
+      "transformerFailover" : [ "imgExtract", "imgCreate" ],
+      "supportedSourceAndTargetList": [
+        {"sourceMediaType": "application/vnd.oasis.opendocument.graphics", "priority": 150, "targetMediaType": "image/png" },
+        ...
+        {"sourceMediaType": "application/vnd.sun.xml.calc.template",       "priority": 150, "targetMediaType": "image/png" }
+      ]
+    }
+  ]
+}
+~~~
+
+* **transformerName** Try to create a unique name for the transform.
+* **transformerFaillover** A list of transformers to try. This may include
+  references to transformer that have not been defined yet. Generally it
+  is better for the t-engine rather than the t-router to define failover
+  transformers as this limits the number of places that have to be changed.
+  Normally it is obvious which t-engine should contain the definition. 
+* **supportedSourceAndTargetList** The supported source and target Media
+  Types, which refer to the Media Types this failover transformer can
+  transform from and to, additionally you can set the `priority` and the
+  `maxSourceSizeBytes`. Unlike pipelines, it must not be blank.
+* **transformOptions** A list of references to options required by the 
+  pipeline transformer.
+
+### Overriding transforms
+
+It is possible to override a previously defined transform definition. The
+following example removes most of the supported source to target media
+types from the standard `"libreoffice"` transform. It also changes the
+max size and priority of others. This is not something you would normally
+want to do.
+~~~
+{
+  "transformers": [
+    {
+      "transformerName": "libreoffice",
+      "supportedSourceAndTargetList": [
+        {"sourceMediaType": "text/csv", "maxSourceSizeBytes": 1000, "targetMediaType": "text/html" },
+        {"sourceMediaType": "text/csv", "targetMediaType": "application/vnd.oasis.opendocument.spreadsheet" },
+        {"sourceMediaType": "text/csv", "targetMediaType": "application/vnd.oasis.opendocument.spreadsheet-template" },
+        {"sourceMediaType": "text/csv", "targetMediaType": "text/tab-separated-values" },
+        {"sourceMediaType": "text/csv", "priority": 45, "targetMediaType": "application/vnd.ms-excel" },
+        {"sourceMediaType": "text/csv", "priority": 155, "targetMediaType": "application/pdf" }
+      ]
+    }
+  ]
+}
+~~~
+
+### Removing a transformer
+
+To discard a previous transformer definition include its name in the
+optional `"removeTransformers"` list. You might want to do this if you
+have a replacement and wish keep the overall configuration simple (so it
+contains no alternatives), or you wish to temporarily remove it. The
+following example removes two transformers before processing any other
+configuration in the same T-Engine or pipeline file.
+
+~~~
+{
+  "removeTransformers" : [
+    "libreoffice",
+    "Archive"
+   ]
+  ...
+}
+~~~
+
+### Overriding the supportedSourceAndTargetList
+
+Rather than totally override an existing transform definition, it is
+generally simpler to modify the `"supportedSourceAndTargetList"` by adding
+elements to the optional `"addSupported"`, `"removeSupported"` and
+`"overrideSupported"` lists. You will need to specify the
+`"transformerName"` but you will not need to repeat all the other
+`"supportedSourceAndTargetList"` values, which means if there are changes
+in the original, the same change is not needed in a second place. The
+following example adds one transform, removes two others and changes
+the `"priority"` and `"maxSourceSizeBytes"` of another. This is done before
+processing any other configuration in the same T-Engine or pipeline file.
+~~~
+{
+  "addSupported": [
+    {
+      "transformerName": "Archive",
+      "sourceMediaType": "application/zip",
+      "targetMediaType": "text/csv",
+      "priority": 60,
+      "maxSourceSizeBytes": 18874368
+    }
+  ],
+  "removeSupported": [
+    {
+      "transformerName": "Archive",
+      "sourceMediaType": "application/zip",
+      "targetMediaType": "text/xml"
+    },
+    {
+      "transformerName": "Archive",
+      "sourceMediaType": "application/zip",
+      "targetMediaType": "text/plain"
+    }
+  ],
+  "overrideSupported": [
+    {
+      "transformerName": "Archive",
+      "sourceMediaType": "application/zip",
+      "targetMediaType": "text/html",
+      "priority": 60,
+      "maxSourceSizeBytes": 18874368
+    }
+  ]
+  ...
+}
+~~~
+
+### Default maxSourceSizeBytes and priority values
+
+When defining `"supportedSourceAndTargetList"` elements the `"priority"`
+and `"maxSourceSizeBytes"` are optional and normally have the default
+values of 50 and -1 (no limit). It is possible to change those defaults.
+In precedence order from most specific to most general these are defined
+by combinations of `"transformerName"` and `"sourceMediaType"`.
+
+* **transformer and source media type default** both specified
+* **transformer** default only the transformer name is specified
+* **source media type default** only the source media type is specified
+* **system wide default** neither are specified.
+
+Both `"priority"` and `"maxSourceSizeBytes"` may be specified in an element,
+but if only one is specified it is only that value that is being defaulted.
+
+Being able to change the defaults is particularly useful once a T-Engine
+has been developed as it allows a system administrator to handle
+limitations that are only found later. The `system wide defaults` are
+generally not used but are included for completeness. The following
+example says that the `"Office"` transformer by default should only handle 
+zip files up to 18 Mb and by default the maximum size of a `.doc` file to be
+transformed is 4 Mb. The third example defaults the priority, possibly
+allowing another transformer that has specified a priority of say `50` to
+be used in preference.
+
+Defaults values are only applied after all t-config has been read.
+
+~~~
+{
+  "supportedDefaults": [
+    {
+      "transformerName": "Office",             // default for a source type within a transformer
+      "sourceMediaType": "application/zip",
+      "maxSourceSizeBytes": 18874368
+    },
+    {
+      "sourceMediaType": "application/msword", // defaults for a source type
+      "maxSourceSizeBytes": 4194304,
+      "priority": 45
+    },
+    {
+      "priority": 60                           // system wide default
+    },
+    {
+      "maxSourceSizeBytes": -1                 // system wide default
+    }
+  ]
+  ...
+}
+~~~
\ No newline at end of file
diff --git a/docs/transform-specific-code.md b/docs/transform-specific-code.md
new file mode 100644
index 00000000..dfbaa369
--- /dev/null
+++ b/docs/transform-specific-code.md
@@ -0,0 +1,140 @@
+## Transform specific code
+
+To create a new t-engine an author uses a base t-engine (a Spring Boot
+application) and implements the following interfaces. An implementation of
+the `CustomTransformer` provides the actual transformation code and the
+implementation of the `TransformEngine` says what it is capable of
+transforming. The `TransformConfig` is normally read from a json file on the
+classpath. Multiple `CustomTransformer` implementations may be in a singe
+t-engine. As a result the author can concentrate on the code that transforms
+one format to another without really worrying about all the plumbing.
+Typically, the transform specific code uses a 3rd party library or an
+external executable which needs to be added to the Docker image.
+
+~~~
+package org.alfresco.transform;
+
+import org.alfresco.transform.config.TransformConfig;
+import org.alfresco.transformer.probes.ProbeTestTransform;
+
+import java.util.Set;
+
+/**
+ * Interface to be implemented by transform specific code. Provides information
+ * about the t-engine as a whole. So that it is automatically picked up, it must
+ * exist in a package under {@code org.alfresco.transform} and have the Spring
+ * {@code @Component} annotation.
+ */
+public interface TransformEngine
+{
+    /**
+      * @return the name of the t-engine. The t-router reads config from t-engines
+      *         in name order.
+      */
+    String getTransformEngineName();
+
+    /**
+     * @return a definition of what the t-engine supports. Normally read from a json
+     *         Resource on the classpath.
+     */
+    TransformConfig getTransformConfig();
+
+    /**
+     * @return a ProbeTestTransform (will do a quick transform) for k8 liveness and
+     *         readiness probes.
+     */
+    ProbeTransform getProbeTransform();
+}
+~~~
+
+implementations of the following interface provide the actual transform code.
+
+~~~
+package org.alfresco.transform;
+
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.Map;
+
+/**
+ * Interface to be implemented by transform specific code. The
+ * {@code transformerName} should match the transformerName in the
+ * {@link TransformConfig} returned by the {@link TransformEngine}. So that it is
+ * automatically picked up, it must exist in a package under
+ * {@code org.alfresco.transform} and have the Spring {@code @Component} annotation.
+ *
+ * Implementations may also use the {@link TransformManager} if they wish to
+ * interact with the base t-engine.
+ */
+public interface CustomTransformer
+{
+    String getTransformerName();
+
+    void transform(String sourceMimetype, InputStream inputStream,
+                   String targetMimetype, OutputStream outputStream,
+                   Map<String, String> transformOptions,
+                   TransformManager transformManager) throws Exception;
+}
+~~~
+
+The implementation of the following interface is provided by the t-base,
+allows the `CustomTransformer` to interact with the base t-engine. The
+creation of Files is discouraged as it is better not to leave files on disk.
+
+~~~
+package org.alfresco.transform.base;
+
+import java.io.File;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.Map;
+
+/**
+ * Allows {@link CustomTransformer} implementations to interact with the base
+ * t-engine.
+ */
+public interface TransformManager
+{
+    /**
+     * Allows a CustomTransformer to use a local source File rather than the
+     * supplied InputStream. To avoid creating extra files, if a File has already
+     * been created by the base t-engine, it is returned.
+     */
+    File createSourceFile();
+
+    /**
+     * Allows a CustomTransformer to use a local target File rather than the
+     * supplied OutputStream. To avoid creating extra files, if a File has already
+     * been created by the base t-engine, it is returned.
+     */
+    File createTargetFile();
+
+    /**
+     * Allows a single transform request to have multiple transform responses. For
+     * example, images from a video at different time offsets or different pages of
+     * a document. Following a call to this method a transform response is made with
+     * the data sent to the current {@code OutputStream}. If this method has been
+     * called, there will not be another response when {@link CustomTransformer#
+     * transform(String, InputStream, String, OutputStream, Map, TransformManager)}
+     * returns and any data written to the final {@code OutputStream} will be
+     * ignored.
+     * @param index    returned with the response, so that the fragment may be
+     *                 distinguished from other responses. Renditions use the index
+     *                 as an offset into elements. A {@code null} value indicates
+     *                 that there is no more output and any data sent to the current
+     *                 {@code outputStream} will be ignored.
+     * @param finished indicates this is the final fragment. {@code False} indicates
+     *                 that it is expected there will be more fragments. There need
+     *                 not be a call with this parameter set to {@code true}.
+     * @return a new {@code OutputStream} for the next fragment. A {@code null} will
+     *                 be returned if {@code index} was {@code null} or {@code
+     *                 finished} was {@code true}.
+     * @throws TransformException if a synchronous (http) request has been made as
+     *                 this only works with requests on queues, or the first call to
+     *                 this method indicated there was no output, or another call is
+     *                 made after it has been indicated that there should be no more
+     *                 fragments.
+     * @throws IOException if there was a problem sending the response.
+    OutputStream respondWithFragment(Integer index);
+}
+~~~
\ No newline at end of file
diff --git a/docs/transformer-selection.md b/docs/transformer-selection.md
new file mode 100644
index 00000000..de132447
--- /dev/null
+++ b/docs/transformer-selection.md
@@ -0,0 +1,27 @@
+## Transformer selection strategy
+
+The TransformRegistry uses t-config to choose which Transformer will be
+used. A transformer definition contains a supported list of source and
+target Media Types. This is used for the most basic selection. It is further
+refined by checking that the definition also supports transform options (the
+parameters) that have been supplied in a transform request.
+
+~~~
+Transformer 1 defines options: Op1, Op2
+Transformer 2 defines options: Op1, Op2, Op3, Op4
+
+Transform request provides values for options: Op2, Op3
+~~~
+If we assume both transformers support the required source and target Media
+Types, Transformer 2 will be selected because it knows about all the supplied
+options. The definition may also specify that some options are required or
+grouped. If any members of an optional group are supplied, all required
+members of that group become required.
+
+The configuration may impose a source file size limit resulting in the
+selection of a different transformer. Size limits are normally added to avoid
+the transforms consuming too many resources.
+
+The configuration may also specify a priority which will be used in
+Transformer selection if there are a number of possible transformers. The
+highest priority is the one with the lowest number.
\ No newline at end of file
diff --git a/docs/transformerDebug.md b/docs/transformerDebug.md
new file mode 100644
index 00000000..7456df34
--- /dev/null
+++ b/docs/transformerDebug.md
@@ -0,0 +1,46 @@
+## TransformerDebug
+
+In addition to any normal logging, the t-engines, t-router and t-client also
+use the `TransformerDebug` class to provide request based logging. The
+following is an example from Alfresco after the upload of a `docx` file.
+
+~~~
+163               docx json AGM 2016 - Masters report.docx 14.8 KB -- metadataExtract --  TransformService
+163               workspace://SpacesStore/0db3a665-328d-4437-85ed-56b753cf19c8 1563306426
+163               docx json  14.8 KB -- metadataExtract -- PoiMetadataExtractor
+163                 cm:title=
+163                 cm:author=James Dobinson
+163               Finished in 664 ms
+...
+164               docx png  AGM 2016 - Masters report.docx 14.8 KB -- doclib --  TransformService
+164               workspace://SpacesStore/0db3a665-328d-4437-85ed-56b753cf19c8 1563306426
+164               docx png   14.8 KB -- doclib -- officeToImageViaPdf
+164.1             docx pdf   libreoffice
+164.2             pdf  png   pdfToImageViaPng
+164.2.1           pdf  png   pdfrenderer
+164.2.2           png  png   imagemagick
+164.2.2             endPage="0"
+164.2.2             resizeHeight="100"
+164.2.2             thumbnail="true"
+164.2.2             startPage="0"
+164.2.2             resizeWidth="100"
+164.2.2             autoOrient="true"
+164.2.2             allowEnlargement="false"
+164.2.2             maintainAspectRatio="true"
+164               Finished in 725 ms
+~~~
+
+This log happens to be from the t-client, but similar log lines exist in the
+t-router and individual t-engines.
+
+All lines start with a reference, which starts with the client’s request
+number (`163`, `164` if known) and then a nested pipeline or failover
+structure. The first request extracts metadata and the second creates a
+thumbnail rendition (called `doclib`). The second request is handled by a
+pipeline called `officeToImageViaPdf` which uses `libreoffice` to transform 
+to `pdf` and then another pipeline to convert to `png`. The last step
+(`164.2.2`) in the process resizes the `png` using a number of transform
+options.
+
+If requested, log information is passed back in the TransformReply's
+clientData.
\ No newline at end of file
diff --git a/engines/base/src/main/java/org/alfresco/transform/base/registry/TransformConfigFiles.java b/engines/base/src/main/java/org/alfresco/transform/base/registry/TransformConfigFiles.java
index fd7e63a1..eb8d3b68 100644
--- a/engines/base/src/main/java/org/alfresco/transform/base/registry/TransformConfigFiles.java
+++ b/engines/base/src/main/java/org/alfresco/transform/base/registry/TransformConfigFiles.java
@@ -38,8 +38,8 @@ import java.util.Map;
 @ConfigurationProperties(prefix = "transform.config")
 public class TransformConfigFiles
 {
-    // Populated from Spring Boot properties or such as transform.config.file.<engineName> or environment variables like
-    // TRANSFORM_CONFIG_FILE_<engineName>.
+    // Populated from Spring Boot properties or such as transform.config.file.<filename> or environment variables like
+    // TRANSFORM_CONFIG_FILE_<filename>.
     private final Map<String, String> files = new HashMap<>();
 
     public Map<String, String> getFile()