38 Commits

Author SHA1 Message Date
alandavis
c9743063f5 REPO-4452 Add local transformer pipelines (#121)
Set priorities on 5 Tika transform routes so they are not used if LibreOffice is available. This currently has no impact on the Transform Service (priorities are not used for routing), but does on repository LocalTransforms as the wrong T-Engine is selected.
2019-09-20 12:01:49 +03:00
Cezar.Leahu
0e09a0c415 ATS-515: Minor test update 2019-09-13 16:31:39 +03:00
Cezar.Leahu
7a114062e4 ATS-515: Default "targetEncoding=UTF-8" option for TIKA 2019-09-13 15:35:35 +03:00
Cezar.Leahu
411a7bd508 ATS-515: Default options for TIKA when called through ATS
- add default "UTF-8" target encoding
- restore previously default values for the other options
2019-09-13 12:50:37 +03:00
CezarLeahu
c650bf292c
ATS-545: Code formatting and small improvements (#113) 2019-09-12 16:46:48 +03:00
alandavis
d6777b58eb REPO-4639 Content conversion failed using Tika (#108)
* REPO-4639: Split tika engine_config.json into separate transformers.

* WIP: REPO-4639 Content conversion failed using Tika

The Tika T-Engine "transform" option does not exist when called via the Transform Service or Local transforms, which resulted in no transforms taking place. However this value is really not be needed as the T-Engine should be able to read its own engine_config.xml to work out which sub transform to use. Transforms only worked via Legacy transforms, which used a T-Engine.

This code is based on tried and tested ACS repository code. It has been further simplified.

TODO:
- replace the ConfigFileFinder class just added with something that uses Spring to read the JSON. i.e. simplify it.
- replace the CombinedConfig class just added with something that does not need the InLineTransformer. i.e. simplify it.
- create tests based on the repo tests
- remove the source and target mimetype checks in Tika as a check against engine_config.xml is cleaner.
- repeat the process for the Misc T-Engine as it has similar code checking source and target mimetypes.
- remove the transform option passed by the legacy transforms.

* Removed CombindConfig and ConfigFileFnder classes.

* Extracted AbstractTransformRegistry so that it may be used in the ACS repository too.

TODO AbstractTransformRegistry and AbstractTransformRegistry need to be moved to the alfresco-transform-model pakage

* tidy up only

* REPO-4639: Add priority to duplicate transforms.

* REPO-4639: Refactor TikaTransformationIT to use the new Tika /transform specifications
Changes AbstractTransformerControllerTest as the engine_config is now loaded in TransformRegistryImpl instead of AbstractTransformerController

* Rename to TransformServiceRegistry, so we don't have to change the repo code.

* Added the baseUrl parameter to the register method and fixed the missed rename in the last commit.

* Javadoc change only

* Moved common classes (with repo) AbbstractTransformRegistry and TransformServiceRegistry to alfresco-transform-model

* Replace (simplify) all the isTransformable calls with a check against the JSON.
- Tests now only pass targetEncoding to the 'string' transformer.

* Fix failing tests.

* Revert port change

* REPO-4639 : Add priorities to misc engine_config

* REPO-4639 : Add priorities to pdf-renderer and  imagemagick engine_config

* Remove test that is @Ignored

* Pick up alfresco-transformer-model 1.0.2.7-REPO-4639-1

* REPO-4639 : Add priorities to libreoffice engine_config

* REPO-4639 : Add priorities to tika engine_config

* REPO-4639 : Remove all priorities with value equal to 50 (default) from engine_config

* Switch over to using TransformServiceRegistry in org.alfresco.transform.client.registry
Reintroduce the noExtensionSourceFilenameTest having removed @Ignore.

* New whitesource issue on commons-compress 1.18. Upgrading to 1.19.

* Removed the text/javascript -> text/plain test as this is not supported

* Modifications as a result of changes to method names in alfresco-transform-model

* Pick up alfresco-transform-model 1.0.2.7-ATS545-2

* Remove unused imports
2019-09-12 15:34:42 +03:00
Alexandru-Eusebiu Epure
bcd6fefe4d
REPO-4617 Add missing source types for T-engines (#95)
Update missing sourceMediaType -> targetMediaType for alfresco-pdf-renderer
   Update missing sourceMediaType -> targetMediaType for imagemagick
   Update missing sourceMediaType -> targetMediaType for tika libreoffice
   Update missing sourceMediaType -> targetMediaType for tika
   Update missing sourceMediaType -> targetMediaType for misc transformer
   Add T-engine Integration Tests
   Fix JavaDoc warnings
   Add sample files for tested mimetypes
2019-09-03 13:34:07 +03:00
CezarLeahu
d9747f015d
ATS-466/ATS-538/ATS-539: Incorporate Misc T-Engine in ATS (#98)
- fix multiple Misc Transformer bugs related to file mimeTypes
- remove usage of 'source/targetMimetype' as transform options/parameters
- add 'source/targetMimetype' arguments to the 'processTransform' method
- remove unnecessary code (e.g. useless overridden methods)
- add quick* test resource files
- add integration test for 'Local Transformations' on the Misc engine
- set up Integration Tests POM configuration for all T-Engine modules
2019-08-26 13:59:38 +03:00
CezarLeahu
3c977bd914
ATS-480 : Update to Tika 1.21 and matching POI (#93)
- upgrade tika
- upgrade poi
- fix/update test resource for PDF parsing
(multi-page PDF parsing was changed in tika-parsers 1.21)
2019-08-20 22:20:47 +03:00
CezarLeahu
bb187dc00f
ATS-488 : Remove alfresco-core dependency (#90)
- remove *alfresco-core* dependency
- remove *alfresco-data-model* dependency
- replace _TempFileProvider_ with local implementation
- duplicate _RuntimeExec_ and _ExecParameterTokenizer_ from alfresco-core
- partially duplicate _MimetypeMap_ from alfresco-data-model
2019-08-20 10:05:39 +03:00
CezarLeahu
22de0ce5df
ATS-532 : Code improvements (#89)
- move startup message from controllers to the Application classes (SpringBoot configuration beans)
- added static imports for most static variables and static methods
- simplified a few nested *if*s
- replaced Arrays.asList() with explicit immutable collections
- fixed a few IntelliJ code inspection warnings
2019-08-18 18:45:14 +03:00
Cezar.Leahu
485347729b ATS-531 : Reformat code 2019-08-14 22:46:36 +03:00
CezarLeahu
3e4f6af0e4
ATS-467 : JMS config startup message (#72) 2019-07-10 13:42:28 +03:00
Lucian Tuca
f1edbb71e9 ATS-460 : ATS: T-Engines - Update license information to *not* refer to Enterprise edition
- updated controller line
2019-07-08 11:40:29 +03:00
CezarLeahu
be48c8e7a9
ATS-467 : T-Engine logs fill up with ActiveMQ errors when used in ACS Community edition
- instantiate JMS beans only when 'activemq.url' property is set
- fix integration tests
2019-07-05 11:39:39 +03:00
CezarLeahu
43b586a565
ATS-443 : Dependency updates (#59) 2019-06-24 17:41:51 +03:00
eknizat
ff0f659ded
REPO-4331: Add remaining core transformers (#45)
* REPO-4331: Add remaining core transformers
* HtmlParserContentTransformer
* AppleIWorksContentTransformer
* StringExtractingContentTransformer
* TextToPdfContentTransformer
* OOXMLThumbnailContentTransformer
2019-06-20 12:31:38 +01:00
Cezar.Leahu
69a836fc89 ATS-252 : Use docker-maven-plugin for the integration tests 2019-06-06 14:06:38 +03:00
Cezar.Leahu
8419cc4b6d ATS-448 : Small T-Engine configuration improvements
- ignore empty fields during serialization
 - changed application.properties files to application-default.yaml
2019-06-05 13:21:15 +03:00
DenisGabriela
d2292f94a0 ATS-434 : Implement /info endpoint in T-Engines (#44)
- implement '/info' endpoint
   - add engine_config files
   - use SNAPSHOT transform-model with new Transform Config models (TODO update after transform-model release)
   - remove 'tests' from travis stages
   - add new junits
   - add test resources
2019-05-31 14:14:03 +03:00
CezarLeahu
70652dcb31
ATS-400 : Update Copyright (#23)
* ATS-400 : Update Copyright

* ATS-400 : Update maven license plugin configuration

* ATS-400 : Update Copyright with the license-maven-plugin

* ATS-400 : Update licence-maven-plugin default configuration
2019-05-20 14:06:25 +03:00
Lucian Tuca
279efd3071 ATS-329
- updated exception package
2019-04-11 16:41:00 +03:00
Lucian Tuca
7c466b7c82 ATS-238 2019-02-05 13:55:43 +00:00
Lucian Tuca
86bb0210e0 ATS-227 : PoC: Improve scaling/performance of transforms via T-Engine queues ? 2019-01-11 08:49:10 +00:00
Cezar Leahu
24611ea42e ATS-176 : Log improvements with Slf4j 2018-11-13 15:22:38 +00:00
Cezar.Leahu
77326ad759 ATS-175 : JavaDoc 2018-10-30 18:17:54 +02:00
Cezar.Leahu
ad4ea1574e ATS-175 : Replace HTTP numeric codes with constants 2018-10-30 16:11:10 +02:00
Cezar.Leahu
36ad687f81 ATS-178 : Fix bean initialization errors 2018-10-29 16:42:38 +02:00
Cezar.Leahu
d85c03d362 ATS-175 : T-Engine code cleanup 2018-10-26 16:38:09 +03:00
DenisGabriela
8e7e775eef ATS-126 : Expose JVM metrics - eg. CPU & Memory (within jvm process / container)
- addressed review comments
2018-10-15 16:08:47 +03:00
DenisGabriela
6a7bbb3c4e ATS-126 : Expose JVM metrics - eg. CPU & Memory (within jvm process / container)
- added pod name tag tag on metrics
2018-10-15 12:41:42 +03:00
Denis Ungureanu
dbedbce8c6 ATS-68 : ATS-16: Fix error status code mapping for expected invalid requests
- fixed code after updating the transform-data-model version (ATS-70)
2018-08-21 15:11:22 +03:00
Denis Ungureanu
8b9593b2bd ATS-68 : ATS-16: Fix error status code mapping for expected invalid requests
- used MediaType.TEXT_PLAIN_VALUE in test instead of "text/plain"
2018-08-21 14:43:39 +03:00
Denis Ungureanu
9fdafcb60f ATS-68 : ATS-16: Fix error status code mapping for expected invalid requests
- updated tests
   - added negative test for 400 reply
2018-08-21 14:05:31 +03:00
Lucian Tuca
dda632c7a5 feature/ATS-16 2018-08-17 09:32:25 +01:00
Andreea Nechifor
a011c2ca39 REPO-3626: changes after review. 2018-07-24 14:37:40 +03:00
Andreea Nechifor
4d2d4acce7 REPO-3626: added a new parameter notExtractBookmarksText 2018-07-24 10:59:03 +03:00
Alan Davis
82c5e3e96a REPO-3425 Transformers: Tika based transformers 2018-06-28 13:25:01 +01:00