Commit Graph

23 Commits

Author SHA1 Message Date
Alex Mukha
17f9b1cf00 REPO-1525: PdfBoxMetadataExtracterTest failures on all DBs (including main PostgreSQL build)
- Committed whitespaces to satisfy license checker...


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@132695 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-11-10 22:28:24 +00:00
Alex Mukha
52d8d9cf59 REPO-1525: PdfBoxMetadataExtracterTest failures on all DBs (including main PostgreSQL build)
- Moved the concurrent test to a separate class - ConcurrencyPdfBoxMetadataExtracterTest
   - It is now utilizing an overridden extractor with a configurable timeout.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@132690 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-11-10 18:46:46 +00:00
Alex Mukha
c6f82e5489 REPO-1525: PdfBoxMetadataExtracterTest failures on Oracle, MSSQL Server, DB2
- Another Java8 to Java7 compliation fallback.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@132557 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-11-08 13:14:27 +00:00
Alex Mukha
2b129ae1b6 REPO-1525: PdfBoxMetadataExtracterTest failures on Oracle, MSSQL Server, DB2
- Removed Java8 includes, those are not necessary and break compilation on Java7


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@132552 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-11-08 12:49:06 +00:00
Alex Mukha
815cc0ed95 REPO-1525: PdfBoxMetadataExtracterTest failures on Oracle, MSSQL Server, DB2
- Another attempt to stabilize the test.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@132501 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-11-08 10:03:45 +00:00
Alex Mukha
ff553f3376 REPO-1525: PdfBoxMetadataExtracterTest failures on Oracle, MSSQL Server, DB2
- An attempt to fix the test (refactored the checks and timeouts)


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@132496 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-11-07 18:45:24 +00:00
Andreea Dragoi
c95cbaccd9 MNT-16709 : Metadata extraction on 200MB PDF file causes large heap utilization
- added concurrent extraction limit
   - added max document size limit

git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@131709 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-10-24 12:13:50 +00:00
Derek Hulley
8991f629ee Remove some deprecated methods and classes in the MetadataExtracter hierarchy
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@130493 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-09-08 16:02:39 +00:00
Alexandra Leahu
fc20674988 Merged 5.1.N (5.1.2) to 5.2.N (5.2.1)
125892 adragoi: Merged 5.0.N (5.0.4) to 5.1.N (5.1.2)
      125842 rmunteanu: Merged V4.2-BUG-FIX (4.2.7) to 5.0.N (5.0.4) (PARTIAL MERGE)
         125700 adavis: Merged V4.2.5 (4.2.5.7) to V4.2-BUG-FIX (4.2.7)
            125698: Merged DEV to V4.2.5 (4.2.5.7)
               125677 arebegea: MNT-15219 : Excel (.xlsx) containing xmls (shapes/drawings) with multi byte characters may cause OutOfMemory in Tika
                  - Should not have updated version.properties as the original commit needs to be merged forwards.,
            125696: Merged DEV to V4.2.5 (4.2.5.7)
               125677 arebegea: MNT-15219 : Excel (.xlsx) containing xmls (shapes/drawings) with multi byte characters may cause OutOfMemory in Tika
                  - Modified tika parser and tika core jars to allow some configuration parameters to be sent from Alfresco side using the metadata map parameter
                  - Excluded by default the parsing of drawings/shapes xmls because there was little valuable data that could be extracted from those xmls


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@126004 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-04-29 11:36:11 +00:00
Raluca Munteanu
6afb44e712 Merged 5.1.N (5.1.2) to 5.2.N (5.2.1)
125606 rmunteanu: Merged 5.1.1 (5.1.1) to 5.1.N (5.1.2)
      125515 slanglois: MNT-16155 Update source headers - add new Copyrights for Java and JSP source files + automatic check in the build


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@125788 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-04-26 13:45:01 +00:00
Raluca Munteanu
8674e2bfc8 Merged 5.1.N (5.1.2) to 5.2.N (5.2.1)
125603 rmunteanu: Merged 5.1.1 (5.1.1) to 5.1.N (5.1.2)
      125484 slanglois: MNT-16155 Update source headers - remove old Copyrights from Java and JSP dource files


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@125781 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-04-26 12:48:49 +00:00
Andreea Dragoi
5d13806c7e Merged 5.1.N (5.1.2) to 5.2.N (5.2.1)
124313 adragoi: Merged 5.0.N (5.0.4) to 5.1.N (5.1.2)
      124244 abalmus: MNT-15497 : Keyword tags generated from metadata extraction are formed into a single string rather than split on delimiter
         - Fixed tag separation on delimiter
         - Enhanced existing test


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/BRANCHES/DEV/5.2.N/root@124364 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-03-22 14:32:10 +00:00
Alan Davis
4314768f44 Merged 5.1.N (5.1.1) to HEAD (5.1)
119696 rmunteanu: Merged 5.0.N (5.0.4) to 5.1.N (5.1.1)
      119612 amorarasu: Merged V4.2-BUG-FIX (4.2.6) to 5.0.N (5.0.4)
         119559 ragauss: MNT-13919: Check for Metadata Embed Support is Incorrect
           - Added unit test


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@123602 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2016-03-11 21:26:32 +00:00
Tatyana Valkevych
7dd2291753 Merged HEAD-BUG-FIX (5.1/Cloud) to HEAD (5.1/Cloud)
107541: Merged 5.0.N (5.0.3) to HEAD-BUG-FIX (5.1/Cloud) (PARTIAL MERGE)
      107413: Merged DEV to 5.0.N (5.0.3)
         106858 : MNT-13545: JavaDoc : Inconsistencies between the Java doc and the actual code
            - Cleaning of Javadoc,
   107565: MNT-13545 Fix compilation after merge of Javadoc


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@107633 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2015-07-02 16:13:03 +00:00
Alan Davis
9552852b91 Merged HEAD-BUG-FIX (5.1/Cloud) to HEAD (5.1/Cloud)
104496: Merged 5.0.N (5.0.2) to HEAD-BUG-FIX (5.1/Cloud)
      104336: Merged NESS/5.0.N-2015_03_23 (5.0.2) to 5.0.N (5.0.2)
         103763: MNT-13920 - rewrite the image dimension properties if there is any exif dimensions information available
         104332: MNT-13920 - code changes based on review, improved javadoc and slight modifications on the extract size method


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@104607 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2015-05-20 09:54:36 +00:00
Alan Davis
f70dc05311 Merged HEAD-BUG-FIX (5.1/Cloud) to HEAD (5.1/Cloud)
100990: Merged 5.0.N (5.0.2) to HEAD-BUG-FIX (5.1/Cloud)
      100834: Merged V4.2-BUG-FIX (4.2.5) to 5.0.N (5.0.2)
         100784: Merged DEV to V4.2-BUG-FIX (4.2.5)
            100732: MNT-13655 : Just first keyword of the IPTC keywords list is extracted as metadata and put into description field of an image
               - Added special way for handling multi-valued meta-data properties retrieved from parser.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@101005 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2015-04-01 01:27:45 +00:00
Alan Davis
862e07f3e2 Merged HEAD-BUG-FIX (5.0/Cloud) to HEAD (5.0/Cloud)
84058: Merged V4.2-BUG-FIX (4.2.4) to HEAD-BUG-FIX (5.0/Cloud)
      83799: MNT-12238: Merged DEV 4.2-BUG-FIX (4.2.4) to V4.2-BUG-FIX (4.2.4)
         MNT-12238: Merged 4.1-BUG-FIX (4.1.10) to V4.2-BUG-FIX (4.2.4)
            80291: Merged V4.1.6 (4.1.6.21) to V4.1-BUG-FIX (4.1.10)
               77378: Merged DEV PATCHES/V4.1.6 (19) to PATCHES/V4.1.6 (20)
                  76649: MNT-11823: Upload of PPTX causes very high memory usage leading to system instability
                     - Patch from MNT-577 has been combined with new changes to avoid hanging of analyzing complicated PPTX documents. The fix just disables reading the entire contents of the complicated document. POI metadata extractor may be switched to standard behavior or reconfigured, using the following new properties: content.transformer.Poi.poiFootnotesLimit, content.transformer.Poi.poiExtractPropertiesOnly and content-services-context.xml/extracter.Poi/poiAllowableXslfRelationshipTypes
                  77379: MNT-11823: Upload of PPTX causes very high memory usage leading to system instability
                     Test and the test data for MNT-577 have been added. Test for MNT-11823 has also been added. But this test is commented because the test data (appropriate PPTX document) is not currently available. Getters for POI specific properties have been added to 'PoiMetadataExtracter' for tests. Also 'afterPropertiesSet()' logic has been a bit modified to allow setting 'false' value for 'poiExtractPropertiesOnly' parameter
                  77561: MNT-11823: Upload of PPTX causes very high memory usage leading to system instability
                     Fix for https://bamboo.alfresco.com/bamboo/browse/HF-PATCH416-126 build failure. POI extractor and transformer properties of 'AlfrescoPoiPatchUtils' have been isolated from each other using context. Each extractor or transformer now has its own context or uses the default context. Properties of the default context allow parsing the entire contents of XLSF documents. And footnotes limit is 50. Property names have not been changed, but currently 'content-services-context.xml/extracter.Poi/poiAllowableXslfRelationshipTypes=null' does not lead to 'content.transformer.Poi.poiExtractPropertiesOnly=false'. I. e., this list may be empty. 'PoiMetadataExtracterTest' test has been modified in accordance with the introduced changes. 'poi-OOXML-3.9-beta1-20121109.jar' has been renamed to 'poi-OOXML-3.9-beta1-20121109-patched.jar'
                  79180: MNT-12043: CLONE - Upload of PPTX causes very high memory usage leading to system instability
                     Timeout mechanism has been added to content transformers. Timeout configuration options have been added. Also mechanism to close streams after 'TimoutException' has been added to transformers and metadata extractors. Also timeout mechanism for input streams has been enabled in 'TikaPoweredContentTransformer'
                  79268: MNT-12043: CLONE - Upload of PPTX causes very high memory usage leading to system instability
                     Fix for the https://bamboo.alfresco.com/bamboo/browse/HF-PATCH416-133 build failure and comments of the review https://fisheye.alfresco.com/cru/CR-100#CFR-1184. The new test has been added into 'PoiOOXMLContentTransformerTest.testMnt12043()' to check out the newly added timeout mechanism
                  79290: MNT-12043: CLONE - Upload of PPTX causes very high memory usage leading to system instability
                     - Removed methods and properties that are no longer needed
                  79327: MNT-12043: CLONE - Upload of PPTX causes very high memory usage leading to system instability
                     - Increased ADDITIONAL_PROCESSING_TIME to 1500ms to try and avoid a new intermittent test failure.
      83885: MNT-12238 Bring Maven POM file in sync with latest patched version of poi-ooxml


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@84627 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2014-09-18 17:23:49 +00:00
Alan Davis
45a331e5c9 Merged HEAD-BUG-FIX (4.3/Cloud) to HEAD (4.3/Cloud)
61242: Merged V4.2-BUG-FIX (4.2.2) to HEAD-BUG-FIX (Cloud/4.3)
      61241: Merged V4.1-BUG-FIX (4.1.8) to V4.2-BUG-FIX (4.2.2)
         61240: MetadataExtracterLimitsTest has too short a timeout (50ms to 500ms)


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@62401 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2014-02-12 15:04:12 +00:00
Alan Davis
0e7931feaf Merged HEAD-BUG-FIX (4.3/Cloud) to HEAD (4.3/Cloud)
58495: Merged V4.2-BUG-FIX (4.2.1) to HEAD-BUG-FIX (Cloud/4.3)
      58419: MNT-9888: WorkflowModelBuilderTest is susceptible to timezone issues
       - Fix unit tests that failed in build 77. 


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@61996 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2014-02-11 23:41:08 +00:00
Alan Davis
fd77ceb85f Merged HEAD-BUG-FIX (4.3/Cloud) to HEAD (4.3/Cloud)
58489: Merged V4.2-BUG-FIX (4.2.1) to HEAD-BUG-FIX (Cloud/4.3)
      58379: MNT-9888: WorkflowModelBuilderTest is susceptible to timezone issues
       - Fix unit tests, because they fails in build 75.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@61989 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2014-02-11 23:36:11 +00:00
Alan Davis
57f7266e48 Merged HEAD-BUG-FIX (4.3/Cloud) to HEAD (4.3/Cloud)
57478: Merged V4.2-BUG-FIX (4.2.1) to HEAD-BUG-FIX (Cloud/4.3)
      57288: Merged V4.1-BUG-FIX (4.1.7) to V4.2-BUG-FIX (4.2.1)
         57264: ALF-14221: Creation and Modification Date Tests Fail in Timezones Other than BST
            - Set default time zone to Europe/London in AbstractMetadataExtracterTest
         MNT-9566: Intermittent test failure OpenDocumentMetadataExtracterTest.testSupportedMimetypes
            - Changed OpenDocumentMetadataExtracterTest to run against date objects rather than strings


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@61820 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2014-02-11 20:45:06 +00:00
Derek Hulley
0b7bceb769 Add a bit more 'wiggle room' to the timeout test, which missed the cutoff by 8ms, and fixed formatting
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@56009 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2013-09-26 03:30:36 +00:00
Samuel Langlois
ab4ca7177f Merged HEAD-QA to HEAD (4.2) (including moving test classes into separate folders)
51903 to 54309 


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@54310 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2013-08-20 17:17:31 +00:00