8 Commits

Author SHA1 Message Date
Alan Davis
819988c518 ALF-12273: Merge V4.0-BUG-FIX to HEAD
33119: Merge V3.4-BUG-FIX (3.4.8) to V4.0-BUG-FIX (4.0.1)
      33099: ALF-10412 Nonreducing 100% CPU Uploading Large Files to Share Site Document Library
      ALF-10976 Excel files bigger than 2mb cause soffice.exe to take 100% of one CPU for more than 2 minutes in previews.
         - Polish TransformerDebug
         - Better config for txt and xlsx to swf   
      33095: ALF-10412 Nonreducing 100% CPU Uploading Large Files to Share Site Document Library
      ALF-10976 Excel files bigger than 2mb cause soffice.exe to take 100% of one CPU for more than 2 minutes in previews.
         - Improvements to TransformerDebug so that calls to getTransformers use trace rather than debug level logging
           allowing one to see the wood for the trees  
      33016: ALF-10412 Nonreducing 100% CPU Uploading Large Files to Share Site Document Library
      ALF-10976 Excel files bigger than 2mb cause soffice.exe to take 100% of one CPU for more than 2 minutes in previews.
         - fix build errors - may not get all of them as not tested on Linux
      33005: ALF-10412 Nonreducing 100% CPU Uploading Large Files to Share Site Document Library
      ALF-10976 Excel files bigger than 2mb cause soffice.exe to take 100% of one CPU for more than 2 minutes in previews.
         - Disable transformers if the source txt or xlsx is too large - avoids transforms that don't finish
           txt limit is 5MB
      	 xlsx limit is 1MB
         - Added a general 2 minute timeout added (ignored by JOD transformers - which already have a 2 minute timeout 
      	 and OpenOffice transformers - would require more work)
         - Previous commit already limited txt -> pdf -> png so that only the 1st pdf page was created when creating a doclib icon
         - Earlier commit already reduced the priority of the background Thread used for transformations so that user interaction
           did not suffer.
      33004: ALF-10412 Nonreducing 100% CPU Uploading Large Files to Share Site Document Library
      ALF-10976 Excel files bigger than 2mb cause soffice.exe to take 100% of one CPU for more than 2 minutes in previews.
         - Added time, size and page limits to transformer configuration to allow one to avoid
           costly transformations and to stop them if they do occur. Limits avoid a transformer being
           selected if the source is too large, or make it throw and Exception or discard data after a given
           time, KBytes read or pages created.
         - Page limits currently only used by TextToPdfContentTransformer for thumbnail (icon) creation.
         - Time, Size and Page limits are currently ignored by JodContentTransformer and OpenOfficeContentTransformerWorker
           once running but the max source size limits may be used to avoid the selection of the transformer in the first
           place.
         - TransformerDebug added to be able to see what is going on. A real eye opener!
           log4j org.alfresco.repo.content.transform.TransformerDebug
      32136: ALF-10412 Nonreducing 100% CPU Uploading Large Files to Share Site Document Library
         Reducing the priority of the async thread pool that is used to perform the transformations so that normal activity (and even garbage collection) is not interrupted by transformations. 


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@33223 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2012-01-13 17:25:32 +00:00
Derek Hulley
f2eab4d8d9 Merged DEV/SWIFT to HEAD
28029: Added more tests for PublishingEventHelper and PublishingQueueImpl. Also added WebPublishingTestSuite.
   28034: Support for ALF-8792: RSOLR 036: SOLR APIs to support index integrity checking
          - ACL and ACLTX support
   28036: WCM QS ML UI tweaks for marking something as the initial translation
   28038: ALF-8548: WPUB: F165: Foundation API: Cancel a scheduled publishing event
          - Code and initial test cases
   28051: Fix for ALF-8836: No permission checks for SolrJSONResultSet
   28057: WCM QS ML support for claiming intermediate non-translated folders when translating documents, with tests
   28058: ML-WQS: Slight refactoring to remove RootNavInterceptor.
          This functionality has been brought into the ApplicationDataInterceptor.
		  The effective root section is now made available to templates and components in the model.
   28059: ALF-8499. SVC 10: Action Forms.
          This checkin adds an ActionFormProcessor which supports the generation and persistence of Forms based on
          Alfresco spring-injected action beans. The form processor produces a form field for each defined action parameter
          as well as the ubiquitous executeAsynchronously boolean for action execution.
          There is no styling of configuration of these forms and therefore NodeRef parameters will allow selection of any
		  cm:cmobject nodes and action constraints like ac-aspects will return every aspect defined in the system.
          To expose these forms in the product, we would need to add form configuration for the built-in actions in order to manage and control such data.
   28064: Fix for ALF-8857: Fix SOLR query caching to respect locale for ordering
   28067: ALF-8846 : Intermittent: DMDeploymentTargetTest
          added more debug logging and throw an explicit exception on trying to create a duplicate directory.
   28068: Publishing: Tidy-up (javadoc and removal of a few unnecessary operations) prior to sprint 1 demo.
   28069: Implemented EnvironmentImpl.checkStatus() method. Also created an AbstractWebPublishingIntegrationTest
          and extended many of the web publishing tests from htis class.
   28076: Publishing: More javadoc
   28078: RINF 11: Canned queries
          - minor: rename CannedQuery "query" to "queryAndFilter" and update/fix related JavaDoc (ALF-8827)
          - update PagingRequest - precursor to merge with (Script) PagingDetails (ALF-8855)
   28079: RINF 40: Lucene Removal: PersonService API (ALF-8805) - W.I.P.
          - add GetChildren CQ support for (initially string) property filtering, including unit tests
          - update GetChildren CQ to allow up to three filter and/or sort props
          - add GetChildren CQ unit test for existing DB-based filtering of child types
          - fix GetChildren CQ sorting, for spoofed referenceable props (including missing name)
   28083: Fix for ALF-8858: Fix cache bugs (TX and ACLTX docs not tracked)
   28097: Fix hard-coded checks for aspect counts following sys:localized changes
   28126: Build/test fix (GetChildrenCannedQueryTest.testPropertyStringFiltering)
   28147: RINF 40: Lucene Removal: PersonService API
          - initial impl w/ unit tests
          - note: separate task required to update JavaScript API (People.getPeople)
   28157: RINF 40: Lucene Removal: PersonService API (ALF-8805)
          - fix People.getPeople - put back FTS option (pending ALF-8924)
   28162: Added PublishWebContentJbpmTest to test the Jbpm publish web content process definiion.
   28178: Build fix. Removing a trailing comma that my ant build objects to.
   28180: Preventing a NPE within TikaCharsetFinder. Was observed as part of tests for ALF-3757.
   28182: RSOLR 037: Integrate CMIS Dictionary into SOLR engine
   28183: ALF-8846 - fix DMDeploymentTarget(Test)
          - make system auth explicit
          - minor: fixup debug logging
   28187: Fix for ALF-7308. The imgpreview thumbnail ... scale up small images...
          I've exposed an ImageMagick configuration option ('>') as a new ImageRenderingEngine parameter, "allowEnlargement".
          It's not mandatory, defaults to true, and is set to false for doclib and imgpreview thumbnails.
          The net result is that doclib and imgpreview thumbnails of small graphics files (e.g. icons) will never have sizes exceeding their original size.
   28191: RINF 09: Update FileFolderService (ALF-7168)
          - minor: clean-up debug/trace logging
   28192: Fix MT for GetChildren CQ
          - FileFolderService -> list (ALF-7168)
          - PersonService -> getPeople (ALF-8805)
   28194: RINF 09: CMIS getChildren sorting fixes (part of ALF-7168)
          - fix sorting by cmis:contentStreamMimeType and/or cmis:contentStreamLength
          - add warning + debug (if some orderBy sort props need to be ignored - eg. too many or unknown)
          - reviewed w/ Florian
   28195: ALF-8910: RSOLR 037: Integrate CMIS Query Parser into SOLR engine
   28211: Changes for ALF-8646: "RINF 38: Text data encryption"
   28227: Changes for ALF-8646: "RINF 38: Text data encryption"
          o fix build issue relating to missing property definition
   28232: ALF-8928 - Performance degradation when loading documents from RepoStore
   28233: Attempt to resolve OOM hangs in SWIFT builds
          - Set mem.size.max=2048m
   28234: Implementation of ALF-8986. Add support for transformation of Apple iWorks files.
             A new transformer transforms (pages, numbers, keynote) iWorks 09 files to image or SWF for doclib & webpreview thumbnailing.
             This transformer extracts an embedded JPEG or PDF file from a well-known location within the iWorks zip structure & uses that
			 to create Alfresco thumbnails. If these zip entries are not present for whatever reason, then the transformation fails in the usual way.
             All of our iWorks 09 test files have an embedded JPEG and more than half have embedded PDFs.
   28243: Init/refresh repo webscripts in single txn
          - found whilst investigating ALF-8928
   28268: Started implementing PublishEventAction. Also updated mapping of nodes from source to live environment to use associations.
   28308: PublishEventAction now supports updating of nodes that have already been published.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@28321 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2011-06-10 00:21:55 +00:00
Nick Burch
325f8e7923 Tika content transformer support for OOXML office
Enable explicit Tika content transform for OOXML files
Allow the Excel transformer (which does CSV as well as text/html) to handle .xlsx as well as .xls
Also update the .doc parser test to ensure that the older word 6 and word 95 files are correctly handled too


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20781 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-23 15:51:03 +00:00
Nick Burch
228d111c56 More Tika content transform updates
New POI-general converter, for things other than excel, and convert the PDF converter too.
The POI-excel converter now does CSV properly, and notes exist for the Text mining converter on the Tika bits needed before it can be replaced.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20780 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-23 14:27:10 +00:00
Nick Burch
f3a7a0aa7c Initial Tika support for Text content transforms
The POI HSSF transformer has been updated to use Tika. A Tika auto-detect
 transformer has also been added, which caters for a large number of 
 previously un-handled cases. Unit tests check this.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20769 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-23 11:40:17 +00:00
Nick Burch
45c757fee8 Add metadata extractor support for .dwg files (ALF-2262)
The code for extracting .dwg files has been contributed to Apache tika, and the Alfresco metadata extractor deep calls into Tika to have the work done. We retain our own tests of this however.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19927 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-04-21 10:17:11 +00:00
Neil McErlean
de612572d9 Proper fix for unreported issue with OOo-based extraction of Office 07 metadata.
Added a new metadata extractor based on POI for docx, xlsx and pptx mime types.
Changed OpenOfficeMetadataExtracter so that it no longer supports these mime types.
Added the new test code to ContentMinimalContextTestSuite

Some tidying up of code in AbstractMetadataExtracterTest and OpenOfficeMetadataExtracter to reflect the fact that this extractor does not handle these mime types any more.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19792 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-04-09 12:10:06 +00:00
Nick Burch
21b6c8cf10 Tweak the minimal context to hopefully work on the build machine too, and then re-enable tests + combine one suite
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19122 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-03-08 14:23:51 +00:00