19 Commits

Author SHA1 Message Date
Nick Burch
62f07a8661 Complete initial Tika-ification of the metadata extractor
The remaining extractors to be converted to Tika now have been, tests have
 been included for the image metadata extraction, and some extension points
 for future extractors have been created.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20669 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-16 16:19:38 +00:00
Neil McErlean
de612572d9 Proper fix for unreported issue with OOo-based extraction of Office 07 metadata.
Added a new metadata extractor based on POI for docx, xlsx and pptx mime types.
Changed OpenOfficeMetadataExtracter so that it no longer supports these mime types.
Added the new test code to ContentMinimalContextTestSuite

Some tidying up of code in AbstractMetadataExtracterTest and OpenOfficeMetadataExtracter to reflect the fact that this extractor does not handle these mime types any more.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19792 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-04-09 12:10:06 +00:00
Kevin Roast
1c897ae1fb Latest SpringSurf libraries:
- Cleanup and improvements to RequestContext related classes.
 - Removal of obsolete Alfresco util classes.
Fixed up imports back to Alfresco versions of unused SpringSurf util classes

git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19322 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-03-16 19:06:54 +00:00
Paul Holmes-Higgin
cefda8c965 Updated header to LGPL
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@18931 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-03-01 22:48:39 +00:00
Paul Holmes-Higgin
43e93f3c14 Updated header to LGPL
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@18926 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-03-01 22:09:17 +00:00
Nick Burch
bd1e3edf76 Update metadata extractors - Outlook, MP3, Mail and PDF improvements, and increase test coverage
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@18454 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-02-04 14:42:45 +00:00
Kevin Roast
b726c4d6db Merged DEV/TEMPORARY to HEAD
17667: Branch for SpringSurf integration - from HEAD r17665
   17668: Fix to ensure included scripts files are not loaded from a cached classpath loader.
   17670: Part 1 of SpringSurf integration - changes relating to spring-surf-core-1.0.0.CI-SNAPSHOT.jar
   17674: Part 2 of SpringSurf integration - changes relating to spring-surf-core-configservice-1.0.0.CI-SNAPSHOT.jar

git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@17788 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2009-12-14 13:41:05 +00:00
Derek Hulley
1278771931 Merged V3.2 to HEAD
15200: Fixed ETHREEOH-2480: Cannot upload files with .doc extension owing to exception from ContentMetadataExtracter
   15203: ETHREEOH-2246: Rework patch.authorityMigration to be more performant and report progress
   15254: ETHREEOH-2339: Fixed NullPointerException when editing LDAP-synced user who doesn't have an email address
   15267: Applied patch for ETHREEOH-1448: Exception when using the move up/down button when editting a web form
   15270: Mereg (sic) - Record only 15238
   15285: Merged V3.1 to V3.2
      15281: Merged V2.2 to V3.1
         15280: Absorb all metadata exceptions at all levels
      15282: Merged V2.2 to V3.1
         15273: Merged V3.2 to V2.2
            15200: Fixed ETHREEOH-2480: Cannot upload files with .doc extension owing to exception from ContentMetadataExtracter
-------------------------
Modified: svn:mergeinfo
   Merged /alfresco/BRANCHES/V2.2:r15273,15280
   Merged /alfresco/BRANCHES/V3.1:r15281-15282
   Merged /alfresco/BRANCHES/V3.2:r15200,15203,15254,15267,15270,15285



git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@16854 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2009-10-13 10:57:47 +00:00
Derek Hulley
f31523d048 Javadoc tweaks while investigating code changes
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@13866 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2009-04-07 00:15:48 +00:00
Dave Ward
a166add97f 3rd Party Service admin (OpenOffice, SWFTools, ImageMagick)
- All supporting classes moved out to thirdparty subsystem
- Open Office service automatically started if available
- All utility locations editable via JMX (and subsystem can be reinitialized with new values without rebooting tomcat)
- New ContentTransformerWorker interface introduced in order to allow separation between ContentTransformer registry and third party utilities
- Existing JMX query capabilities preserved


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@13860 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2009-04-06 16:31:33 +00:00
Derek Hulley
1f3aabc6a0 Merged V2.1 to HEAD
6455: OpenOffice transformer and extractor register regardless of the initial connection state.
   6456: Fix for WCM-636 (Clicking OK twice while deleting web project results in exception)
   6457: Updated installers and associated config
   6458: AR-1669 Add getQnamePath to Javascript
   6459: Fix for AWC-1456 - Word and Excel documents were being stored as octet streams rather than their correct mimetype
   6460: Reverse order of reject & approve transitions, so that approve appears first in list of ui actions.
   6461: Removed Process.exe (often detected as a virus) and updated config wizard.
   6462: Switch to synchronous indexing for AVM by default
   6463: Better support to query the state of AVM indexes
   6464: Added Office 2007 document mimetypes and icons
   6465: Added Office 2007 icons without the typo this time


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@6736 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-09-10 22:41:44 +00:00
Derek Hulley
91c962aae5 Fixed recursive initialization of OpenOfficeMetadataExtracter.
Fixed minor incorrect warning when XMLMetadataExtracter is active.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@6279 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-07-17 16:30:44 +00:00
Derek Hulley
8288d99e98 Final fix for AR-357: Metadata extractors are configurable
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@6246 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-07-13 15:35:58 +00:00
Derek Hulley
0c10d61a48 Merged V2.0 to HEAD
svn merge svn://svn.alfresco.com:3691/alfresco/BRANCHES/V2.0@5141 svn://svn.alfresco.com:3691/alfresco/BRANCHES/V2.0@51352 .
      - FLOSS
      - Some files will need a follow-up
         -root/projects/repository/source/java/org/alfresco/repo/avm/wf/AVMRemoveWFStoreHandler.java (not yet on HEAD: 5094)
         -root/projects/repository/source/java/org/alfresco/filesys/server/state/FileStateLockManager.java (not yet on HEAD: 5093)
         -onContentUpdateRecord (not on HEAD)


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@5167 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-02-16 06:44:46 +00:00
Paul Holmes-Higgin
31c250682b Changed licence headers
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@5081 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-02-08 18:59:58 +00:00
Derek Hulley
cfb373ae36 Merge 1.4 to HEAD
svn merge svn://svn.alfresco.com:3691/alfresco/BRANCHES/V1.4@4340 svn://svn.alfresco.com:3691/alfresco/BRANCHES/V1.4@4350 .
   svn resolved root\projects\3rd-party\.classpath
   svn resolved root\projects\repository\source\java\org\alfresco\repo\workflow\WorkflowInterpreter.java
   svn merge svn://svn.alfresco.com:3691/alfresco/BRANCHES/V1.4@4379 svn://svn.alfresco.com:3691/alfresco/BRANCHES/V1.4@4380 .
   svn merge svn://svn.alfresco.com:3691/alfresco/BRANCHES/V1.4@4420 svn://svn.alfresco.com:3691/alfresco/BRANCHES/V1.4@4421 .
   svn resolved root\projects\3rd-party\.classpath


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@4655 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-12-19 14:24:45 +00:00
Kevin Roast
5a513ea900 corrected copyright and author
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@3394 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-07-25 09:23:36 +00:00
Kevin Roast
e31e027039 . Outlook email format meta-data extractor
- expects .msg files in native Outlook format
  - uses POI library for the parsing of the horrid OLE2 compound document format
  - extracts addressee(s), sent date and originator email address
  ...for the future - could be modified and used as a transformer to allow full-text indexing of Outlook format emails

. Add new aspect "emailed" to the contentmodel to support properties for above extractor

git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@3387 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-07-24 15:05:48 +00:00
Derek Hulley
2c2816a39b Upgraded JOOConverter to V2.0.0
- Fixes AR-505
 - OpenOffice transformations are config driven
 - Incorporated WordPerfect transformations


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@3367 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-07-21 14:14:11 +00:00