17 Commits

Author SHA1 Message Date
Nick Burch
01d0f43ab3 Add quick example files for tar
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@22848 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-10-04 12:18:42 +00:00
Neil McErlean
ffdef0597c Implementation of ALF-5066. Support for thumbnailing of .eps files.
Had to relax AbstractImageMagickContentTransformerWorker's restriction to 'image/*' mimetypes to allow 'application/eps'.
  Added application/eps to the MimetypeMap.
  Added application/eps to the view and edit modes of the mimetype.ftl forms control.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@22838 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-10-04 10:50:15 +00:00
Nick Burch
0b01fe9a3a Add sample office files with images and other office files embeded in them
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@22347 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-09-08 16:18:17 +00:00
Nick Burch
4847620d2e More .doc{x} -> html support - basic Tika conversion to HTML now enabled (lacks some of the required elements), image extraction remains stubbed out
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@22335 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-09-08 13:31:31 +00:00
Nick Burch
d2c1cc78e5 Add cm:geographic Aspect, which has cm:latitude and cm:longitude, and update the Tika auto parser to map to this (plus tests)
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20925 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-07-02 14:57:58 +00:00
Nick Burch
228d111c56 More Tika content transform updates
New POI-general converter, for things other than excel, and convert the PDF converter too.
The POI-excel converter now does CSV properly, and notes exist for the Text mining converter on the Tika bits needed before it can be replaced.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20780 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-23 14:27:10 +00:00
Neil McErlean
0aa0184907 iDay. Indexing and WebPreviewing of Archive (currently zip) files.
I've added 2 transformers: zip to text/plain and zip to pdf.
This means that zip files will be indexable and therefore searchable. They will also now have webpreviews.

In each transformer it is the names of the entries in the zip file that are output. Therefore the webpreview will show a listing of the zip contents (not recursive for zips in zips) and the searching will be against entry names within zips but not within the content of those entries.

Also added a test class and a quick.zip file for testing.
These changes required some extension points in AbstractContentTransformerTest to support the zip transformation testing.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20580 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-10 11:24:23 +00:00
Nick Burch
45c757fee8 Add metadata extractor support for .dwg files (ALF-2262)
The code for extracting .dwg files has been contributed to Apache tika, and the Alfresco metadata extractor deep calls into Tika to have the work done. We retain our own tests of this however.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19927 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-04-21 10:17:11 +00:00
Derek Hulley
7a18e7e52b Removed svn:executable tag
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@19133 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-03-09 00:39:21 +00:00
Nick Burch
1dbaca0890 More test files for text and metadata extractor unit tests
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@18455 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-02-04 14:54:46 +00:00
Derek Hulley
54d7208f7b Fixed AR-1321: Allow '&' in filename
The following filename is valid now: "x ¬ £ % & + ; x.txt"
This was a restriction imposed by WebDAV, but the encoding of the repsonses is working well and these restrictions be removed as a result.

Fixed AR-1281: WebDAV upload was assigning incorrect encoding

I added a bean 'charset.finder', which can be fetched from the MimetypeService.
Various pluggins now exist to decode a stream and figure out what the encoding is.
WebDAV and CIFS/FTP are now hooked into this so that they guess a little better.

Fixed others:
Added retrying transactions to WebDAV.
Read/write transactions for WebDAV.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@6073 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-06-22 21:27:17 +00:00
Kevin Roast
04a78f17d2 Outlook email messages (in OLE2 .msg format) now converted to text for full-text indexing.
JUnit test for new transformer class.
Added new test to ContentTestSuite.

git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@5951 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-06-14 10:51:51 +00:00
Derek Hulley
f03f95325a Upgraded OpenDocumentMetadataExtracter to new infrastructure.
Added more OpenDocument test documents.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@5690 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-05-16 10:27:36 +00:00
Derek Hulley
f3658f2a58 WordPerfect icons and mimetype
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@2546 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-03-14 21:13:34 +00:00
Derek Hulley
d252748bbe More fine-grained access to mimetype config XML. Added more mimetypes to OpenOffice handlers
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@2225 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-01-27 14:34:45 +00:00
Derek Hulley
119ac044d4 Enabled *.ods, *.sdc and *.sxc conversion to PDF and therefore to TXT
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@2223 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2006-01-26 18:27:52 +00:00
Derek Hulley
e1e6508fec Moving to root below branch label
git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@2005 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2005-12-08 07:13:07 +00:00