Merged BRANCHES/DEV/V4.1-BUG-FIX to HEAD

45144: Fix for ALF-16790 - upload ignores additional aspects for updateNodes
   45174: Merged V4.1-BUG-FIX-2012_11_22 to V4.1-BUG-FIX
   43940: ALF-15209: Add Dashlet to User or Site Dashboard not working when IE8 is operating in Compatibility Mode. 
             Fix check p_el.attributes["id"] to avoid errors for "IE8 compatibility mode
   45175: Fix for ALF-17260 - Tags list not updated after moving/copying an high number of nodes with taggable aspect but no tags 
   45188: ALF-16254 ("Leave Site" behaviour for group based site membership)
   45204: Fix for ALF-17335 - OnCreateNodePolicy not fired when custom type is used in the Share flash upload
   45218: ALF-17248 BaseAssociationEditor.java does not return any results when the query filter consists of "firstname + lastname" 
   45221: Merged V4.1.3 (4.1.3) to V4.1-BUG-FIX (4.1.4) RECORD ONLY
             45220: Merged: V4.1-BUG-FIX (4.1.4) to V4.1.3 (4.1.3)
                44054: Fix for ALF-16337. Datalist assignee not searchable by full name.
                45218: ALF-17248 BaseAssociationEditor.java does not return any results when the query filter consists of "firstname + lastname"
   45245: ALF-17089 (Displaying Url Name instead of site Name in Select form)
   45257: ALF-17318 Unnecessary Canned Query in .getPeople(String,...) on startup.
          - Just one extra query on each run of the FeedNotifier was being made at the end.
            The sequence of queries are necessary.
   45336: Merged DEV to V4.1-BUG-FIX (4.1.4)
             45318: ALF-14086: CLONE - Sort order of folders including hyphens ( - ) are different in folder-tree and view on folders (in Share)
             Sort groups and users on the Java server side using collators.
             - Deprecated a few methods not deprecated in DEV and removed one which had just been added to 4.1.4
   45362: Merged V3.4-BUG-FIX (3.4.13) to V4.1-BUG-FIX <<RECORD ONLY>>
             45361: Merged V3.4 (3.4.12) to V3.4-BUG-FIX (3.4.13)
                45360: ALF-17431: Merged V4.1 (4.1.2) to V3.4 (3.4.12)
                   43622: ALF-16757: Sharepoint doesn't work correct with SSO
                   - Fix by Pavel
   45385: Merged V4.1.3 (4.1.3) to V4.1-BUG-FIX (4.1.4)
             45384: ALF-17097 60k Site Performance: Admin Console | Groups | Browse Groups (include sys groups): Results isn't appeared.
                - Error in authorities comparator causing test failure of ALF 14086 in 4.1.4 only.
                  4.1.3 was okay as ALF 14086 now uses the change made for ALF-17097 but only in 4.1.4
   45452: Corrected config check for ALF-16413 - Share asks for Basic-Auth while not needed trying to access RSS feeds (thus breaking SSO).
   45467: Fix for ALF-17509 - patches the FreeMarker built-in ?js_string to correctly encode the "/" character.
   45468: ALF-17492 - WebScript errors must contain useful information
          - SpringSurf libs 1217 provide additional INFO log information on the HTTP method, URL+params that caused the exception.
   45475: Fix for ALF-17510 - Upgrade of htmlparser from 1.6 to 2.1
   45566: Fixed ALF-17530
          - Refactored "successCallback" & "successScope" parameters for multipart uploads to be simply "success" (same for failure)
   45574: Fixed ALF-17528
          - Asserting that request is made using application/json
   45662: Merged HEAD to BRANCHES/DEV/V4.1-BUG-FIX:
             45660: Fixes: ALF-17539 - The server was failing to parse the date. It shouldn't have been trying to parse it at all.
   45849: Merged HEAD to BRANCHES/DEV/V4.1-BUG-FIX:
             45824: Fixes: ALF-13676: Event edit times are now presented using the date-format.shortTime setting & may be entered in either 24h or 12hr formats.
   45876: ALF-17642: Fix broken HtmlParserContentTransformerTest after upgrade of htmlparser to 2.1
          - Since the upgrade slightly changed the behaviour of the transformer, I added some explanatory comments to the test and to the transformer class.
   45927: Fix for ALF-17302  DocLib sort is determined by server locale rather than browser locale 
          - GetChildrenCannedQuery was not using locale based collation
   46014: Fix for ALF-17732 - SWF files are considered insecure content and should not be displayed directly in the browser.
   46160: Fix for ALF-17759 - HTML files are stripped from metadata and style information after they are uploaded.
   46186: Fix for ALF-17786 - Site dashboard page issues too many requests (Site Members dashlet issues avatar requests when it doesn't need too)

git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@46287 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
This commit is contained in:
Kevin Roast
2013-02-06 11:00:59 +00:00
parent c82473db8c
commit db49919005
13 changed files with 358 additions and 65 deletions

View File

@@ -33,12 +33,25 @@ import org.htmlparser.beans.StringBean;
import org.htmlparser.util.ParserException;
/**
* @see http://htmlparser.sourceforge.net/
* @see org.htmlparser.beans.StringBean
* Content transformer which wraps the HTML Parser library for
* parsing HTML content.
*
* Tika Note - could be convered to use the Tika HTML parser,
* <p>
* Since HTML Parser was updated from v1.6 to v2.1, META tags
* defining an encoding for the content via http-equiv=Content-Type
* will ONLY be respected if the encoding of the content item
* itself is set to ISO-8859-1.
* </p>
*
* <p>
* Tika Note - could be converted to use the Tika HTML parser,
* but we'd potentially need a custom text handler to replicate
* the current settings around links and non-breaking spaces.
* </p>
*
* @see http://htmlparser.sourceforge.net/
* @see org.htmlparser.beans.StringBean
* @see http://sourceforge.net/tracker/?func=detail&aid=1644504&group_id=24399&atid=381401
*
* @author Derek Hulley
*/
@@ -72,7 +85,8 @@ public class HtmlParserContentTransformer extends AbstractContentTransformer2
File htmlFile = TempFileProvider.createTempFile("HtmlParserContentTransformer_", ".html");
reader.getContent(htmlFile);
// Fetch the encoding of the HTML, if it's set in Alfresco
// Fetch the encoding of the HTML, as set in the ContentReader
// This will default to 'UTF-8' if not specifically set
String encoding = reader.getEncoding();
// Create the extractor

View File

@@ -147,37 +147,23 @@ public class HtmlParserContentTransformerTest extends AbstractContentTransformer
tmpS.delete();
tmpD.delete();
// Note - since HTML Parser 2.0 META tags specifying the
// document encoding will ONLY be respected if the original
// content type was set to ISO-8859-1.
//
// This means there is now only one test which we can perform
// to ensure that this now-limited overriding of the encoding
// takes effect.
// Nothing on the content, meta set to ISO 8865-1
// Content set to ISO 8859-1, meta set to UTF-8
tmpS = File.createTempFile("test", ".html");
content = new FileContentWriter(tmpS);
content.setMimetype(MimetypeMap.MIMETYPE_HTML);
String str = partA+
"<meta http-equiv=\"Content-Type\" content=\"text/html; charset=ISO-8859-1\">" +
partB+partC;
content.putContent(new ByteArrayInputStream(str.getBytes("ISO-8859-1")));
tmpD = File.createTempFile("test", ".txt");
dest = new FileContentWriter(tmpD);
dest.setMimetype(MimetypeMap.MIMETYPE_TEXT_PLAIN);
transformer.transform(content.getReader(), dest);
assertEquals(
TITLE + "\n" + TEXT_P1 + "\n" + TEXT_P2 + "\n" + TEXT_P3 + "\n",
dest.getReader().getContentString()
);
tmpS.delete();
tmpD.delete();
// Nothing on the content, meta set to UTF-8
tmpS = File.createTempFile("test", ".html");
content = new FileContentWriter(tmpS);
content.setMimetype(MimetypeMap.MIMETYPE_HTML);
str = partA+
"<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\">" +
partB+partC;
content.putContent(new ByteArrayInputStream(str.getBytes("UTF-8")));
content.setEncoding("ISO-8859-1");
tmpD = File.createTempFile("test", ".txt");
dest = new FileContentWriter(tmpD);