Dave Ward 9963da3d51 Merged V3.3 to HEAD
20794: Merged DEV/V3.3-BUG-FIX to V3.3
      20792: Fix for unit test failures introduced by check in 20771
      20791: ALF-3568: Include axiom jars in WAS shared library to solve Quickr connector issues
      20785: Merged DEV/BELARUS/V3.3-BUG-FIX-2010_06_14 to DEV/V3.3-BUG-FIX
         20644: Function for the browser window closing was implemented. For IE browser the trick with window opener was used. Fixes ALF-1004: After closing Details Space, user doesn't return to his previous location
      20784: Fix for ALF-3516: Enterprise 3.X / Impossible to Create a Blog with Special Characters in the Title (?/!)
      20783: Fix for ALF-1087: Documents checked-out from Share do not have "Upload new version" action in Alfresco Explorer
      20782: Added multiday timed event handling to week view
      20775: Merged V3.3 to DEV/V3.3-BUG-FIX
         20670: Fix for ALF-3260: XSS attack is made in Wiki tab if First/Last user name contain xss. Also fixed double encoding errors found during regression testing.
      20772: Update to node browser to show namespace of attributes.
      20771: ALF-3591 - transferring rules.
         - also extends the behaviour filter.
      20770: ALF-3186 - action parameter values are not fully transferred - need to handle d:any
      20768: AVM - ALF-3611 (OrphanReaper + PurgeTestP + additional NPE fixes)
      20765: (RECORD ONLY) Merged BRANCHES/V3.3 to BRANCHES/DEV/V3.3-BUG-FIX:
         20708: DB2 build - add create/drop db ant targets (use DB2 cmdline - since not possible via JDBC/SQL)
         20722: DB2 build - run db2cmd in same window (follow-on to r20708)
      20764: Fix unreported JSON encoding issue with links components
      20762: Fix ALF-2599: Share - Cannot search for user currently logged on
      20759: DB2: fix FullNodeServiceTest.testLongMLTextValues (ALF-497)
         - TODO: fix create script when merging to HEAD
      20756: DB2: fix JBPMEngine*Test.* (ALF-3640) - follow-on (upgrade patch)
      20746: DB2: fix WebProjectServiceImplTest.testCreateWebProject (ALF-2300)
      20744: DB2: fix JBPMEngine*Test.* (ALF-3640) - missed file
      20743: DB2: fix JBPMEngine*Test.* (ALF-3640)
      20729: AVM - fix purge store so that root nodes are actually orphaned (ALF-3627)
         - also prelim for ALF-3611
      20720: (RECORD ONLY) ALF-3594: Merged HEAD to V3.3-BUGFIX
         20616: ALF-2265: Share 'Uber Filter' part 2
            - WebScriptNTLMAuthenticationFilter detached from its superclass and renamed to WebScriptSSOAuthenticationFilter
            - Now the filter simply chains to the downstream authentication filter rather than call its superclass
            - This means the same filter can be used for Kerberos-protected webscripts as well as NTLM
            - Wired globalAuthenticationFilter behind webscriptAuthenticationFilter in the filter chain in web.xml
            - Configured webscriptAuthenticationFilter for Kerberos subsystem
      20719: Merged DEV/TEMPORARY to V3.3-BUGFIX
         20696: ALF-3180: when using NTLM SSO, a user needs to log in first into the web UI before being able to mount alfresco using CIFS
            The absence of the missing person creation logic in “the org.alfresco.filesys.auth.cifs.PassthruCifsAuthenticator.authenticateUser()” method was fixed. 
      20718: Merged DEV/TEMPORARY to V3.3-BUGFIX
         20659: ALF-3216: Incomplete settings for Lotus Quickr
            The protocol,host,port and context are removed from properties and a dependency on the org.alfresco.repo.admin.SysAdminParams interface is introduced.
      20711: Latest SpringSurf libs - fix for ALF-3557
      20710: Merged HEAD to BRANCHES/DEV/V3.3-BUG-FIX:
         20705: Fix ALF-3585: AtomPub summary can render first part of binary content resulting in invalid XML
      20691: Merged DEV/TEMPORARY to V3.3-BUGFIX
         19404: ALF-220: Editor can't rename files and folders via WebDav
            The Rename method of FileFolderService was used in case of file renaming instead of move method in WebDAV MOVE command.
      20663: ALF-3208 RenderingEngine actions should no longer appear in the list of available actions that can be fired using rules.
      20656: ALF-2645: LDAP sync now logs 'dangling references' for debugging purposes
      20651: ALF-485: FTP passthru authenticator logs authentication failures at debug level to avoid noise in the logs
      20646: Merge V2.2 To V3.3
         14301 : RECORD ONLY - ETWOTWO-1227 - fix to serialize FSR deployments.
         14618 : RECORD ONLY - Merge HEAD to 2.2 13944 : After rename project deploy option disappears.
      20637: ALF-3123: Avoid NPE on Oracle when loading empty string values persisted through JMX and the attribute service
      20633: ALF-2057: LDAP synchronization lock now persists for a maximum of two minutes (instead of 24 hours!)
         - The exclusive lock gained for LDAP sync from the JobLockService is now refreshed at 1 minute intervals and never persists for more than 2 minutes
      20628: ALF-1905: Allow use of anonymous bind for LDAP synchronization (NOT authentication)
         - Previously synchronization AND authentication shared the same setting for java.naming.security.authentication, meaning that if you tried to use anonymous bind for the synchronization side, the authentication side would complain.
         - Now there are two independent environments declared for the 'default' synchronization connection and the authentication connection
         - A new property ldap.synchronization.java.naming.security.authentication declares the authentication type used by synchronization. Set to "none" for anonymous bind.
      20623: Fix for ALF-3188 : Access Denied when updating doc via CIFS
      20620: Merge DEV to V3.3-BUG-FIX
         20456 -  ALF-1824 : Setting alfresco.rmi.services.host on linux does not use specified host/IP
      20617: Merged DEV/BELARUS/V3.3-2010_06_08 to V3.3-BUG-FIX (with corrections)
         20606: ALF-651: Web Services client ContentUtils.convertToByteArray is broken
            - org.alfresco.webservice.util.ContentUtils.convertToByteArray() method has been updated to cover large Input Streams conversion.
            - org.alfresco.webservice.test.ContentUtilsTest is a test for the new functionality implemented in the ContentUtils class.
            - org.alfresco.webservice.test.resources.big-content.pdf is a large content for the ContentUtilsTest.testInputStreamToByteArrayConversion() test.
      20613: Fixed ALF-1746: Metadata extractors are unable to remove ALL aspect-related properties
         - putRawValue keeps hold of 'null' values
         - All policies keep hold of 'null' values
         - Only affects 'carryAspectProperties=false'
      20609: Merged HEAD to V3.3-BUG-FIX
         20578: ALF-3178 - Transfer Service - to transfer rule (ie. ruleFolder with it's children) the PathHelper should allow "-" (dash character)
         20608: ALF-3178 - fix r20578 (mis-applied patch)
      20594: WebDAV BitKinex compatibility fix - Let the XML Parser work out the body encoding if it is not declared in the Content-Type header
      20588: (RECORD ONLY) Merged V3.3 to V3.3-BUG-FIX
         - Merged across all differences from V3.3
   20778: Added revision to version label.
   20777: Fix for ALF-2451 - installer correctly configure Share port
   20722: DB2 build - run db2cmd in same window (follow-on to r20712)
   20721: DB2 build - fix create target and add "/c" to exit "db2cmd"
      - TODO: add wait/timeout target, ideally checking for created DB 


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@20796 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2010-06-24 15:47:38 +00:00

307 lines
14 KiB
Java

/*
* Copyright (C) 2005 Jesper Steen Møller
*
* This file is part of Alfresco
*
* Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
*/
package org.alfresco.repo.content.metadata;
import java.io.Serializable;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import org.alfresco.repo.content.ContentWorker;
import org.alfresco.service.cmr.repository.ContentIOException;
import org.alfresco.service.cmr.repository.ContentReader;
import org.alfresco.service.namespace.QName;
/**
* Interface for document property extracters.
* <p>
* Please pardon the incorrect spelling of <i>extractor</i>.
*
* @author Jesper Steen Møller
* @author Derek Hulley
*/
public interface MetadataExtracter extends ContentWorker
{
/**
* A enumeration of functional property overwrite policies. These determine whether extracted properties are
* written into the property map or not.
*
* @author Derek Hulley
* @author Jesper Steen Møller
*/
public enum OverwritePolicy
{
/**
* This policy puts the new value if:
* <ul>
* <li>the extracted property is not null</li>
* </ul>
* <tt>null</tt> extracted values are return in the 'modified' map.
*/
EAGER
{
@Override
public Map<QName, Serializable> applyProperties(Map<QName, Serializable> extractedProperties, Map<QName, Serializable> targetProperties)
{
Map<QName, Serializable> modifiedProperties = new HashMap<QName, Serializable>(7);
for (Map.Entry<QName, Serializable> entry : extractedProperties.entrySet())
{
QName propertyQName = entry.getKey();
Serializable extractedValue = entry.getValue();
// Ignore null extracted value
if (extractedValue != null)
{
targetProperties.put(propertyQName, extractedValue);
}
modifiedProperties.put(propertyQName, extractedValue);
}
return modifiedProperties;
}
},
/**
* This policy puts the new value if:
* <ul>
* <li>the extracted property is not null</li>
* <li>there is no target key for the property</li>
* <li>the target value is null</li>
* <li>the string representation of the target value is an empty string</li>
* </ul>
* <tt>null</tt> extracted values are return in the 'modified' map.
*/
PRAGMATIC
{
@Override
public Map<QName, Serializable> applyProperties(Map<QName, Serializable> extractedProperties, Map<QName, Serializable> targetProperties)
{
/*
* Negative and positive checks are mixed in the loop.
*/
Map<QName, Serializable> modifiedProperties = new HashMap<QName, Serializable>(7);
for (Map.Entry<QName, Serializable> entry : extractedProperties.entrySet())
{
QName propertyQName = entry.getKey();
Serializable extractedValue = entry.getValue();
// Ignore null extracted value
if (extractedValue == null)
{
modifiedProperties.put(propertyQName, extractedValue);
continue;
}
// Handle the shortcut cases where the target value is missing or null
if (!targetProperties.containsKey(propertyQName))
{
// There is nothing currently
targetProperties.put(propertyQName, extractedValue);
modifiedProperties.put(propertyQName, extractedValue);
continue;
}
Serializable originalValue = targetProperties.get(propertyQName);
if (originalValue == null)
{
// The current value is null
targetProperties.put(propertyQName, extractedValue);
modifiedProperties.put(propertyQName, extractedValue);
continue;
}
// Check the string representation
if (originalValue instanceof String)
{
String originalValueStr = (String) originalValue;
if (originalValueStr != null && originalValueStr.length() > 0)
{
// The original value is non-trivial
continue;
}
else
{
// The original string is trivial
targetProperties.put(propertyQName, extractedValue);
modifiedProperties.put(propertyQName, extractedValue);
continue;
}
}
// We have some other object as the original value, so keep it
}
return modifiedProperties;
}
},
/**
* This policy only puts the extracted value if there is no value (null or otherwise) in the properties map.
* It is assumed that the mere presence of a property key is enough to inidicate that the target property
* is as intented.
* This policy puts the new value if:
* <ul>
* <li>the extracted property is not null</li>
* <li>there is no target key for the property</li>
* </ul>
* <tt>null</tt> extracted values are return in the 'modified' map.
*/
CAUTIOUS
{
@Override
public Map<QName, Serializable> applyProperties(Map<QName, Serializable> extractedProperties, Map<QName, Serializable> targetProperties)
{
Map<QName, Serializable> modifiedProperties = new HashMap<QName, Serializable>(7);
for (Map.Entry<QName, Serializable> entry : extractedProperties.entrySet())
{
QName propertyQName = entry.getKey();
Serializable extractedValue = entry.getValue();
// Ignore null extracted value
if (extractedValue == null)
{
modifiedProperties.put(propertyQName, extractedValue);
continue;
}
// Is the key present in the target values
if (targetProperties.containsKey(propertyQName))
{
// Cautiously bypass the value as there is one already
continue;
}
targetProperties.put(propertyQName, extractedValue);
modifiedProperties.put(propertyQName, extractedValue);
}
return modifiedProperties;
}
};
/**
* Apply the overwrite policy for the extracted properties.
*
* @return
* Returns a map of all properties that were applied to the target map
* as well as any null values that weren't applied but were present.
*/
public Map<QName, Serializable> applyProperties(Map<QName, Serializable> extractedProperties, Map<QName, Serializable> targetProperties)
{
throw new UnsupportedOperationException("Override this method");
}
};
/**
* Get an estimate of the extracter's reliability on a scale from 0.0 to 1.0.
*
* @param mimetype the mimetype to check
* @return Returns a reliability indicator from 0.0 to 1.0
*
* @deprecated This method is replaced by {@link #isSupported(String)}
*/
public double getReliability(String mimetype);
/**
* Determines if the extracter works against the given mimetype.
*
* @param mimetype the document mimetype
* @return Returns <tt>true</tt> if the mimetype is supported, otherwise <tt>false</tt>.
*/
public boolean isSupported(String mimetype);
/**
* Provides an estimate, usually a worst case guess, of how long an
* extraction will take.
* <p>
* This method is used to determine, up front, which of a set of equally
* reliant transformers will be used for a specific extraction.
*
* @return Returns the approximate number of milliseconds per transformation
*
* @deprecated Generally not useful or used. Extraction is normally specifically configured.
*/
public long getExtractionTime();
/**
* Extracts the metadata values from the content provided by the reader and source
* mimetype to the supplied map. The internal mapping and {@link OverwritePolicy overwrite policy}
* between document metadata and system metadata will be used.
* <p>
* The extraction viability can be determined by an up front call to {@link #isSupported(String)}.
* <p>
* The source mimetype <b>must</b> be available on the
* {@link org.alfresco.service.cmr.repository.ContentAccessor#getMimetype()} method
* of the reader.
*
* @param reader the source of the content
* @param destination the map of properties to populate (essentially a return value)
* @return Returns a map of all properties on the destination map that were
* added or modified. If the return map is empty, then no properties
* were modified.
* @throws ContentIOException if a detectable error occurs
*
* @see #extract(ContentReader, OverwritePolicy, Map, Map)
*/
public Map<QName, Serializable> extract(ContentReader reader, Map<QName, Serializable> destination);
/**
* Extracts the metadata values from the content provided by the reader and source
* mimetype to the supplied map.
* <p>
* The extraction viability can be determined by an up front call to {@link #isSupported(String)}.
* <p>
* The source mimetype <b>must</b> be available on the
* {@link org.alfresco.service.cmr.repository.ContentAccessor#getMimetype()} method
* of the reader.
*
* @param reader the source of the content
* @param overwritePolicy the policy stipulating how the system properties must be
* overwritten if present
* @param destination the map of properties to populate (essentially a return value)
* @return Returns a map of all properties on the destination map that were
* added or modified. If the return map is empty, then no properties
* were modified.
* @throws ContentIOException if a detectable error occurs
*
* @see #extract(ContentReader, OverwritePolicy, Map, Map)
*/
public Map<QName, Serializable> extract(
ContentReader reader,
OverwritePolicy overwritePolicy,
Map<QName, Serializable> destination);
/**
* Extracts the metadata from the content provided by the reader and source
* mimetype to the supplied map. The mapping from document metadata to system metadata
* is explicitly provided. The {@link OverwritePolicy overwrite policy} is also explictly
* set.
* <p>
* The extraction viability can be determined by an up front call to
* {@link #isSupported(String)}.
* <p>
* The source mimetype <b>must</b> be available on the
* {@link org.alfresco.service.cmr.repository.ContentAccessor#getMimetype()} method
* of the reader.
*
* @param reader the source of the content
* @param overwritePolicy the policy stipulating how the system properties must be
* overwritten if present
* @param destination the map of properties to populate (essentially a return value)
* @param mapping a mapping of document-specific properties to system properties.
* @return Returns a map of all properties on the destination map that were
* added or modified. If the return map is empty, then no properties
* were modified.
* @throws ContentIOException if a detectable error occurs
*
* @see #extract(ContentReader, Map)
*/
public Map<QName, Serializable> extract(
ContentReader reader,
OverwritePolicy overwritePolicy,
Map<QName, Serializable> destination,
Map<String, Set<QName>> mapping);
}