Jon Cox c5d97bfebe Generalized handling of URI schemas to be ignored during link validation.
These are configurable in linkvalidation-service-context.xml
Also ensured the list of links per file is sorted & unique.

 Here's a list of known protocols:

 http://www.iana.org/assignments/uri-schemes.html

      aaa aaas acap afs cap cid crid data dav dict dns dtn fax file
      ftp go gopher h323 http https iax2 icap im imap info ipp iris
      iris.beep iris.lwz iris.xpc iris.xpcs ldap mailserver mailto
      mid modem msrp msrps mtqp mupdate news nfs nntp opaquelocktoken
      pop pres prospero rtsp service shttp sip sips snmp soap.beep
      soap.beeps tag tel telnet tftp thismessage tip tn3270 tv urn
      vemmi wais xmlrpc.beep xmlrpc.beeps xmpp z39.50r z39.50s

 For now, all these URI schemes get a free pass except http & https. 
 Any URI not qualified by one of these protocol/scheme designators 
 is presumed to be broken.  It would be nice to validate ftp links
 for real, but that won't happen for a while.


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@6189 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2007-07-07 02:58:22 +00:00
..