Merge branch 'feature/SEARCH-1643' into 'master'

Feature/search 1643

See merge request search_discovery/insightengine!10
This commit is contained in:
Elia Porciani
2019-06-05 09:00:42 +01:00
16 changed files with 690 additions and 101 deletions

View File

@@ -0,0 +1,262 @@
## DocRouters
![Completeness Badge](https://img.shields.io/badge/Document_Level-In_Progress-yellow.svg?style=flat-square)
### Introduction
When an index grows too large to be stored on a single search server, it can be distributed across multiple search servers. This is known as _sharding_. The distributed/sharded index can then be searched using Alfresco/Solr's distributed search capabilities.
Search Services can use any of these four different methods for routing documents and ACLs to shards.
The following diagram illustrates the DocRouter class hierarchy; the DocRouter interface, which declares the main contract, and all concrete implementors.
![DocRouter Class Diagram](doc_router_class_overview.png)
The light gray color on the _DateQuarterRouter_ is because it is not currently used.
### How a DocRouter is choosen
To use a specific sharding method, when creating a shard we must add the required configuration properties in the solrcore.properties.
There's a first required property (called _"shard_method"_) which is the mnemonic code associated to the sharding method, then additional properties could be needed, depending on the method.
As you can see from the class diagram above, the _DocRouter_ instance creation is a responsibility of the _DocRouterFactory_.
Since we are interested (see the _DocRouter_ interface) in distributing ACLs and Nodes, the _MetadataTracker_ and the _ACLTracker_ read the shard_method in the configuration and create/use the corresponding _DocRouter_ instance.
If the sharding method doesn't correspond to a valid option (e.g. invalid or unknown mnemonic code) then the system will fallback to _DBID_ routing (_DBDIDRouter_).
### Available document routers
The following table lists the available sharding methods and the associated mnemonic codes (i.e. the value of the "shard_method" attribute we need to configure in solrcore.properties).
| Name | Code |
| -----|---------|
|ACL (MOD) |MOD_ACL_ID|
|ACL (HASH) |ACL_ID|
|DBID (HASH)|DB_ID|
|DBID (RANGE)|DB_ID_RANGE|
|Month|DATE|
|Metadata|PROPERTY|
|Last registered indexing Shard|LRIS or LAST_REGISTERED_INDEXING_SHARD|
|Explicit Shard ID (fallback on DBID)|EXPLICIT_ID or EXPLICIT_ID_FALLBACK_DBID|
|Explicit Shard ID (fallback on LRIS)|EXPLICIT_ID_FALLBACK_LRIS|
##### ACL (MOD_ACL_ID)
Nodes and access control lists are grouped by their ACL identifier. This places the nodes together with all the access control information required to determine the access to a node in the same shard. Both the nodes and access control information are sharded.
The overall index size will be smaller than other methods as the ACL index information is not duplicated in every shard. Also, the ACL count is usually much smaller than the node count.
This method is beneficial if you have lots of ACLs and the documents are evenly distributed over those ACLs.
> The node distribution may be uneven as it depends how many nodes share ACLs.
In order to use this method, the following properties are required:
```
shard.method=MOD_ACL_ID
shard.instance=<shard.instance>
shard.count=<shard.count>
```
where
* shard.method is the mnemonic code associated with this router
* shard.instance is the shard identifier, which must be unique across the cluster
* shard.count is the total number of the shards composing the cluster
##### ACL (ACL_ID)
The sharding method is similar to the previous one, except that the murmur hash of the ACL ID is used in preference to its modulus.
This gives better distribution of ACLs over shards. The distribution of documents over ACLs is not affected and can still be skewed.
Apart the different mnemonic code, this option has the same configuration attributes of the MOD_ACL_ID.
##### DBID (DB_ID)
This is the default sharding method used in Solr 6. As mentioned above, this is also the method used in case an invalid or unknown value is detected in the shard.method property.
Nodes are evenly distributed over the shards at random based on the murmur hash of the DBID. The access control information is duplicated in each shard. The distribution of nodes over each shard is very even and shards grow at the same rate.
In order to use this method, the following properties are required:
```
shard.method=DB_ID
shard.instance=<shard.instance>
shard.count=<shard.count>
```
where
* shard.method is the mnemonic code associated with this router
* shard.instance is the shard identifier, which must be unique across the cluster
* shard.count is the total number of the shards composing the cluster
##### DBID Range (DB_ID_RANGE)
This sharding method routes documents within specific DBID ranges to specific shards. It adds new shards to the cluster without requiring a reindex. As consequence of that, the total number of shards (i.e. the shard.count attribute) mustn't be known in advance.
For each shard, you specify the range of DBIDs to be included. As your repository grows you can add shards.
In order to use this method, the following properties are required:
```
shard.method=DB_ID_RANGE
shard.instance=<shard.instance>
shard.range=0-200000
```
where
* shard.method is the mnemonic code associated with this router
* shard.instance is the shard identifier, which must be unique across the cluster
* shard.range is the range of DBID that must belong to the shard
##### Month (DATE)
This method assigns dates sequentially through shards based on the month.
For example: If there are 12 shards, each month would be assigned sequentially to each shard, wrapping round and starting again for each year.
The non-random assignment facilitates easier shard management - dropping shards or scaling out replication for some date range. Typical ageing strategies could be based on the created date or destruction date.
Each shard contains copies of all the ACL information, so this information is replicated in each shard. However, if the property is not present on a node, sharding falls back to the DBID murmur hash to randomly distribute these nodes.
In order to use this method, the following properties are required:
```
shard.method=DATE
shard.key=cm:created
shard.instance=<shard.instance>
shard.count=<shard.count>
(optional)shard.date.grouping=4
```
where
* shard.method is the mnemonic code associated with this router
* shard.key is the node property which contains the date used by this sharding method
* shard.instance is the shard identifier, which must be unique across the cluster
* shard.range is the range of DBID that must belong to the shard
* shard.date.grouping is an optional property used for grouping months together
* shard.count is the total number of the shards composing the cluster
##### Metadata (PROPERTY)
This method hashes the value of a node property and it uses the hash for targeting a given shard. All nodes with the same property value will be assigned to the same shard. Each shard will duplicate all the ACL information.
Only properties of type d:text, d:date and d:datetime can be used. For example, the recipient of an email, the creator of a node, some custom field set by a rule, or by the domain of an email recipient. The keys are randomly distributed over the shards using murmur hash.
In order to use this method, the following properties are required:
```
shard.method=PROPERTY
shard.key=cm:creator
shard.instance=<shard.instance>
shard.count=<shard.count>
(optional)shard.regex=^\d{4}
```
where
* shard.method is the mnemonic code associated with this router
* shard.key is the node property whose value will be used for determining the target shard
* shard.instance is the shard identifier, which must be unique across the cluster
* shard.count is the total number of the shards composing the cluster
* shard.regex is an optional property which applies a regex on the node property value, before hashing it.
##### Last Registered Indexing Shard (LRIS)
This method uses, as the name suggests, the last indexing shard which subscribed to the Shard Registry.
Like the _DB_ID_RANGE_ strategy, it's possible to add new shards to the cluster without requiring a reindex. And as consequence of that, the total number of shards (i.e. the _shard.count_ attribute) isn't needed with this sharding strategy.
At indexing time, when the _MetadataTracker_ periodically asks for transactions and nodes, Alfresco repository creates the nodes instances and, using the associated transaction timestamp, asks to the _ShardRegistry_ which is the target Shard which should index/own the node.
This _DocRouter_ will then use that information in order to understand if the incoming node belongs to it or not.
In order to use this method, the following properties are required:
```
shard.method=LRIS
shard.instance=<shard.instance>
```
where
* shard.method is the mnemonic code associated with this router
* shard.instance is the shard identifier, which must be unique across the cluster
###### ShardRegistry and ShardSubscription
On SearchServices side, the _MetadataTracker_ periodically communicates the hosting Shard state to Alfresco repository.
Alfresco, on its side, maintains a registry which provides near-real-time information about the Solr cluster (e.g. shards, shards state, topology). The very first time a shard subscribes the ShardRegistry creates a subscription for it.
A ShardSubscription is a simple entity which includes
* the shard id
* the subscription timestamp
* the owning core name (i.e. the name of the core the shards belongs to)
Shard subscriptions are
* collected and sorted by subscription timestamp descendent order (i.e. the first entry is the last subscriber)
* persisted on the database
The persistence is needed because in case of restart Alfresco needs to restore the subscriptions map and the correct shard topology. If no persistence definition exists, the map is empty and the registry will wait for the first subscriber.
##### Explicit Shard ID with fallback on DBID (EXPLICIT_ID, EXPLICIT_ID_FALLBACK_DBID)
Nodes are routed to shards accordingly to a value of a node property which is supposed to contain the target shard instance identifier.
Note that this is different from the "PROPERTY" doc router seen above because no hashing occurs on the node property:
* it must be numeric (it can be also a string with a numeric content)
* it must contain the target shard instance identifier
If the system is not able to determine a numeric id from the configured node property, then the doc router uses the DB_ID as a fallback strategy.
Although the primary sharding method (the numeric node property) results in an "elastic sharding" approach (we don't need to know the total number of shards in advance), when this doc router is
used, we still need to know the _shard.count_ attribute, because it is a required parameter for the fallback router (DB_ID)
![EXPLICIT_ID](explicit_id_with_docid_as_fallback.png)
In order to use this method, the following properties are required:
```
shard.method=EXPLICIT_ID_FALLBACK_DBID
shard.key=cm:targetShardInstance
shard.instance=<shard.instance>
shard.count=<shard.count>
```
where
* shard.method is the mnemonic code associated with this router. Note that also _EXPLICIT_ID_ can be used
* shard.key is the node property whose value will be used for determining the target shard
* shard.instance is the shard identifier, which must be unique across the cluster
* shard.count is the total number of the shards composing the cluster (required by the DBID fallback strategy)
##### Explicit Shard ID with fallback on LRIS (EXPLICIT_ID_FALLBACK_LRIS)
This method still belongs to the "explicit" id family, where nodes are routed to shards accordingly to a value of a node property which is supposed to contain the target shard instance identifier.
So it is very similar to the previous one. The important difference resides in the fallback strategy: in case the target shard cannot be determined used the supplied node property, the
"Last Registered Indexing Shard" is used as fallback.
Since both strategies (the primary and the fallback) are "elastic", this sharding method doesn't have to know the total number of shards: new shards can be added without any reindexing.
![EXPLICIT_ID_LRIS](explicit_id.png)
Let's see what happens when a member in the cluster (a Solr or Alfresco node) is restarted.
In all the examples below we have:
* an Alfresco instance, which maintains the persisted subscriptions map
* two shards (i.e. Solr nodes): S1 subscribed at t1, S2 subscribed at t2
**Example #1: One shard is restarted**
![One Shard Restart](one_shard_is_restarted.png)
In order to better explain the system behaviour we'll enumerate two sub-scenarios:
* **Indexed data is not lost**: after restarting, the node will find itself already registered in the subscription map. If it is the last indexing registered shard Alfresco will continue to use it as target indexing shard for the new incoming data
* **Indexed data is lost**: after restarting, the node will still find its subscription in the ShardRegistry. That means on Alfresco side, the Shard is still associated with the proper transaction range. It will request data (e.g. txn, nodes) from Alfresco and the doc router will accept only those nodes that belong to a transaction within that range.
**Example #2: Two shards are restarted**
![Two shards are restarted](two_shards_are_restarted.png)
Shard subscriptions are persisted on Alfresco side, so this scenario is similar to the previous one: regardless the shards have the indexed data or not after a restart, their corresponding subscription are still available on the ShardRegistry. That means the doc router will be aware about the range of transactions that must be (re)indexed in the two nodes.
**Example #3: Alfresco is restarted**
![Alfresco is restarted](alfresco_is_restarted.png)
The Shard Subscriptions Map is persisted on the database. That means its whole definition can survive to a system crash.
After restarting, subscriptions are restored from the database so each shard will be aware about the range of transactions it belongs to.
In order to use this method, the following properties are required:
```
shard.method=EXPLICIT_ID_FALLBACK_DBID
shard.key=cm:targetShardInstance
shard.instance=<shard.instance>
```
where
* shard.method is the mnemonic code associated with this router. Note that also _EXPLICIT_ID_ can be used
* shard.key is the node property whose value will be used for determining the target shard
* shard.instance is the shard identifier, which must be unique across the cluster

Binary file not shown.

After

Width:  |  Height:  |  Size: 214 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 185 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 207 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 209 KiB

View File

@@ -60,9 +60,15 @@ public class DocRouterFactory
case PROPERTY: case PROPERTY:
log.info("Sharding via PROPERTY"); log.info("Sharding via PROPERTY");
return new PropertyRouter(properties.getProperty("shard.regex", "")); return new PropertyRouter(properties.getProperty("shard.regex", ""));
case LAST_REGISTERED_INDEXING_SHARD:
log.info("Sharding via LAST_REGISTERED_INDEXING_SHARD");
return new LastRegisteredShardRouter();
case EXPLICIT_ID_FALLBACK_LRIS:
log.info("Sharding via EXPLICIT_ID_FALLBACK_LRIS");
return new ExplicitRouter(new LastRegisteredShardRouter());
case EXPLICIT_ID: case EXPLICIT_ID:
log.info("Sharding via EXPLICIT_ID"); log.info("Sharding via EXPLICIT_ID");
return new ExplicitRouter(); return new ExplicitRouter(new DBIDRouter());
default: default:
log.info("Sharding via DB_ID (default)"); log.info("Sharding via DB_ID (default)");
return new DBIDRouter(); return new DBIDRouter();

View File

@@ -1,3 +1,21 @@
/*
* Copyright (C) 2005-2019 Alfresco Software Limited.
*
* This file is part of Alfresco
*
* Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
*/
package org.alfresco.solr.tracker; package org.alfresco.solr.tracker;
import org.alfresco.solr.client.Acl; import org.alfresco.solr.client.Acl;
@@ -11,9 +29,10 @@ import org.slf4j.LoggerFactory;
public class ExplicitRouter implements DocRouter { public class ExplicitRouter implements DocRouter {
protected final static Logger log = LoggerFactory.getLogger(ExplicitRouter.class); protected final static Logger log = LoggerFactory.getLogger(ExplicitRouter.class);
private final DBIDRouter fallback = new DBIDRouter(); private final DocRouter fallbackRouter;
public ExplicitRouter() { public ExplicitRouter(DocRouter fallbackRouter) {
this.fallbackRouter = fallbackRouter;
} }
@Override @Override
@@ -25,11 +44,6 @@ public class ExplicitRouter implements DocRouter {
@Override @Override
public boolean routeNode(int shardCount, int shardInstance, Node node) { public boolean routeNode(int shardCount, int shardInstance, Node node) {
if(shardCount <= 1)
{
return true;
}
String shardBy = node.getShardPropertyValue(); String shardBy = node.getShardPropertyValue();
if (shardBy != null && !shardBy.isEmpty()) if (shardBy != null && !shardBy.isEmpty())
@@ -59,6 +73,6 @@ public class ExplicitRouter implements DocRouter {
{ {
log.debug("Shard "+shardInstance+" falling back to DBID routing for node "+node.getNodeRef()); log.debug("Shard "+shardInstance+" falling back to DBID routing for node "+node.getNodeRef());
} }
return fallback.routeNode(shardCount, shardInstance, node); return fallbackRouter.routeNode(shardCount, shardInstance, node);
} }
} }

View File

@@ -0,0 +1,62 @@
/*
* Copyright (C) 2005-2019 Alfresco Software Limited.
*
* This file is part of Alfresco
*
* Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
*/
package org.alfresco.solr.tracker;
import org.alfresco.solr.client.Acl;
import org.alfresco.solr.client.Node;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Routes a document only if the explicitShardId matches the provided shardId
*
* @author Elia
*/
public class LastRegisteredShardRouter implements DocRouter
{
protected final static Logger log = LoggerFactory.getLogger(ExplicitRouter.class);
public LastRegisteredShardRouter()
{
}
@Override
public boolean routeAcl(int shardCount, int shardInstance, Acl acl)
{
//all acls go to all shards.
return true;
}
@Override
public boolean routeNode(int shardCount, int shardInstance, Node node)
{
Integer explicitShardId = node.getExplicitShardId();
if (explicitShardId == null)
{
log.error("explicitShardId is not set for node " + node.getNodeRef());
return false;
}
return explicitShardId.equals(shardInstance);
}
}

View File

@@ -20,6 +20,7 @@ package org.alfresco.solr.tracker;
import java.io.IOException; import java.io.IOException;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet; import java.util.HashSet;
import java.util.LinkedHashSet; import java.util.LinkedHashSet;
import java.util.List; import java.util.List;
@@ -221,6 +222,9 @@ public class MetadataTracker extends AbstractTracker implements Tracker
.map(Tracker::getTrackerState) .map(Tracker::getTrackerState)
.orElse(transactionsTrackerState); .orElse(transactionsTrackerState);
HashMap<String, String> propertyBag = new HashMap<>();
propertyBag.put("coreName", coreName);
return ShardStateBuilder.shardState() return ShardStateBuilder.shardState()
.withMaster(isMaster) .withMaster(isMaster)
.withLastUpdated(System.currentTimeMillis()) .withLastUpdated(System.currentTimeMillis())
@@ -240,6 +244,7 @@ public class MetadataTracker extends AbstractTracker implements Tracker
.withTemplate(shardTemplate) .withTemplate(shardTemplate)
.withHasContent(transformContent) .withHasContent(transformContent)
.withShardMethod(ShardMethodEnum.getShardMethod(shardMethod)) .withShardMethod(ShardMethodEnum.getShardMethod(shardMethod))
.withPropertyBag(propertyBag)
.endFloc() .endFloc()
.endShard() .endShard()
.endShardInstance() .endShardInstance()
@@ -359,6 +364,7 @@ public class MetadataTracker extends AbstractTracker implements Tracker
gnp.setStoreProtocol(storeRef.getProtocol()); gnp.setStoreProtocol(storeRef.getProtocol());
gnp.setStoreIdentifier(storeRef.getIdentifier()); gnp.setStoreIdentifier(storeRef.getIdentifier());
gnp.setShardProperty(shardProperty); gnp.setShardProperty(shardProperty);
gnp.setCoreName(coreName);
List<Node> nodes = client.getNodes(gnp, (int) info.getUpdates()); List<Node> nodes = client.getNodes(gnp, (int) info.getUpdates());
for (Node node : nodes) for (Node node : nodes)
@@ -452,6 +458,7 @@ public class MetadataTracker extends AbstractTracker implements Tracker
gnp.setTransactionIds(txs); gnp.setTransactionIds(txs);
gnp.setStoreProtocol(storeRef.getProtocol()); gnp.setStoreProtocol(storeRef.getProtocol());
gnp.setStoreIdentifier(storeRef.getIdentifier()); gnp.setStoreIdentifier(storeRef.getIdentifier());
gnp.setCoreName(coreName);
List<Node> nodes = client.getNodes(gnp, (int) info.getUpdates()); List<Node> nodes = client.getNodes(gnp, (int) info.getUpdates());
for (Node node : nodes) for (Node node : nodes)
{ {
@@ -877,6 +884,7 @@ public class MetadataTracker extends AbstractTracker implements Tracker
gnp.setStoreProtocol(storeRef.getProtocol()); gnp.setStoreProtocol(storeRef.getProtocol());
gnp.setStoreIdentifier(storeRef.getIdentifier()); gnp.setStoreIdentifier(storeRef.getIdentifier());
gnp.setShardProperty(shardProperty); gnp.setShardProperty(shardProperty);
gnp.setCoreName(coreName);
List<Node> nodes = client.getNodes(gnp, Integer.MAX_VALUE); List<Node> nodes = client.getNodes(gnp, Integer.MAX_VALUE);
ArrayList<Node> nodeBatch = new ArrayList<>(); ArrayList<Node> nodeBatch = new ArrayList<>();
@@ -1050,6 +1058,7 @@ public class MetadataTracker extends AbstractTracker implements Tracker
gnp.setTransactionIds(txs); gnp.setTransactionIds(txs);
gnp.setStoreProtocol(storeRef.getProtocol()); gnp.setStoreProtocol(storeRef.getProtocol());
gnp.setStoreIdentifier(storeRef.getIdentifier()); gnp.setStoreIdentifier(storeRef.getIdentifier());
gnp.setCoreName(coreName);
return client.getNodes(gnp, Integer.MAX_VALUE); return client.getNodes(gnp, Integer.MAX_VALUE);
} }
catch (IOException e) catch (IOException e)

View File

@@ -0,0 +1,202 @@
/*
* Copyright (C) 2005-2019 Alfresco Software Limited.
*
* This file is part of Alfresco
*
* Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
*/
package org.alfresco.solr.tracker;
import java.util.Properties;
import org.alfresco.model.ContentModel;
import org.alfresco.repo.index.shard.ShardMethodEnum;
import org.alfresco.repo.search.adaptor.lucene.QueryConstants;
import org.alfresco.solr.AbstractAlfrescoDistributedTest;
import org.alfresco.solr.client.Acl;
import org.alfresco.solr.client.AclChangeSet;
import org.alfresco.solr.client.AclReaders;
import org.alfresco.solr.client.Node;
import org.alfresco.solr.client.NodeMetaData;
import org.alfresco.solr.client.StringPropertyValue;
import org.alfresco.solr.client.Transaction;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.LegacyNumericRangeQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.util.LuceneTestCase;
import org.apache.solr.SolrTestCaseJ4;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.Test;
import static java.util.Collections.singletonList;
import static org.alfresco.solr.AlfrescoSolrUtils.getAcl;
import static org.alfresco.solr.AlfrescoSolrUtils.getAclChangeSet;
import static org.alfresco.solr.AlfrescoSolrUtils.getAclReaders;
import static org.alfresco.solr.AlfrescoSolrUtils.getNode;
import static org.alfresco.solr.AlfrescoSolrUtils.getNodeMetaData;
import static org.alfresco.solr.AlfrescoSolrUtils.getTransaction;
import static org.alfresco.solr.AlfrescoSolrUtils.indexAclChangeSet;
import static org.carrot2.shaded.guava.common.collect.ImmutableList.of;
/**
* Test Routes based on an last registered shard
*
* @author Elia
*/
@SolrTestCaseJ4.SuppressSSL
@SolrTestCaseJ4.SuppressObjectReleaseTracker (bugUrl = "RAMDirectory")
@LuceneTestCase.SuppressCodecs({"Appending","Lucene3x","Lucene40","Lucene41","Lucene42","Lucene43", "Lucene44", "Lucene45","Lucene46","Lucene47","Lucene48","Lucene49"})
public class DistributedLastRegisteredShardRouterTest extends AbstractAlfrescoDistributedTest
{
private static long MAX_WAIT_TIME = 80000;
private final int timeout = 100000;
@Before
private void initData() throws Throwable
{
initSolrServers(2, "DistributedLastRegisteredShardRoutingTest", getProperties());
indexData();
}
@AfterClass
private static void destroyData() throws Throwable
{
dismissSolrServers();
}
/**
* Setup, indexes and returns the ACL used within the tests.
*
* @return the ACL used within the test.
*/
private Acl getTestAcl() throws Exception
{
AclChangeSet aclChangeSet = getAclChangeSet(1);
Acl acl = getAcl(aclChangeSet);
AclReaders aclReaders = getAclReaders(aclChangeSet, acl, singletonList("joel"), singletonList("phil"), null);
indexAclChangeSet(aclChangeSet, singletonList(acl), singletonList(aclReaders));
//Check for the ACL state stamp.
BooleanQuery.Builder builder =
new BooleanQuery.Builder()
.add(new BooleanClause(new TermQuery(new Term(QueryConstants.FIELD_SOLR4_ID, "TRACKER!STATE!ACLTX")), BooleanClause.Occur.MUST))
.add(new BooleanClause(LegacyNumericRangeQuery.newLongRange(
QueryConstants.FIELD_S_ACLTXID, aclChangeSet.getId(), aclChangeSet.getId() + 1, true, false), BooleanClause.Occur.MUST));
Query waitForQuery = builder.build();
waitForDocCount(waitForQuery, 1, MAX_WAIT_TIME);
return acl;
}
/**
* Default data is indexed in solr.
* 1 folder node with 2 children nodes.
* 1 Child is on the same shard of the parent folder (shard 0) while the other is on shard 1.
*/
private void indexData() throws Exception
{
AclChangeSet aclChangeSet = getAclChangeSet(1);
Acl acl = getAcl(aclChangeSet);
AclReaders aclReaders = getAclReaders(aclChangeSet, acl, singletonList("joel"), singletonList("phil"), null);
indexAclChangeSet(aclChangeSet,
of(acl),
of(aclReaders));
indexTestData(acl);
}
public void indexTestData(Acl acl) throws Exception
{
Transaction txn = getTransaction(0, 3);
/*
* Create node1 in the first shard
*/
Node node1 = getNode(0, txn, acl, Node.SolrApiNodeStatus.UPDATED);
node1.setExplicitShardId(0);
NodeMetaData nodeMetaData1 = getNodeMetaData(node1, txn, acl, "elia", null, false);
nodeMetaData1.getProperties().put(ContentModel.PROP_NAME, new StringPropertyValue("first"));
/*
* Create node2 in the second shard
*/
Node node2 = getNode(1, txn, acl, Node.SolrApiNodeStatus.UPDATED);
node2.setExplicitShardId(1);
NodeMetaData nodeMetaData2 = getNodeMetaData(node2, txn, acl, "elia", null, false);
nodeMetaData2.getProperties().put(ContentModel.PROP_NAME, new StringPropertyValue("second"));
/*
* Create node3 with no explicitShardId
*/
Node node3 = getNode(2, txn, acl, Node.SolrApiNodeStatus.UPDATED);
NodeMetaData nodeMetaData3 = getNodeMetaData(node3, txn, acl, "elia", null, false);
nodeMetaData3.getProperties().put(ContentModel.PROP_NAME, new StringPropertyValue("third"));
/*
* Create node4 in second shard
*/
Node node4 = getNode(3, txn, acl, Node.SolrApiNodeStatus.UPDATED);
node4.setExplicitShardId(1);
NodeMetaData nodeMetaData4 = getNodeMetaData(node4, txn, acl, "elia", null, false);
nodeMetaData4.getProperties().put(ContentModel.PROP_NAME, new StringPropertyValue("second"));
nodeMetaData4.getProperties().put(ContentModel.PROP_NAME, new StringPropertyValue("forth"));
/*
* Create node4 in shard 4 (which does not exist)
*/
Node node5 = getNode(4, txn, acl, Node.SolrApiNodeStatus.UPDATED);
node5.setExplicitShardId(4);
NodeMetaData nodeMetaData5 = getNodeMetaData(node5, txn, acl, "elia", null, false);
nodeMetaData5.getProperties().put(ContentModel.PROP_NAME, new StringPropertyValue("fifth"));
indexTransaction(txn,
of(node1, node2, node3, node4, node5),
of(nodeMetaData1, nodeMetaData2, nodeMetaData3, nodeMetaData4, nodeMetaData5));
/*
* Get sure the nodes are indexed correctly in the shards
*/
waitForShardsCount(params("q", "cm:content:world", "qt", "/afts"), 3, timeout, System.currentTimeMillis());
}
@Test
public void testNodesShouldBeIndexedInSpecifiedShard() throws Exception
{
assertShardCount(0, params("q", "cm:content:world", "qt", "/afts"), 1);
assertShardCount(1, params("q", "cm:content:world", "qt", "/afts"), 2);
assertShardCount(0, params("q", "cm:name:first", "qt", "/afts", "shards", shards), 1);
// The third node shouldn't be found because no explicit shard is set
assertShardCount(0, params("q", "cm:name:third", "qt", "/afts", "shards", shards), 0);
// The fifth node shouldn't be found because the explicit shard set is not running
assertShardCount(0, params("q", "cm:name:fifth", "qt", "/afts", "shards", shards), 0);
}
protected static Properties getProperties()
{
Properties prop = new Properties();
prop.put("shard.method", ShardMethodEnum.LAST_REGISTERED_INDEXING_SHARD.toString());
return prop;
}
}

View File

@@ -22,7 +22,7 @@
</distributionManagement> </distributionManagement>
<properties> <properties>
<dependency.alfresco-data-model.version>8.32</dependency.alfresco-data-model.version> <dependency.alfresco-data-model.version>8.34</dependency.alfresco-data-model.version>
<dependency.jackson.version>2.9.9</dependency.jackson.version> <dependency.jackson.version>2.9.9</dependency.jackson.version>
</properties> </properties>

View File

@@ -1,28 +1,28 @@
/* /*
* #%L * #%L
* Alfresco Solr Client * Alfresco Solr Client
* %% * %%
* Copyright (C) 2005 - 2016 Alfresco Software Limited * Copyright (C) 2005 - 2016 Alfresco Software Limited
* %% * %%
* This file is part of the Alfresco software. * This file is part of the Alfresco software.
* If the software was purchased under a paid Alfresco license, the terms of * If the software was purchased under a paid Alfresco license, the terms of
* the paid license agreement will prevail. Otherwise, the software is * the paid license agreement will prevail. Otherwise, the software is
* provided under the following open source license terms: * provided under the following open source license terms:
* *
* Alfresco is free software: you can redistribute it and/or modify * Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by * it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or * the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version. * (at your option) any later version.
* *
* Alfresco is distributed in the hope that it will be useful, * Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of * but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details. * GNU Lesser General Public License for more details.
* *
* You should have received a copy of the GNU Lesser General Public License * You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>. * along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
* #L% * #L%
*/ */
package org.alfresco.solr.client; package org.alfresco.solr.client;
import java.util.List; import java.util.List;
@@ -48,10 +48,11 @@ public class GetNodesParameters
private Set<QName> excludeNodeTypes; private Set<QName> excludeNodeTypes;
private Set<QName> includeAspects; private Set<QName> includeAspects;
private Set<QName> excludeAspects; private Set<QName> excludeAspects;
private QName shardProperty;
private QName shardProperty;
private String coreName;
public boolean getStoreFilter() public boolean getStoreFilter()
{ {
return (storeProtocol != null || storeIdentifier != null); return (storeProtocol != null || storeIdentifier != null);
@@ -145,17 +146,27 @@ public class GetNodesParameters
public void setExcludeAspects(Set<QName> excludeAspects) public void setExcludeAspects(Set<QName> excludeAspects)
{ {
this.excludeAspects = excludeAspects; this.excludeAspects = excludeAspects;
} }
public QName getShardProperty() public QName getShardProperty()
{ {
return this.shardProperty; return this.shardProperty;
} }
public void setShardProperty(QName shardProperty) public void setShardProperty(QName shardProperty)
{ {
this.shardProperty = shardProperty; this.shardProperty = shardProperty;
} }
public String getCoreName()
{
return this.coreName;
}
public void setCoreName(String coreName)
{
this.coreName = coreName;
}
} }

View File

@@ -1,28 +1,28 @@
/* /*
* #%L * #%L
* Alfresco Solr Client * Alfresco Solr Client
* %% * %%
* Copyright (C) 2005 - 2016 Alfresco Software Limited * Copyright (C) 2005 - 2016 Alfresco Software Limited
* %% * %%
* This file is part of the Alfresco software. * This file is part of the Alfresco software.
* If the software was purchased under a paid Alfresco license, the terms of * If the software was purchased under a paid Alfresco license, the terms of
* the paid license agreement will prevail. Otherwise, the software is * the paid license agreement will prevail. Otherwise, the software is
* provided under the following open source license terms: * provided under the following open source license terms:
* *
* Alfresco is free software: you can redistribute it and/or modify * Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by * it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or * the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version. * (at your option) any later version.
* *
* Alfresco is distributed in the hope that it will be useful, * Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of * but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details. * GNU Lesser General Public License for more details.
* *
* You should have received a copy of the GNU Lesser General Public License * You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>. * along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
* #L% * #L%
*/ */
package org.alfresco.solr.client; package org.alfresco.solr.client;
public class Node public class Node
@@ -37,8 +37,9 @@ public class Node
private long txnId; private long txnId;
private SolrApiNodeStatus status; private SolrApiNodeStatus status;
private String tenant; private String tenant;
private long aclId; private long aclId;
private String shardPropertyValue; private String shardPropertyValue;
private Integer explicitShardId;
public long getId() public long getId()
{ {
@@ -99,32 +100,44 @@ public class Node
public void setAclId(long aclId) public void setAclId(long aclId)
{ {
this.aclId = aclId; this.aclId = aclId;
} }
/** /**
* The property value to use for sharding - as requested * The property value to use for sharding - as requested
* *
* @return null - if the node does not have the property, the standard "String" value of the property if it is present on the node. * @return null - if the node does not have the property, the standard "String" value of the property if it is present on the node.
* For dates and datetime properties this will be the ISO formatted datetime. * For dates and datetime properties this will be the ISO formatted datetime.
*/ */
public String getShardPropertyValue() public String getShardPropertyValue()
{ {
return this.shardPropertyValue; return this.shardPropertyValue;
} }
public void setShardPropertyValue(String shardPropertyValue) public void setShardPropertyValue(String shardPropertyValue)
{ {
this.shardPropertyValue = shardPropertyValue; this.shardPropertyValue = shardPropertyValue;
} }
@Override
public String toString() public Integer getExplicitShardId()
{ {
return "Node [id=" + this.id + ", nodeRef=" + this.nodeRef + ", txnId=" + this.txnId return this.explicitShardId;
+ ", status=" + this.status + ", tenant=" + this.tenant + ", aclId=" }
+ this.aclId + ", shardPropertyValue=" + this.shardPropertyValue + "]";
} public void setExplicitShardId(Integer explicitShardId)
{
this.explicitShardId = explicitShardId;
}
@Override
public String toString()
{
return "Node [id=" + this.id + ", nodeRef=" + this.nodeRef + ", txnId=" + this.txnId
+ ", status=" + this.status + ", tenant=" + this.tenant + ", aclId="
+ this.aclId + ", shardPropertyValue=" + this.shardPropertyValue
+ ", explicitShardId=" + this.explicitShardId + "]";
}
} }

View File

@@ -610,6 +610,11 @@ public class SOLRAPIClient
{ {
body.put("shardProperty", parameters.getShardProperty().toString()); body.put("shardProperty", parameters.getShardProperty().toString());
} }
if (parameters.getCoreName() != null){
body.put("coreName", parameters.getCoreName());
}
PostRequest req = new PostRequest(url.toString(), body.toString(), "application/json"); PostRequest req = new PostRequest(url.toString(), body.toString(), "application/json");
@@ -668,6 +673,11 @@ public class SOLRAPIClient
if(jsonNodeInfo.has("shardPropertyValue")) if(jsonNodeInfo.has("shardPropertyValue"))
{ {
nodeInfo.setShardPropertyValue(jsonNodeInfo.getString("shardPropertyValue")); nodeInfo.setShardPropertyValue(jsonNodeInfo.getString("shardPropertyValue"));
}
if(jsonNodeInfo.has("explicitShardId"))
{
nodeInfo.setExplicitShardId(jsonNodeInfo.getInt("explicitShardId"));
} }
if(jsonNodeInfo.has("tenant")) if(jsonNodeInfo.has("tenant"))