alfresco-community-repo/source/java/org/alfresco/repo/node/index/IndexTransactionTracker.java
Dave Ward a7b885a1c6 Merged V3.4-BUG-FIX to HEAD
29057: ALF-9491: Bitrock 7.2.2
   29063: ALF-8766 Concatenated strings in EN webclient file
   29066: Merge DEV/DEV/BELARUS/V3.4-BUG-FIX-2011_07_13 to DEV/V3.4-BUG-FIX
      29010: ALF-7396: Japanese- Untranslated
   29072: HomeFolderProvider work - Changes as a result of Dave Ward's comments
     (HomeFolderManager not fully done as there is a spring issue with using NodeService, FileFolderService, fileFolderService, SearchService or searchService) 
   29074: ALF-7637 - Share displays incorrect folder contents after copy-on-outbound rule against working copy
   29075: ALF-8406 - Configuring the datalist display for sub-types does not work
   29082: ALF-6847 translation: "Collega" should be reverted to English: "Link" as per term list.
   29087: ALF-5717 property names for wcm quickstart website-model had an invalid format or did not end in .description or .title
   29093: Merge V3.3 to DEV/V3.4-BUG-FIX (28596)
      28596: Remove dependency between subsystems and all the object factories in the parent context!
         - Do not allow eager initialization when looking up parent post processors
         - Removes circular dependencies from sysAdmin subsystem
   29094: Merge HEAD to DEV/V3.4-BUG-FIX ()
      28892: Broke circular references between NodeService beans, NodeIndexer, Lucene and back to NodeService.
         - NodeIndexer is now bootstrapped to pull out reference to the Lucene beans
   29100: Revert Merge V3.3 to DEV/V3.4-BUG-FIX (28596) Caused RepositoryStartupTest to fail 
      28596: Remove dependency between subsystems and all the object factories in the parent context!
         - Do not allow eager initialization when looking up parent post processors
         - Removes circular dependencies from sysAdmin subsystem
   29102: ALF-9048: Make apply_amps.bat work from its installed location
   29103: ALF-8746: Restored Japanese choice format translations
   29104: Merged V3.3 to V3.4-BUG-FIX (Reinstated this revision as it is required)
      28596: Remove dependency between subsystems and all the object factories in the parent context!
         - Do not allow eager initialization when looking up parent post processors
         - Removes circular dependencies from sysAdmin subsystem
   29105: Use org.springframework.aop.target.LazyInitTargetSource in the NodeService public proxy to break a circular dependency
   29106: Make PersonService interact with HomeFolderManager via a lazy proxy to prevent another circular dependency
   - Simple HomeFolderManager interface created
   - Implementation class renamed to PortableHomeFolderManager
   - Removed TODOs from authentication-services-context.xml
   29107: Forgot to remove the serviceRegistry dependency from homeFolderManager
   29108: ALF-9529: Installer memory consumption and startup time improvements
   - Bitrock discover the for loop!
   29109: ALF-9530: Postgres installed as Windows service should run as a postgres user, not System
   - Fix from Bitrock
   29118: Fix for ALF-6737 - It's impossible to view any version of the wiki page if it was renamed with native characters
   29119: Fix for ALF-5787 - strings extracted for L10N in Web form creation help text
   29124: ALF-9530: Follow up fix from Bitrock
   29126: Fix for ALF-8344 - Incorrect message is displayed while recover deleted file
   29127: Fix for ALF-9445 - French - Share, translation on Transfer Target configuration
   29129: ALF-9476: Make FTPS work on IBM JDK
   29133: Fix failing DictionaryRestApiTest
   29136: Fix build issues from 29104:
   - run as system when creating home folders (PortableHomeFolderManager)
   - re-factored onCreateNode out of PortableHomeFolderManager into PersonServiceImpl
   - re-factored property PortableHomeFolderManager.enableHomeFolderCreationAsPeopleAreCreated to PersonServiceImpl.homeFolderCreationEager 
   29137: Fix for ALF-8831 - Internal error occurs in My Tasks Webscripts component
   29138: Fix for ALF-8765 - Layout is displaced if translated string occupies more than 1 line
   29140: Fix for ALF-8668 - Deleting author account causes Failed to load the forum posts
   29142: - PortableHomeFolderManager: Moved code to run as System into PersonServiceImpl so that one must have a valid authority to call the publick makeHomeFolder method. The authority should already be valid if called via PersonServiceImpl.
   - Removed unused policyBehaviourFilter property from PersonServiceImpl
   29146: ALF-8701: partially translated string in html-upload.get_fr
   29147: ALF-8727: DE - changes to Root Category
   29149: ALF-8731: DE - Wiki changes (space before full stop)
   29152: ALF-9503: Add space after colon in strings in file wdr-messages.properties
   29153: Fixed ALF-7899: association.ftl does not render when showTargetLink=true in workflow
   29165: ALF-8749: on submit action properties in wcn-workflow-messages.properties
   29166: Fix for ALF-6220 - Language pack - .ftl localization
   29167: ALF-9550 - Typos in new section of webclient.properties
   29169: Fix for ALF-7844 - W3C: Impossible to activate 'Choose from popular tags in this site' link by Enter/Space keys
   29170: Merge V3.4-TEAM to V3.4-BUG-FIX (3.4.4)
      27471: Fix for ALF-8150 - check for visibility before applying focus to element for IE.
   29171: Fixes: ALF-8922, removes date formatting from API (now returns ISO8601) and instead formats it on the client, using L10N strings.
   29172: Fix for ALF-2023 - Repository Action - Copy item to specific space doesn not include content. The option to 'deep copy' is now exposed in the UI for Run Action and Rules in Explorer.
   29173: Fix for ALF-1446 - Sorting of inline descendants is not observed
   29175: ALF-241 - The item is not coppied via 'Paste All' in Shelf when 'Details' page is opened
   29177: Fix for ALF-9520 - confusing sample config. Reordered sample config file as suggested.
   29178: Fixed ALF-6400: GERMAN: Explorer mouse over hints for TinyMCE are not localized
   Fixed ALF-5766: ALL translations errors in Explorer - Calendars are not localizable for content based on webforms
   29202: Merge DEV/BELARUS/V3.4-BUG-FIX-2011_04_12 to V3.4-BUG-FIX (3.4.4)
      27836: ALF-8524: CLONE - Sharepoint doesn't work with HTTPS
         Changes in url links required for HTTPS support.
   29203: Restored removal of postgresCreateSymLinksLinuxBuildingFromWindows tag (32 bit Linux) from revision 26582
   29211: Fix for ALF-1051 - It is impossible to find link by tag from link details page
   29212: Fix for ALF-5301 - TinyMCE is replacing carriage return with white spaces
   29250: Latest L10N update for all languages (DE, ES, FR, IT, JA) from Gloria (based on r29172)
   29253: L10N Update from Gloria
   29270: Fixed ALF-516: Unable to add content/delete tables in webform content when using FireFox
   29271: Update from Gloria
   29272: Merged BRANCHES/DEV/BELARUS/V3.4-BUG-FIX-2011_07_13 to BRANCHES/DEV/V3.4-BUG-FIX: (with minor modification)
      29223: ALF-7619: When document A has an association with a document B editing A's properties fails if user has no permission to edit B
   29274: ALF-9517 Incorrect behaviour of versions on Copy action. Version is 0.1 rather than 1.0
   29283: Resolve ALF-8273: Valid datetime value cannot be parsed by CMIS AtomPub interface
   29284: Update from Gloria
   29286: ALF-9596: Merged PATCHES/V3.4.1 to V3.4-BUG-FIX
      28150: ALF-8607: Detailed debug logging when out of sync transaction detected by index checker / tracker
      28177: ALF-8607: Corrections to debug logging in AbstractReindexComponent
      28213: ALF-8607: Further corrections to debug logging in AbstractReindexComponent
      - Log attributes from indexes, rather than nodeService properties
      28341: ALF-8607: Stop index checker from 'lying'
      - isTxnPresentInIndex() call must be made in a new transaction in order to get a database view in sync with the current indexes
      28352: ALF-8607: Revisit transaction delineation. Nested transaction only required in checkTxnsImpl()
      28403: ALF-8607: Merged PATCHES/V3.3.4 to PATCHES/V3.4.1
         27823: ALF-7237: Index tracker needs to perform a cascade reindex on updated nodes in order to cope with node moves
      28406: ALF-8607: Improvement to FTS fix. Prevent FTS from restoring documents that have been deleted!
      28412: ALF-8607: Invalidate properties and aspects as well as parent assocs when stale cache entry dected during transaction tracking
      28427: ALF-8607: Prevent NPE with bad NodeRef in ADMLuceneIndexerImpl.createDocumentsImpl()
      28705: ALF-8607: Validate transaction IDs when fetching parent associations
      - Compare the cached child node transaction ID against one fetched from the DB
      - Stops us from pairing up the cached node for an older or newer transaction with the wrong parent associations
      28707: ALF-8607: Merged PATCHES/V3.3.4 to PATCHES/V3.4.1
         28588: ALF-7237: Prevent FTS from ever wiping out a document that still exists and ignore duplicates
      28708: ALF-8607: Make FTS capable of recovering from cache concurrency issues by using a RetryingTransactionHelper and better exception handling.
      - Also avoids skipping the entire batch when the reindexing of a particular document fails.
      28710: ALF-8607: Corrected transaction delineation
      28753: ALF-8607: Prevent errors caused by AbstractReindexComponent diagnostics trying to parse FTSREF document IDs as NodeRefs (which they aren't!)
      28755: ALF-8607: When 'failing over' during FTS indexing, don't bother adding a FTS status document so we don't get stuck in a loop with a problematic document
      28815: ALF-8607: Do two way validation of cached / fetched nodes and their parent associations to avoid skew
      - Should resolve problem of tracking moves to the archive store and moves in general
      28862: ALF-8607: Lucene indexers now support 'read through' behaviour for FTS and Index tracking batches
      - Small discrete read only transactions used to read each reindexed node from the database / cache
      - Avoids cache 'drift' and 'skew' after long running indexing transactions
      28863: ALF-8607: Missing file
      28869: ALF-8607: isTxnPresentInIndex() needs to 'read through' so index tracker and checker don't pollute the cache
      28872: ALF-8607: Optimization to prevent constant writing to AVM indexes whilst 'ticking over'.
      28950: ALF-8607: Improved logic in AbstractReindexComponent.isTxnPresentInIndex() so that we can reliably cope with multi-store transactions (e.g. archive store + spaces store)
      - Due to FTS, the txn ID may have 'drifted in' to one store but not the other so we must validate all stores in the txn
      29098: ALF-8607: Use getNodeRefStatus as a cache validation point for reindexing 'read through' transactions
      - Guarantees that FTS reindexed node will see correct state (well if we had consistent read behaviour it would!)
      - Removes stale nodeRef -> ID mappings (e.g. when original node moved to archive store and substituted with deleted node)
      - Inexplicably seems to produce a ~30x speedup in performance tests on MySQL! Appears to remove a contention point. More investigation required to find out what!
   29287: ALF-9598: Merged PATCHES/V3.4.1 to V3.4-BUG-FIX
      28653: ALF-9189: More efficient usage of IndexReaders to avoid huge memory churn with large indexes
      - A single reading thread could block out all other reading threads because a write lock is obtained whilst constructing a set of FilterIndexReaderByStringId readers and all deletions across all indexes have to be evaluated. We now cache a FilterIndexReaderByStringId for each 'layer' of the index so that we get some reuse. We also defer evaluation of deletions to AFTER the write lock is returned and in some cases never have to evaluate the deletions at all.
      - When merging deletions we now make use of a cached index reader for locating the documents, and only resort to a new reader if deletions have to be performed. Hopefully this will mean that the reader for the largest indexes, containing the least recently used stuff, will get left alone most of the time. 
      28690: ALF-9189: Corrections to previous fix
      - Forgot to remove non-lazy reader initialization
      - Fixed NPE
      - Reinstated correct looping behaviour - each processed delta must be considered as one of the indexes to search for the next processed delta
      29099: ALF-9189: Avoid having to allocate a byte array full of number ones for all occurrences of a term to 'fake' norms.
      - Severe Lucene memory hog during FTS
      29262: ALF-9189: Fixed memory leak during index tracking / reindexing and further memory leak regression
      - Fixed up Lucene refcounting again - remember to propagate through decrefs on ReferenceCounting readers
      - Refined ALF-9189 fix to guarantee mainreader clean up
      - Remember to flush the delta during reindexing / tracking
      - Some extra trace diagnostics to help
   29288: ALF-9600: Merged PATCHES/V3.4.1 to V3.4-BUG_FIX
      28876: ALF-9041: Merged HEAD to PATCHES/V3.4.1
         28850: Latest SpringSurf libs
            - Fix to SSO connector passing empty username
   29289: ALF-8241: assemble-tomcat populates endorsed directory with xalan.jar and serializer.jar and Bitrock installer installs these too
   29291: Merged DEV/SWIFT to V3.4-BUG-FIX (3.4.4) - already merged to HEAD as part of a larger merge
      26104: RM: Remove incomplete and unnecessary unit test     
   29302: Fix for ALF-8885 - Unable to paste item due to system error:null


git-svn-id: https://svn.alfresco.com/repos/alfresco-enterprise/alfresco/HEAD/root@29325 c4b6b30b-aa2e-2d43-bbcb-ca4b014f7261
2011-07-25 19:32:17 +00:00

770 lines
30 KiB
Java

/*
* Copyright (C) 2005-2010 Alfresco Software Limited.
*
* This file is part of Alfresco
*
* Alfresco is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* Alfresco is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with Alfresco. If not, see <http://www.gnu.org/licenses/>.
*/
package org.alfresco.repo.node.index;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Date;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import org.alfresco.error.AlfrescoRuntimeException;
import org.alfresco.repo.domain.node.Transaction;
import org.alfresco.repo.transaction.RetryingTransactionHelper;
import org.alfresco.repo.transaction.RetryingTransactionHelper.RetryingTransactionCallback;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.springframework.extensions.surf.util.ISO8601DateFormat;
/**
* Component to check and recover the indexes.
*
* @author Derek Hulley
*/
public class IndexTransactionTracker extends AbstractReindexComponent
{
private static Log logger = LogFactory.getLog(IndexTransactionTracker.class);
private IndexTransactionTrackerListener listener;
private NodeIndexer nodeIndexer;
private long maxTxnDurationMs;
private long reindexLagMs;
private int maxRecordSetSize;
private int maxTransactionsPerLuceneCommit;
private boolean disableInTransactionIndexing;
private boolean started;
private List<Long> previousTxnIds;
private Long lastMaxTxnId;
private long fromTimeInclusive;
private Map<Long, TxnRecord> voids;
private boolean forceReindex;
private long fromTxnId;
private String statusMsg;
private static final String NO_REINDEX = "No reindex in progress";
/**
* Set the defaults.
* <ul>
* <li><b>Maximum transaction duration:</b> 1 hour</li>
* <li><b>Reindex lag:</b> 1 second</li>
* <li><b>Maximum recordset size:</b> 1000</li>
* <li><b>Maximum transactions per Lucene commit:</b> 100</li>
* <li><b>Disable in-transaction indexing:</b> false</li>
* </ul>
*/
public IndexTransactionTracker()
{
maxTxnDurationMs = 3600L * 1000L;
reindexLagMs = 1000L;
maxRecordSetSize = 1000;
maxTransactionsPerLuceneCommit = 100;
disableInTransactionIndexing = false;
started = false;
previousTxnIds = Collections.<Long>emptyList();
lastMaxTxnId = Long.MAX_VALUE;
fromTimeInclusive = -1L;
voids = new TreeMap<Long, TxnRecord>();
forceReindex = false;
fromTxnId = 0L;
statusMsg = NO_REINDEX;
}
public synchronized void setListener(IndexTransactionTrackerListener listener)
{
this.listener = listener;
}
public void setNodeIndexer(NodeIndexer nodeIndexer)
{
this.nodeIndexer = nodeIndexer;
}
/**
* Set the expected maximum duration of transaction supported. This value is used to adjust the
* look-back used to detect transactions that committed. Values must be greater than zero.
*
* @param maxTxnDurationMinutes the maximum length of time a transaction will take in minutes
*
* @since 1.4.5, 2.0.5, 2.1.1
*/
public void setMaxTxnDurationMinutes(long maxTxnDurationMinutes)
{
if (maxTxnDurationMinutes < 1)
{
throw new AlfrescoRuntimeException("Maximum transaction duration must be at least one minute.");
}
this.maxTxnDurationMs = maxTxnDurationMinutes * 60L * 1000L;
}
/**
* Transaction tracking should lag by the average commit time for a transaction. This will minimize
* the number of holes in the transaction sequence. Values must be greater than zero.
*
* @param reindexLagMs the minimum age of a transaction to be considered by
* the index transaction tracking
*
* @since 1.4.5, 2.0.5, 2.1.1
*/
public void setReindexLagMs(long reindexLagMs)
{
if (reindexLagMs < 1)
{
throw new AlfrescoRuntimeException("Reindex lag must be at least 1 millisecond.");
}
this.reindexLagMs = reindexLagMs;
}
/**
* Set the number of transactions to request per query.
*/
public void setMaxRecordSetSize(int maxRecordSetSize)
{
this.maxRecordSetSize = maxRecordSetSize;
}
/**
* Set the number of transactions to process per Lucene write.
* Larger values generate less contention on the Lucene IndexInfo files.
*/
public void setMaxTransactionsPerLuceneCommit(int maxTransactionsPerLuceneCommit)
{
this.maxTransactionsPerLuceneCommit = maxTransactionsPerLuceneCommit;
}
/**
* Enable or disabled in-transaction indexing. Under certain circumstances, the system
* can run with only index tracking enabled - in-transaction indexing is not always
* required. The {@link NodeIndexer} is disabled when this component initialises.
*/
public void setDisableInTransactionIndexing(boolean disableInTransactionIndexing)
{
this.disableInTransactionIndexing = disableInTransactionIndexing;
}
/**
* @return Returns <tt>false</tt> always. Transactions are handled internally.
*/
@Override
protected boolean requireTransaction()
{
return false;
}
/** Worker callback for transactional use */
RetryingTransactionCallback<Long> getStartingCommitTimeWork = new RetryingTransactionCallback<Long>()
{
public Long execute() throws Exception
{
return getStartingTxnCommitTime();
}
};
/** Worker callback for transactional use */
RetryingTransactionCallback<Boolean> reindexWork = new RetryingTransactionCallback<Boolean>()
{
public Boolean execute() throws Exception
{
return reindexInTransaction();
}
};
public void resetFromTxn(long txnId)
{
if (logger.isInfoEnabled())
{
logger.info("resetFromTxn: " + txnId);
}
this.fromTxnId = txnId;
this.started = false; // this will cause index tracker to break out (so that it can be re-started)
}
@Override
protected void reindexImpl()
{
if (logger.isInfoEnabled())
{
logger.info("reindexImpl started: " + this);
}
RetryingTransactionHelper retryingTransactionHelper = transactionService.getRetryingTransactionHelper();
if (!started)
{
// Disable in-transaction indexing
if (disableInTransactionIndexing && nodeIndexer != null)
{
logger.warn("In-transaction indexing is being disabled.");
nodeIndexer.setEnabled(false);
}
// Make sure that we start clean
voids.clear();
previousTxnIds = new ArrayList<Long>(maxRecordSetSize);
lastMaxTxnId = null; // So that it is ignored at first
if (this.fromTxnId != 0L)
{
if (logger.isInfoEnabled())
{
logger.info("reindexImpl: start fromTxnId: " + fromTxnId);
}
Long fromTxnCommitTime = getTxnCommitTime(this.fromTxnId);
if (fromTxnCommitTime == null)
{
return;
}
fromTimeInclusive = fromTxnCommitTime;
}
else
{
fromTimeInclusive = retryingTransactionHelper.doInTransaction(getStartingCommitTimeWork, true, true);
}
fromTxnId = 0L;
started = true;
if (logger.isInfoEnabled())
{
logger.info(
"reindexImpl: start fromTimeInclusive: " +
ISO8601DateFormat.format(new Date(fromTimeInclusive)));
}
}
while (true)
{
Boolean repeat = retryingTransactionHelper.doInTransaction(reindexWork, true, true);
// Only break out if there isn't any more work to do (for now)
if (repeat == null || repeat.booleanValue() == false)
{
break;
}
}
// Wait for the asynchronous reindexing to complete
waitForAsynchronousReindexing();
if (logger.isTraceEnabled())
{
logger.trace("reindexImpl: completed: "+this);
}
statusMsg = NO_REINDEX;
}
private Long getTxnCommitTime(final long txnId)
{
RetryingTransactionHelper retryingTransactionHelper = transactionService.getRetryingTransactionHelper();
RetryingTransactionCallback<Long> getTxnCommitTimeWork = new RetryingTransactionCallback<Long>()
{
public Long execute() throws Exception
{
Transaction txn = nodeDAO.getTxnById(txnId);
if (txn != null)
{
return txn.getCommitTimeMs();
}
logger.warn("Txn not found: "+txnId);
return null;
}
};
return retryingTransactionHelper.doInTransaction(getTxnCommitTimeWork, true, true);
}
/**
* @return Returns <tt>true</tt> if the reindex process can exit otherwise <tt>false</tt> if
* a new transaction should be created and the process kicked off again
*/
private boolean reindexInTransaction()
{
List<Transaction> txns = null;
long toTimeExclusive = System.currentTimeMillis() - reindexLagMs;
// Check that the voids haven't been filled
long minLiveVoidTime = checkVoids();
if (minLiveVoidTime <= fromTimeInclusive)
{
// A void was discovered.
// We need to adjust the search time for transactions, i.e. hop back in time but
// this also entails a full build from that point on. So all previous transactions
// need to be reindexed.
fromTimeInclusive = minLiveVoidTime;
previousTxnIds.clear();
}
// get next transactions to index
txns = getNextTransactions(fromTimeInclusive, toTimeExclusive, previousTxnIds);
// If there are no transactions, then all the work is done
if (txns.size() == 0)
{
// We have caught up.
// There is no need to force reindexing until the next unindex transactions appear.
forceReindex = false;
return false;
}
statusMsg = String.format(
"Reindexing batch of %d transactions from %s (txnId=%s)",
txns.size(),
(new Date(fromTimeInclusive)).toString(),
txns.isEmpty() ? "---" : txns.get(0).getId().toString());
if (logger.isDebugEnabled())
{
logger.debug(statusMsg);
}
// Reindex the transactions. Voids between the last set of transactions and this
// set will be detected as well. Additionally, the last max transaction will be
// updated by this method.
long maxProcessedTxnCommitTime = reindexTransactions(txns);
// Call the listener
synchronized (this)
{
if (listener != null)
{
listener.indexedTransactions(fromTimeInclusive, maxProcessedTxnCommitTime);
}
}
// Move the time on.
// The next fromTimeInclusive may well pull back transactions that have just been
// processed. But we keep track of those and exclude them from the results.
if (fromTimeInclusive == maxProcessedTxnCommitTime)
{
// The time didn't advance. If no new transaction appear, we could spin on
// two or more transactions with the same commit time. So we DON'T clear
// the list of previous transactions and we allow them to live on.
}
else
{
// The processing time has moved on
fromTimeInclusive = maxProcessedTxnCommitTime;
previousTxnIds.clear();
}
for (Transaction txn : txns)
{
previousTxnIds.add(txn.getId());
}
if (isShuttingDown() || (! started))
{
// break out if the VM is shutting down or tracker has been reset (ie. !started)
return false;
}
else
{
// There is more work to do and we should be called back right away
return true;
}
}
public String getReindexStatus()
{
return statusMsg;
}
private static final long ONE_HOUR_MS = 3600*1000;
/**
* Find a transaction time to start indexing from (inclusive). The last recorded transaction by ID
* is taken and the max transaction duration substracted from its commit time. A transaction is
* retrieved for this time and checked for indexing. If it is present, then that value is chosen.
* If not, a step back in time is taken again. This goes on until there are no more transactions
* or a transaction is found in the index.
*/
protected long getStartingTxnCommitTime()
{
long now = System.currentTimeMillis();
// Get the last indexed transaction for all transactions
long lastIndexedAllCommitTimeMs = getLastIndexedCommitTime(now, false);
// Now check back from this time to make sure there are no remote transactions that weren't indexed
long lastIndexedRemoteCommitTimeMs = getLastIndexedCommitTime(now, true);
// The one to start at is the least of the two times
long startTime = Math.min(lastIndexedAllCommitTimeMs, lastIndexedRemoteCommitTimeMs);
// Done
// Make sure we recheck any voids
return startTime - maxTxnDurationMs;
}
/**
* Gets the commit time for the last indexed transaction. If there are no transactions, then the
* current time is returned.
*
* @param maxCommitTimeMs the largest commit time to consider
* @param remoteOnly <tt>true</tt> to only look at remotely-committed transactions
* @return Returns the last indexed transaction commit time for all or
* remote-only transactions.
*/
private long getLastIndexedCommitTime(long maxCommitTimeMs, boolean remoteOnly)
{
// Look back in time by the maximum transaction duration
long maxToTimeExclusive = maxCommitTimeMs - maxTxnDurationMs;
long toTimeExclusive = maxToTimeExclusive;
long fromTimeInclusive = 0L;
double stepFactor = 1.0D;
boolean firstWasInIndex = true;
found:
while (true)
{
// Get the most recent transaction before the given look-back
List<Transaction> nextTransactions = nodeDAO.getTxnsByCommitTimeDescending(
0L,
toTimeExclusive,
1,
null,
remoteOnly);
// There are no transactions in that time range
if (nextTransactions.size() == 0)
{
break found;
}
// We found a transaction
Transaction txn = nextTransactions.get(0);
long txnCommitTime = txn.getCommitTimeMs();
// Check that it is in the index
InIndex txnInIndex = isTxnPresentInIndex(txn);
switch (txnInIndex)
{
case YES:
fromTimeInclusive = txnCommitTime;
break found;
case INDETERMINATE:
// If we hit an indeterminate transaction we go back a small amount to try and hit something definitive before a bigger step back
firstWasInIndex = false;
toTimeExclusive = txnCommitTime - 1000;
continue;
default:
firstWasInIndex = false;
// Look further back in time. Step back by 60 seconds each time, increasing
// the step by 10% each iteration.
// Don't step back by more than an hour
long decrement = Math.min(ONE_HOUR_MS, (long) (60000.0D * stepFactor));
toTimeExclusive = txnCommitTime - decrement;
stepFactor *= 1.1D;
continue;
}
}
// If the last transaction (given the max txn duration) was in the index, then we used the
// maximum commit time i.e. the indexes were up to date up until the most recent time.
if (firstWasInIndex)
{
return maxToTimeExclusive;
}
else
{
return fromTimeInclusive;
}
}
private static final int VOID_BATCH_SIZE = 100;
/**
* Voids - otherwise known as 'holes' - in the transaction sequence are timestamped when they are
* discovered. This method discards voids that were timestamped before the given date. It checks
* all remaining voids, passing back the transaction time for the newly-filled void. Otherwise
* the value passed in is passed back.
*
* @return Returns an adjused start position based on any voids being filled
* or <b>Long.MAX_VALUE</b> if no new voids were found
*/
private long checkVoids()
{
long maxHistoricalTime = (fromTimeInclusive - maxTxnDurationMs);
long fromTimeAdjusted = Long.MAX_VALUE;
List<Long> toExpireTxnIds = new ArrayList<Long>(1);
Iterator<Long> voidTxnIdIterator = voids.keySet().iterator();
List<Long> voidTxnIdBatch = new ArrayList<Long>(VOID_BATCH_SIZE);
while (voidTxnIdIterator.hasNext())
{
Long voidTxnId = voidTxnIdIterator.next();
// Add it to the batch
voidTxnIdBatch.add(voidTxnId);
// If the batch is full or if there are no more voids, fire the query
if (voidTxnIdBatch.size() == VOID_BATCH_SIZE || !voidTxnIdIterator.hasNext())
{
if (logger.isDebugEnabled())
{
logger.debug("Checking void txn batch " + voidTxnIdBatch);
}
List<Transaction> filledTxns = nodeDAO.getTxnsByCommitTimeAscending(voidTxnIdBatch);
for (Transaction txn : filledTxns)
{
InIndex inIndex;
if (txn.getCommitTimeMs() == null) // Just coping with Hibernate mysteries
{
continue;
}
else if ((inIndex = isTxnPresentInIndex(txn, true)) != InIndex.NO)
{
if (logger.isDebugEnabled())
{
logger.debug("Expiring indexed void txn " + txn.getId() + " (inIndex = " + inIndex + ")");
}
// It is in the index so expire it from the voids.
// This can happen if void was committed locally.
toExpireTxnIds.add(txn.getId());
}
else
{
if (logger.isDebugEnabled())
{
logger.debug("Found unindexed void txn " + txn.getId() + " (inIndex = " + inIndex + ")");
}
// It's not in the index so we have a timespamp from which to kick off
// It is a bone fide first transaction. A void has been filled.
long txnCommitTimeMs = txn.getCommitTimeMs().longValue();
// If the value is lower than our current one we keep it
if (txnCommitTimeMs < fromTimeAdjusted)
{
fromTimeAdjusted = txnCommitTimeMs;
}
// The query selected them in timestamp order so there is no need to process
// the remaining transactions in this batch - we have our minimum.
break;
}
}
// Wipe the batch clean
voidTxnIdBatch.clear();
}
// Check if the void must be expired or not
TxnRecord voidTxnRecord = voids.get(voidTxnId);
if (voidTxnRecord.txnCommitTime < maxHistoricalTime)
{
if (logger.isDebugEnabled())
{
logger.debug("Expiring void txn " + voidTxnId + " ("
+ (maxHistoricalTime - voidTxnRecord.txnCommitTime) + " ms too old)");
}
// It's too late for this void whether or not it has become live
toExpireTxnIds.add(voidTxnId);
}
}
// Throw away all the expired or removable voids
int voidCountBefore = voids.size();
for (Long toRemoveTxnId : toExpireTxnIds)
{
voids.remove(toRemoveTxnId);
}
int voidCountAfter = voids.size();
if (logger.isDebugEnabled() && voidCountBefore != voidCountAfter)
{
logger.debug("Void count " + voidCountBefore + " -> " + voidCountAfter);
}
// Done
if (logger.isDebugEnabled() && fromTimeAdjusted < Long.MAX_VALUE)
{
logger.debug("Returning to void time " + fromTimeAdjusted);
}
return fromTimeAdjusted;
}
private List<Transaction> getNextTransactions(long fromTimeInclusive, long toTimeExclusive, List<Long> previousTxnIds)
{
List<Transaction> txns = nodeDAO.getTxnsByCommitTimeAscending(
fromTimeInclusive,
toTimeExclusive,
maxRecordSetSize,
previousTxnIds,
false);
// done
return txns;
}
/**
* Checks that each of the transactions is present in the index. As soon as one is found that
* isn't, all the following transactions will be reindexed. After the reindexing, the sequence
* of transaction IDs will be examined for any voids. These will be recorded.
*
* @param txns transactions ordered by time ascending
* @return returns the commit time of the last transaction in the list
* @throws IllegalArgumentException if there are no transactions
*/
private long reindexTransactions(List<Transaction> txns)
{
if (txns.isEmpty())
{
throw new IllegalArgumentException("There are no transactions to process");
}
// Determines the window for void retention
long now = System.currentTimeMillis();
long oldestVoidRetentionTime = (now - maxTxnDurationMs);
// Keep an ordered map of IDs that we process along with their commit times
Map<Long, TxnRecord> processedTxnRecords = new TreeMap<Long, TxnRecord>();
List<Long> txnIdBuffer = new ArrayList<Long>(maxTransactionsPerLuceneCommit);
Iterator<Transaction> txnIterator = txns.iterator();
while (txnIterator.hasNext())
{
Transaction txn = txnIterator.next();
Long txnId = txn.getId();
Long txnCommitTimeMs = txn.getCommitTimeMs();
if (txnCommitTimeMs == null)
{
// What? But let's be cautious and treat this as a void
continue;
}
// Keep a record of it
TxnRecord processedTxnRecord = new TxnRecord();
processedTxnRecord.txnCommitTime = txnCommitTimeMs;
processedTxnRecords.put(txnId, processedTxnRecord);
// Remove this entry from the void list - it is not void
boolean previouslyVoid = voids.remove(txnId) != null;
// Reindex the transaction if we are forcing it or if it isn't in the index already
InIndex inIndex = InIndex.INDETERMINATE;
if (forceReindex || (inIndex = isTxnPresentInIndex(txn, true)) == InIndex.NO)
{
// From this point on, until the tracker has caught up, all transactions need to be indexed
forceReindex = true;
// Add the transaction to the buffer of transactions that need processing
txnIdBuffer.add(txnId);
if (logger.isDebugEnabled())
{
if (previouslyVoid)
{
logger.debug("Reindexing previously void transaction: " + txn + " (inIndex = " + inIndex + ")");
}
else
{
logger.debug("Reindexing transaction: " + txn + " (inIndex = " + inIndex + ")");
}
}
}
else
{
if (logger.isDebugEnabled())
{
if (previouslyVoid)
{
logger.debug("Reindex skipping previously void transaction: " + txn + " (inIndex = " + inIndex
+ ")");
}
else
{
logger.debug("Reindex skipping transaction: " + txn + " (inIndex = " + inIndex + ")");
}
}
}
if (isShuttingDown() || (! started))
{
// break out if the VM is shutting down or tracker has been reset (ie. !started)
break;
}
// Flush the reindex buffer, if it is full or if we are on the last transaction and there are no more
if (txnIdBuffer.size() >= maxTransactionsPerLuceneCommit || (!txnIterator.hasNext() && txnIdBuffer.size() > 0))
{
try
{
// We try the reindex, but for the sake of continuity, have to let it run on
reindexTransactionAsynchronously(txnIdBuffer, false);
}
catch (Throwable e)
{
logger.warn("\n" +
"Reindex of transactions failed: \n" +
" Transaction IDs: " + txnIdBuffer + "\n" +
" Error: " + e.getMessage(),
e);
}
// Clear the buffer
txnIdBuffer = new ArrayList<Long>(maxTransactionsPerLuceneCommit);
}
}
// Use the last ID from the previous iteration as our starting point
Long lastId = lastMaxTxnId;
long lastCommitTime = -1L;
// Walk the processed txn IDs
for (Map.Entry<Long, TxnRecord> entry : processedTxnRecords.entrySet())
{
Long processedTxnId = entry.getKey();
TxnRecord processedTxnRecord = entry.getValue();
boolean voidsAreYoungEnough = processedTxnRecord.txnCommitTime >= oldestVoidRetentionTime;
if (lastId != null && voidsAreYoungEnough)
{
int voidCount = 0;
// Iterate BETWEEN the last ID and the current one to find voids
// Only enter the loop if the current upper limit transaction is young enough to
// consider for voids.
for (long i = lastId.longValue() + 1; i < processedTxnId; i++)
{
// The voids are optimistically given the same transaction time as transaction with the
// largest ID. We only bother w
TxnRecord voidRecord = new TxnRecord();
voidRecord.txnCommitTime = processedTxnRecord.txnCommitTime;
voids.put(new Long(i), voidRecord);
voidCount++;
}
if (logger.isDebugEnabled()&& voidCount > 0)
{
logger.debug("Voids detected: " + voidCount + " in range [" + lastId + ", " + processedTxnId + "]");
}
}
lastId = processedTxnId;
lastCommitTime = processedTxnRecord.txnCommitTime;
}
// Having searched for the nodes, we've recorded all the voids. So move the lastMaxTxnId up.
lastMaxTxnId = lastId;
// Done
return lastCommitTime;
}
private class TxnRecord
{
private long txnCommitTime;
@Override
public String toString()
{
StringBuilder sb = new StringBuilder(128);
sb.append("TxnRecord")
.append("[time=").append(txnCommitTime <= 0 ? "---" : new Date(txnCommitTime))
.append("]");
return sb.toString();
}
}
/**
* A callback that can be set to provide logging and other record keeping
*
* @author Derek Hulley
* @since 2.1.4
*/
public interface IndexTransactionTrackerListener
{
void indexedTransactions(long fromTimeInclusive, long toTimeExclusive);
}
}