Introduction
While working on different Endeca implementations, we come across scenarios where the indexed data does not contain the latest changes or sometimes gets corrupted. Apart from these, there is high possibility of encountering indexing failures. While some indexing failures does not impact the Dgraph whereas others result in failed Dgraphs, thus bringing down the Endeca servers.In many applications, re-indexing data would contain the following steps
1. Identify root cause of data failure and fix this in data import process.
2. Run the data import process. Both ATG and non-ATG applications would consume considerable amount of time during this stage.
3. Initiate BCC deployment
4. Trigger Endeca indexing process.
There are times where data import fix would consume numerous days with stale index data residing on the production environment provided that Dgraph failures are backed up with successful generations. This makes monitoring of generated data consumed by Endeca indexing process super critical.
Endeca Validator Component
Based on CAS recordstore APIs, we can create custom components which can run just before EndecaScriptService and provide essential checks before indexing the raw data from CAS record stores.
We can customize this component to execute the following items before handing over the control to EndecaScriptService
1. Check the count of records being processed. Minimum threshold can be used to keep a track of failed records and cancelling EndecaScriptService if this number is huge.
2. This component can be further enhanced by making use of record store APIs to inspect and validate each and every Endeca records for troubleshooting issues. This is similar to the implementation done in the following link.
http://www.ateam-oracle.com/debugging-cas-with-the-endeca-recordstore-inspector/
3. Like the way, we inspect product catalog using ATG Dyn Admin, this component can be customized to provide CAS record data lookup based on Endeca.Id
No comments:
Post a Comment