Understanding Reconciliation

This topic will explain the reconciliation workflow and process as applicable to Smarsh.

What is Data Reconciliation?


Data reconciliation is the process in which ingested data in the target system is compared and verified with the transmitted data from the system of origin. Reconciliation for Smarsh Enterprise data involves tracking a message lifecycle across Smarsh Capture, Archive, and other Apps to ensure completeness at each step.

Understanding Enterprise Data Flow

Before we can understand the reconciliation concept, it is important to know the message lifecycle in the Smarsh enterprise platform. The following workflow depicts how data is sent from the Capture solutions and the various states that are possible on successful archiving.

images/download/attachments/132938204/enterprise_Data_flow.png

  1. Channels such as email, IM, Collab, and mobile are first captured during the Capture process.

  2. This data is then sent to Enterprise Archive by a process known as Ingestion.
    Enterprise Archive users can use the Data Ingestion API to ingest messages into Enterprise Archive. For more information on how to perform ingestion, refer Data Ingestion API.

  3. The Ingestion Pipeline is responsible for handling any failures or issues that occur during the ingestion process.

  4. On successful ingestion, the various data types are now archived and discoverable and can be searched for using Enterprise Archive search. This process occurs successfully in the majority of the cases (>99%) within a few hours of ingestion.

  5. However, this process could fail in certain cases, which are constantly changing based on the circumstances. Some known instances are listed below:

    1. Large messages or messages containing large attachments.

    2. Messages containing a large number of participants or DLs (distribution lists), that could lead to extraction issues for participant data.

    3. IM data containing very large threads, or number of interactions.

What is the Reconciliation Timeline?


The ingestion process and reconciliation timeline can be explained as follows:

  1. To ensure no delays on the normal ingestion process, messages in the ingestion pipeline can take up to 3 days until they can be marked as archived.

  2. Messages that are not ingested for 3 days will be marked as Unreconciled in the Recon dashboard.

  3. Failure messages are now processed using reconciliation remediation workflows.

  4. Reconciliation workflows similar to Ingestion workflows are constantly evolving to account for message size, attachment type, and so on.

    Important

    During the alternate processing pipelines stage, no action is required from the customer. The retry mechanism cannot be influenced manually at this stage by any external means.

  5. Since the reconciliation process runs overnights for messages that are sent up to 3 days ago, issues that are reported could include the following:

    1. Ingestion failures: Messages that have failed in the ingestion pipeline, and the alternate processing pipelines. These messages are further reviewed individually for resolution and retry errors.

    2. Archive Failures: Messages that have failed in Enterprise Archive's storage or indexing process. This is usually caused due to service health issues in Enterprise Archive, and these messages will be retried once the service health is restored.

    3. Reporting Failures: Messages that are successfully archived, however not updated in the proper logs. There are scheduled scripts that identify and resolve these reporting failures by validating if the message has been successfully archived.

What is the number reported on the Insights Dashboard?


The number that is reported represents the failures messages where:

  • From: Time when ingestion began.

  • To: Up to 3 days ago.

How long is the Insights Dashboard numbers preserved?


The Insights Dashboards numbers will be preserved forever, while the data is not purged whereas is cumulative. However, the number of failure messages will reduce due to resolution in the alternate processing pipelines.

What is the process to address failure messages reported in Insights Dashboard?


Internal tickets are opened by Smarsh support to track the resolution of failure messages. Customers do not need to report or raise tickets for failure messages.

Important

Customers may choose to review Insights Dashboard numbers on a weekly/monthly basis. In case of large number of unreconciled messages (>10000), customers can request a customer facing investigation be initiated.

What is Reconciliation for Smarsh Enterprise data?


The data reconciliation process in Smarsh compares message numbers across the message lifecycle, while surfacing any delays in processing.

What are the Common Reasons for Unreconciled Items?


The common reasons for unreconciled items are listed below during the Capture and Archive process.

Capture Unreconciled

  • Network issues from the data source.

  • Incomplete metadata in messages captured from the source.

  • Message characters which cannot be processed due to various reasons such as issues with control characters, different charsets, etc.

  • Deviation from the defined schema during XML data generation.

  • Ingestion endpoint being non - responsive to the incoming payload.

  • Ingestion payload exceeds the size limits on Enterprise Archive.

Archive Unreconciled

  • Extraction issues with participant data, attachments, etc.

  • File and attachment size limitations.

  • Very large threads containing large number of interactions.