Enabling Historical Data Import
The Historical Data Import feature allows you to seamlessly integrate past employee information into your system. This functionality ensures a comprehensive record of your workforce, providing valuable insights into past and present employee data. This topic outlines the functionalities and steps involved in utilizing Historical Data Import using CSV files with the "Start Time" field. The "Start Time" allows you to specify the validity period for each employee record, providing a more holistic view of your employee data over time.
This feature is behind a feature flag. Contact Smarsh support to enable this feature.
Why Include Start Time in Import CSV?
Start Time provides a way to associate specific dates with participant data (e.g., groups, attributes) in your CSV files. This is particularly useful for:
Handling Email Address Reuse: Companies often reuse email addresses over time. The Start Time attributes ensure accurate participant identification based on specific points in time, eliminating confusion during historical data imports.
Tracking Employee Profile Transitions: Employee profiles evolve due to promotions, name changes, and department shifts. Start Time allows you to capture these historical transitions, enriching your data analysis and enabling more granular access control for your content.
The table below outlines several relevant use cases for including Start Time within the Enterprise Archive import process. It details the associated context, the challenges encountered with traditional data ingestion, and how the inclusion of a "Start Time" attribute in the import CSV addresses these issues.
Use Case |
Context |
Challenge |
Solution |
Email Address Reuse |
Companies often assign email addresses based on a naming convention (e.g., <firstname>@<company.com> , <lastname>.<initials>@<company.com> ). This can lead to duplicate email addresses when multiple employees share similar names. While companies may implement strategies to differentiate these addresses (e.g., john.smith@company.com and john.smith.01@company.com ), these solutions may not account for past employees with the same name. |
Existing data ingestion methods in Enterprise Archive might struggle to pinpoint the correct employee associated with an email address in historical data (Tier 2 data) due to email address reuse over time. Customers have no way to specify which employee used a specific email address during a particular period. |
Customers can import historical employee data along with a "Start Time" attribute. This attribute specifies the date and time from which a specific employee record becomes valid. This enables Enterprise Archive to differentiate between employees who shared the same email address at different points in history. |
Employee Profile Transitions |
Some organizations meticulously maintain historical employee data, including details like promotions, department changes, and name modifications. This information is crucial for applying content restrictions based on user groups or custom attributes. |
Existing historical data processing methods in Enterprise Archive might not capture these employee profile transitions. This makes it difficult to associate content with the appropriate employee based on their role or status at a specific point in time. |
Customers can import historical employee data with various attributes alongside a "Start Time". These attributes could include group memberships, custom classifications, or other details relevant to content access control. By associating these attributes with specific timeframes, Enterprise Archive can accurately determine the appropriate access permissions for each piece of historical content based on the employee's profile at the time it was created or interacted with. |
Importing Historical Data
Import via API: Historical data import is currently available exclusively through a dedicated API endpoint. Detailed API documentation outlining Historical CSV import functionalities is available at Historical Import of Participants using a CSV file.
Specifying Start Times: Include a "Start Time" column in your CSV file. This column should be formatted as MM/DD/YYYY HH:MM:SS.SSS in UTC. For example, 12/31/2010 09:15:10.000
Refer the Sample import participant CSV file for more information.
Error Handling
Rows with invalid Start Time formats or no Start Time will be rejected during upload.
Uploads containing both valid and invalid entries will be partially processed. The error message will identify the specific rows and columns containing invalid data, helping to troubleshoot.