Change Tracking

<< Click to Display Table of Contents >>

Navigation:  Data >

Change Tracking

An agent can keep track of the latest changes that have been made to extracted data.

 

The agent will mark extracted data as deleted, modified or added. If data is deleted but later returned, the data will be marked as "returned", or "returned modified" if the data returned in a modified state.

 

An agent can be configured to only export data that has changed since last successful run, or only export data that has changed since since a specified number of days.

 

Data will be marked as deleted if it was extracted last time the agent ran, but not during the current run. Data will only be marked as deleted if an agent completes successfully. This prevents data from being incorrectly marked as deleted if an agent fails halfway through a run. The Success Criteria options are used to define when an agent run should be considered successful.

 

Default change tracking can be enabled on the Internal Database window which is available from the Run menu in Content Grabber.

 

internalDatabaseChangeTracking
Internal database window.

 

Change tracking can be fine-tuned by setting the advanced change tracking options. These options can be set by clicking the Configure Change Tracking button.

 

changeTrackingOptions
Advanced change tracking options.

 

The following advanced change tracking options are available.

 

Option

Description

Enabled

Enables or disables change tracking.

Export Method

Specifies how to export data when an agent is not exporting historical data to a database. The option Historical Data Export Method is used instead when exporting historical data to a database. The following options are available:

 

Export All. Exports all data no matter if the data has changed or not.

 

Since Last Successful Run. Exports all data that has changed since last successful run.

 

Since Number of Days. Exports all data that has changed since a specified number of days.

Export Method Days

When Export Method is set to Since Number of Days, only data that has changed since the specified time period is exported.

Historical Data Export Method

This option is used instead of Export Method when exporting historical data to a database. The following options are available:

 

All Data. Exports all data no matter if the data has changed or not.

 

Changed Data Only. Exports all data that has changed since last agent run.

Track Deletes

Specifies if an agent should track deleted data.

 

If an agent does not track deleted data, the last change status will not change for data that was not found in the last successful agent run.

 

If an agent is tracking deleted data, the last change status will be set to Deleted for data that was not found in the last successful agent run.

Last Change Enabled

Exports the type of change that was last made to a data row.

Last Change Column Name

The name of the data column where the type of change is stored.

Change Date Enabled

Exports the date a data row was last changed.

Change Date Column Name

The name of the data column where the change date is stored.

Insert Date Enabled

Exports the date a data row was first inserted. This is the date the data was first extracted.

Insert Date Column Name

The name of the data column where the insert date is stored.

Update Date Enabled

Exports the date a data row was last processed. This is the date the data was last extracted and compared to existing data. Notice that data may not have changed at this date.

Update Date Column Name

The name of the data column where the update date is stored.

Identifier Enabled

Exports the object identifier used in the internal database. This value uniquely identifies the data row and will not change unless the internal database is recreated.

Identifier Column Name

The name of the data column where the object identifier is stored.

Columns Affected Enabled

Exports a column that contains the names of columns affected by a change.

Columns Affected Column Name

The name of the data column where the columns affected value is stored.

Changed Last Run Enabled

Exports a value indicating if a data row changed last time the agent was run.

Changed Last Run Column Name

The name of the data column to store the value indicating if a data row changed last time an agent was run.

 

 

Key Columns

When an agent exports data, it compares extracted data with existing data to see if the data has been added, changed or deleted. In order to see if a data entry has changed, the agent needs to be able to uniquely identify a data entry. By default, an agent uses all captured data to identify a data entry, but this means that every time any data changes, a data entry is always identified as a new data entry because it never matches any existing data entry.

 

To get change tracking working properly, so the agent correctly identifies modified data entries, it's important to mark capture commands that extracts data that uniquely identifies a data entry. For example, when extracting product data, a website may display a product ID that a capture command can extract and the agent use as a unique identifier. Set the command option Key Column to mark a capture command as a command that extracts data that can be used to uniquely identify a data entry.

 

changeTrackingKeyColumn
Key Column command option.

 

Multiple capture commands can be marked as key columns to combine extracted data from multiple commands into a value that uniquely identifies a data entry.

 

Each container command that is configured to generate a separate data table should have one ore more capture commands that are marked as key columns.

 

Exclude From Change Tracking

Capture commands can be excluded from change tracking, so if the captured data changes, it will not cause the last change status for the data entry to change.

 

changeTrackingExclude

Exclude captured data from change tracking.