Upgrading to Content Grabber 2

<< Click to Display Table of Contents >>

Navigation:  Introduction >

Upgrading to Content Grabber 2

Content Grabber 2 can run alongside Content Grabber 1 on the same computer using the same license. This makes it easier to upgrade from Content Grabber 1 to Content Grabber 2, since new agents can be built in and run in Content Grabber 2, while existing agents can keep running in Content Grabber 1 until they have been converted to Content Grabber 2.

 

Content Grabber 2 is based on the Chrome browser engine, while Content Grabber 1 is based on Internet Explorer, so some websites will display differently in the two versions, and an agent created in Content Grabber 1 may need to be modified  to work properly in Content Grabber 2. However, most Content Grabber 1 agents will work in Content Grabber 2 without any modifications required.

 

Some Content Grabber 1 agents using specialized action configurations will need to be modified to work properly in Content Grabber 2, since Activities have been removed and replaced with simpler options.

 

Some Content Grabber 1 agents using specialized export configurations will need to be modified to work properly in Content Grabber 2, since export options have changed so they now work on the commands where the options are configured, rather than sub-commands. Content Grabber 2 will automatically make most required configuration changes to a Content Grabber 1 agent, but may not be able to get the configuration completely right.

 

When upgrading a Content Grabber 1 agent to Content Grabber2, it's strongly recommended to save the converted agent to a new location, so the old agent is not overwritten. The default agent location for Content Grabber 2 is My Documents\Content Grabber 2\Agents.

 

Content Grabber 2 includes the following new features and changes.

 

Web Browser

Chrome based web browser

oThe embedded web browser is now based on Chrome and is completely self-contained, so no existing web browser is required on the computer where Content Grabber is running, and agent behavior no longer depends on existing browser configuration on the computer.

Edge Selections

oClick on the edge of a web element to select its parent. This is particular convenient when selecting web elements such as table rows, which cannot be selected directly in the web browser because they are completely covered by table cells.

 

Agents

Data retention

oNew data retention options. Old data can now be deleted, kept and exported, or kept for duplicate checks only. New default duplicate scripts can be used to copy old data to the current data set, so you end up with a complete current data set. Previously, using duplicate scripts would always make it impossible to know when data was deleted from the target website.

Change tracking

oAn agent can now keep track of the latest changes that have been made to extracted data. The agent will mark extracted data as deleted, modified or added. If data is deleted but later returned, the data will be marked as "returned", or "returned modified" if the data returned in a modified state. An agent can be configured to only export data changes since last successful run, or since a specified number of days.

oData will only be marked as deleted if an agent run successfully completes. This prevents data from being incorrectly marked as deleted if an agent fails halfway through a run.

Success criteria

oSuccess criteria can now be defined to control when an agent run is considered successfully completed. Success criteria can be used to control notifications and change tracking.

Export to single table

oMultiple agents can now export to a single database table.

Export script templates

oScript templates make it easier to build export scripts for agents. A script template will contain the C# code required to read all data from an agent, so all that needs added is the code required to write the data.

Export settings

oExport Settings have been simplified. Data exported from a container can now be separated by setting a single option on the container.

oExported data from list containers is now never separated by default, which should make the export data format more predictable.

oThe export option to convert data rows into data columns is now only available for CSV and Excel export. The conversion is done when the CSV or Excel files are generated, so the internal data structures never reflect this conversion. If a container command has been configured to convert data rows into data columns, the internal data structures will contain a separate data table for the data extracted by that container command.

File downloads

oMore reliable file downloads when using "Click to Download".

Screenshots

oAgents running in a service are now able to take screenshots.

Retry errors

oAfter an agent stops, the agent can now continue data extraction, continue data extraction and retry errors, or just retry errors.

oThe current version of an agent can now be used when continuing data extraction or retrying errors. The current version of the agent can differ from the version at the time the data extraction started. An error occurs if the agent has been modified in a way that requires a new internal data structure. It's still possible to use a version of the agent that was current when the data extraction started.

 

User Interface

Simplified action configuration

oActivities have been replaced with simple options to configure an action command to wait for page loads and/or AJAX. An agent can also wait for a specific URL or a specific selection XPath.

Multiple command selections

oMultiple commands can be selected in the Agent Explorer to make it easier to delete, copy or move multiple commands.

Group commands

oA new Group Commands feature allows all selected commands to be placed inside a Group Command with a single click.

Improved XPath editor

oThe XPath editor now has color highlighting and has been moved from the Ribbon menu to a docking window.

 

Self-Contained Agents

New self-contained agents

oThe user interface for self-contained agents is now pure HTML, and can be completely customized with the Premium edition of Content Grabber.