<< <%SKIN-STRTRANS-SYNTOC%> >> Navigation: Quick Start with Content Grabber > Content Grabber Basics |
Web-scraping tools generally use macros or configuration methods, and follow a sequential list of commands. The macro approach is more user friendly and automatically records the actions of a user in a browser. However, there are typically restrictions on accessing the code behind the agent. The configuration approach allows the user to directly configure each part of the agent. They can introduce more code structure, controls, data refinements, or add their own naming conventions.
Content Grabber gives you the option to either follow the simple macro automation methods, or to take direct control over the treatment of each element and command within your agent.
With Content Grabber, you can visually browse the website and click on the data elements in the order that you want to collect them. Based on the content elements selected, Content Grabber will automatically determine the corresponding action type and provide default names for each command as it builds the agent for you.
Content Grabber main screen - building CarPoint Agent
A Content Grabber agent is a collection of commands which are executed in serial until completed. The commands can either be actions (such as a jump to a URL) or data capture commands (e.g. capture text). These commands are recorded in order of execution in the Agent Explorer panel of the Content Grabber screen.
Agent Explorer panel with New Agent commands
If you want to make other adjustments or gain more control of your commands, you can make changes in the Configure Agent Command panel.
Configure Agent Command panel
You can also add new commands to your agent, or configure existing ones. To do this, you simply click twice on any web element (content item) and the Content Grabber Message window will appear. From here you can select the command type you want and add it to the Agent Explorer.
Content Grabber Message window pop-up
After you have finished building you Agent and run it for the first time, Content Grabber saves the data locally in a structured database format. Content Grabber can export extracted web data as a report or to numerous different database types. Data output options include CSV, Excel, XML, SQL Server, MySQL, Oracle and OleDB.
Content Grabber's Data Configuration window
You can also use a Content Grabber export script to completely customize the data export to your own database structures.
Content Grabber provides an Agent scheduling facility that enables you to automatically run your agent at predetermined time slots whenever you need it to run. This can be done every hour, every day, month, year and so on.
Content Grabber's Scheduling Window