<< <%SKIN-STRTRANS-SYNTOC%> >> Navigation: Agent Commands > Action Commands > Action Configuration > Wait |
Typically, you have no concern about the sequence of complex activities during the loading of a web page, since you simply wait for the content that you want to see. The most critical content on a web page will likely load far in advance of the time that you actually get around to viewing a specific part of the web page. Usually, all features function correctly as you fill in web forms or click links.
However, it's very different with web-scraping agents, since these agents are very fast. An agent will attempt to process a web page as quickly as possible and continue onto the next page. A web-scraping agent is so fast that it could easily start processing a page before all of the essential content loads. So, it's important that you configure an action command to wait for all important browser activities to complete and all the content loads before web page processing begins.
When an action command executes, it waits for certain activities to complete in the web browser. For example, if a command executes a click on a link, it may wait for a page to load or an AJAX call to complete. Some actions may result in a very complex set of activities. An action may load a new page that then uses AJAX to load additional dynamic content onto the page.
Action commands automatically discover web browser activities. After a command fires the action events, it will monitor all activity in the web browser and wait for critical activities to complete. Once no new activities have started for a little while, it will consider the action to be complete.
You can specify which activities an action command should wait for. The command can wait for activities in the main web page and in sub-pages that are loaded in web frames.
Page load activities can be optional or required. An error will be reported if a page load activity is required, but no page load occurs. If Wait for page load is set to None, the command will not wait for any page load to occur, which is slightly faster than setting Wait for page load to Optional.
An AJAX activity occurs when a web page loads content from the web server asynchronously. A Script activity occurs when a JavaScript file is loaded by the web page asynchronously. AJAX and Script activities are always optional, which means no error will be reported if a command is configured to wait for AJAX, but no AJAX activities occur.
Wait configuration panel.
Some websites have very complex activities. For example, many travel websites that provide hotel and flight search functionality will load a waiting page and after a while load the actual search result. An action command will often complete the action after the waiting page is loaded, since it doesn't know that more content will be loaded later. If the website redirects from the waiting page to the search result page, then the Wait option Delayed redirect can often be used successfully, but sometimes websites use other techniques and it can be very difficult for the action command to tell when an action has completed.
Sometimes it's possible to determine that a website action has completed when a specific URL has been loaded. This URL could be from a full page load, a frame page load, or an asynchronous AJAX call. A Wait for Content sub-command can be used to wait for a URL that matches a Regular Expression.
Sometimes the only reliable way to determine when an action has completed, is to wait for certain web content to appear on the web page. A Wait for Content sub-command can be used to wait for web content.
Action commands will wait for browser activities for a certain period of time before the wait times out, and the command either considers the action completed or reports the timeout as a page load error. The default timeout values are usually appropriate, but there will sometimes be situations where some timeout values should be modified. For example, timeout values may need to be increased for a very slow website in order for the agent to work properly, or timeout values could be decreased for a very fast website in order to increase agent performance.
This feature shows all browser activities that occur after the current action executes. You can use this information to determine potential issues with the configuration of the action. Use the Activity button on the Content Grabber status bar to open the Browser Activity screen, as shown in the figure below:
The Activity button on the status bar
Critical activities have dark coloring and other activities have light coloring. A blue row appears in the sequence at the point where the command recognizes completion of the action. Activities that occur after the action completes may not necessarily indicate a problem. If the agent does not work as you expect, then you may need to reconfigure your action in such a way that it waits for some or all of those activities.
Activity is seen after the action completes, which may not be a problem