Web Element Content


Navigation:  Agent Commands > Capture Commands >

Web Element Content

Typically, the Web Element Content command is the most common, since it is the command that captures text content from the target web page. The command allows you to choose which type of text to extract from a particular web element.


The properties shown in the following table are available to all Web Element Content commands:

Text Option



This is the text that displays in the web browser, and is the most common choice (it's also the default).


This option will extract the entire HTML of the chosen web element, including the HTML of any child elements.

Inner HTML

The entire HTML of all child elements of the selected web element, but not the tag HTML of the chosen element itself.

Tag Text

The text of the selected web element, excluding the text of any child elements.

Unique ID

A unique identifier. The web selection is ignored if this option is selected.

Default File Name

Returns any file name specified by a response from a web server. If no file name is specified, extracts a file name from a href attribute, or a unique identifier if the href does not not exist.


This attribute is mostly relevant to file capture commands.


In addition to the default options given above, some web elements may have other attributes available, such as Class, Name, ID, Value, URL, etc. If a web element has an attribute that is not shown in the default drop down box, then you can simply enter the name of the attribute you want to extract.


The figure below shows the Command Properties panel after choosing Web Element Content from the New Command drop-down:




The following properties are specific to Web Element Content commands and are listed in the HTML Capture section on the command property tab:



HTML Attribute

The HTML attribute to extract from the selected web element.

Concatenate Multiple Web Elements

Concatenates content if multiple web elements are selected. Only the first web element will be used if this property is set to false.

Concatenate Content Separator

The separator to use between content from multiple web elements. This property is only applicable when Concatenate Multiple Web Elements is set to true.


The following properties are available in the Web Selection section on the Properties tab:




The selection XPath for the chosen web element. See the Web Selection Properties article for more information.


Extracting URLs deserves a special mention. That's because it's more common to extract a link URL instead of navigating to the link, or to extract an image URL instead of downloading the image itself. If you want to extract a URL from a web element, simply select the element and extract the URL or Image URL attribute. You can also enter the actual HTML tag attributes: src for images and href for links. However, the URL and Image URL attributes automatically convert any relative URLs to absolute URLs-which is best in most cases.


Content Transformation Script

The Web Element Content command allows you to use regular expressions or a .NET script, to transform the extracted content. In most cases we recommend that you write expressions or use a script to clean the data that you extract. You can also separate data-such as the elements of a postal address-into separate fields.


Example: Consider a case in which you want to extract product data that includes a price of $400. You could use a transformation script to strip off the "$" character and leave only the numeric value.


Please see the Content Transformation Script topic for more information.


These common properties are also available for this command:

Capture Command Propertie

Agent Command Properties