Building a Desktop Application

<< <%SKIN-STRTRANS-SYNTOC%> >>

Navigation:  Programming Interface >

Building a Desktop Application

The Content Grabber application can build stand-alone agents. This is a convenient way to distribute agents that can be configured and run without requiring the full Content Grabber application on the target computer. However, stand-alone agents have a standardized user interface and can only export data to file formats. If you want full control of the user interface and export features, you can build your own desktop application using the Content Grabber API.

 

See these topics for more information:

Visual Studio Configuration

Distributing Your Application

 

Examples

The following example uses the API to run an agent with a set of input parameters. It sets the log level to high and specifies that log information should be written to file.

 

AgentApi api = new AgentApi(@"C:\Users\Public\Documents\Content Grabber\Agents\

qantasApiTest\qantasApiTest.scg");

AgentSettings settings = new AgentSettings();

settings.InputParameters.Add("from", "SYD");

settings.InputParameters.Add("to", "MEL");

settings.InputParameters.Add("departure_date", DateTime.Now.ToString("yyyy-MM-dd"));

settings.InputParameters.Add("return_date", DateTime.Now.ToString("yyyy-MM-dd"));

settings.InputParameters.Add("travel_class", "ECO");

settings.InputParameters.Add("adults", "1");

settings.InputParameters.Add("children", "0");

settings.LogLevel = AgentLogLevel.High;

settings.IsLogToFile = true;            

api.RunAgent(settings);

 

The following example uses the API to run an agent asynchronously and checks the agent for progress.

 

AgentApi api = new AgentApi(@"C:\Users\Public\Documents\Content Grabber\Agents\test\test.scg");

api.StartAgent();

 

AgentStatus status = api.GetAgentStatus();

while(status.IsRunning())

{

 Thread.Sleep(2000);

 status = api.GetAgentStatus();

}

if (status.IsCompletedWithFailure())

{

 Console.WriteLine("Failure");        

}

else

{

 Console.WriteLine(api.GetAgentDataAsJson());        

}

 

Sending Messages to the Host Application

An agent that runs asynchronously on the same computer as the host application can send custom messages to the host application. The agent and the host application run in different processes, so inter-process communication is required for the agent to send messages to the host application. The Content Grabber API provides inter-process communication functionality that is based on named pipes.

 

The following example uses the API method StartPipe to start a named pipe, and then waits for messages to arrive from agent. The agent will always send the message Completed when the agent has completed its run. All other messages must be sent from custom scripts in the agent.

 

using (var api = new AgentApi(@"C:\Users\Public\Documents\Content Grabber\Agents\test\test.scg"))

{

 var pipe = api.StartPipe();

 api.StartAgent();

 while (true)

 {

 var message = pipe.GetNextMessage();

 

         if (message == null)

                 continue;

 

         if (message.MessageType == CgMessageType.Completed)

         {

                 break;

         }

         Console.WriteLine(message.Messsage);        

 }

}      

 

The following Execute Script command sends the currently extracted data entry to the host application.

 

using System;

using Sequentum.ContentGrabber.Api;

public class Script

{        

 public static CustomScriptReturn CustomScript(CustomScriptArguments args)

 {

         args.ScriptUtilities.SendMessage(CgMessageType.Data, args.DataRow.ToJson());

         

         return CustomScriptReturn.Empty();

 }

}

 

 

 

Agent API Functions

Function

Description

AgentApi(string agentNameOrPath, string sessionId)

Instantiates a new API class with the specified agent and session ID. You can specify the full path to an agent file or just the name of the agent. If you only specify the agent name, Content Grabber will look for the agent in the default location for the user running the agent service. The default agent location for the local System account is:

 

C:\Users\Public\Documents\Content Grabber\Agents

 

AgentApi(string agentNameOrPath)

Instantiates a new API class without a session. You can specify the full path to an agent file or just the name of the agent. If you only specify the agent name, Content Grabber will look for the agent in the default location for the user running the agent service. The default agent location for the local System account is:

 

C:\Users\Public\Documents\Content Grabber\Agents

 

AgentApi(string key, string agentName, string sessionId)

Instantiates a new API class with the specified agent, session ID and access key. The agentName parameter should be the agent name without the full path. Content Grabber will use the Access Key to determine the agent's full path.

 

The sessionId can be set to null if no sessionId is required.

 

void Connect(string endPointAddress)

Connects to a Content Grabber agent service. You can specify the server name or IP address and port number. The default connection string for a local service is:

 

http://localhost:8000/ContentGrabber

 

void CloseConnection()

Closes the connection to the Content Grabber agent service.

void StartAgent()

Runs the agent specified when instantiating the API. The agent will run asynchronously.

void StartAgent(AgentSettings settings)

Runs the agent with additional settings. The agent will run asynchronously. See below for more information about the AgentSettings class.

void StopAgent()

Stops the agent if it is currently running.

void CloseAgentSession()

Closes an agent session. When you close an agent session, all data associated with that session is removed and you will not be able to retrieve status information about the agent that ran in this session. You can only close a session if an agent is not currently running in the session.

 

You don't need to close a session. Session data will be removed automatically after the agent has completed running and the session timeout has elapsed. The default session timeout is 30 minutes, so by default session data will be removed automatically 30 minutes after the agent has completed.

AgentStatus GetAgentStatus()

Returns status information about an agent that has been run asynchronously. See below for more information about the AgentStatus class.

DataTable GetAgentProgressAsDataTable()

Returns progress information in a DataTable about an agent running in asynchronously. See below for more information about the information returned.

DataTable GetAgentProgressAsJson()

Returns progress information as a JSON string about an agent running in asynchronously. See below for more information about the information returned.

DataTable GetAgentLogAsDataTable(offset, limit)

Returns log data in a DataTable for an agent that has been run asynchronously. This function does not return any data if logging is disabled or if logging is written to file. See below for more information about the information returned.

 

offset (optional): Index of the first log entry to return.

Limit (optional): Index of the last log entry to return.

string GetAgentLogAsJson(offset, limit)

Returns log data as a JSON string for an agent that has been run asynchronously. This function does not return any data if logging is disabled or if logging is written to file. See below for more information about the information returned.

 

offset (optional): Index of the first log entry to return.

Limit (optional): Index of the last log entry to return.

DataSet GetAgentExportDataAsDataSet(offset, limit)

Returns extracted data in a DataSet for an agent that has been run asynchronously.

 

offset (optional): Index of the first data entry to return.

Limit (optional): Index of the last data entry to return.

string GetAgentExportDataAsJson(offset, limit)

Returns extracted data as a JSON string for an agent that has been run asynchronously.

 

offset (optional): Index of the first data entry to return.

Limit (optional): Index of the last data entry to return.

string GetAgentExportDataAsXml(offset, limit)

Returns extracted data as an XML string for an agent that has been run asynchronously.

 

offset (optional): Index of the first data entry to return.

Limit (optional): Index of the last data entry to return.

RunAgentReturnJson(string agentNameOrPath, limit)

Runs an agent synchronously and returns extracted data as a JSON string. The agent is always run in a session when the agent supports sessions, and the session is closed automatically after the agent has completed its run.

 

Limit (optional): Maximum number of data rows to return.

RunAgentReturnXml(string agentNameOrPath, limit)

Runs an agent synchronously and returns extracted data as an XML string. The agent is always run in a session when the agent supports sessions, and the session is closed automatically after the agent has completed its run.

 

Limit (optional): Maximum number of data rows to return.

RunAgentReturnDataSet(string agentNameOrPath, limit)

Runs an agent synchronously and returns extracted data in a DataSet. The agent is always run in a session when the agent supports sessions, and the session is closed automatically after the agent has completed its run.

 

Limit (optional): Maximum number of data rows to return.

RunAgentReturnJson(AgentSettings settings, limit)

Runs an agent synchronously with additional settings and returns extracted data as a JSON string. The agent is always run in a session when the agent supports sessions, and the session is closed automatically after the agent has completed its run.

 

See below for more information about the AgentSettings class.

 

Limit (optional): Maximum number of data rows to return.

RunAgentReturnXml(AgentSettings settings, limit)

Runs an agent synchronously with additional settings and returns extracted data as an XML string. The agent is always run in a session when the agent supports sessions, and the session is closed automatically after the agent has completed its run.

 

See below for more information about the AgentSettings class.

 

Limit (optional): Maximum number of data rows to return.

RunAgentReturnDataSet(AgentSettings settings, limit)

Runs an agent synchronously with additional settings and returns extracted data in a DataSet. The agent is always run in a session when the agent supports sessions, and the session is closed automatically after the agent has completed its run.

 

See below for more information about the AgentSettings class.

 

Limit (optional): Maximum number of data rows to return.

Agent GetAgent()

Returns the agent specified when instantiating the API class.

SaveAgent(Agent agent)

Saves the specified agent.

CgPipeIn StartPipe()

Starts a pipe that can be used to send messages from a running agent to the host application. The pipe must be started before the agent is started.

 

AgentSettings

The following agent settings can be specified when running an agent:

Property

Description

bool IsViewBrowser

The web browser is displayed while the agent runs. This option has no effect when an agent is being run from a Windows service.

bool IsNoUi

No User Interfaces will be displayed while an agent runs, and the web browser will not render web content on the screen.

AgentRunMethod RunMethod

Specifies how to run the agent. Restart, Continue, or Continue And Retry Errors

string ScheduleUserName

The full user name, including domain name, used when retrieving schedule information.

string SchedulePassword

The password used when retrieving schedule information.

GlobalDataDictionary GlobalData

Any serializable data object can be stored in this dictionary and will be available to all scripts in an agent. Notice that input parameters will eventually be stored in this dictionary as well, so it doesn't matter if you use input parameters or global data to store your input data.

bool LogLevel

Log detail level. Set the log level to None to turn off logging.

bool IsLogHtml

Logs the raw HTML of all web pages processed by the agent.

bool IsLogToFile

Logs data to a file instead of a database table.

string LogFilePath

The log path if logging to a file. The path can be a directory and a specific file. In both cases the directory must exist.

int Timeout

This value specifies the session timeout in minutes when an agent is run asynchronously. All session data is removed automatically when the agent has completed and this timeout has elapsed. The default session timeout is 30 minutes.

 

This value specifies the maximum number of seconds an agent will run when it's run synchronously. When the timeout is reached, the agent will stop and close its session if it's run in a session. The default timeout is 30 seconds.

Dictionary<string, string> InputParameters

A list of input parameters.

 

AgentStatus

An agent can provide the following status information:

 

Property

Description

PageLoads

The RunStatus can be one of the following values:

Completed. The agent has completed successfully.

Incomplete. The agent has completed, but stopped prematurely. The agent may have been stopped manually.

Failed. The agent has completed, but a critical error occurred.

Idle. The agent has never been run.

Starting. The agent is starting.

ExportingData. The agent is exporting data to the specified export target.

Stopping. The agent is in the process of stopping.

Restarting. The agent is restarting. This usually occurs when the agent needs to clear JavaScript memory leaks.

ExportFailed. The agent completed, but failed to export data.

MissingElements

The number of page loads. This includes AJAX calls triggered by agent actions.

StartTime

The amount of time the agent has run.

int MissingElements

The number of times an agent command could not find it's specified content where the content was not specified as optional.

int PageErrors

The number of page load errors. This includes errors loading content from AJAX calls that were triggered by agent actions.

DateTime StartTime

The time the agent started.

 

Agent Progress Data

An agent can provide progress data in a DataTable. The DataTable contains a DataRow for each web browser the agent is using to extract data. Each DataRow contains a status column and a description column. The progress data is the same information displayed when running an agent in the Content Grabber agent editor.

 

Agent Log Data

An agent can provide log data in a DataTable. The DataTable contains a log level column and a description column. A log level of 1 means an error, 2 means a warning and 3 means information. The log data is the same data you can view in the Content Grabber agent editor.

 

Agent Export Data

The API can provider extracted data in a DataSet, as an XML string or as a JSON string. For large amount of data, use the parameters offset and limit to page through the data. Offset is the index of the first data entry to return and limit is the index of the last data entry to return. The API method GetAgentStatus returns a value Export Row Count which contains the total number of data entries available. See Data Counting for more information about the Export Row Count value.