libsoup Client Basics

libsoup Client Basics — Client-side tutorial

This section explains how to use libsoup as an HTTP client using several new APIs introduced in version 2.42. If you want to be compatible with older versions of libsoup, consult the documentation for that version.


Creating a SoupSession

The first step in using the client API is to create a SoupSession. The session object encapsulates all of the state that libsoup is keeping on behalf of your program; cached HTTP connections, authentication information, etc.

When you create the session with soup_session_new_with_options, you can specify various additional options:

SOUP_SESSION_MAX_CONNS

Allows you to set the maximum total number of connections the session will have open at one time. (Once it reaches this limit, it will either close idle connections, or wait for existing connections to free up before starting new requests.) The default value is 10.

SOUP_SESSION_MAX_CONNS_PER_HOST

Allows you to set the maximum total number of connections the session will have open to a single host at one time. The default value is 2.

SOUP_SESSION_USER_AGENT

Allows you to set a User-Agent string that will be sent on all outgoing requests.

SOUP_SESSION_ACCEPT_LANGUAGE and SOUP_SESSION_ACCEPT_LANGUAGE_AUTO

Allow you to set an Accept-Language header on all outgoing requests. SOUP_SESSION_ACCEPT_LANGUAGE takes a list of language tags to use, while SOUP_SESSION_ACCEPT_LANGUAGE_AUTO automatically generates the list from the user's locale settings.

SOUP_SESSION_HTTP_ALIASES and SOUP_SESSION_HTTPS_ALIASES

Allow you to tell the session to recognize additional URI schemes as aliases for "http" or https. You can set this if you are using URIs with schemes like "dav" or "webcal" (and in particular, you need to set this if the server you are talking to might return redirects with such a scheme).

SOUP_SESSION_PROXY_RESOLVER and SOUP_SESSION_PROXY_URI

SOUP_SESSION_PROXY_RESOLVER specifies a GProxyResolver to use to determine the HTTP proxies to use. By default, this is set to the resolver returned by g_proxy_resolver_get_default, so you do not need to set it yourself.

Alternatively, if you want all requests to go through a single proxy, you can set SOUP_SESSION_PROXY_URI.

SOUP_SESSION_ADD_FEATURE and SOUP_SESSION_ADD_FEATURE_BY_TYPE

These allow you to specify SoupSessionFeatures (discussed below) to add at construct-time.

Other properties are also available; see the SoupSession documentation for more details.

If you don't need to specify any options, you can just use soup_session_new, which takes no arguments.


Session features

Additional session functionality is provided as SoupSessionFeatures, which can be added to a session, via the SOUP_SESSION_ADD_FEATURE and SOUP_SESSION_ADD_FEATURE_BY_TYPE options at session-construction-time, or afterward via the soup_session_add_feature and soup_session_add_feature_by_type functions.

A SoupContentDecoder is added for you automatically. This advertises to servers that the client supports compression, and automatically decompresses compressed responses.

Some other available features that you can add include:

SoupLogger

A debugging aid, which logs all of libsoup's HTTP traffic to stdout (or another place you specify).

SoupCookieJar, SoupCookieJarText, and SoupCookieJarDB

Support for HTTP cookies. SoupCookieJar provides non-persistent cookie storage, while SoupCookieJarText uses a text file to keep track of cookies between sessions, and SoupCookieJarDB uses a SQLite database.

SoupContentSniffer

Uses the HTML5 sniffing rules to attempt to determine the Content-Type of a response when the server does not identify the Content-Type, or appears to have provided an incorrect one.

Use the "add_feature_by_type" property/function to add features that don't require any configuration (such as SoupContentSniffer), and the "add_feature" property/function to add features that must be constructed first (such as SoupLogger). For example, an application might do something like the following:

1
2
3
4
5
6
7
8
9
10
11
session = soup_session_new_with_options (
	SOUP_SESSION_ADD_FEATURE_BY_TYPE, SOUP_TYPE_CONTENT_SNIFFER,
	NULL);

if (debug_level) {
	SoupLogger *logger;

	logger = soup_logger_new (debug_level, -1);
	soup_session_add_feature (session, SOUP_SESSION_FEATURE (logger));
	g_object_unref (logger);
}

Creating and Sending SoupMessages

Once you have a session, you send HTTP requests using SoupMessage. In the simplest case, you only need to create the message and it's ready to send:

1
2
3
SoupMessage *msg;

msg = soup_message_new ("GET", "http://example.com/");

In more complicated cases, you can use various SoupMessage, SoupMessageHeaders, and SoupMessageBody methods to set the request headers and body of the message:

1
2
3
4
5
6
SoupMessage *msg;

msg = soup_message_new ("POST", "http://example.com/form.cgi");
soup_message_set_request (msg, "application/x-www-form-urlencoded",
                          SOUP_MEMORY_COPY, formdata, strlen (formdata));
soup_message_headers_append (msg->request_headers, "Referer", referring_url);

(Although this is a bad example, because libsoup actually has convenience methods for dealing with HTML forms, as well as XML-RPC.)

You can also use soup_message_set_flags to change some default behaviors. For example, by default, SoupSession automatically handles responses from the server that redirect to another URL. If you would like to handle these yourself, you can set the SOUP_MESSAGE_NO_REDIRECT flag.

Sending a Message Synchronously

To send a message and wait for the response, use soup_session_send:

1
2
3
4
GInputStream *stream;
GError *error = NULL;

stream = soup_session_send (session, msg, cancellable, &error);

At the point when soup_session_send returns, the request will have been sent, and the response headers read back in; you can examine the message's status_code, reason_phrase, and response_headers fields to see the response metadata. To get the response body, read from the returned GInputStream, and close it when you are done.

Note that soup_session_send only returns an error if a transport-level problem occurs (eg, it could not connect to the host, or the request was cancelled). Use the message's status_code field to determine whether the request was successful or not at the HTTP level (ie, "200 OK" vs "401 Bad Request").

If you would prefer to have libsoup gather the response body for you and then return it all at once, you can use the older soup_session_send_message API:

1
2
3
guint status;

status = soup_session_send_message (session, msg);

In this case, the response body will be available in the message's response_body field, and transport-level errors will be indicated in the status_code field via special pseudo-HTTP-status codes like SOUP_STATUS_CANT_CONNECT.

Sending a Message Asynchronously

To send a message asynchronously, use soup_session_send_async:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
	...
	soup_session_send_async (session, msg, cancellable, my_callback, my_callback_data);
	...
}

static void
my_callback (GObject *object, GAsyncResult *result, gpointer user_data)
{
	GInputStream *stream;
	GError *error = NULL;

	stream = soup_session_send_finish (SOUP_SESSION (object), result, &error);
	...
}

The message will be added to the session's queue, and eventually (when control is returned back to the main loop), it will be sent and the response will be read. When the message has been sent, and its headers received, the callback will be invoked, in the standard GAsyncReadyCallback style.

As with synchronous sending, there is also an alternate API, soup_session_queue_message, in which your callback is not invoked until the response has been completely read:

1
2
3
4
5
6
7
8
9
10
11
{
	...
	soup_session_queue_message (session, msg, my_callback, my_callback_data);
	...
}

static void
my_callback (SoupSession *session, SoupMessage *msg, gpointer user_data)
{
	/* msg->response_body contains the response */
}

soup_session_queue_message is slightly unusual in that it steals a reference to the message object, and unrefs it after the last callback is invoked on it. So when using this API, you should not unref the message yourself.


Processing the Response

Once you have received the initial response from the server, synchronously or asynchronously, streaming or not, you can look at the response fields in the SoupMessage to decide what to do next. The status_code and reason_phrase fields contain the numeric status and textual status response from the server. response_headers contains the response headers, which you can investigate using soup_message_headers_get and soup_message_headers_foreach.

SoupMessageHeaders automatically parses several important headers in response_headers for you and provides specialized accessors for them. Eg, soup_message_headers_get_content_type. There are several generic methods such as soup_header_parse_param_list (for parsing an attribute-list-type header) and soup_header_contains (for quickly testing if a list-type header contains a particular token). These handle the various syntactical oddities of parsing HTTP headers much better than functions like g_strsplit or strstr.


Handling Authentication

SoupSession handles most of the details of HTTP authentication for you. If it receives a 401 ("Unauthorized") or 407 ("Proxy Authentication Required") response, the session will emit the authenticate signal, providing you with a SoupAuth object indicating the authentication type ("Basic", "Digest", or "NTLM") and the realm name provided by the server. If you have a username and password available (or can generate one), call soup_auth_authenticate to give the information to libsoup. The session will automatically requeue the message and try it again with that authentication information. (If you don't call soup_auth_authenticate, the session will just return the message to the application with its 401 or 407 status.)

If the server doesn't accept the username and password provided, the session will emit authenticate again, with the retrying parameter set to TRUE. This lets the application know that the information it provided earlier was incorrect, and gives it a chance to try again. If this username/password pair also doesn't work, the session will contine to emit authenticate again and again until the provided username/password successfully authenticates, or until the signal handler fails to call soup_auth_authenticate, at which point libsoup will allow the message to fail (with status 401 or 407).

If you need to handle authentication asynchronously (eg, to pop up a password dialog without recursively entering the main loop), you can do that as well. Just call soup_session_pause_message on the message before returning from the signal handler, and g_object_ref the SoupAuth. Then, later on, after calling soup_auth_authenticate (or deciding not to), call soup_session_unpause_message to resume the paused message.

By default, NTLM authentication is not enabled. To add NTLM support to a session, call:

1
soup_session_add_feature_by_type (session, SOUP_TYPE_AUTH_NTLM);

(You can also disable Basic or Digest authentication by calling soup_session_remove_feature_by_type on SOUP_TYPE_AUTH_BASIC or SOUP_TYPE_AUTH_DIGEST.)


Multi-threaded usage

A SoupSession can be used from multiple threads. However, if you are using the async APIs, then each thread you use the session from must have its own thread-default GMainContext.

SoupMessage is not thread-safe, so once you send a message on the session, you must not interact with it from any thread other than the one where it was sent.


Sample Programs

A few sample programs are available in the libsoup sources, in the examples directory:

  • get is a simple command-line HTTP GET utility using the asynchronous API.

  • simple-proxy uses both the client and server APIs to create a simple (and not very RFC-compliant) proxy server. It shows how to use the SOUP_MESSAGE_OVERWRITE_CHUNKS flag when reading a message to save memory by processing each chunk of the message as it is read, rather than accumulating them all into a single buffer to process all at the end.

More complicated examples are available in GNOME git.