The SPARQL 1.1 specifications have a number of informative
Security considerations sections. This section describes how those possibly
apply to the implementation of Tracker.
Note that most of these considerations derive from situations where a SPARQL store is exposed through a public endpoint, while Tracker does not do that by default. Users should be careful about creating endpoints. For D-Bus endpoints, access through the portal is encouraged.
SPARQL queries using FROM, FROM NAMED, or GRAPH may cause the specified URI to be dereferenced. This may cause additional use of network, disk or CPU resources along with associated secondary issues such as denial of service. The security issues of Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7 should be considered. In addition, the contents of file: URIs can in some cases be accessed, processed and returned as results, providing unintended access to local resources. SPARQL requests may cause additional requests to be issued from the SPARQL endpoint, such as FROM NAMED. The endpoint is potentially within an organisations firewall or DMZ, and so such queries may be a source of indirection attacks.
Graph URIs are virtual in Tracker and do not cause any access outside of
database resources. The only SPARQL syntax capable of dereferencing or accessing
external resources are the
SERVICE <uri> and
LOAD <rdf-file> features.
The SPARQL language permits extensions, which will have their own security implications.
Tracker SPARQL extensions have no special security considerations, other than being code that runs on silicon.
SPARQL queries using SERVICE imply that a URI will be dereferenced, and that the result will be incorporated into a working data set.
Since a SPARQL protocol service may make HTTP requests of other origin servers on behalf of its clients, it may be used as a vector of attacks against other sites or services. Thus, SPARQL protocol services may effectively act as proxies for third-party clients. Such services may place restrictions on the resources that they retrieve or on the rate at which external resources can be retrieved. SPARQL protocol services may log client requests in such a way as to facilitate tracing them with regard to third-party origin servers or services. SPARQL protocol services may choose to detect these and other costly, or otherwise unsafe, queries, impose time or memory limits on queries, or impose other restrictions to reduce the service's (and other service's) vulnerability to denial-of-service attacks. They also may refuse to process such query requests.
Tracker offers 2 types of endpoint that are susceptible to this vector:
- D-Bus endpoints accessed outside a sandbox.
- HTTP endpoints
Particularly, requests on a D-Bus endpoint happening through the portal from a sandboxed process have all SERVICE access restricted.
Tracker developers encourage that all access to endpoints created on D-Bus happen through the portal, and that all HTTP endpoints validate the provenance of the requests through the block-remote-address signal to limit access to resources.
There are at least two possible sources of denial-of-service attacks against SPARQL protocol services. First, under-constrained queries can result in very large numbers of results, which may require large expenditures of computing resources to process, assemble, or return. Another possible source are queries containing very complex — either because of resource size, the number of resources to be retrieved, or a combination of size and number — RDF Dataset descriptions, which the service may be unable to assemble without significant expenditure of resources, including bandwidth, CPU, or secondary storage. In some cases such expenditures may effectively constitute a denial-of-service attack. A SPARQL protocol service may place restrictions on the resources that it retrieves or on the rate at which external resources are retrieved. There may be other sources of denial-of-service attacks against SPARQL query processing services.
Tracker does not perform any time or frequency rate limits to queries. HTTP endpoints may perform the latter through the block-remote-address signal.
Write access to data makes it inherently vulnerable to malicious access. Standard access and authentication techniques should be used in any networked environment. In particular, HTTPS should be used, especially when implementing the SPARQL HTTP-based protocols. (i.e., encryption with challenge/response based password presentation, encrypted session tokens, etc). Some of the weak points addressed by HTTPS are: authentication, active session integrity between client and server, preventing replays, preventing continuation of defunct sessions.
SPARQL protocol services may remove, insert, and change underlying data via the update operation. To protect against malicious or destructive updates, implementations may choose not to implement the update operation. Alternatively, implementations may choose to use HTTP authentication mechanisms or other implementation-defined mechanisms to prevent unauthorized invocations of the update operation.
Tracker HTTP endpoints do not implement any update mechanisms. D-Bus endpoints accessed through the portal from inside a sandbox are likewise read-only.
Systems that provide both read-only and writable interfaces can be subject to injection attacks in the read-only interface. In particular, a SPARQL endpoint with a Query service should be careful of injection attacks aimed at interacting with an Update service on the same SPARQL endpoint. Like any client code, interaction between the query service and the update service should ensure correct escaping of strings provided by the user. While SPARQL Update and SPARQL Query are separate languages, some implementations may choose to offer both at the same SPARQL endpoint. In this case, it is important to consider that an Update operation may be obscured to masquerade as a query. For instance, a string of unicode escapes in a PREFIX clause could be used to hide an Update Operation. Therefore, simple syntactic tests are inadequate to determine if a string describes a query or an update.
Following the SPARQL 1.1 spec, Tracker implements updates and queries as two different languages with different parser entry points, this separation happens all the way to the public API. As an additional layer of security, readonly queries happen on readonly database connections. It is essentially not possible to perform any data change from the query APIs.
API user considerations
Users of the Tracker API and SPARQL interface are encouraged to make some considerations and take some precautions:
Do not expose any endpoints that does not need exposing.
For local D-Bus endpoints, consider using a graph partitioning scheme that makes it easy to policy the access to the data when accessed through the portal.
Avoid the possibility of injection attacks. Use TrackerSparqlStatement and avoid string-based approaches to build SPARQL queries from user input.
Consider that IRIs are susceptible to homograph attacks. Quoting https://www.w3.org/TR/sparql11-protocol/#policy-security:
Different IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Users of SPARQL must take care to construct queries with IRIs that match the IRIs in the data. Further information about matching of similar characters can be found in Unicode Security Considerations [UNISEC] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.
The situations where this might be a source of confusion or mischief, or even be possible depends on how those IRIs are created, used, displayed or inserted.
This is a quick reference of the features offered by the different types of endpoint.
|Endpoint||Query||Update||Graph Constraints||Service Constraints|
The results of the search are