Sandboxing and portals

There are times when it does make sense to lend advanced querying capabilities around your data to other processes of the same machine, or even widely available to any application that might want to make use of it. One real life example of this is Tracker Miners used as the GNOME Desktop indexer.

But in this day and age, it is naive to make this data available without constraints on what data can be accessed. Tracker provides a portal for the Flatpak application sandboxing system. This provides the mechanisms to filter the available data for sandboxed users.

How it works

The portal acts as a single entry point for sandboxed applications to have read access to any D-Bus SPARQL endpoint.

cluster_session User session cluster_client1 Sandbox #1 cluster_client2 Sandbox #2 Portal Portal App A App A Portal--App A App B App B Portal--App B Provider #1 Provider #1 Provider #1--Portal Provider #2 Provider #2 Provider #2--Portal Provider #3 Provider #3 App A--Provider #3

The partitioning of the data is decided by the data provider at the time of inserting this data in the RDF triple store, and relies on graphs as the natural partitioning scheme of RDF data. As an example, consider the following updates:

INSERT DATA {
  GRAPH example:A {
    example:item1 a rdfs:Resource ;
  }

  GRAPH example:B {
    example:item2 a rdfs:Resource ;
  }
}

Resulting in the following simple diagram:

cluster_graph2 example:B cluster_graph1 example:A item2 example:item2 type2 rdfs:Resource item2->type2 rdf:type item1 example:item1 type1 rdfs:Resource item1->type1 rdf:type

At that point, SPARQL queries may target either, both or any graph, e.g.:

# This will return example:item1
SELECT ?element
FROM example:A
{
  ?element a rdfs:Resource
}

Or perhaps even poke at non-existing graphs:

# This will return no elements
SELECT ?element
FROM example:C
{
  ?element a rdfs:Resource
}

Graphs may overlap with other graphs, or extend each other.

The data filtering performed by the portal strongly relies on these RDF semantics for graphs, except that graph access is enforced through the CONSTRAINT syntax. These policies prevail over any other SPARQL syntax to specify graphs, and will result in filtered graphs being replaced by an empty graph.

Likewise, notification of changes through TrackerNotifier will be filtered on the way to sandboxed applications, so that there are only notifications on changes from the allowed graphs.

Using it from data providers

Data providers are free to decide the data partitioning scheme available to sandboxed applications, by creating graphs and inserting data to those. Access to those graphs will be policied by the portal on their behalf.

Keep in mind that there are limits on the number of graphs that apply on graph creation.

Using it from applications

The portal is transparently integrated in the Tracker library. Applications may write SPARQL queries so that they work the same while sandboxed or not. No code changes are needed in the app, as tracker_sparql_connection_bus_new() will automatically try connect via the portal if it can not talk to the given D-Bus name directly.

The policy on the accessed graphs is usually defined in the flatpak manifest at the time of bundling the software, e.g.:

{
  "finish-args" : [
    "--add-policy=Tracker3.dbus:org.example.Endpoint=example:A",
    "--add-policy=Tracker3.dbus:org.example.Endpoint=example:B"
  ]
}

The portal will pick the policy from the /.flatpak-info file created by Flatpak, and (with the example arguments) allow access to graphs example:A and example:B from the org.example.Endpoint D-Bus SPARQL endpoint. Access to other graphs, or other services will be disallowed.

The Tracker library is included in the GNOME Flatpak SDK and runtime. If the app uses a different runtime, you may need to build Tracker SPARQL in the app manifest.