Nepomuk Information Element (NIE)

Top classes in the ontology. Almost everything else is subclass of these.

@prefix nie: <http://tracker.api.gnome.org/ontology/v3/nie#>

The following classes are defined:

DataObject, DataSource, InformationElement

Overview

Introduction

The core of the NEPOMUK Information Element Ontology and the entire Ontology Framework revolves around the concepts of nie:DataObject and nie:InformationElement. They express the representation and content of a piece of data. Their specialized subclasses (defined in the other ontologies) can be used to classify a wide array of desktop resources and express them in RDF.

nie:DataObject class represents a collection of bytes somewhere (local or remote), the physical entity that contain data. The meaning (interpretation) of that entity (e.g. a music file, a picture) is represented on the nie:InformationElement side of the ontology.

All resources on the desktop are basically related to each other with two most fundamental types of relations: interpretation, Expressed through nie:interpretedAs and its reverse nie:isStoredAs.

And containment, expressed through nie:hasPart and its reverse nie:isPartOf.

These properties (or their subproperties with a more specific semantic meaning) provide the scaffolding to give an uniform view of the data with an arbitrary level of detail. For a more thorough example, the figure below represents an image in an archive in the attachment of a PDF document in the filesystem:

The horizontal edges express interpretation, the diagonal edges express containment. This approach gives a uniform overview of data regardless of how it’s represented.

Common properties

Given that the classes defined in this ontology are the superclasses for almost everything in the Nepomuk set of ontologies, the properties defined here will be inherited for a lot of classes. It is worth to comment few of them with special relevance:

  • nie:title: Title or name or short text describing the item
  • nie:description: More verbose comment about the element
  • nie:language: To specify the language of the item.
  • nie:plainTextContent: Just the raw content of the file, if it makes sense as text.
  • nie:generator: Software/Agent that set/produced the information.
  • nie:usageCounter: Count number of accesses to the information. It can be an indicator of relevance for advanced searches

Date and timestamp representations

There are few important dates for the life-cycle of a resource. These dates are properties of the nie:InformationElement class, and inherited for its subclasses:

  • nie:informationElementDate: This is an ”abstract” property that act as superproperty of the other dates. Don’t use it directly.
  • nie:contentLastModified: Modification time of a resource. Usually the mtime of a local file, or information from the server for online resources.
  • nie:contentCreated: Creation time of the content. If the contents is created by an application, the same application should set the value of this property. Note that this property can be undefined for resources in the filesystem because the creation time is not available in the most common filesystem formats.
  • nie:contentAccessed: For resources coming from the filesystem, this is the usual access time to the file. For other kind of resources (online or virtual), the application accessing it should update its value.
  • nie:lastRefreshed: The time that the content was last refreshed. Usually for remote resources.

URIs and full representation of a file

One of the most common resources in a desktop is a file. Given the split between Data Objects and Information Elements, some times it is not clear how a real file is represented into Nepomuk. Here are some indications:

  1. Every file (local or remote) should generate one DataObject instance and an InformationElement instance.
  2. Even when Data Objects and Information Elements are different entities.
  3. The URI of the DataObject is the real location of the item (e.g. ”file://path/to/file.mp3”)
  4. The URI of the InformationElement(s) will be generated IDs.
  5. Every DataObject must have the property nie:url, that points to the location of the resource, and should be used by any program that wants to access it.
  6. The InformationElement and DataObject are related via the nie:isStoredAs / nie:interpretedAs properties.

Here comes an example, for the image file /home/user/a.jpeg:

# Properties as nmm:Photo
<urn:uuid:10293801928301293> a nmm:Photo ;
  nie:isStoredAs <file:///home/user/a.jpeg> ;
  nfo:width 49 ;
  nfo:height 36 ;
  nmm:flash nmm:flash-off;
  nmm:whiteBalance nmm:white-balance-automatic ;
  nfo:equipment [
    a nfo:Equipment ;
    nfo:make 'Nokia';
    nfo:model 'N900';
    nfo:equipmentSoftware 'Tracknon'
  ] .

# Properties from nfo:FileDataObject
<file:///home/user/a.jpeg> a nfo:FileDataObject ;
  nie:interpretedAs <urn:uuid:10293801928301293> ;
  nfo:fileSize 12341234 ;
  nie:url 'file:///home/user/a.jpeg' .

Classes

DataObject

A unit of data that is created, annotated and processed on the user desktop. It represents a native structure the user works with. The usage of the term ‘native’ is important. It means that a DataObject can be directly mapped to a data structure maintained by a native application. This may be a file, a set of files or a part of a file. The granularity depends on the user. This class is not intended to be instantiated by itself. Use more specific subclasses.

Class hierarchy

G nie:DataObject nie:DataObject nco:ContactListDataObject nco:ContactListDataObject nie:DataObject--nco:ContactListDataObject nfo:FileDataObject nfo:FileDataObject nie:DataObject--nfo:FileDataObject nfo:HardDiskPartition nfo:HardDiskPartition nie:DataObject--nfo:HardDiskPartition nfo:MediaStream nfo:MediaStream nie:DataObject--nfo:MediaStream nfo:RemotePortAddress nfo:RemotePortAddress nie:DataObject--nfo:RemotePortAddress nfo:SoftwareItem nfo:SoftwareItem nie:DataObject--nfo:SoftwareItem nfo:SoftwareService nfo:SoftwareService nie:DataObject--nfo:SoftwareService rdfs:Resource rdfs:Resource rdfs:Resource--nie:DataObject

RDF Diagram

G nie:DataObject nie:DataObject nfo:DataContainer nfo:DataContainer nie:DataObject->nfo:DataContainer nfo:belongsToContainer nie:DataSource nie:DataSource nie:DataObject->nie:DataSource nie:dataSource nie:InformationElement nie:InformationElement nie:DataObject->nie:InformationElement nie:interpretedAs nie:DataObject->nie:InformationElement nie:isPartOf nie:InformationElement->nie:DataObject nie:depends nie:InformationElement->nie:DataObject nie:hasPart nie:InformationElement->nie:DataObject nie:isStoredAs nie:InformationElement->nie:DataObject nie:links nie:InformationElement->nie:DataObject nie:relatedTo nco:IMAddress nco:IMAddress nco:IMAddress->nie:DataObject nco:imAvatar nco:Contact nco:Contact nco:Contact->nie:DataObject nco:key nco:OrganizationContact nco:OrganizationContact nco:OrganizationContact->nie:DataObject nco:logo nfo:Bookmark nfo:Bookmark nfo:Bookmark->nie:DataObject nfo:bookmarks nfo:Media nfo:Media nfo:Media->nie:DataObject nfo:hasMediaStream

Properties

Name Type Notes Description
belongsToContainer DataContainer Models the containment relations between Files and Folders (or CompressedFiles).
byteSize integer File size in bytes
created dateTime Date of creation of the DataObject. Note that this date refers to the creation of the DataObject itself (i.e. the physical representation). Compare with nie:contentCreated
dataSource DataSource Marks the provenance of a DataObject, what source does a data object come from
interpretedAs InformationElement Links the DataObject with the InformationElement it is interpreted as
isPartOf InformationElement Generic property used to express containment relationships between DataObjects. NIE extensions are encouraged to provide more specific subproperties of this one. It is advisable for actual instances of DataObjects to use those specific subproperties. Note to the developers: Please be aware of the distinction between containment relation and provenance. The isPartOf relation models physical containment, a nie:DataObject (e.g. an nfo:Attachment) is a ‘physical’ part of an nie:InformationElement (a nmo:Message). Also, please note the difference between physical containment (isPartOf) and logical containment (isLogicalPartOf) the former has more strict meaning. They may occur independently of each other
lastRefreshed dateTime Date when information about this data object was retrieved (for the first time) or last refreshed from the data source. This property is important for metadata extraction applications that don’t receive any notifications of changes in the data source and have to poll it regularly. This may lead to information becoming out of date. In these cases this property may be used to determine the age of data, which is an important element of it’s dependability
url string URL pointing at the location of the resource. In cases where creating a simple file:// or http:// URL for a file is difficult (e.g. for files inside compressed archives) the applications are encouraged to use conventions defined by Apache Commons VFS Project at http://jakarta.apache.org/ commons/ vfs/ filesystems.html.

DataSource

A superclass for all entities from which DataObjects can be extracted. Each entity represents a native application or some other system that manages information that may be of interest to the user of the Semantic Desktop. Subclasses may include FileSystems, Mailboxes, Calendars, websites etc. The exact choice of subclasses and their properties is considered application-specific. Each data extraction application is supposed to provide it’s own DataSource ontology. Such an ontology should contain supported data source types coupled with properties necessary for the application to gain access to the data sources. (paths, urls, passwords etc…)

Class hierarchy

G nie:DataSource nie:DataSource tracker:IndexedFolder tracker:IndexedFolder nie:DataSource--tracker:IndexedFolder rdfs:Resource rdfs:Resource rdfs:Resource--nie:DataSource

RDF Diagram

G nie:DataSource nie:DataSource nie:DataObject nie:DataObject nie:DataObject->nie:DataSource nie:dataSource nie:InformationElement nie:InformationElement nie:InformationElement->nie:DataSource nie:rootElementOf

Predefined instances

nie:DataSource has the following predefined instances:

  • tracker:extractor-data-source

InformationElement

A unit of content the user works with. This is a superclass for all interpretations of a DataObject.

RDF Diagram

G nie:InformationElement nie:InformationElement nie:InformationElement->nie:InformationElement nie:hasLogicalPart nie:InformationElement->nie:InformationElement nie:isLogicalPartOf nco:Contact nco:Contact nie:InformationElement->nco:Contact nco:contributor nie:InformationElement->nco:Contact nco:creator nie:InformationElement->nco:Contact nco:publisher nie:DataObject nie:DataObject nie:InformationElement->nie:DataObject nie:depends nie:InformationElement->nie:DataObject nie:hasPart nie:InformationElement->nie:DataObject nie:isStoredAs nie:InformationElement->nie:DataObject nie:links nie:InformationElement->nie:DataObject nie:relatedTo nie:DataSource nie:DataSource nie:InformationElement->nie:DataSource nie:rootElementOf slo:GeoLocation slo:GeoLocation nie:InformationElement->slo:GeoLocation slo:location tracker:ExternalReference tracker:ExternalReference nie:InformationElement->tracker:ExternalReference tracker:hasExternalReference nco:Contact->nie:InformationElement nco:photo nco:Contact->nie:InformationElement nco:sound nie:DataObject->nie:InformationElement nie:interpretedAs nie:DataObject->nie:InformationElement nie:isPartOf nco:Role nco:Role nco:Role->nie:InformationElement nco:video nfo:MediaList nfo:MediaList nfo:MediaList->nie:InformationElement nfo:mediaListEntry nfo:RegionOfInterest nfo:RegionOfInterest nfo:RegionOfInterest->nie:InformationElement nfo:roiRefersTo

Properties

Name Type Notes Description
contributor Contact An entity responsible for making contributions to the content of the InformationElement.
creator Contact Creator of a data object, an entity primarily responsible for the creation of the content of the data object.
publisher Contact An entity responsible for making the InformationElement available.
isBootable boolean True when the file is bootable, for example like an ISO or other disc images
isContentEncrypted boolean Might change (IE of DataObject property?)
characterSet string Characterset in which the content of the InformationElement was created. Example: ISO-8859-1, UTF-8. One of the registered character sets at http://www.iana.org/assignments/character-sets. This characterSet is used to interpret any textual parts of the content. If more than one characterSet is used within one data object, use more specific properties
comment string A user comment about an InformationElement
contentAccessed dateTime
contentCreated dateTime The date of the content creation. This may not necessarily be equal to the date when the DataObject (i.e. the physical representation) itself was created. Compare with nie:created property
contentLastModified dateTime The date of the last modification of the original content (not its corresponding DataObject or local copy). Compare with nie:lastModified
contentSize integer The size of the content. This property can be used whenever the size of the content of an InformationElement differs from the size of the DataObject. (e.g. because of compression, encoding, encryption or any other representation issues). The contentSize in expressed in bytes
copyright string Content copyright
depends DataObject Dependency relation. A piece of content depends on another piece of data in order to be properly understood/used/interpreted
description string A textual description of the resource. This property may be used for any metadata fields that provide some meta-information or comment about a resource in the form of a passage of text. This property is not to be confused with nie:plainTextContent. Use more specific subproperties wherever possible
disclaimer string A disclaimer
generator string Software used to ‘generate’ the contents. E.g. a word processor name
hasLogicalPart InformationElement Generic property used to express ‘logical’ containment relationships between InformationElements. NIE extensions are encouraged to provide more specific subproperties of this one. It is advisable for actual instances of InformationElement to use those specific subproperties. Note the difference between ‘physical’ containment (hasPart) and logical containment (hasLogicalPart)
hasPart DataObject Generic property used to express ‘physical’ containment relationships between DataObjects. NIE extensions are encouraged to provide more specific subproperties of this one. It is advisable for actual instances of DataObjects to use those specific subproperties. Note to the developers: Please be aware of the distinction between containment relation and provenance. The hasPart relation models physical containment, an InformationElement (a nmo:Message) can have a ‘physical’ part (an nfo:Attachment). Also, please note the difference between physical containment (hasPart) and logical containment (hasLogicalPart) the former has more strict meaning. They may occur independently of each other
identifier string An unambiguous reference to the InformationElement within a given context. Recommended best practice is to identify the resource by means of a string conforming to a formal identification system
informationElementDate dateTime A point or period of time associated with an event in the lifecycle of an Information Element. A common superproperty for all date-related properties of InformationElements in the NIE Framework
isLogicalPartOf InformationElement Generic property used to express ‘logical’ containment relationships between DataObjects. NIE extensions are encouraged to provide more specific subproperties of this one. It is advisable for actual instances of InformationElement to use those specific subproperties. Note the difference between ‘physical’ containment (isPartOf) and logical containment (isLogicalPartOf)
isStoredAs DataObject Links the information element with the DataObject it is stored in
keyword string Adapted DublinCore: The topic of the content of the resource, as keyword. No sentences here. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme
language string Language the InformationElement is expressed in. Users are encouraged to use the two-letter code specified in the RFC 3066
legal string A common superproperty for all properties that point at legal information about an Information Element
license string Terms and intellectual property rights licensing conditions.
licenseType string The type of the license. Possible values for this field may include ‘GPL’, ‘BSD’, ‘Creative Commons’ etc.
links DataObject A linking relation. A piece of content links/mentions a piece of data
mimeType string File Mime Type
plainTextContent string Plain-text representation of the content of a InformationElement with all markup removed. The main purpose of this property is full-text indexing and search. Its exact content is considered application-specific. The user can make no assumptions about what is and what is not contained within. Applications should use more specific properties wherever possible.
relatedTo DataObject A common superproperty for all relations between a piece of content and other pieces of data (which may be interpreted as other pieces of content).
rootElementOf DataSource DataObjects extracted from a single data source are organized into a containment tree. This property links the root of that tree with the datasource it has been extracted from
subject string The subject or topic of the document
title string The title of the document
usageCounter integer
version string The current version of the given data object. Exact semantics is unspecified at this level. Use more specific subproperties if needed
id string
mediaId string
location GeoLocation This can be subclassed to add semantics
hasExternalReference ExternalReference Links the information element with the external reference

Credits and Copyright

Authors:

  • Antoni Mylka, DFKI, <antoni.mylka@dfki.de>
  • Leo Sauermann, DFKI, <leo.sauermann@dfki.de>
  • Ludger van Elst, DFKI, <elst@dfki.uni-kl.de>
  • Michael Sintek, DFKI, <michael.sintek@dfki.de>

Editors:

  • Antoni Mylka, DFKI, <antoni.mylka@dfki.de>

Contributors:

  • Christiaan Fluit, Aduna, <christiaan.fluit@aduna-software.com>
  • Evgeny ‘phreedom’ Egorochkin, KDE Strigi Developer, <stexx@mail.ru>

Upstream: Upstream version

ChangeLog: Tracker changes

Copyright: © 2007 DFKI © 2009 Nokia. The ontologies are made available under the terms of NEPOMUK software license (FIXME verify)