Publishing Event Streams as Linked Data

Latest version:
http://km.aifb.kit.edu/sites/lodstream/
Vocabulary:
http://events.event-processing.org/types/types.n3
Last modified:
$Id: index.html 23484 2014-08-25 17:55:33Z stuehmer $
Status:
Public draft
Authors:
Andreas Harth, Karlsruher Institut für Technologie
Roland Stühmer, FZI Forschungszentrum Informatik

Abstract

How to publish event streams as Linked Data.

Introduction

Increasingly, real-time processing is becoming important and sites provide direct access to data streams, which provide data as soon as the data appears. For example, Twitter provides API access to a stream of Twitter messages in real-time. In addition, with the proliferation of sensors, more streaming data sources will become available. Simultaneously, there is an increasing amount of public data on the Web which can be linked to and reused in RDF. This means that a lot of static data is available to be used as context of events.

Requirements

N-ary predicates. [N-ARY-REL]

Need a class event, time stamp (atomic or interval).

Data sources:

Triples/Quads occur once in NiceWeatherStream and once in historical data source for events.

Event Class

An event is defined as:

Event
anything that happens or is contemplated as happening [EP-GLOSS]
something that happens at a given place and time [WORDNET]

Temporal Modelling

Spatial Modelling

(optional) (either lat/lon or URIs to locations which contain descriptions of geometry)

The NeoGeo vocabulary [NEOGEO] and [BASIC-GEO] could be a starting point:

e:e1    {
    <http://events.event-processing.org/ids/e1#event>
            :location [ geo:lat "53.27194"^^xsd:double; geo:long "-9.04889"^^xsd:double ] .
}

Event Model (Class Diagram)
Figure 1: Event Model (Class Diagram)

Figure 1 shows the event model in a class diagram. The class "Event" is the superclass for any event to conform to our model. This class makes use of related work by inheriting from the class "DUL:Event" from Dolce Ultralight [DUL]. That class provides a notion of time and helps distinguish events (things that happen) from facts (which are always valid).

An instance of class Event MUST have (i) a type, (ii) at least one timestamp and (iii) a relevant stream. We describe the event properties in detail as follows.

The type of an event must be specified using rdf:type. The type must be the class Event or any subclass.

The event model supports interval-based events as well as point-based events by either using just the property :endTime for a point or both :startTime and :endTime for an interval. The property :endTime thus has a cardinality of [1..1] whereas :startTime has a cardinality of [0..1]. Both temporal properties are subproperties of DUL:hasEventDate from the super class. We improve the semantics by distinguishing start from end whereas the superclass has an alternative, more difficult way of formulating intervals using subobjects reifying the interval.

The property :stream associates an event with a stream. Streams are used in our system as a unit of organisation for events governing publish/subscribe and access control. Streams themselves are modelled using title, description and a topic needed for topic-based publish/subscribe.

The first optional property is :location For for geo-referencing of events (where necessary) we re-use the basic geo vocabulary from the W3C [BASIC-GEO]. The property may be used to locate events in physical locations on a globe. The property is subproperty of DUL:hasLocation and geo:location to inherit the semantics from those schemas.

Inter-event relationships may be supported by linking a complex event to the simple events which caused it. Thus, RDF Lists may be used in :members to maintain an ordered and complete account of member events. The linked events are identified by their URI. These linked events could have further member events themselves. This allows modelling of composite events [EP-GLOSS]. The :members property is a subproperty of DUL:hasConstituent from the superclass.

The property :eventPattern may be used to link a complex event to the pattern which caused the event to be detected. Using such links can help in recording provenance of derived events.

The source of an event may be specified using the :source property. This is an optional property to record the provenance of a event where needed. The property is a subproperty of DUL:involvesAgent. Agents may be human or non-human.

A human readable synopsis of an event may be added using the :message property. This proves useful in scenarios where events are received by human end users. The :message property is a subproperty of dc:title, a popular way of describing things using natural language.

Examples

Example 1
@prefix :        <http://events.event-processing.org/types/> .
@prefix dbpedia: <http://dbpedia.org/resource/#> .
@prefix e:       <http://events.event-processing.org/ids/> .
@prefix geo:     <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix s:       <http://streams.event-processing.org/ids/> .
@prefix src:     <http://sources.event-processing.org/ids/> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .

e:e1 {
    <http://events.event-processing.org/ids/e1#event>
          a       :AvgTempEvent ;
          :endTime "2011-08-24T14:42:01.011"^^xsd:dateTime ;
          :members (<http://events.event-processing.org/ids/e2#event> <http://events.event-processing.org/ids/e3#event>) ;
          :source <http://sources.event-processing.org/ids/NiceWeatherAggregator#source> ;
          :startTime "2011-08-24T14:40:59.837"^^xsd:dateTime ;
          :stream <http://streams.event-processing.org/ids/NiceWeatherStream#stream> ;
    dbpedia:Nice :avgTemp [ rdf:value "25" ; :event  <http://events.event-processing.org/ids/e1#event> ] .
}
  1. An event using quadruples in TriG syntax [TRIG]. The graph name (a.k.a context) before the curly braces is used in the storage backend to enable efficient indexing of contiguous triples.
  2. The event in this example has the ID e1 as part of its URI. There is a distinction made between the event e1:event and the Web resource e1 to separate the URI for the thing event from the information resource describing the event. The fragment identifier #event is used to identify the event.
  3. RDF Lists "(", ")" are used in :members to maintain ordered and complete account of related events.
  4. N-ary predicates "[", "]" [N-ARY-REL] are used on dbpedia:Nice to maintain knowledge about a temperature measurement and at the same time its belonging to a specific event.
  5. There is an event type hierarchy from which type :AvgTempEvent is inherited. This hierarchy can be extended by anyone by referencing the RDF type Event as a super class.
  6. The event links to two other events e2:event, e3:event in different streams which were used as input to create the :AvgTempEvent. These events are depicted below. They could have further input events themselves.
  7. The event links to an entity dbpedia:Nice from Linked Data where further context for the event can be retrieved.
  8. The event links to a stream where current events can be obtained as they happen.
  9. We are aware of the overlap of the namespaces. Unfortunately, stacking of namespaces like e:e1#event is not allowed, so we are using the absolute URI instead of e1:event which avoids defining a new namespace e1, e2, e3, ..., eN for each unique event .
  10. The namespace event-processing.org was chosen as a generic home for this schema.
Example 2
e:e2 {
    <http://events.event-processing.org/ids/e2#event>
          a       :TempEvent ;
          :endTime "2011-08-24T14:40:59.837"^^xsd:dateTime ;
          :source <http://sources.event-processing.org/ids/NiceWeatherStation01#source> ;
          :stream <http://streams.event-processing.org/ids/NiceTempStream#stream> ;
    dbpedia:Nice :curTemp [ rdf:value "23" ; :event  <http://events.event-processing.org/ids/e2#event> ] .
}
Example 3
e:e3 {
    <http://events.event-processing.org/ids/e3#event>
          a       :TempEvent ;
          :endTime "2011-08-24T14:42:01.011"^^xsd:dateTime ;
          :source <http://sources.event-processing.org/ids/NiceWeatherStation02#source> ;
          :stream <http://streams.event-processing.org/ids/NiceTempStream#stream> ;
    dbpedia:Nice :curTemp [ rdf:value "27" ; :event  <http://events.event-processing.org/ids/e3#event> ] .
}

Performance

Streams make use of namespace prefixes to reduce the number of characters to be transmitted.

Further bandwidth savings can be achieved by gzipping the HTTP stream.

Another option could be a binary RDF serialization like [HDT-ISWC2010].

A. Change Log

Changes since 2012-11-27:

B. References

[DUL]
Aldo Gangemi. DOLCE+DnS Ultralite (DUL)
[BASIC-GEO]
Dan Brickley. Basic Geo (WGS84 lat/long) Vocabulary, 2003.
[HDT-ISWC2010]
Javier D. Fernández, Miguel A. Martínez-Prieto, Claudio Gutierrez, and Axel Polleres. Binary RDF Representation for Publication and Exchange (HDT), W3C Member Submission 30 March 2011. Available at http://www.w3.org/Submission/2011/SUBM-HDT-20110330
[NEOGEO]
Juan Martín Salas and Andreas Harth. NeoGeo Vocabulary: Defining a shared RDF representation for GeoData, Public draft.
[EP-GLOSS]
David Luckham and Roy Schulte (Editors). Event Processing Glossary - Version 2.0, 2011
[WORDNET]
Christiane Fellbaum et al.. WordNet on "event"
[TRIG]
Chris Bizer and Richard Cyganiak. RDF 1.1 TriG
[N-ARY-REL]
Natasha Noy and Alan Rector. Defining N-ary Relations on the Semantic Web

C. Java

For the use in Java we are generating classes from the RDF types. The classes are generated using RDFReactor. They are currently available as Maven snapshots in the following artefact:
    <dependency>
        <groupId>eu.play-project</groupId>
        <artifactId>play-commons-eventtypes</artifactId>
        <version>1.1</version>
    </dependency>
To make it work, include these repositories:
    <repository>
        <id>ow2.release</id>
        <name>OW2 Maven Releases Repository</name>
        <url>http://repository.ow2.org/nexus/service/local/staging/deploy/maven2</url>
    </repository>