Challenge Criteria
The Semantic Web Challenge 2011 is defined in terms of minimum requirements and additional desirable features that submissions should exhibit.
The minimum requirements and the additional desirable features are listed below per track.
Open Track
Minimal requirements
- The application has to be an end-user application, i.e. an application that provides a practical value to general Web users or, if this is not the case, at least to domain experts.
- The information sources used
- should be under diverse ownership or control
- should be heterogeneous (syntactically, structurally, and semantically), and
- should contain substantial quantities of real world data (i.e. not toy examples).
- The meaning of data has to play a central role.
- Meaning must be represented using Semantic Web technologies.
- Data must be manipulated/processed in interesting ways to derive useful information and
- this semantic information processing has to play a central role in achieving things that alternative technologies cannot do as well, or at all;
Additional Desirable Features
In addition to the above minimum requirements, we note other desirable features that will be used as criteria to evaluate submissions.
- The application provides an attractive and functional Web interface (for human users)
- The application should be scalable (in terms of the amount of data used and in terms of distributed components working together). Ideally, the application should use all data that is currently published on the Semantic Web.
- Rigorous evaluations have taken place that demonstrate the benefits of semantic technologies, or validate the results obtained.
- Novelty, in applying semantic technology to a domain or task that have not been considered before
- Functionality is different from or goes beyond pure information retrieval
- The application has clear commercial potential and/or large existing user base
- Contextual information is used for ratings or rankings
- Multimedia documents are used in some way
- There is a use of dynamic data (e.g. workflows), perhaps in combination with static information
- The results should be as accurate as possible (e.g. use a ranking of results according to context)
- There is support for multiple languages and accessibility on a range of devices
Billion Triples Track
The specific goal of the Billion Triples Track is to demonstrate the
scalability of applications as well as the capability to deal with the specifics
of data that has been crawled from the public Web. We stress that the goal of this is not to be a benchmarking effort between triple stores, but rather to demonstrate applications that can work on Web scale using realistic Web-quality data.
Minimal requirements
The primary goal of the Billion Triple track is to demonstrate applications that can work on Web scale using realistic Web-quality data.
- The applications must make use of the Billion Triple Challenge 2011 Dataset provided by the organisers, which has been crawled from the Web. The functionality of the applications is left open: for example it could involve helping people figure out what is in the data set via browsing, visualization, profiling, etc., or inferencing that adds information not directly queryable in the original data set.
- The tool or application has to make use of at least the first billion triples from the data provided by the organizers. It is desired that the tool or application uses the complete data set.
- The tool or application is allowed to use other data that can be linked to the Billion Triple Challenge 2011 data set, but there is still an expectation that the primary focus will be on the data provided.
- The tool or application does not have to be specifically an end-user application, as defined for the Open Track Challenge, but usability is a concern. The key goal is to demonstrate an interaction with the large data set driven by a user or an application.
Additional Desirable Features
In addition to the above minimum requirements, we note other desirable features that will be used as criteria to evaluate submissions.
- The application should do more than simply store/retrieve large numbers of triples
- The application or tool(s) should be scalable (in terms of the amount of data used and in terms of distributed components working together)
- The application or tool(s) should show the use of the very large, mixed quality data set
- The application should either function in real-time or, if pre-computation is needed, have a real-time realization (but we will take a wide view of "real time" depending on the scale of what is done)
|