Team IN



Observation

Web users are extremely impatient; give them an interface they don't understand immediately, and chances are they are off too the competitors website in seconds.
The creators of semantic portals often forget about this, when they create websites where the user needs to understand the differences between searching in document text, the terms or the ontology dictionary. Or where the user needs to navigate to her cooking recipe starting at "Thing".
The most popular interface on the internet seems to be the simplest: the plain text box where you enter what you are looking for. The queries are usually very short and only very few users are willing to bother with any more options.


Idea

Build a semantic portal that uses a Google-like in interface.
Once the users enters a query the search engine tries to interpret (parts of) the search string as references to some parts in the ontology. If this succeds, the information from the ontology is used better answer the query (if for example a part of the query references a concept in the ontology and we have documents that have been manually annotated to be about this topic, these document will be highly ranked). But at the same time the search engine makes a full text search over the contents of any documents and the results from text search and ontology search are combined.
Since more and more users of the internet are machines, such a search engine needs to be accessible as webservice.


Implementation

We started by defining OWL ontology about computer science topics, we also defined a lexicon containing natural language terms for the topics in the ontology.
We went on to annotate older AIFB papers with the topics from our ontology. Lucene was then used to index the content of the aifb papers as well as the annotations.
The actual search engine is programmed in Java, the web interface is implemented as servlet that runs in Tomcat, the webservice is build using Axis.
The search engine searches for all the terms from the query in the content of the AIFB papers. In addition it tries to find the terms from lexicon in the query string - if any can be found, it searches for documents annotated with this topic(s) and (with a lesser weight) for documents annotated with a sub-/supertopic.


The Team

Left to right: Daniel Oberle, Denny Vrandecic, Steffen Lamparter, Valentin Zacharias

see our presentation (PDF)