The 2016 ENtity Summarization Evaluation Campaign (ENSEC 2016)
The volume of entity-centric data is rapidly increasing on the Web, including RDF and Linked Data, Schema.org, Facebook’s Open Graph, and Google’s Knowledge Graph, describing entities (e.g., directors and films) and relations between them (e.g., directs). The description of an entity, consisting of a set of entity-property-value triples, is sometimes too long to be entirely presented to a user. As a substitute, a compact summary can be shown to help the user efficiently while effectively perform a task (e.g., browsing, searching).
Specifically, an entity summary is a subset of entity-property-value triples selected from the description of an entity. Entity summarization is the task of automatically generating a high-quality entity summary, to be used for a specific task or for general purposes. Whereas several preliminary solutions have been proposed , the problem is still far from being solved. Therefore, this ENtity Summarization Evaluation Campaign (ENSEC) is organized to assess strengths and weaknesses of entity summarization systems, compare performance of techniques, and enhance communication among researchers and developers.
ENSEC-2016, co-located with the SumPre 2016 workshop, consists of two tracks: the DBpedia-50 track and the LinkedMDB-30 track. A system can participate either or both tracks, by submitting summaries it generates for a set of specified entities. The results will be evaluated against gold-standard entity summaries given by human experts.
The DBpedia-50 Track
This track was cancelled due to the lack of sufficient submissions
The LinkedMDB-30 Track
- Winning system: Yang Li and Liang Zhao. A Common Property and Special Property Entity Summarization Approach Based on Statistical Distribution
- Runner-up: Danyun Xu, Liang Zheng and Yuzhong Qu. CD at ENSEC 2016: Generating Characteristic and Diverse Entity Summaries
- LinkedMDB entity descriptions (30 entities of type Film, Actor, Director)
- Reference dataset and submissions (gold standard summaries for the 30 entities from six independent evaluators, two challenge submissions)
This track consists of 30 entities in LinkedMDB (2012-02-10). The dataset dump can be downloaded here.
For diversity purposes, the 30 entities are composed of ten entities randomly selected from each of the following three major classes in LinkedMDB: Film, Actor, and Director.
The description of each entity consists of at least twenty RDF triples; an entity can be either the subject or the object of a triple. All the 30 entity descriptions can be found in this zipped pack of 30 N-Triples files, each named 'EntityLocalName.nt'.
Each participating system should select a subset of five triples from the description of each entity, as a summary for general purposes. Each summary is contained in a separate N-Triples file, named 'EntityLocalName_top5.nt'. All the 30 N-Triples files are packed into 'SystemName_linkedmdb30.zip'. Alternatively, considering that some systems can be configured in different ways (e.g., under different parameter settings), each participating system is allowed to submit the results of two runs under different configurations, packed into two separate files named 'SystemName_linkedmdb30_runA.zip' and 'SystemName_linkedmdb30_runB.zip'.
Entity summaries, packed into zip files, should be sent to email@example.com. Besides, participants are expected to submit a paper describing each participating system. The paper will be up to 5 pages in Springer LNCS format, and submitted via EasyChair, to be included in the proceedings of the SumPre 2016 workshop without peer review. At least one author of each system paper is expected to register for the workshop and attend to present the paper.
Entity summaries generated by participating systems will be compared with gold-standard entity summaries given by a group of human experts. The results will be reported at the SumPre 2016 workshop. Gold-standard entity summaries will also be shared at the workshop, to be used in future research.
Winners and runner-ups will share Amazon vouchers in a total value of 350 Euros.
Gong Cheng, Nanjing University, China
Kalpa Gunaratna, (Kno.e.sis) Wright State University, USA
Andreas Thalhammer, Karlsruhe Institute of Technology (KIT), Germany
 Gong Cheng, Thanh Tran, Yuzhong Qu. RELIN: Relatedness and Informativeness-based Centrality for Entity Summarization. In Proceedings of the 10th International Semantic Web Conference (ISWC'11), Part I, pages 114--129, 2011.
 Andreas Thalhammer, Ioan Toma, Antonio J. Roa-Valverde, Dieter Fensel. Leveraging Usage Data for Linked Data Movie Entity Summarization. In Proceedings of the 2nd International Workshop on Usage Analysis and the Web of Data (USEWOD’12), 2012.
 Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit Sheth. FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15), pages 116--122, 2015.