Measuring Query Processing Performance with dief@t and dief@k



In this demo, we showcase the application of the metrics dief@t and dief@k to measure the continous efficiency (or diefficiency) of SPARQL query engines.

To measure the diefficiency of approaches, the metrics dief@t and dief@k compute the area under the curve of answer traces. Answer traces record the points in time when an approach produces an answer. The plot (on the right side) depicts the answer trace of three approaches when executing a query.

We compare the performance of the nLDE query engine when executing SPARQL queries with three different configurations: Not Adaptive, Random, and Selective.

We executed the SPARQL queries from Benchmark 1 using nLDE and recorded two outputs:
  • traces: Contains the answer trace per query per approach.
  • metrics: Reports on time for the first tuple, execution time, and number of answers produced per query and approach.
These outputs are available as CSV files at https://doi.org/10.6084/m9.figshare.5008289.

As part of this demo, we are providing the dief R package available at GitHub. To illustrate the usage of the package, in this demo we will show snippets of the functions provided by dief to generate the reported results.

In this demo, we will analyze the performance of the nLDE variants when executing different SPARQL queries.

Answer Trace






1. Measuring Performance with dief@t


Measuring dief@t at Different Points in Time

The metric dief@t can be measured at different points of time. Intuitively, approaches that produce answers at a higher rate in a certain period of time are more efficient.

dief@t interpretation: Higher is better.





Comparing dief@t with Other Metrics

Raw Values of Metrics

We computed four different metrics used in the query processing literature.
  • Time for the first tuple (TFFT)
  • Total execution time (totaltime)
  • Number of answers produced (comp)
  • Throughput (throughput)

dief@t is computed for t equals to the minimum of the total execution time among all the approaches.

Radarchart of Computed Metrics

Plot interpretation: Higher is better.




2. Measuring Performance with dief@k


Measuring dief@k at Different Number of Answers Produced

The metric dief@k can be measured at different number of answers produced. Intuitively, approaches that require a shorter period of time to produced a certain number of answers are more efficient.

dief@k interpretation: Lower is better.




Measuring dief@k at Different Answer Completeness

dief@k can be measured when the approaches produce a certain percentage of the answers.

Plot interpretation: Lower is better.






Demo by Maribel Acosta and Maria-Esther Vidal.