Data Sprout
Dataset Generation Patterns for Evaluating
Knowledge Graph Construction

Knowledge Graph Selection

Select a knowledge graph that is used as a source to generate the data below.

Decides date format and boolean value names.


Shows for each cell a cell comment with the expected statements in turtle format. Note that some spreadsheet tools can not handle large amounts of cell comments.

All statements that should be found in the spreadsheet.

For each cell in a sheet the provenance information using reified statements.

For each cell in a sheet the provenance information as a CSV table.

Explains what and how often patterns where used for each sheet.

When you click on the button, a new tab page will be opened. Depending on the settings the generation can take some time.
If it was successful, a ZIP file is downloaded which contains all generated resources. If an error occurs, the page displays some error information instead.

You can also download already generated datasets.

Generation Patterns
Activate Pattern Image
Numeric Information as Text
There is numeric information, therefore, it can be also represented as text
Acronyms or Symbols
An entity refers to another entity or has a literal value, therefore, to save time a rather short acronym or symbol string is used to refer to an entity or a literal value
Multiple Surface Forms
Entities can be mentioned in various ways, therefore, different cells contain distinct surface forms of equal entities
Property Value as Color
Some entities have different values for the same property, therefore, different colors that encode property values are chosen to color ranges
Partial Formatting Indicates Relations
Multiple entities in one cell have different relationships with another entity, therefore, partial formatting is used to indicate their relations
Outdated is Formatted
Information is not valid anymore, but must not be removed completely, therefore, outdated information is formatted
Multiple Entities in one Cell
Multiple entities have to be referred to, therefore, entities are listed in the same cell
Intra-Cell Additional Information
Additional information is related to information that is already recorded in a cell, therefore, the additional information is recorded in the same cell
Multiple Types in a Table
Some entities of different types share same properties, therefore, they are recorded in the same table