CSV to RDF conversion template project

This repository provides all you need to convert CSV files to RDF. It contains:

A sample CSV file
A sample XRM mapping that generates CSVW (CSV on the web) mapping files
A pipeline that converts the input CSV to RDF
A default GitHub Action configuration that runs the pipeline and creates an artifact for download

This is a GitHub template repository. It will not be declared as "fork" once you click on the Use this template button above. Simply do that, start adding your data sources and adjust the XRM mapping accordingly:

Add your CSV files to the input directory.
Generate a logical-source block for each CSV by running:
```
npm run csv2xrm -- input/yourfile.csv input/another.csv
```
This reads the CSV header of each file, auto-detects the delimiter, and appends one logical-source block per file to mappings/Sources.xrm. You will be prompted before overwriting an existing file.
Adjust the generated block and create/adjust the XRM mapping files in mappings/.
Execute one of the run-scripts to convert your data.

Make sure to commit the input, mappings and src-gen directories if you want to build it using GitHub Actions.

See Further reading for more information about the XRM mapping language.

Run the pipeline

The default pipeline can be run with npm start or npm run to-file. It will:

Read the CSVW input files
Convert it to RDF
Write it into a file as N-Triples (default: output/transformed.nt)

There are additional pipelines configured in package.json:

file-to-store: Uploads the generated output file to an RDF store via SPARQL Graph Store Protocol
to-store(-dev): Directly uploads to an RDF store (direct streaming in the pipeline) via SPARQL Graph Store Protocol

If you want to test the upload to an RDF store, a default Apache Jena Fuseki installation with a database data on port 3030 should work out of the box.

Pipeline configuration is done via environment variables and/or adjusting default variables in the pipeline itself. If you want to pass another default, have a look at the --variable=XYZ samples in package.json or consult the barnard59 documentation. If you want to adjust it in the pipeline, open the file pipelines/main.ttl and edit <defaultVars> ....

barnard59 RDF pipelines

This template is built on top of our Zazuko barnard59 pipelining system. It is a Node.js based, fully configurable pipeline framework aimed at creating RDF data out of various data sources. Unlike many other data pipelining systems, barnard59 is configured instead of programmed. In case you need to do pre- or post-processing, you can implement additional pipeline steps written in JavaScript.

barnard59 is streaming and can be used to convert very large data sets with a small memory footprint.

Other template repositories

We provide additional template repositories:

xrm-csv-workflow: A template for converting CSV files or relational databases to RDF using DuckDB and Ontop VKG. Better suited for large CSV files — this repository is a good fit for smaller, self-contained projects that only need Node.js.

Querying the output

Sample SPARQL queries for the generated data are available in docs/QUERIES.sparqlbook. To run them directly in VS Code, install the SPARQL Notebook extension.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github		.github
docs		docs
input		input
mappings		mappings
pipelines		pipelines
scripts		scripts
src-gen		src-gen
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CSV to RDF conversion template project

Run the pipeline

barnard59 RDF pipelines

Other template repositories

Querying the output

Further reading

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CSV to RDF conversion template project

Run the pipeline

barnard59 RDF pipelines

Other template repositories

Querying the output

Further reading

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages