Skip to content

ldelements/lde

Repository files navigation

LDE – Linked Data Elements

Shared building blocks for the full Linked Data lifecycle.

CI License: MIT

Every organization working with Linked Data ends up building the same infrastructure from scratch: endpoint management, data import, transformation pipelines, dataset discovery.

LDE covers the full Linked Data lifecycle – from discovery and ingestion through transformation to publication – as an open-source toolkit of composable building blocks for Node.js.

Data transformations are expressed as plain SPARQL queries: portable, transparent and free of vendor lock-in.

Key capabilities

  • Discover datasets from DCAT-AP 3.0 registries.
  • Download and import data dumps to a local SPARQL endpoint for querying.
  • Transform datasets with pure SPARQL CONSTRUCT queries: composable stages with fan-out item selection.
  • Analyze datasets with VoID statistics and SPARQL monitoring.
  • Publish results to SPARQL endpoints or local files.
  • Serve RDF data over HTTP with content negotiation (Fastify plugin).

Standards

Standard Usage
DCAT-AP 3.0 (EU) Dataset discovery and registry queries
SPARQL 1.1 Data transformations, dataset queries and endpoint management
SHACL Validation (@lde/pipeline-shacl-validator) and documentation generation (@lde/docgen)
VoID Statistical analysis of RDF datasets (@lde/pipeline-void)
RDF/JS Internal data model (N3)
LDES (EU) Event stream consumption and publication (planned)

Quick example

import {
  Pipeline,
  Stage,
  SparqlConstructExecutor,
  SparqlItemSelector,
  SparqlUpdateWriter,
  ManualDatasetSelection,
} from '@lde/pipeline';

const pipeline = new Pipeline({
  datasetSelector: new ManualDatasetSelection([dataset]),
  stages: [
    new Stage({
      name: 'per-class',
      itemSelector: new SparqlItemSelector({
        query: 'SELECT DISTINCT ?class WHERE { ?s a ?class }',
      }),
      executors: new SparqlConstructExecutor({
        query:
          'CONSTRUCT { ?class a <http://example.org/Class> } WHERE { ?s a ?class }',
      }),
    }),
  ],
  writers: new SparqlUpdateWriter({
    endpoint: new URL('http://localhost:7200/repositories/lde/statements'),
  }),
});

await pipeline.run();

Packages

Discovery – Find and retrieve dataset descriptions from registries
@lde/dataset npm Core dataset and distribution objects
@lde/dataset-registry-client npm Retrieve dataset descriptions from DCAT-AP 3.0 registries
Processing – Transform, enrich and analyse datasets with SPARQL pipelines
@lde/pipeline npm Build pipelines that query, transform and enrich Linked Data
@lde/pipeline-shacl-validator npm SHACL validation for pipeline stages
@lde/pipeline-void npm VoID statistical analysis for RDF datasets
@lde/distribution-downloader npm Download distributions for local processing
@lde/sparql-importer npm Import data dumps to a local SPARQL endpoint for querying
Publication – Serve and document your data
@lde/fastify-rdf npm Fastify plugin for RDF content negotiation and request body parsing
@lde/docgen npm Generate documentation from RDF such as SHACL shapes
Monitoring – Observe pipeline runs and endpoint health
@lde/sparql-monitor npm Monitor SPARQL endpoints with periodic checks
@lde/pipeline-console-reporter npm Console progress reporter for pipelines
Infrastructure – Manage SPARQL servers and run tasks
@lde/local-sparql-endpoint npm Quickly start a local SPARQL endpoint for testing and development
@lde/sparql-server npm Start, stop and control SPARQL servers
@lde/sparql-qlever npm QLever SPARQL adapter for importing and serving data
@lde/wait-for-sparql npm Wait for a SPARQL endpoint to become available
@lde/task-runner npm Task runner core classes and interfaces
@lde/task-runner-docker npm Run tasks in Docker containers
@lde/task-runner-native npm Run tasks natively on the host system

Architecture

graph TD
  subgraph Discovery
    dataset
    dataset-registry-client --> dataset
  end

  subgraph Processing
    pipeline --> dataset-registry-client
    pipeline --> sparql-server
    pipeline --> sparql-importer
    pipeline-shacl-validator --> pipeline
    pipeline-void --> pipeline
    distribution-downloader --> dataset
    sparql-importer --> dataset
  end

  subgraph Publication
    fastify-rdf
    docgen
  end

  subgraph Monitoring
    pipeline-console-reporter --> pipeline
    sparql-monitor
  end

  subgraph Infrastructure
    sparql-qlever --> sparql-importer
    sparql-qlever --> sparql-server
    sparql-qlever --> task-runner-docker
    task-runner-docker --> task-runner
    task-runner-native --> task-runner
    sparql-server
    local-sparql-endpoint
    wait-for-sparql
  end
Loading

Who uses LD Elements

Netwerk Digitaal Erfgoed — Dutch national digital heritage infrastructure, commissioned by the Ministry of Education, Culture and Science

Comparison

LD Elements TriplyETL rdf-connect
Focus SPARQL-native pipelines RDF ETL platform RDF stream processing
Pipeline language SPARQL + TypeScript TypeScript DSL Declarative (RML)
Lock-in None – plain SPARQL files Proprietary platform Framework-specific
License MIT Proprietary MIT

Development

Prerequisites: Node.js (LTS) and npm.

npm install
npx nx run-many -t build
npx nx run-many -t test
npx nx affected -t lint test typecheck build  # only changed packages

See CONTRIBUTING.md for the full development workflow.

License

MIT – see LICENSE.

Acknowledgements

LD Elements originated at the Dutch national infrastructure for digital heritage (NDE).