Information Integration

Created on Feb. 27, 2013, 7:40 p.m. by Hevok & updated by Hevok on May 2, 2013, 4:42 p.m.

We are combining information from diverse Open Linked Data sets. Everything should be integrated using an Ontology (and other usual suspects like SKOS), and stored as RDF in Triple Store (like Virtuoso). RDF is used because its Graph mode is the most efficient - and most agile when changes are needed - way to store and query information. The front-end is thus generated using a set of SPARQL Queries (as a Semantic Search Feature, but hidden behind hopefully well-designed form). Data is also provided in JSON or JSON-LD.

To cope with data-quality Real-time extraction algorithms are needed, so that the data will be up-to-date right after it has being changed on the original source. On has also to deal with conflicting information. When for instance two sources give different information one can simply display both, but it would be more elegant to figuring out other techniques to deal with that.

The architecture shall be domain agnostic so that it can be replicated to any dataset and RDF store, pending a few parameters in the system.


Tags: web, open linked data, semantics, data
Categories: Concept
Parent: Data Integration

Update entry (Admin) | See changes

Comment on This Data Unit