Giant Triple Storage

In a Giant Triple Store every single Triple is put into one single Table. In the Table there are three Columns for Subject, Property and Object. This is easy to implement as one has only to create one Table within the Database. Then one has lots of different Properties in the Property Column. But to access this the problem is that one has to do lots of self joins on that Relational Database Table. For this reason the Indices have to be choose with care to enable efficient computations inside a Giant Triple Store.

  • Basic idea:
  • Pros:
  • easy to implement
  • works for huge numbers of properties, if Indexes are chosen with care
  • Cons:
  • many self joins

In such system one has to translate a SPARQL Query into a SQL.

For instance if one wants to select universities where one has graduate students that have Bachelors from the same university. This is quite simple, because one selects two different Graph Patterns one has to do a self join on the single Table. One selects Triples from t1 and Triples from t2 and in the first one wants to take out all rows that have the type GradeStudent and one wants to combine that with the Bachelors from the same University.


SELECT ?university WHERE {
    ?v rdf:type :GradStudent;
        :bachelorsFrom ?university . }


SELECT t2.o AS university
FROM triples AS t1, triples AS t2
WHERE t1.p='type' AND
      t1.o='GradStudent` AND
      t2.p='BachelorsFrom' AND

In this sense one can translate a SPARQL Query into a SQL Query. The problem is that one has only one self join and if one has lots of conditions and Graph Patterns one will have lots of self joins. This of course leads to a blow up in computation and will take a while. Therefore it is recommended to have a better way to encode Triples in RDF with Relational Databases and also to transform the SPARQL Queries into SQL Queries.

Edit tutorial

Comment on This Data Unit