Graph databases and RDF stores

Involved members: 

Graph databases can be seen as a special case of NoSQL databases, however they usually have very specific different implementation techniques and application domains that are different from other NoSQL databases. For example graph database play an important role in graph analytics for domains such as social media, the life sciences, telecommunication and crime fighting.

On the other hand, the graph paradigm is a very intuitive way of presenting both schemas and data, and many queries that are hard to represent in a textual query language can be readily understood, even by non-experts, when represented as a graph pattern.

In addition, semantic technology such as RDF also to a large extent is based on storage-techniques of graph-databases, and RDF stores can therefor be seen as special cases of graph databases.

The research in WISE considers the following topics:

  • Schema languages for graph-based data models: There are some proposals for languages that allows for the precise definition of the content of a graph database, but there is not yet a clear winner. The research focuses on investgating the properties of the different proposals and their trade-offs. In addition new constructs are investigated on their expressive power, computational effectivenes to typing and consistency checking, and implementability.
  • Indexing structures for graph databases: Since graph queries and graph data often differe from relational queries and relational data, it is possible to desing tailored indexing structures for graph databases. The research is aiming to compare existing proposals, and developing new indexing structures for specific types of graph queries such as graph patterns and path expressions.
  • Partitioning and distributing graph data: One particular challenge in graph databases that achieve scalability by distributing the data and the query processing over multiple servers, is the problem of determining where which parts of the graph are stored. This problem is similar to choosing the right types of indexes, since what is right or wrong may depend on the type of data access that is to be expected.
  • Efficiently executing Regular Path Queries: A very basic type of graph query is that of RPQs, which attempt to find all pairs of nodes between which there exists a path whose string of edge-labeld is in the language of a certain regular expressoin over the alphabet of all edge labels. Such queris are in the core of almost very serious graph query language, and therefore are worth the effort to find a highly optimized implementation. The resarch focuses on finding optimal algorithms for executing and optimizing such queries, specifically when the data is distributed over multiple servers.
  • Graph query languages: There are many proposals for the standard for querying graph languages, but there is not yet a clear winner (except perhaps for RDF data where SPARQL seems the commonly accepted standard). The research focuses on comparing the propsals and the propsed constructs, and also investigating their relationship with other languages for other data models such as JSON, XML and the relational data model. A subject of particular interest is the problem of developing a typing regime for such language in the context of a specified schema.