DatalogRA: datalog with recursive aggregation in the spark RDD model
Distributed computations on graphs are becoming increasingly important with the emergence of large graphs such as social networks and the Web that contain huge amounts of useful information. Computations can be easily distributed with the use of specialised frameworks like Hadoop with MapReduce, Giraph/Pregel or GraphLab. Yet, declarative, query-like, but at the same time efficient solutions are lacking. Programmers are needed to code all computations by hand and manually optimise each individual program.
This paper presents an implementation of a tool which extends a distributed computations platform, Apache Spark, with the capability of executing queries written in a variant of a declarative query language, Datalog, especially extended to better support graph algorithms.
This approach makes it possible to express graph algorithms in a declarative query language, accessible to a broader group of users than typical programming languages, and execute them on an existing infrastructure for distributed computations.
Publication Reference
Marek Rogala, Jan Hidders, Jacek Sroka: DatalogRA: datalog with recursive aggregation in the spark RDD model. GRADES 2016: 3