DBPL Keynote: The Gremlin Graph Traversal Machine and Language
Gremlin is a graph traversal machine and language designed, developed, and distributed by the Apache TinkerPop project. Gremlin, as a graph traversal machine, is composed of three interacting components: a graph G, a traversal Ψ, and a set of traversers T. The traversers move about the graph according to the instructions specified in the traversal, where the result of the computation is the ultimate locations of all halted traversers. A Gremlin machine can be executed over any supporting graph computing system such as an OLTP graph database and/or an OLAP graph processor. Gremlin, as a graph traversal language, is a functional language implemented in the user’s native programming language and is used to define the Ψ of a Gremlin machine. This article provides a mathematical description of Gremlin and details its automaton and functional properties. These properties enable Gremlin to naturally support imperative and declarative querying, host language agnosticism, user-defined domain specific languages, an extensible compiler/optimizer, single- and multi-machine execution models, hybrid depth- and breadth-first evaluation, as well as the existence of a Universal Gremlin Machine and its respective entailments.
Dr. Marko A. Rodriguez focuses his time and energy advancing the state of the art of graph computing. Marko is a co-founder of Apache TinkerPop which is a graph processing framework leveraged by graph system vendors. TinkerPop’s graph traversal language is called Gremlin and it operates over both OLTP graph databases and OLAP graph processors. Marko is currently a Director of Engineering at DataStax. Previously, he was the founder and CEO of the graph computing firm Aurelius (acquired by DataStax) and a Director’s Fellow at the Center for Nonlinear Studies at the Los Alamos National Laboratory. Dr. Rodriguez lives in a small farming village north of Santa Fe, New Mexico.