Agensgraph I: Setup and Modelling

Ajay
Geek Culture
Published in
3 min readAug 1, 2021

--

Agensgraph is a Postgres-based open-source graph solution developed by Bitnine. Recently, it has been adopted by Apache and is being developed as an extension of Postgres.

In this post, you will learn how Agensgraph storage and query engine leverages Postgres. You will also get familiarized with graph modeling with some examples of it.

Storage

On top of database segregation provided by Postgres, Agensgraph provides more granular segregation which you can use for creating multiple graph databases in a single Postgres database.

To represent nodes and relationships among them, Agensgraph leverages Postgres Tables. Every node and edge is uniquely identified across a graph using a graphid identifier. Both node and edge have JSONB properties column, which is used to store node or edge-related attributes. Edges have two extra columns to represent starting nodes and ending nodes using graph ids.

A simple graph representation for a teacher <> student relationship can be encapsulated in following

Every type of nodes/edges created in the graph are stored in tables that are unique across Postgres Database.

Although nodes and edges are internally stored as Postgres relations, Agensgraph doesn’t allow any changes on these using SQL DML queries. Only Cypher queries can be used for any data manipulation.

Query

As per documentation to query the graph Cypher, SQL, and a combination of both are supported. But Cypher standard is still being developed, hence might not follow all the syntax.

In my opinion, documentation is not good enough, hence it’s better to take help from documentation of graph database which supports Cypher, for example, Neo4j. But Agensgraph has the support of limited graph functions, which means these other graph queries should be tried once. One more resourceful place to look for supported Cypher queries is the past issue page of Agensgraph.

Agensgraph uses Postgres joins extensively, we will go into depth about how Agensgraph uses them in the next post.

Modeling

For a given dataset how to represent it in a graph is called Graph Modeling. You could represent a dataset in different graph models. The kind of queries you would like to run influences the graph model. Let’s say you want to represent a dataset involving bank transactions, and want to track the transactions of a user. Each bank transaction dataset row will have a payer, a payee, and an amount that is transacted between them. I could think of representing this data in two way:

Approach 1:

Approach 2:

In this case, the second representation is better as it has fewer nodes and edges and serves our purpose of tracking a user’s transaction activity. Also, fewer edges and nodes will mean the performance of queries will be better as the data size will be less.

So, here are the simple steps you can follow for modeling a dataset:

  1. Understand query requirements.
  2. Go through the sample dataset.
  3. Figure out multiple ways a chunk of data can be represented with different nodes and edges.
  4. Take the representation with the smallest nodes and edges. Using different documentation sources figure out which query can work for fulfilling the requirement.
  5. Try out the planned queries by ingesting sample data.

Installation Guide

You can run this script to install Agensgraph in Linux

References

--

--