Skip to main content

Getting Started with Neo4j : Graph Theory and Data Connections

· 10 min read

Hi there 👋

This blog introduces readers to the world of graph databases and explores the fundamentals of building graphs.

History of Graphs

To get started with graph databases, it's important to understand what graphs are. Don't you agree?

Let’s travel back in time around 1730’s.

Story time GIF

It dates back to Königsberg, Prussia. Kongsberg is divided into four sections by the river Pregel. These four sections are connected by seven bridges as you can see in the image below.

7-bridges

Courtesy: graphacademy.neo4j.com

There lived a man Leonard Euler. He wondered if there was a way by which he could walk around the city by crossing all the seven bridges exactly once. He gave a solution which involved considering each land mass as a vertex and connecting these land masses together with a set of seven edges that represent bridges.

He finally concluded that there was no way to travel through every land mass without taking the bridge at least twice.

graph according to Leonard Euler

Courtesy: graphacademy.neo4j.com

The blue dots represent the landmasses, while the black lines connecting these dots symbolize the bridges linking the four areas. This laid the foundation for Graphs.


What’s a graph ?

It is a visual representation of data that consists of vertices and edges.

Graph Database

A graph database is a software that stores data as a network of interconnected Nodes and Relationships rather than storing it in the form of traditional tables.

Traditional Database

Traditional Database

Graph Database

Graph Database (Courtesy: graphacademy.neo4j.com)

What’s Neo4j ?

  • Neo4j is a native graph database management system designed to efficiently store, manage, and query data that is structured as a graph.
  • It models data in the form of a graph and uses the powerful GRAPH ENGINE to provide store, manage, and query graph data efficiently.

Elements of a Graph Database

The elements that constitute a graph include : Nodes & Relationships

What are Nodes ?

They refer to objects, entities or things . They can be called as Vertices and are often represented in the form of a circle.

What are Relationships ?

Relationships form the basis of connection of nodes. They can be called Edges and are often represented as connecting lines. They define how any two Nodes are related.

Below is a representation of the same.

Courtesy: graphacademy.neo4j.com

Courtesy: graphacademy.neo4j.com

In the above image, the blue and green circles are nodes(Benjamin Melniker, Liam Neeson, Batman Begins) and the lines connecting these nodes(PRODUCED, ACTED_IN) are the relationships.

We understood what Nodes and Relationships are. There’s more to just nodes and relationships that make up for the graph.

These are : Labels and Properties.

What’s a label and what’s a property?

LABELS : Labels are tags assigned to nodes that categorize them into groups or types. A single node can have multiple labels.

PROPERTIES : Properties translate to attributes of a Node or Relationship. They are key-value pairs and can be added and removed as per user requirement.

Let’s take an example to understand this better 😁

Courtesy: graphacademy.neo4j.com

Courtesy: graphacademy.neo4j.com

The above image is a representation that belongs to the popular movie database that’s available on the Neo4j site.

  • The Labels in the above image are → Person, Actor and Director.
  • The Properties, let’s take an example of the movie node are → title: ’Forest Gump’ & released: 1994.
  • The properties describe more about the Movie node which gives us additional information about the Movie Node i.e. the Title of the movie and the year of release.

What makes Neo4j more powerful than traditional databases?

  • INDEX-FREE ADJACENCY : Each node in Neo4j directly references its connected nodes, making relationship queries fast and efficient without needing indexes.

  • NATIVE-GRAPH DB : Neo4j is built specifically for graph storage and processing, so it’s faster and better suited for managing connected data than other types of databases.

  • OPTIONALLY SCHEMATIC : Neo4j doesn’t require a strict schema, giving flexibility to add structure as needed. You can adapt your data model over time without major changes.

  • CYPHER AS A QUERY LANGUAGE : Neo4j’s Cypher language is simple and visual, designed specifically for graph queries, allowing you to easily find patterns and relationships in your data.


When there’s Database involved there’s querying involved too.

We’ve understood what’re Nodes, Relationships and many other basic terms associated with Graph Database. Now let’s understand the basics of QUERYING!

Neo4j uses a powerful language called Cypher to retrieve data from the Database.

Here are some of the Basic notations anyone querying should know…

  • Nodes — ( )
  • Relationships — [ ]
  • Properties — { }

Before we understand how to write queries, we need to understand what’s a Data Model.

Data Model

A data model represents how data is structured and connected within the graph database. “How data is structured” refers to the nodes and relationships in the graph db.

It is similar to schema in relational databases. Below is an example of a Data Model.

Based on the data model, for instance the one given below, we can write queries to traverse through nodes & relationships and obtain the desired results.

Courtesy: graphacademy.neo4j.com

Courtesy: graphacademy.neo4j.com


Querying

Now let’s look at a set of basic commands to get started with querying

  1. Creation of Nodes

    The two straightforward commands available to create Nodes in Neo4j are — CREATE & MERGE.

Let’s look at their syntax and example to understand better.

  • CREATE → Adds new nodes and relationships.

    Syntax:

    CREATE (n:Label {property1: "value1", property2: "value2"})

    Here ‘n’ refers to the instance or alias of ‘Label’.

    Example:

    CREATE (p:Person {name: "Alice", age: 30})

    In the above example, a node ‘Person’ is created with an associated property ‘name & age’ that is created. ‘p’ serves as the aliases for the ‘Person’ node.

  • MERGE → Ensures that the pattern exists. It creates one if it doesn't.

    Syntax:

    MERGE (n:Label {property1: "value1"})

    Here ‘n’ refers to the instance or alias of ‘Label’.

    Example:

    MERGE (p:Person {name: "Alice"})

    In the above example, a node ‘Person’ is created only if it doesn’t already exist. Along with this an associated property ‘name’ that is created. ‘p’ serves as the aliases for the ‘Person’ node.

  1. RETURN → Specifies what to return from the query.

    A query is mostly incomplete without a RETURN statement. The RETURN statement helps us display the results and also validate that the intended Nodes or Relationships were created.

    Example:

    CREATE (p:Person {name: "Alice", age: 30})
    RETURN p

    This query lets us confirm that a ‘Person’ node with name: “Alice” and age: 30 has been successfully created.

  2. MATCH → Used to search for patterns in the graph. Basically it finds Nodes and Relationships in the graph db. This is similar to the SELECT statement in traditional rdbms querying.

    MATCH (p:Person {name: "Alice"})
    RETURN p

    This query finds any 'Person' node with the property name set to 'Alice'. The variable 'p' represents the node, and RETURN p shows the details of this node.

  3. WHERE → Adds filtering criteria to queries based on the conditions.

    Example-1:

    MATCH (p:Person)
    WHERE p.name = "Alice"
    RETURN p

    This matches all 'Person' nodes but returns only those where the name property is 'Alice'.

    Example-2:

    MATCH (p:Person)
    WHERE p.age > 25 AND p.city = "New York"
    RETURN p.name, p.age, p.city

    This matches nodes where the age is greater than 20 and the city is “New York”

    The WHERE clause is powerful for narrowing down query results based on various conditions, making queries more specific and efficient.

  4. SET → It is used to add or update properties and labels on nodes or relationships.

    Example-1: Updation of a property.

    MATCH (p:Person {name: "Alice"})
    SET p.age = 31
    RETURN p

    This finds the Person node with name: 'Alice' and updates the age property to 31.

    Example-2: Adding a new property.

    MATCH (p:Person {name: "Alice"})
    SET p.city = "New York"
    RETURN p

    This will add a city property to Alice’s node with the value 'New York'.

  5. DELETE → Removes nodes or relationships.

    Example:

    MATCH (p:Person {name: "Alice"})
    DELETE p

    This finds the Person node with name: 'Alice' and deletes it.


    Here’s a quick summary about all that you’ve learnt in the blog.

    • History of graphs - How the concept of graphs came into existence.

    • What’s a Graph and Graph Databases - Graphs consist of Vertices and Edges & graph databases store data as a network of nodes and relationships.

    • Elements of a Graph Database

      Graph Database elements
    • Features of Neo4j that make it more powerful

      Graph Database elements
    • Basic commands used in Querying

      Graph Database elements

You've made it to the end of the blog 😎