Getting Started with Neo4J - Part 3


Bhaskar S 12/09/2017


Overview

In Part-2 of this series, we briefly explored the SQL like query language for Neo4J called Cypher, dabbling with the CREATE clauses and the various MATCH clauses, via the web-browser interface as well as the command-line interface.

Hands-on with Cypher

Lets us continue from where we left off to wrap a few loose ends.

To query the property values of the keys uid and email for all the nodes with the label User, by skipping the first 2 rows and limiting the output to just 3 rows, execute the following MATCH query:

MATCH (u:User) RETURN u.uid, u.email SKIP 2 LIMIT 3;

On executing the above query, the results should look like the one shown below:

Output.1

+---------------------------------------+
| u.uid     | u.email                   |
+---------------------------------------+
| "charlie" | "charlie.brown@earth.com" |
| "david"   | "david.black@earth.com"   |
| "frank"   | "frank.grey@earth.com"    |
+---------------------------------------+

3 rows available after 86 ms, consumed after another 34 ms

To query the property value of the key state with the value nj for all the nodes with the label User and return the count of the entries, execute the following MATCH query:

MATCH (u:User) WHERE u.state = 'nj' RETURN COUNT(u.state);

On executing the above query, the results should look like the one shown below:

Output.2

+----------------+
| COUNT(u.state) |
+----------------+
| 3              |
+----------------+

1 row available after 200 ms, consumed after another 0 ms

As can be seen from the Output.2 above, the column name for the resulting count is the function invoked COUNT(u.state). To assign a user friendly column name for the resulting count, execute the following MATCH query:

MATCH (u:User) WHERE u.state = 'nj' RETURN COUNT(*) AS Count;

On executing the above query, the results should look like the one shown below:

Output.3

+-------+
| Count |
+-------+
| 3     |
+-------+

1 row available after 41 ms, consumed after another 0 ms

One can combine the results from multiple MATCH queries to appear as one using the UNION clause. One constraint is that the column names have to be the same from the multiple queries.

The following is an example querying the property values of the keys uid and email from two separate MATCH queries:

MATCH (u1:User) -[g1:BELONGS_TO]-> () WHERE u1.state = 'tx' AND g1.context = 'pm' RETURN u1.uid AS Uid, u1.email As Email UNION MATCH (u2:User) -[g2:BELONGS_TO]-> () WHERE g2.context = 'pm' RETURN u2.uid AS Uid, u2.email AS Email;

On executing the above query, the results should look like the one shown below:

Output.4

+---------------------------------------+
| Uid       | Email                     |
+---------------------------------------+
| "gary"    | "gary.white@earth.com"    |
| "alice"   | "alice.pink@earth.com"    |
| "bob"     | "bob.green@earth.com"     |
| "charlie" | "charlie.brown@earth.com" |
+---------------------------------------+

4 rows available after 82 ms, consumed after another 7 ms

Let us now create an additional node with the label User by executing the following CREATE clause:

CREATE (Zion:User {name:'Zion Red', uid: 'zion', email: 'zion.red@mars.com', state: 'tx'});

On executing the above query, the results should look like the one shown below:

Output.5

0 rows available after 147 ms, consumed after another 0 ms
Added 1 nodes, Set 4 properties, Added 1 labels

Let us now relate the above created User node to the Group node called ProjectManagement by executing the following CREATE clause:

MATCH (u:User) WHERE u.uid = 'zion' CREATE (u) -[:BELONGS_TO {context: 'pm'}]-> (ProjectManagement);

On executing the above query, the results should look like the one shown below:

Output.6

0 rows available after 57 ms, consumed after another 0 ms
Added 1 nodes, Created 1 relationships, Set 1 properties

Verify the relationship was successful by executing the following MATCH query:

MATCH (u) -[:BELONGS_TO {context: 'pm'}]-> () RETURN u.name;

On executing the above query, the results should look like the one shown below:

Output.7

+-----------------+
| u.name          |
+-----------------+
| "Gary White"    |
| "Charlie Brown" |
| "Bob Green"     |
| "Alice Pink"    |
| "Zion Red"      |
+-----------------+

5 rows available after 29 ms, consumed after another 3 ms

To query the property value of the key uid with the value zion for all the nodes with the label User and delete the entries, execute the following MATCH query:

MATCH (u:User) WHERE u.uid = 'zion' DELETE u;

On executing the above query, the results should look like the one shown below:

Output.8

Cannot delete node<9>, because it still has relationships. To delete this node, you must first delete its relationships.

OOPS !!! What happened here ???

Remember from above, we created a relationship with the Group node called ProjectManagement. One needs to delete all the relationship(s) from a node before it can be targetted for deletion.

To delete User node for zion along with all the relationships to other nodes, execute the following MATCH query:

MATCH (u:User) -[r]- () WHERE u.uid = 'zion' DELETE r, u;

On executing the above query, the results should look like the one shown below:

Output.9

0 rows available after 34 ms, consumed after another 0 ms
Deleted 1 nodes, Deleted 1 relationships

Often times there is a need to bulk load data from external file(s). Neo4J provides facilities to load data from CSV files.

Since we are using a docker instance of Neo4J, we need to make few adjustments to our environment in order to import data from CSV files.

Exit off the cypher-shell and shutdown the running docker instance.

Create an additional directory called import under /home/alice/Neo4J by executing the following command:

mkdir -p /home/alice/Neo4J/import

To launch a new docker instance for Neo4J with the directory of import enabled, execute the following command:

docker run --rm --name neo4j --publish=7474:7474 --publish=7687:7687 --volume=$HOME/Neo4J/data:/data --volume=$HOME/Neo4J/logs: /logs --volume=$HOME/Neo4J/conf:/conf --volume=$HOME/Neo4J/import:/import neo4j:3.3.0

The following should be the typical output:

Output.10

Active database: demo_graph.db
Directories in use:
  home:         /var/lib/neo4j
  config:       /var/lib/neo4j/conf
  logs:         /logs
  plugins:      /var/lib/neo4j/plugins
  import:       /import
  data:         /data
  certificates: /var/lib/neo4j/certificates
  run:          /var/lib/neo4j/run
Starting Neo4j.
2017-12-09 20:49:53.944+0000 WARN  Unknown config option: causal_clustering.discovery_listen_address
2017-12-09 20:49:53.949+0000 WARN  Unknown config option: causal_clustering.raft_advertised_address
2017-12-09 20:49:53.950+0000 WARN  Unknown config option: causal_clustering.raft_listen_address
2017-12-09 20:49:53.950+0000 WARN  Unknown config option: ha.host.coordination
2017-12-09 20:49:53.950+0000 WARN  Unknown config option: causal_clustering.transaction_advertised_address
2017-12-09 20:49:53.950+0000 WARN  Unknown config option: causal_clustering.discovery_advertised_address
2017-12-09 20:49:53.951+0000 WARN  Unknown config option: ha.host.data
2017-12-09 20:49:53.951+0000 WARN  Unknown config option: causal_clustering.transaction_listen_address
2017-12-09 20:49:53.967+0000 INFO  ======== Neo4j 3.3.0 ========
2017-12-09 20:49:53.993+0000 INFO  Starting...
2017-12-09 20:49:55.404+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2017-12-09 20:49:58.626+0000 INFO  Started.
2017-12-09 20:49:59.526+0000 INFO  Remote interface available at http://localhost:7474/

Re-launch the cypher-shell by executing the following command:

docker exec -ti neo4j bin/cypher-shell -u neo4j

The following is a sample CSV file called users.csv that contains rows for creating new User nodes:

users.csv
Full_Name,uid,Email,State,Type
Jane Red,jane,jane.red@earth.com,ny,employee
Kyle Orange,kyle,kyle.orange@earth.com,nj,consultant
Lynda Pink,lynda,lynda.pink@earth.com,tx,employee
Mary Green,mary,mary.green@earth.com,ny,employee
Nathan Brown,nathan,nathan.brown@earth.com,tx,consultant

Similarly, the following is a sample CSV file called relationships.csv that contains rows for creating relationships between the User nodes and Group nodes:

relationships.csv
uid,Context,Group_Name
jane,architect,Core Engineering
kyle,developer,Core Engineering
lynda,pm,Project Management
mary,pm,Project Management
mary,manager,Core Engineering
nathan,developer,Core Engineering

Copy the above two sample CSV files users.csv and relationships.csv to the directory /home/alice/Neo4J/import.

To access any CSV file in Neo4J, use the Cypher clause LOAD CSV.

To validate that we are able to access the users.csv file in cypher-shell, execute the following LOAD CSV statement:

LOAD CSV WITH HEADERS FROM "file:///users.csv" AS row RETURN COUNT(*);

On executing the above query, the results should look like the one shown below:

Output.11

+----------+
| COUNT(*) |
+----------+
| 5        |
+----------+

1 row available after 98 ms, consumed after another 98 ms

To list the existing User nodes, execute the following MATCH query:

MATCH (u:User) RETURN u.uid;

On executing the above query, the results should look like the one shown below:

Output.12

+-----------+
| u.uid     |
+-----------+
| "alice"   |
| "bob"     |
| "charlie" |
| "david"   |
| "frank"   |
| "gary"    |
| "harry"   |
+-----------+

7 rows available after 56 ms, consumed after another 13 ms

To import data from the users.csv file as new User nodes (for each row), execute the following LOAD CSV statement:

LOAD CSV WITH HEADERS FROM "file:///users.csv" AS row CREATE (:User {name: row.Full_Name, uid: row.uid, email: row.Email, state: row.State, type: row.Type});

On executing the above statement, the results should look like the one shown below:

Output.13

0 rows available after 211 ms, consumed after another 0 ms
Added 5 nodes, Set 25 properties, Added 5 labels

To verify the list of all the User nodes, execute the following MATCH query:

MATCH (u:User) RETURN u.uid;

On executing the above query, the results should look like the one shown below:

Output.14

+-----------+
| u.uid     |
+-----------+
| "alice"   |
| "bob"     |
| "charlie" |
| "david"   |
| "frank"   |
| "gary"    |
| "harry"   |
| "jane"    |
| "kyle"    |
| "lynda"   |
| "mary"    |
| "nathan"  |
+-----------+

12 rows available after 9 ms, consumed after another 5 ms

WALLA !!! We have successfully imported data from the users.csv file.

To import data from the relationships.csv file for creating relationships of type BELONGS_TO for the newly created User nodes, execute the following LOAD CSV statement:

LOAD CSV WITH HEADERS FROM "file:///relationships.csv" AS row MATCH (u:User) WHERE u.uid = row.uid MATCH (g:Group) WHERE g.name = row.Group_Name CREATE (u) -[:BELONGS_TO {context: row.Context}]-> (g);

On executing the above statement, the results should look like the one shown below:

Output.15

0 rows available after 371 ms, consumed after another 0 ms
Created 6 relationships, Set 6 properties

To verify the list of all the relationships of the type BELONGS_TO, execute the following MATCH query:

MATCH (u:User) -[r:BELONGS_TO]-> (g:Group) RETURN u.uid, r.context, g.name;

On executing the above query, the results should look like the one shown below:

Output.16

+------------------------------------------------+
| u.uid     | r.context   | g.name               |
+------------------------------------------------+
| "mary"    | "pm"        | "Project Management" |
| "lynda"   | "pm"        | "Project Management" |
| "gary"    | "pm"        | "Project Management" |
| "charlie" | "pm"        | "Project Management" |
| "bob"     | "pm"        | "Project Management" |
| "alice"   | "pm"        | "Project Management" |
| "nathan"  | "developer" | "Core Engineering"   |
| "mary"    | "manager"   | "Core Engineering"   |
| "kyle"    | "developer" | "Core Engineering"   |
| "jane"    | "architect" | "Core Engineering"   |
| "harry"   | "architect" | "Core Engineering"   |
| "frank"   | "developer" | "Core Engineering"   |
| "david"   | "developer" | "Core Engineering"   |
| "bob"     | "architect" | "Core Engineering"   |
+------------------------------------------------+

14 rows available after 87 ms, consumed after another 14 ms

BINGO !!! We have successfully imported data from the relationships.csv file.

References

Getting Started with Neo4J - Part 1

Getting Started with Neo4J - Part 2

Neo4J Official Site