Wednesday, January 6, 2016

MySQL to Neo4J

ou’ve probably heard that an effective way to take move data from an existing relational database to graph is using LOAD CSV. But what exactly does the process of converting all or part of the database tables from MySQL to Neo4j using LOAD CSV involve start to finish? We’ll be using the Mysql5 Northwind database as our example. There is a Neo4j tutorial that has a similar explanation using Postgres and discusses the graph modeling aspects as well. So definitely good to read through that. Here we’ll focus on MySQL and the CSV export in preparation for the Neo4j
import.
First we’ll install and connect to the MySQL database:
1
2
$ brew install mysql
$ mysql.server restart
*Note: We’re skipping all MySQL server security because for this demonstration its simply an intermediary to get the data we need for the Neo4j LOAD CSV process.
Now using freely available MySQL Workbench or Sequel Pro connect to your localhost MySQL server. You should be able to do this directly on 127.0.0.1 without any username or password because we skipped the normal process of securing the server.
Import the Northwind.MySQL5.sql that you downloaded above. If you’re using Sequel Pro, you do this by choosing File -> Import… -> browse to your download and select Northwind.MySQL5.sql
When the import is finished you’ll see all the tables available for export to Neo4j. The specific tables we are interested in for our Neo4j graph model are Categories, Customers, Order Details, Orders, Products and Suppliers.
Export each table with right + click and selecting Export -> As csv file.
Customize the CSV file with settings that import smoothly into Neo4j (most should be selected by default):
1. NULL fields should export as a blank because it’s more efficient validate an actual existence or IS NULL check rather than actually creating the property with the literal string value “NULL” as value.
2. Escape values such as quotes with \ so quotes in the middle of the field do not break the CSV structure.
*Note: If you are planning to use dot notation to access columns by name, then you’ll need to make sure to remove any spaces from the column names in the first row of the CSV files before attempting to import into Neo4j.
Now using the latest (2.2.5 as of this article) Neo4j Community Edition, you can continue to follow along with the Cypher below. To import data into Neo4j locally, launch the Neo4j shell by navigating to the installation directory using terminal and launching ./bin/neo4j-shell
Before you copy and paste the Cypher below into the shell to import each one of the CSV files created by each exported table, you’ll need to update the “file://…” paths to match your export location.
If you aren’t using the shell and you prefer to use the Neo4j browser, then you’ll need to execute one statement at a time. Statements are terminated by a semicolon.
Tweet: Some guidelines on how to use @neo4j #graphdb #Cypher MERGE operations consistently and efficiently. http://ctt.ec/T5rcz+

No comments:

Post a Comment