GraphGrid: January 2016

Wednesday, January 13, 2016

Neo4j AWS Cloud

It is safe to say that you are considering a move from on-reason to the cloud a mixture approach or completely got tied up with cloud? As cloud selection keeps on quickening over all commercial has make our Neo4j AWS Cloud stage offering to free up your time that you would somehow spend on leaving so as to convey and working servers obligation on our shoulders so you can concentrate on your business administration and item advancement.

What is GraphGrid's Neo4j AWS Cloud?

GraphGrid's Neo4j AWS Cloud stage offers you a simple approach to run Neo4j Enterprise on Amazon Web Services. We have joined Neo4j + AWS to give expansive figuring limit that works all the more rapidly and flawlessly by uniquely uniting the local diagram database Neo4j with Wide Area Network (WAN) organizations crosswise over 12 land AWS Regions around the globe.

Amazon Web Services is situated in 12 locales: US East (Northern Virginia), lion's share of AWS servers are based here, US West (northern California), US West (Oregon), Brazil (São Paulo), Europe (Ireland and Germany), Southeast Asia (Singapore), East Asia (Tokyo and Beijing), Asia Pacific (Seoul) and Australia (Sydney). Also, every district has different "Accessibility Zones" which are really particular server farms offered by AWS administrations.

For an ideal worldwide affair, we have utilized Geo load-adjusting strategies with AWS Elastic Load Balancers since it will naturally convey approaching application movement crosswise over various Amazon EC2 occurrences in the cloud. This progression of our affirmed Neo4j expert's ensures that the solicitations are served from the best AWS areas to accomplish low dormancy levels and in addition consistently furnishing adaptation to internal failure with the exact measure of burden adjusting ability to bolster the programmed failover situations.

Why we have incorporated with AWS

Some key advantages of incorporating with AWS to begin:

Security:

AWS has built world-class base both physically and over the web to stay versatile in the disappointment mode, common catastrophes or even framework disappointments. Additionally, they are using a conclusion to-end approach which guarantees your information is totally secured.

Adaptable:

Influence the force of the AWS worldwide framework and disregard the mystery in distinguishing your foundation needs. The AWS cloud limit is accessible with granular adaptability to exploit on interest, spot and saved occurrences for consistent state and burst situations.

Worldwide Leader:

Amazon holds an incredible vicinity over the globe with 12 districts, 37 accessibility zones and more than 50 edge areas. Additionally, AWS has executed the most recent advancements by which you can without much of a stretch access to process and capacity assets according to your prerequisite.

Utilizing Advanced AWS Cloud Capabilities for GraphGrid Data Platform (GGDP)

Our GraphGrid Data Platform (GGDP) influences best practices of Amazon Web Services (AWS) for security, information administration, information access and information coordination. Our GGDP is architected for the endeavor in this manner, it offers organizations over every one of the Regions and Availability Zones of AWS.

Be that as it may, all occurrences are conveyed in a Virtual Private Cloud (VPC), each being ordered in a Neo4j Cluster which is dispatched with a Security Group permitting access inside of Cluster possessed Subnets. Furthermore, on the off chance that you need to extend your present security answers for a higher evaluation, GraphGrid additionally gives security alternatives on your solicitation, for example, encryption and unscrambling to secure your information

enterprises and associations, finding cloud stages that can give your business an upper hand right out of the door while evacuating the lead time of building inside gets to be vital. Consequently to streamline this procedure for Neo4j, GraphGrid

Monday, January 11, 2016

What is a Graph Database?

A diagram database, or also called "chart situated database," is a specific type of NoSQL database that makes utilization of a chart to question, house, and outline connections. It comprises of databases which particularly serve to store information structures that are diagram arranged.
A diagram database is an illustration of a capacity arrangement that shows where connected components are joined with one another without a list. Gatherings of a particular substance can be gotten to by method for dereferencing a pointer.

There are different sorts of charts that can be put away. They can go from a solitary undirected diagram to property charts to hyper-charts.

A chart database for the most part meets the accompanying prerequisites:

Capacity is particularly orchestrated the information to be spoken to as a chart, with procurements made for lodging edges and vertices.

Capacity is changed for chart transversal without making utilization of a list when taking after edges.

A chart database is orchestrated so questions influence nearness information starting from a solitary or numerous root hubs, instead of worldwide inquiries.

An information model that is adaptable for specific arrangements: without having the need to pronounce information for edges or vertices, rather than a table-situated model of database.

Has a coordinated API alongside section focuses for further calculations of diagram hypothesis.

Neo4j Graph Database: What Is It Really?

Neo4j is an illustration of an adaptable and dependable diagram database made by Neo Technology. It is a local chart database, and made the now open diagram inquiry dialect called "Figure." An extensive variety of licenses have been made accessible, going from an open-source alternative to a select venture membership.

As of now, Neo4j is a broadly utilized diagram database internationally, with years of generation and a large number of arrangements in new businesses to Fortune 500 organizations.

Reasons why Neo4j is so famous is because of its:

Quality: all exchanges being made are totally ACID

Quickness: chart transversal is fast with a profundity of 1,000 levels in just milliseconds

Versatility: chart stockpiling of billions of components in a solitary machine graph database

Numerous send Neo4j Enterprise for the GraphGrid Data Platform since it's the quickest and profoundly versatile local diagram database on the planet. The points of interest that Neo4j offers in demonstrating associated information are faultless and takes into consideration handling issues that couldn't be determined with run of the mill databases. The following point will cover the parts of a local chart database Know more here.....

Sunday, January 10, 2016

Build a Modern Graph Data Architecture Use Case

Data Architecture Optimization:

Get Results Faster

At the point when Neo4j is utilized profitable bits of knowledge can be found, comprehended and made significant in hours rather than weeks with constant proposals ready to be performed substantially more effectively over your chart information than whatever other NoSQL store.

New Value

Not at all like construction on-compose, which changes information into indicated outline upon burden, Neo4j enables you to store information in any configuration, and afterward make mapping right then and there when you break down your chart information. This exceptional adaptability opens up new potential outcomes for iterative investigation and conveys new business esteem.

Any Workload

With GraphCompute supporting numerous entrance strategies, (for example, clump, intelligent and constant) on a typical information set, GraphGrid empowers you to change and see Neo4j chart information in different ways all the while, significantly diminishing time to knowledge..

Tweet: Some guidelines on how to use @neo4j #graphdb #Cypher MERGE operations consistently and efficiently. http://ctt.ec/T5rcz+

Thursday, January 7, 2016

Modeling Time Series Data with Neo4j

I've been accepting numerous inquiries as of late at trainings and meetups in regards to how to viably demonstrate time arrangement information with use cases going from hour level exactness to microsecond level accuracy. In surveying the different methodologies conceivable, I arrived on a tree structure as the model that best fit the issue. The two key inquiries I wound up asking as I

https://www.graphgrid.com/modeling-time-series-data-with-neo4j/

experienced the procedure of building the time tree to unite the time arrangement occasions were, "The manner by which granular do I truly need to make this to effectively work with and uncover the time-based information being broke down?" and "Do I have to produce unsurpassed hubs down to the coveted exactness level?" The equalization that should be considered is the in-statement and practicality of all the time arrangement hubs versus the dynamic creation as time arrangement occasions require their presence and the effect the missing hubs might have when questioning time arrangement occasions by different date and time ranges.

Starting the Time Tree

I eventually concluded that it would be best to make the hour, minute and second level hubs just when expected to associate an occasion into a day. So I developed the work done by Mark Needham in his post Neo4j: Cipher – Creating a period tree down to the day. The principle demonstrating change in this stride was to utilize a solitary CONTAINS relationship going from the higher tree toward the lower tree level to rearrange development here and there the whole tree through the utilization of profundity based example coordinating. Furthermore I reasoned that for sub-second level estimations it would be best to store the full accuracy (i.e millisecond, microsecond, and so on) on the occasion hub itself, however interface the occasion to the time tree at the second in which it happened in light of the fact that any separating or design investigation is unrealistic to be significant inside of a second (in any event for the utilization cases I've been hearing).

Interfacing the Time Series Events

In the time arrangement use cases I've been hearing there are a large number of occasions moving through the framework over brief timeframes so I needed to locate a fascinating information set of important size to use in approving the adequacy of the tree based methodology for demonstrating time arrangement information. There has been an open information development in government the last couple years that has been picking up force so I started hunting data.gov and looking down different urban communities that have open information entryways. I found the city of Seattle has an information entrance at data.seattle.gov, which had a helpful, extensive time arrangement dataset containing each of the 911 flame calls got subsequent to 2010. There were around 400k sections in the CSV containing an Address, Latitude, Longitude, Incident Type, Incident Number and Date+Time.

In the wake of downloading the time arrangement CSV, I expected to a do two or three things to get the information into an all the more inviting and finish state for stacking into Neo4j.

1. I expelled all spaces from the section names.

2. I needed a millisecond timestamp connected with every line notwithstanding the long UTC content organization so I ran a procedure to embed a second time section that changed over the Datetime string quality to milliseconds and put that esteem in the new DatetimeMS segment.

**Michael Hunger has some exceptionally valuable tips on utilizing LOAD CSV with Cyhper as a part of his post LOAD CSV into Neo4j rapidly and effectively, which is an advantageous read before you start importing your own CSV information.

Time Series Graphgist

I've incorporated an intuitive time arrangement graphgist underneath with remarked figure for the era and a couple investigation inquiries. You'll see that I'm just stacking the initial 1k lines here in this post and that is just to keep the information set sufficiently little to handle and render instantly for this case. In my own testing and showing for the meetup bunch I'd stacked and associated all the occasion information. In case you're keen on testing out the bigger dataset or one of your own, I'd be glad to offer assistance.

Some supportive clues for connecting with the diagram representations:

1. Hold move and look up to zoom in.

2. Hold move and look down to zoom out.

3. Hold move and click and drag to move the diagram around the showcase range.

4. Double tap to zoom in around a particular point.

5. Move and double tap to zoom out around a particular poin details see here...........

Wednesday, January 6, 2016

MySQL to Neo4J

ou’ve probably heard that an effective way to take move data from an existing relational database to graph is using LOAD CSV. But what exactly does the process of converting all or part of the database tables from MySQL to Neo4j using LOAD CSV involve start to finish? We’ll be using the Mysql5 Northwind database as our example. There is a Neo4j tutorial that has a similar explanation using Postgres and discusses the graph modeling aspects as well. So definitely good to read through that. Here we’ll focus on MySQL and the CSV export in preparation for the Neo4j

import.

First we’ll install and connect to the MySQL database:

$ brew install mysql

$ mysql.server restart

*Note: We’re skipping all MySQL server security because for this demonstration its simply an intermediary to get the data we need for the Neo4j LOAD CSV process.

Now using freely available MySQL Workbench or Sequel Pro connect to your localhost MySQL server. You should be able to do this directly on 127.0.0.1 without any username or password because we skipped the normal process of securing the server.

Import the Northwind.MySQL5.sql that you downloaded above. If you’re using Sequel Pro, you do this by choosing File -> Import… -> browse to your download and select Northwind.MySQL5.sql
When the import is finished you’ll see all the tables available for export to Neo4j. The specific tables we are interested in for our Neo4j graph model are Categories, Customers, Order Details, Orders, Products and Suppliers.

Export each table with right + click and selecting Export -> As csv file.
Customize the CSV file with settings that import smoothly into Neo4j (most should be selected by default):
1. NULL fields should export as a blank because it’s more efficient validate an actual existence or IS NULL check rather than actually creating the property with the literal string value “NULL” as value.
2. Escape values such as quotes with \ so quotes in the middle of the field do not break the CSV structure.
*Note: If you are planning to use dot notation to access columns by name, then you’ll need to make sure to remove any spaces from the column names in the first row of the CSV files before attempting to import into Neo4j.

Now using the latest (2.2.5 as of this article) Neo4j Community Edition, you can continue to follow along with the Cypher below. To import data into Neo4j locally, launch the Neo4j shell by navigating to the installation directory using terminal and launching ./bin/neo4j-shell
Before you copy and paste the Cypher below into the shell to import each one of the CSV files created by each exported table, you’ll need to update the “file://…” paths to match your export location.

If you aren’t using the shell and you prefer to use the Neo4j browser, then you’ll need to execute one statement at a time. Statements are terminated by a semicolon.

Data Modeling with Neo4j: “School Immunization in California” CSV to Graph

https://www.graphgrid.com/data-modeling-neo4j-school-immunization-california-csv-graph/

1 state, over 9 million children, and 42,981 rows of CSV immunization data. After many rough drafts, I was finally able to land on an efficient and aesthetically pleasing way to map out the immunization data of children in California (found and downloaded online from the California Department of Education*).In this post our goal is to walk through the data modeling process to show how this CSV data can be connected meaningfully with Neo4j. What makes this data so interesting is its varying degrees of location, three distinct grade levels, and a dense record of immunization numbers and percentages-all spanning over two separate school years.

After successfully mapping the data, I could then easily explore it, answering questions such as: Where in California has the lowest amount of children vaccinated?, Are less parents vaccinating their children in 2015 compared to 2014?, and Which age group is more up to date on its vaccinations?. Furthermore, I was able to clearly visualize the data in small and large quantities using the neo4j graph.

Neo4j for Your Modern Data Architecture

We recently sat down with Neo Technology, the company behind Neo4j, the world’s leading graph database, to talk more about our role as a trusted Neo4j solution partner and to dive deeper into our Neo4j Enterprise offerings.

Talk to me about GraphGrid. What’s your story?
So to understand GraphGrid, let’s dive into a little back story: We co-founded AtomRain nearly seven years ago with the vision to create an elite engineering team capable of providing enterprises worldwide with real business value by solving their most complex business challenges. As we figured out what that looked like practically, we found ourselves moving deeper down the technology stack into the services and data layer where we handled all the heavy lifting necessary to integrate data sources and provide the functionality, performance and scale needed to deliver powerful enterprise service APIs.

In early 2012, we had our first exposure to Neo4j and experienced first hand the potential of graph databases and over the next couple of years refined the integration of Neo4j into our enterprise technology stack.
…
GraphGrid is the full suite of essential data import, export and routing capabilities for utilizing Neo4j within your modern data architecture. At its core, GraphGrid, enables seamless multi-region global Neo4j Enterprise cluster management with automatic failover for disaster recovery.

A powerful job framework enables on-graph analytics and job processing, which removes the need to move data out of Neo4j to do analytics and batch processing of data. Real-time ElasticSearch auto-indexing keeps your search cluster updated with the latest data from your graph.

Using Neo4j Cypher MERGE Effectively

One of the areas in Neo4j and Cypher that brings the most questions and discussion when I’m giving Neo4j trainings or working with other engineers building with Neo4j, is around how to use Cypher MERGE correctly and efficiently. The way I’ve explained Cypher MERGE to all our engineers and all the training attendees is this.

There are a few simple things to understand about how Neo4j handles Cypher MERGE operations to avoid undesired or unexpected behavior when using it.

1. The Cypher MERGE operation is a MATCH or CREATE of the entire pattern. This means that if any element of the pattern does NOT exist, Neo4j will attempt to create the entire pattern.

2. Always MERGE on each segment of the pattern that has the possibility to already exist.
3. After a MERGE operation you are guaranteed to have a use able reference to all identifiers established during the Cypher MERGE operation because they were either found or created.

Simple Cypher MERGE Examples

Let’s look at a couple examples of Cypher MERGE operations:

Assuming that a unique constraint exists on username for the User label and that (u:User {username: “neo”}) exists in the graph do you think these two statements are equivalent?

Statement 1:

MERGE (neo:User {username: “neo”})-[:KNOWS]->(trinity:User {username: “trinity”});

Statement 2:
MERGE (neo:User {username: “neo”})
MERGE (trinity:User {username: “trinity”})
MERGE (neo)-[:KNOWS]->(trinity);

The answer is no; they’re not equivalent. Here’s why they’re not:
In Statement 1 Neo4j will find that the entire pattern doesn’t MATCH because -[:KNOWS]->(trinity:User {username: “trinity”}) doesn’t exist in the graph.
This will cause Neo4j to attempt to create the entire pattern, which includes the already existing User node with a uniquely constrained username field causing an exception about violating the unique constraint.

In Statement 2 Neo4j is able to MATCH the ‘neo’ User node in the first line and establishes a reference to it. Then in the second line Neo4j doesn’t find the ‘trinity’ User node so a CREATE is performed, which establishes the reference. Then finally in the third statement, using the references established in the two preceding MERGE statements, Neo4j successfully connects both the ‘neo’ and ‘trinity’ User nodes with the KNOWS relationship.

Simple Cypher MERGE Best Practices

Knowing what know now that we looked the examples above, here are some best practices for you most common, simple Cypher MERGE operations:

In scenarios where node existence in the graph is optional, the best general strategy for Cypher MERGE operations is to always MERGE on the uniquely constrained identifier for each node involved in the total pattern in isolation and then MERGE the relationships using the node references already established.

In scenarios where node existence in the graph is required, the best general strategy for Cypher MERGE operations is to always MATCH the nodes that are expected to exist and MERGE only the relationship using the node references established by the MATCH operation.

There are more advanced Cypher MERGE patterns and strategies, but if you’re just starting out using Cypher MERGE in this way will help you consistently get the desired result.

Pages