Day 58 of #100daysofnetworks
Knowledge Graph Breakthroughs: 6x Bigger, 100% Cheaper, 540x Faster!
Hello everyone!
Today is a great day. As I mentioned in the previous article, I had a breakthrough in thought, followed by a breakthrough in scale. All of the Cypher learning paid off.
I suddenly had enough skill and knowledge to push forward with my own Knowledge Graph experimentation and exploration.
Today’s article is going to strictly be a show-and-tell. I’m not ready to share the code yet, and I am still working on it. But the breakthrough is impressive. With the Cognee approach, I was using 10% of the Artificial Life dataset, and it took nine hours and $20 to create the Knowledge Graph. AI has a cost! Cognee’s approach is great, and I am a huge fan of the work that they are doing. But I have been working with databases since I was a teenager (now in my 40s), and I wanted to try my hand at creating a Knowledge Graph from scratch. I wanted to do the work, find some things out the hard way, and learn from all of it. There is no better learning than doing the work.
The approach I am going to show used 60% of the Artificial Life dataset, took 40 seconds, and cost $0.00. I really like this, and we’re going to keep expanding the capabilities of this KG over the next articles. We are going to do the work and learn a lot!
Breakthrough Results
10% of dataset to 60% of dataset (6x increase in visibility)
540 minutes (9 hours) to less than 1 minute (540x speedup)
$20 to $0 (100% decrease in cost)
And all of this is true, because this is a Network Science series, not just an Artificial Intelligence series. We need more graph coverage to be able to see more things. We want to explore these graphs. And I don’t want to wait nine hours for a KG to build!
Different approaches have different outcomes. Neither is better or worse. I really like Cognee’s AI approach, and it’s fascinating to see the edge types and KG builds. I am definitely going to keep using Cognee. It is one tool at my disposal.
Show-and-Tell Demonstration
This is a simpler Knowledge Graph, and I limited it to two node types:
Person
Article
Because this is a dataset of Artificial Life research since the 1990s. A more complete Knowledge Graph would contain more node types, like Cognee’s does. I am building this one, slowly, and we’re going to gradually build this, following these iterations:
People and Article Relationships ← we are here
Temporal
Entities
Clusters
???
I have some ideas that I want to try, to give AI interfaces more nuance, and this is our playground. Ok, let’s start the tour!
First impression: we are using 6x more data, so the relationships are much more illuminating. This is Artificial Life research. The brown dots are people, and the pink dots are articles that they wrote. So, this map/graph shows some interesting things.
Some scientists write a lot of papers. Look at these ‘central’ people nodes.
Wow, that is cool, and utterly unexpected. This was not staged. That’s Judea Pearl! This is why I love graph analysis. Every single time, there’s a discovery. What a cool subgraph, definitely worth exploring!
But it is important for us all to understand that we exist inside ecosystems. Look at this denser part of the graph. There is some serious activity in these parts that literally look like parts of a spider web.
This visualization really makes me happy. It is hard to get a good look at this kind of density, and Neo4j is doing really well. I’d love to export this into an edgelist and play with it in Cosmograph!
The queries from previous articles work even better, because there are fewer edge types. As a result, the query itself is easier to understand:
MATCH (a)-[r]-(b)
WHERE a.summary CONTAINS 'interstellar'
RETURN a, r, bMatch nodes and relationships where at least one of the nodes has a summary that contains the word interstellar.
Since we know Judea Pearl is in here, we can do a search for “causal graph”, and see papers that mention them. Very cool! Here are the articles if you want to read them!
Paying readers can get access to this database. Please reach out to me if you are a paying subscriber and would like access.
I like the outer space stuff, so this just makes me happy. So much research mentioning planets. This is a really cool Graph Database for exploration, now. It is less usable for GraphRAG in its current state, but I am going to work on making it GraphRAG ready, and we’re going to build reliable GraphRAG from scratch. We’re going to try a lot of things, and we are going to learn a lot in the process!
This is really powerful stuff. I’ve done this with files for years, but it makes me so happy to see the usefulness of Graph Databases first-hand, and to feel comfortable working with them!
There is a lot more to build and explore, so that is all for today! Thank you for attending the tour! Please let me know what you think!
Please Support this Work!
I have written over 50 articles for this series. Each one takes about four hours of research, and several pages of writing and editing. Here are some ways you can support the blog!
Please subscribe if you have not. This motivates me like nothing else!
LET’S DO BUSINESS. Reach out to me if you need data or AI help! Happy to help! You can read about the partners we are currently looking for here.
BIGGEST HELP to the BLOG: Please consider upgrading if you are a subscriber. Thank you to all current paying subscribers for making this research and development possible!
Please buy my book to understand how I think about Natural Language Processing and Network Science combined.
Please reach out to me if interested in training related to Data Science, Data Engineering, Network Science, Knowledge Graphs, Artificial Intelligence, or anything else I write about. Feel free to message me on Substack or on LinkedIn!
Feel free to hang out in the comments and have a good time!
We have come so far since the very first day of the very first #100daysofnetworks. I love writing for this series. Thank you for being a part of it!










