Today, I wanted to write about how learning to analyze network graphs will open the door to learning about other topics that may interest you. Specifically, today, I want to discuss the following:
Geospatial Analysis
Causal Inference
Knowledge Graphs and Artificial Intelligence
Data Flow Mapping and Source Code Analysis
Cybersecurity
When I wrote my book, I was hoping that it would be useful in bridging the gap between the social sciences and software engineering and data science. These techniques that I have described are an overlap of different domains, they are not common knowledge. We are making use of learning from across domains.
My goal in writing my book was to help people understand how this skillset can be useful in multiple domains.
But not everyone gets excited about Network Science by itself. They may be interested in other things, such as Geospatial Analysis, or Causal Inference, or Artificial Intelligence, or in using Knowledge Graphs, or maybe they would rather pursue a career in Cybersecurity. Or maybe they are just generally interested in Data Science or Software Engineering.
I come from cybersecurity and software engineering. I show how to create Knowledge Graphs. I create Artificial Intelligence. I am learning Causal inference and Geospatial Analysis. What do they all have in common? Graphs. If you are interested in any of these topics, you will inevitably learn about networks and graphs. So, if you would like to become stronger in any of these, you should learn about Network Science and Network Analysis. It will only make you better and give you more useful capabilities.
In this post, I will talk about some of this overlap, and recommend books that can help.
My Path: Data Operations and Cybersecurity
I come from cybersecurity. If you are in cybersecurity, I wrote my book for you and you will find it relevant. I am one of you. If you are in OSINT, you will also find it very useful, no doubt.
I have already written about this in other posts, but this obsession really started when I first learned relational database design in the 1990s, I just didn’t know it. I really got into designing databases, and in mapping existing production databases into Entity Relationship Diagrams (ERD, or ER Diagrams). These diagrams showed how the tables in a relational database linked together by primary and foreign keys.
Later, I created my own framework for profiling servers, work that would be needed in order to successfully “uplift” very old, undocumented, orphaned production servers. If there’s no documentation or owner, how do you find out what a server does? I created a process for that. Maybe I will write more about it someday. Are you interested in that? Leave a comment if it sounds useful to you.
After coming up with a methodology for mapping out how production servers worked, I took this further, mapping out production dataflows across entire datacenters. This helped the companies (Intel and McAfee) in a big way and we were able to speed up parts of the uplift process by 10x, no exaggeration. This led to many safe and very boring uplifts, the best kind. We had zero failures or complications. Zero.
Later, I was pulled onto a Data Science team, and I shifted my thinking away from Data Operations and more towards Malware Classification. At one point, I mapped out the evolution of malware using these techniques.
If you are in cybersecurity and you haven’t read my book, you really should. I wrote it especially for you. It is a natural fit. You defend computer networks from malware networks that are created by dark networks of criminals and adversaries. Get it?
Geospatial Analysis
Since those days, I left and created my own company. I now map out the entire internet of billions of websites, in every single language that exists. If you want to know more, you can follow my company on LinkedIn. We do really cool and important work.
I now do a lot of OSINT, which stands for Open Source Intelligence. If you are doing OSINT, then Geospatial Analysis will inevitably capture your attention.
There is overlap between networks, graphs, and Geospatial Analysis. If you want to get from your home to some place thirty minutes, what’s the fastest route? What will you travel on? You will travel on a transportation network (roads, sidewalks, rail, etc), and the fastest route is the shortest path. Roads follow a network.
So, if you are interested in Geospatial Analysis, you will very likely inevitably learn about networks and graphs, and things like Shortest Paths and Betweenness Centrality will be useful in your work.
I am just getting started in learning about Geospatial Analysis and have been pleased with the overlap between what I already know from network analysis and this domain. Here are some books that I’ve found useful:
I am working through the first one, lately, and will include some of my learning in this series. I enjoyed reading the second book earlier this year and am going to jump back into it after reading the first one. Have books you enjoy and recommend? Please post a comment with the title and author so that others can find it.
Causal Inference
Causal Inference is a cool topic, and one that I hope to learn a lot more about and build my skills. I am only beginning, but really enjoying it.
In my earlier work in data operations, anytime anything important would break, the team would be sent to do a “root cause analysis” of why it broke. If you don’t know what “root cause analysis” is, the goal is to understand why the thing broke. What caused the thing to break. Often, it is not one thing but a cascading failure.
In order to understand why the thing broke, you have to understand the things that impact it. In network thinking, what is upstream. Which of those upstream things caused the thing to break. You begin to think a lot in terms of upstream and downstream.
Already, in your mind, you should be thinking of “the thing” as a thing, a dot, and the things that affect it as other dots, with arrows pointing at the thing. One of those dots is the culprit. Which one?
Causal Inference will help you figure it out. In causal inference, you will make use of causal graphs. Being able to visualize them and inspect them visually is very useful in understanding and figuring things out.
Here are two books I have read that I recommend:
The top one is my favorite. It is a total page turner, and I sat outside for several hours reading it the first day I got it. Do you have books you recommend? Add them in the comments so that others can learn from you.
Knowledge Graphs and AI
These days, there is so much talk about Artificial Intelligence, and for good reason. If you are already learning or knowledgeable about Artificial Intelligence, then you know how pervasive graphs are in both creating and using AI. Think about it: Neural NETWORK.
And these days, when people are building LLMs, there is a lot of talk about Knowledge Graphs. In this series, I’ve shown how to create Knowledge Graphs using Wikipedia data, and other sources.
If you want to be able to explore those Knowledge Graphs, then you should learn about Network Analysis and Network Visualization. If you want to be able to build your own Knowledge Graphs from interesting data sources, then you should absolutely read this blog and my book.
Here’s a couple books you might find useful:
Graphs are Everywhere
There is no getting away from network graphs. They are everywhere, in everything. Your brain is a network. Ideas follow a network. Disease follows a network. Malware follows a network. People interact, creating social networks. Cause and effect follows a network. If you want to understand reality, then you need to acknowledge networks. If you learn to wield networks and network insights as tools, it puts you at a higher level of thinking and understanding. Broaden your horizons by learning more. If you do any kind of data analysis, you will run into graph data, even without realizing it. Learn to recognize it and use it, and reap rich insights.
I am only starting to appreciate the enormous power of network thinking thanks to reading David Knickerbocker's book on Network Science with Python.
I didn't have any appreciation of the (honestly staggering) number of insights that can be drawn from this.
What stands out:
Enormous power of network thinking
Only starting to appreciate (it’s easy to overlook)
I didn’t have any appreciation (it’s easy to overlook)
Staggering number of insights (so many that it is hard to manage)
I’m thankful to all of my readers who have let me know how my book has been helpful to them. I am really happy to see this catching on.
So, what are you waiting for? Get started today!
That’s All for Today
Thanks to everyone who has been following along with this series. Happy learning! If you would like to learn more about networks and network analysis, please buy a copy of my book!
I finished the book:
INFORMATION by James Gliek
I enjoyed the last three chapters three times
I have almost finished a book recommendation by David Knickerbocker
Written by James Gleick