Computer Networking : The Hidden Multiplier for Your Ops Career
Understanding Computer Networking makes you a better Ops/SRE engineer.
Understanding Computer Networking makes you a better Ops/SRE engineer.
No, really. Let me convince you.
Why Computer Networking?
There are no standalone computer systems anywhere anymore, except maybe ones in highly secure facilities in areas like government and defense. Even in such facilities there exist networked computers shielded from the global internet. The rest are all connected to each other. Almost all modern systems are distributed - made up of many components talking to each other over the network. With networks being such an ubiquitous part of any system, a strong understanding is necessary to build, debug, and maintain systems.
As an Ops/SRE engineer, you may have grown into your role from different backgrounds. Based on your previous experience your knowledge of networking will be either patchy, strong in a few areas and not so in others, or very strong in the foundations. If you are in one of the first two groups, an often asked question is how do you get better overall? In fact, what are the things you should know? There’s so much to learn and so little time. Even if you start out as an Ops/SRE engineer, the same thing applies.
In this article I’ve tried to encapsulate what constitutes the minimum networking knowledge that an Ops/SRE engineer must have. There is no upper bound.
The Absolute Basics
Protocol layers and encapsulation, TCP/IP, UDP
IPv4/v6 addressing, MAC addresses, DHCP, ARP, CIDR, Subnetting
The domain name system, Name servers, DNS resolution
Basic understanding of internet topology and Autonomous Systems, routing, BGP
Common protocols - HTTP/S, SMTP, FTP, SSH
Cryptography - symmetric and public key encryption, Transport Layer Security, Certificates and how they are used for HTTPS
Authentication basics - Basic auth, Bearer auth, Private key auth
Common tools - traceroute, ping, netstat, dig, curl, nmap
As in anything else, learning by doing is the best method.
Learning by Doing
There are two approaches here and they are complementary:
Follow a course or learning path that has hands-on training and exercises. This can be comprehensive, but unless you have the opportunity to apply what you learn after the course is over, the pieces can be forgotten.
Learn something new when you need to apply it in your daily work - to solve a specific problem. This is work-driven, where you need to arrive at a solution and work backwards from there, learning some of the theory along the way, and understanding enough to resolve your problem. This has the possibility of leaving gaps in your knowledge but you will rarely forget what you learn here.
Creating your own Roadmap
There are many freely available resources to learn from, but the challenge is often creating a path that makes sense for your needs. For example, https://roadmap.sh/devops has networking related topics but no practical guides.
You can create your own small projects. Here are some examples that I have come up with, in no particular order:
DNS
Buy a cheap domain to play around with. Set up different kinds of records - A, CNAME, and TXT. Deploy a static website and a blog on subdomains. You can set up static website hosting for free with many services like GitHub, Cloudflare, and Render.com.
Set up MX records and use a free email provider like Zoho and check that you are able to receive emails in your domain. Ensure that DMARC, DKIM, and SPF records are correctly set.
Setup BIND as a forwarding DNS server on your laptop.
Cryptography and Authentication
Create a self-signed certificate on an nginx server and then curl from a remote box. How do you make the cert error go away in curl? Note that self signed certs are a bad idea - this is just for learning.
Replace the self-signed certificate with one from Let’s Encrypt with auto-renewal. To do this, you will need a public domain name with access to its DNS servers.
Setting up your own ssh keys, for say, your GitHub account will give you an intro into private keys. Generate the key yourself using ssh-keygen and understand what the options you are providing to it mean.
Setup basic auth for a specific path /web for an nginx web server. Other paths including / should be unauthenticated.
Subnetting and Routing
Create a VPC in AWS or GCP or another IaaS and divide it manually into 4 subnets for a hypothetical org
Engineering needs 1000 hosts
Marketing needs 200 hosts
Sales needs 200 hosts
IT needs 500 hosts
Allow room for 50% growth. What are the subnet masks, broadcast addresses, and IP ranges? Do not use any online VLSM calculator. Do it on paper first.
One way to play around with routing between subnets is to create a Google Cloud account and use its free tier. Create multiple VPCs and subnets without the default settings. Now see what is needed to set up routing between instances running in different subnets.
Border Gateway Protocol
Check out the DN42 project. It is a P2P overlay network where you can use tunnelling to create your own autonomous systems and play around with routing protocols like BGP. For more on overlay networks, you can read about Software Defined Networking (SDN).
Tools
ping
is probably the CLI tool most of us are familiar with - when we want to check reachability of a remote host. But did you know that a ping
failure does not imply that the remote host is unreachable? ping depends on ICMP packets and if your firewall rules block them ping won’t work.
The next time you have to debug an Ops issue - try to see which network tools you can use. When debugging problems especially with distributed systems which have many microservices, you would also need a visual map of the different services and their inter-dependencies, and who does what. These tools will give you a peek under the hood of how they communicate.
tcpdump
is a network traffic capture tool and the output can be intimidating at first. Start with something small - like ping a remote host and capture the packets usingtcpdump
.Create an echo server (a server that returns whatever the client sends as-is) using
nc
and analyze the traffic usingtcpdump
.List out all the certificate authorities for a well known domain like www.google.com using openssl only.
Resources
High Performance Browser Networking - I found this book very useful. Irrespective of what the title says, you can read the chapters in the Networking 101 section and get a good idea.
For a starter book, Douglas Comer’s Internetworking with TCP/IP never gets old.
If video lectures are your thing, watch Nick Feamster's lectures.
For a real world implementation of a non-trivial networking architecture, see the Kubernetes networking implementation in Google Kubernetes Engine (GKE).
(Disclaimer : the following is an affiliate link) I’ve personally used CodeCrafter’s courses on Redis and HTTP in the past that walk you through building popular protocols with an inbuilt test suite. You can sign up using my affiliate link.
Image credits : Photo by Conny Schneider on Unsplash