Title: Site Reliability Engineering
Author(s): Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy
There are some resources that you know you must read, even if it is a steep ascent. Considering that this title is freely available, that it contains so much good ideas, tips, and practices that many people won't ever put half of them in practice, and that it is a big book (around 500 pages), it is hard to complain about it. But still, I'd like to start with my only complain: Most of the book feels like a bunch of academic papers grouped together, often under a common topic, but sometimes even overlapping in the contents despite being at different chapters and book parts. It's not been an easy book to read because it can get dull at times, up to the point that I stopped reading it halfway, and retook it recently and decided to finish it.
That said, everything else is so interesting. Even if sometimes we're not exactly told how their internal systems work, Google SRE teams authors explain enough techniques, procedures, ideas and advices to create tickets for years of work at any medium or large sized company that has more than a few services in production. From recommending monitoring everything, placing retries and circuit breakers, or explaining their "production readiness requirements" SRE guidelines, to less often heard concepts like "given enough requests per second, a simple random load balancer strategy can perform better than a round robin one", or how Google in some services employs an initially counter-intuitive practice of, in case of high latency, discarding the newer requests instead of the older ones (while also always discarding those that have been waiting for more than X seconds). Tons if ideas of how, what and when to monitor, log, alert, automate, improve, fix, and a myriad of related actions you can take related with services, tasks, teams, scenarios... Just take a look at the table of contents to grasp the sheer amount of topics it covers.
I marked some internal highlights but if you want some really hardcore and useful notes, I can only encourage you to check the in-depth review at danluu.com/google-sre-book/. Every chapter except the final SRE team management and integration ones is summarized.
But even the management final chapters are useful, explaining why interruptions are bad, how to deal with them and mitigate them, how to collaborate between teams, how, when and why to meet...
As I mentioned before, not the easiest book to read but definitely one that most engineers, SREs or not, should read.
It's been quite a while since I started working in tech. I've been in tiny startups, medium companies, and a few big companies; I've done both in-house development and consulting services, the later sometimes outsourced to clients for long periods; I've both seen from within workforces grow and adapt to change, and landed into already well-oiled machinery (theoretically at least).
You learn to recognize some people patterns, some archetypes, that typically either are already present or appear sooner or later. The superstars known in the community, the quick learners, the slower but methodical thinkers, the still-junior-but-will-be-awesome profiles... Some of these people will step forward and also help with the technical outreach, often by becoming speakers and/or participating in local user groups. But there will always be a subset, maybe just shy, maybe just not interested in public spotlight, that in any case won't shine so publicly (and sometimes even internally if the company is big enough), and yet get work done and sometimes achieve great things not expecting any ego boosting public recognition.
They don't speak at conferences, they might not have a blog, Twitter or any "social presence", or if they do they use it rarely... They just work, just focus on doing things the best they can. They usually excel and get great evaluations, and any time you ask about them to a colleague, the answer will always be positive. And if you speak with or, even better, directly work alongside them, you will always learn something new.
Those are the heroes who save the tech world, one company at a time. Those are the people I'm trying to emulate as of late.
Taking advantage of the
#FreeApril Pluralsight initiative, and considering I wanted to learn the fundamentals about BigQuery, I took the Data Analytics on Google Cloud path, which consists on four courses and in total 12-something hours of content.
Grabbed from the official page, a summarized description reads: "This path teaches how to derive insights through data analysis and visualization using the Google Cloud Platform. Feature interactive scenarios and hands-on labs where participants explore, mine, load, visualize, and extract insights from diverse Google BigQuery datasets. Cover data loading, querying, schema modeling, optimizing performance, query pricing, and data visualization.".
Exploring and Preparing your Data with BigQuery: Beginner course (4.5h) about SQL and BigQuery and some CloudDataPrep, which makes sense as data ingestion is the first step.
Creating New BigQuery Datasets and Visualizing Insights: Intermediate course (2.5h), heavily focused on BigQuery and data ingestion from external sources (online, and local files), SQL JOINs and UNIONs, and an introduction to Google Data Studio.
Achieving Advanced Insights with BigQuery: Advanced course (3.5h), now getting into advanced SQL and BigQuery features, including using arrays and structs to nest and unnest data, performance optimizations and an intro to Cloud Datalab (Google's Jupyter Notbook solution)
Applying Machine Learning to your Data with GCP: Advanced course (~3h), about how to do Machine Learning with BigQuery data, and inside BigQuery (this is the awesome part), by creating models, training and validating them.
The labs allow you to practice on your own, but you also get "fully guided" videos solving them, which is good for both cases of wanting to do them or preferring to just watch how they're done.
The courses are great, and my only complaint is that volume is not normalized among the videos, some are really low while others are fine, so you need to take care of the volume between them. Otherwise the production quality is very professional, you can clearly notice it is "official" Google built content.
Clean Architecture: Patterns, Practices, and Principles indeed talks about patterns, practices and principles of the aforementioned architecture paradigm, and hints at how to build maintainable and testeable software. The problem is that with a 2.5 hours course, half of which is a code example walkthrough, is impossible to go below the surface of most of the topics.
The example is a C# ASP.NET MVC 5 clean architecture application, is really easy to follow and is nicely explained. I liked this approach of half-slides, half-code for a change from other courses. Serves well to see specific implementations and not just theory. The author also mentions quite a few design patterns, some very briefly, others detailed. I liked the "screaming architecture" practice: organization of each layer of your system screams the intent of the application.
Topics are many but far from deep, so you should take that into account. Not saying it is bad, just warning that a 15 minutes lecture on microservices won't make you an expert building them ;)
UX-Driven Software Design consists of 3.5 hours about the methodology that gives name to the course, which basically is a top-down methodology based on emergent design, applying lean/agile practices. We will learn about the benefits of starting with mockups, wireframes and prototypes and why heavily iterating is a better approach than just delivering a version after long development cycles.
And when entering in the implementation details, it suggests using event sourcing, CQRS and bounded contexts.
You can consider this course as a kind of continuation of another from the same author, Domain Models CQRS and Event Sourcing, although this one is much more focused on examples, advices and high level rules. Still it provides many examples and a simple but clean TODO list code walkthrough (of the most relevant parts).