Kartones Blog

Be the change you wanna see in this world

Course Review: GCP Cloud Architect (Udemy)

I've been exploring the online video courses "learning path" since a year and a half, making this one the third course I finish. Two are from Udemy (paid but cheap) and one from Coursera (free). As at work we're migrating some services from AWS to GCP I thought would be a nice idea to repeat Udemy, based on how good the other AWS course was.

Google Cloud Platform - Cloud Architect "In depth coverage of all Services, 200+ Practice questions & 4 Case Studies Design" Sounded quite nice, covering services like Compute, Storage & Database, Networking, VPN & VPC, IAM, Security, Management Tools, and intros to ML and BigData. And while yes, it covers all those topics, it is more like an introduction to them.

Sadly, I'll dedicate the remainder of the post to why you shouldn't bother buying that course even if has some huge discount, so let's get to it:

There is absolutely no cadency when speaking, it is a mere reading of either the slides or slide notes (those not so frequent times that there's more content than what you see). Add to that both typos and mistakes at the slides with errors when reading them, and you get a fatal combo. It was so terrible that I've studied more than half of the course without sound, pausing and advancing myself "the slides". Because another issue is that many times the author is either not even checking his "reading position" regarding the visual content, or doesn't cares, so after reading like a robot a big bullet list, if you're not fast stopping the video, you almost won't read the final line.

Other issues are terrible speaking errors and pauses, which should have meant a re-recording of the segment, but instead were left. Or mistakes like going out of MS Powerpoint presentation mode and not caring to turn it on again until after a while, like here: screenshot

Another of my top annoyances is the mouse pointer. on each and every video you see the little annoying pointer being used sometimes to remark things but many things to just leave it anywhere, usually covering letters, diagrams and most times distracting the watcher.

Everything looks like coming from a teacher/evangelist kit or directly from Google web documentation, which I'm not against but combined with all the previous defects makes the course as a whole lacking any quality. If we were given all the slides would be more productive and useful. And the few demos I tried to watch and follow, were so terribly prepared (full of long pauses, mistakes, etc.) that I skipped all of them.

Now don't get me wrong about one aspect: the author's english is not too bad and could be understood way better if he had made the effort of not reading the notes and focused on real explanations of the contents shown.

I'm a stubborn guy so I went through all the non-demo videos, because you always learn something and I was past refund date anyway.

It might be a nice introduction and less hardcore than digesting all of Google's GCP online documentation at once, but I really doubt it.

Note: I also left a 0.5 / 5 rating of the course inside the learning platform, alongside a detailed explanation of why so low.


Information regarding GDPR (General Data Protection Regulation)

One topic that is getting more and more attention lately is the GDPR, which stands for General Data Protection Regulation. A new regulation that should start to be fully enforced by May 25th, 2018, and that finally provides many pretty good user-related regulations and limitations. For once, and although not everything is clear or properly detailed, even in general is something that benefits everyone using internet. Even companies, although where it benefits them (performance, security, data protection) is not as interesting as user tracking, retargeting and other marketing and data related areas that must change radically.

For a decade, companies have been harvesting more and more data without our consent, so in theory in less than two months, no more automatic opt-out consents, no more dozens of trackers without at least informing you, no more Delaware-based international companies not complying with EU laws and no more tricks to not be able to delete your accounts. Or at least that's the theory, we'll see how it turns out.

Anyway, this regulation also means that most tech companies are going to be busy this two months adapting to the new laws. At work we've already started to prepare everything and the first thing I noticed is that there are many posts but relevant, quality info was not so easy to come by, so I decided to write this small blog post and gather what we've found interesting.

First, the most important link of all, the regulation itself: https://gdpr-info.eu/art-16-gdpr/

Reading all of it can be a bit daunting at first, but it contains a handy search box that allows to easily find detailed explanations inside the 173 "recitals" and of course in the main regulation itself. Instead, I recommend you to start by going through the following links links from Bozhidar Bozhanov, which provide an interesting and practical digest about the regulation in general and cookies:

If you read it, you'll see that there are hundreds of mentions to "personal data", but what really covers that term? This post is a good explanation.

Another excellent summary guide is Stripe's.

Also, while checking how it affects Google Analytics I came upon this post containing very important topics regarding both Google Analytics and Google Tag Manager and ip anonymization, among other things you should now take care about, like never sending to GA urls containing personally identifiable data (emails and the like).

If you speak or at least read Spanish, the two following links contain all GDPR info translated and into PDF:

One topic that had some discussion at the office was if regarding the "consent checkboxes" you could just go and make all of them mandatory or not allow to use your service. According to recitals 42 and 43:

Consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment.

Consent is presumed not to be freely given if it does not allow separate consent to be given to different personal data processing operations despite it being appropriate in the individual case, or if the performance of a contract, including the provision of a service, is dependent on the consent despite such consent not being necessary for such performance.

So, if I interpreted them correctly, you cannot make it an all-or-nothing choice unless it is really critical for the service to work. Which means, you must provide a way to use the service without being tracked by third parties and the like.

As I mentioned before, let's see how all this gets implemented, but at the very least we'll now be able to own a bit more our data, and also request data exports from any service, the "right to forget" (data deletion). or "processing restriction" (in theory, you allow the service to keep your data but they're forbidden to use it for anything else than basic functionality).

Update: Added Stripe's guide link.


Book Review: The One Thing

Review

The One Thing - book cover

Title: The One Thing: The Surprisingly Simple Truth Behind Extraordinary Results

Author: Gary W. Keller, Jay Papasan

I am a bit wary of self-help books as I usually feel I either fully trust them or won't get anything of use (and that some are mere money grabbers). Thankfully, this title is not one of them. It is "only" about focusing your energies, work efforts and even lifestyle one one thing at a time, plus creating a plan to achieve "the one thing" that really drives you, and then proceeding towards that ultimate goal by applying some efficiency techniques.

At around 220 pages, it both contains some really interesting tips, advices and tactics, but also lots of bloating, small stories which sometimes are given a (at least for me) fragile interpretation to adapt to that "one thing seeking" and sometimes multiple paragraphs going in circles around a single concept. It is not bad, and maybe is just that I prefer more condensed information, but my general feeling is that the book could be half the size.

That said, you might fully or partially agree with the topics exposed but I think that there are some good ways of increasing your focus to achieve more.

Notes

  • No one succeeds alone. No one
  • "Where I'd had huge success, I had narrowed my concentration to one thing, and where my success varied, my focus had too"
  • Extraordinary results are directly determined by how narrow you can make your focus
  • Success is built sequentially. It's one thing at a time
  • "Things which matter most must never be at the mercy of things which matter least" -Johan Wolfgang von Goethe
  • Equality is a lie
  • When everything feels urgent and important, everything seems equal
  • The 80/20 Principle (aka Pareto's Principle)
  • Prioritize a to-do list so it becomes a success list. Also, go small, go extreme and say no/discard
  • Multitasking is a lie
  • Disciplined life is a lie. We don't need more discipline, but to direct and manage it better
  • Success is about doing the right thing, not about doing everything right
  • Once a new behaviour becomes a habit, it takes less discipline to maintain
  • It takes an average of 66 days to acquire a new habit
  • The act of living a full life by giving time to what matters is a balancing act
  • In your personal life go short and avoid long periods where you're out of balance. [...] Nothing gets left behind
  • In your professional life go long and embrace long periods out of balance. [...] is required [to remove lesser priorities]
  • Life is the art of progressing
  • Think big - Act big - Succeed big
  • The quality of an answer is determined by the quality of the question
  • Asking questions improves learning and performance by as much as 150%
  • Sometimes questions are more important than answers -Nancy Willard
  • Out putpose sets our priority and our priority determines the productivity our actions produce
  • Financially wealthy people are those who have enough money in without having to work to finance their purpose in life
  • Planning is bringing the future into the present so that you can do something about it now -Alan Lakein
  • Purpose without priority is powerless
  • Hyperbolic discounting: the farther away a reward is in the future, the smaller the intermediate motivation to achieve it
  • Goal setting to the now: someday goal -> five-year goal -> one-year goal -> monthly goal -> weekly goal -> daily goal -> right now
  • Getting the most out of what you do, when what you do matters
  • Time blocking: Way of making sure that what has to be done gets done
  • Resting is as important as working
  • Recommendation: block four hours a day
  • Be a "maker" in the morning and a "manager" in the afternoon
  • When stuff pops out in your head, write it down on a task list and get back to what you're supposed to be doing
  • See mastery as a path you go down instead of a destination you arrive at
  • 10,000-hour rule: It takes 10k hours to achieve mastery at something
  • A different result requires doing something different
  • absorb setbacks and keep going. [...] persevere through problems and keep pushing forward
  • Circumstances won't change by themselves
  • The four thieves of productivity:
  • Inability to say "no"
  • Fear of chaos
  • Poor health habits
  • Environment [that] doesn't support your goals
  • When you say yes to something, it's imperative that you understand what you're saying no to
  • You can't please everyone, so don't try
  • The art of being wise is the art of knowing what to overlook -William James
  • High achievement and extraordinary results require big energy
  • Spend the early hours energizing yourself
  • Surrond yourself only with people who are going to lift you higher -Oprah Winfrey
  • To get through the hardest journey we need take only one step at a time, but we must keep on stepping -Chinese proverb
  • Twenty years from now you will be more dissapointed by the things that you didn't do than by the ones you did do -Mark Twain
  • A life worth living might be measured in many ways, but the one way that stands above all others is living a life of no regrets
  • Life is too short to pile up woulda, coulda, shouldas

Talk: Python static typing with MyPy

Talk cover slide

Just a quick post to remark that this week I gave a talk at Python-Madrid local meetup, and the topic I chose was Python type hints and static typing using MyPy.

Although I explained more in detail most slides, I tried to make them so even without listening to me they carry some understandable content. If anybody is interested on reading them, the direct link is https://slides.kartones.net/027.html (as usual on my slides.kartones.net section).

I wish to repeat the talk again and I even sent it as a proposal for a London Python-related event so fingers crossed :)

Update: This blog post about gradual programming explains very interestingly my point of why Python type hints are interesting as a "gradual typing". I highly recommend you to read it.


Rate limits with Python

Rate Limiting is something that most projects get as a feature late, but that the earlier it comes the better for everyone. Any non-trivial service you use will have rate limits, whenever is a soft limit ("your XXXXX service can only handle Y operations per second on your current pricing tier") or a hard limit ("any table on the database cannot have more than 12k columns per table"), and this is good, because unbounded resources are points of failure. Without restrictions on an HTTP API, you're not only allowing abusive clients to DOS the platform, you're also risking any internal developer mistake to take it down, any big process (like a batch update or a yearly report).

So basically we can agree that every system should have resource limits. There are many different ways to put them in place, but commonly either the software you build (e.g. Python services) will use some component(s) (or implement their own), or use features of the web servers to limit certain type of actions based on some criteria.

Recently at work we wanted to build an internal REST API that would perform small tasks like Google Cloud Tasks (where you queue a task and when dequeued it calls an HTTPS endpoint and you're the one in charge of executing the task on one of the instances behind that hopefully load-balanced URL). To simplify the scenario, let's say we wanted to perform lots of jobs that individually should execute quickly but if massively batched could hurt a database. The best way to avoid problems is making hard for them to happen, so I wanted to put a limit to the amount of requests that the endpoint can receive, so the resources "breathe" enough to never reach too high values.

A good summary of choices regarding rate limit algorithms can be found in the following article: https://konghq.com/blog/how-to-design-a-scalable-rate-limiting-algorithm/

For example, to us didn't mattered much if we implemented a fixed or sliding window algorithm, we don't need that much precision, but one aspect was important: It had to be distributed, because the hosts are load balanced and sometimes there are few instances, but other times there are around a dozen, so leaving 2 tasks per second with 12 instances consumes more resources than when having 3 and could cause system instability. We prefered to be more accurate with load/usage predictions, so that ruled out alternatives like implementing rate limiting at Nginx.

Checking Python libraries the main requisites were for the chosen one to be distributed and easy to use (a decorator being the best option). After some digging, the winner was django-rate-limit, which offered:

  • very easy setup as a django 1.11 middleware
  • a fixed window distributed rate limit (using Redis for the distributed storage)
  • a simple yet configurable decorator to mark http endpoints at the django views. As an added bonus it automatically returns 429 HTTP responses when the rate is exceeded, so no manual handling of exceptions!
  • request-path rate-limit key, which while not perfect (no way to rate limit by ip or other custom mechanism like user_id or cookie), was good enough as a staring point and could be implemented in the future without much effort

The library implements one of the two official Redis recommendations of building a rate limiter pattern with INCR so it was good enough and race conditions small enough to not pose an issue.


We already have it working and I even did some quick django tests and confirmed everything works as expected.

As a fun fact, as the library requires Python 3, this was the main reason that I decided to give a try to migrating ticketea.com to Pyton 3.


Previous entries