Kartones Blog

Be the change you want to see in this world

Markov Model Python Example

With our current LLM wave, which is fascinating, I've begun reading about their basics. I have a draft or two of posts with small experiments I'm doing to replicate tiny pieces of their systems (text processing is a topic that I'm not sure why but sparks my curiosity), but I remembered that not long ago, I had written a simple Markov model, a Markov chain to generate variants of the sentences found in a text.

It reads a .txt file line by line, assuming each line is a sentence, and fills a Python dictionary with the "chain" of words that forms each sentence (plus the start and end of sentence delimiters).

Using a few sentences from my sample markov_quotes.txt file:

Be the change you want to see in this world
Be the person your dog thinks you are
Everything a person can imagine, others will do

When it finishes reading, it knows it can begin a sentence with "Be" or "Everything"; If it (randomly) chooses "Be", then it has only seen the word "the" after it, so must it must follow; but the third word can either be "change" or "person"; If it chose "person", then the next word could either be "your" or "can"; and so it goes until it either picks an end of sentence delimiter, or we reach the maximum number of words per sentence we've setup.

It could generate the sentence "Be the person can imagine, others will do .". It is incorrect, but the bigger the input text you feed, the greater the variety and potential chance of generating something making more sense.

As an example with the quotes file, running it a few times, sometimes produces funny philosopher quotes:

Experience is easy .

Donโ€™t teach them like a professional is right not improving .

Be the worst .

He who seek the things will never have to control complexity not a shorter letter .

Spend your own happiness and go home .

Boy Scout Rule Always leave the happiness and go home .

Learn the best and distribute the rules like a priority .

Work hard and practice something youโ€™ve never have written a marvellous thing that you don't feel like doing them to ...

He who thinks you want to think .

Try to create it wrong because nobody sees it again .

I've also included a transcript of a TED talk, which again mostly generates gibberish, but at times almost looks correct.

Not precisely groundbreaking, but fun and illustrative of some basic "brute-forcing" method of creating new text.

You can find the Python code on my GitHub.


Browser Automation via Chromium

Browser automation has advanced a lot, not only regarding the frameworks and tools but also in the most fundamental piece: the browser itself. Google Chrome is now very mature, has the biggest market share (as of mid'2023), and complies with all web standards, so it is an excellent starting point for automation projects.

In this post, I'll mention the most relevant pieces you need to set it up.


Using Google's Chromium instead of the main Chrome has two main advantages:

  • Some of the Google-specific features are removed, and any Google APIs require API keys to function, so they will be disabled by default
  • Unlike with Chrome, it is easy to get previous builds, so you're not forced to always test only with the latest version

But otherwise, it is the same browser.

There is a handy latest build link to download Chromium: https://download-chromium.appspot.com

Be aware that those builds, under Linux, come without the Widevine (DRM) compilation flag, so even if you follow the steps below, it won't work with protected content.

The ungoogled-chromium-binaries GitHub project provides Linux binaries compiled with the DRM flag. From the releases page is easy to pick either the latest version or a specific one:

https://github.com/clickot/ungoogled-chromium-binaries/releases/download/112.0.5615.165-1/ungoogled-chromium_112.0.5615.165-1.1_linux.tar.xz

An alternative site that hosts binaries for all platforms compiled with the DRM flag is: https://chromium.woolyss.com/

ChromeDriver is another critical piece, alongside an automation framework like WebDriverIO. It is easy to automate fetching a certain version via their download URLs:

https://storage.googleapis.com/chromium-browser-snapshots/Linux_x64/1109220/chromedriver_linux64.zip

As mentioned before, Chromium might come without the DRM library, Widevine. You can fetch specific versions via URLs like the following:

https://dl.google.com/widevine-cdm/4.10.2557.0-linux-x64.zip

Following the instructions provided at the chromium-widevine GitHub project set it up, which consists of extracting the files in a certain subfolder structure inside Chromium's main folder.

Another URL you'll use a lot when setting up Chromium automation is https://peter.sh/experiments/chromium-command-line-switches/, because it contains a complete list of the hundreds of command-line arguments/flags/switches. There is no official documentation, so this is really valuable.

For debugging errors thrown by the browser, you probably want to use the flags --enable-logging=stderr --v=1.

Suppose you plan to run automated browsers in a Linux environment without display (like a Docker container, or a CI instance installed without the X Server). In that case, you will probably want to use XVFB (and xvfb-run).

Finally, if you are really brave, and have some spare time, you can manually download and compile Chromium from the source code, but it is time-consuming.

UPDATE #1: Added friend suggestion of another page containing binaries for all platforms (including Linux with DRM flag).


Book Review: Pro Git

I've mentioned at least once my opinion that I would have preferred for Mercurial to have won the distributed version control systems race, because its commands were way more consistent and easy. But as of today, git has both come a long way and it it also a very powerful tool. And won the battle. So, I fully embraced git and been trying to level up lately.

I've written a git cheatsheet since quite some time, and while it still does not cover everything (and won't), I've added a bunch of new content after reading the book.

Review

Pro Git book cover

Title: Pro Git

Author(s): Scott Chacon, Ben Straub

If I had to summarize this book quickly, I'd say: If you use git, you must read it.

I've read dozens of articles and tutorials with varying difficulty levels (the hardest being at times git's own documentation). Since the first chapter, I found the explanations excellent. Everything is nicely explained, accompanied by examples, and any time the topic at hand might be non-trivial to understand, the authors will also include helpful diagrams showing branches, commits, or whatever is needed.

Need to learn about the different states a file can be (untracked, staged, committed, ...)? Check; need to learn complex strategies to bring commits from some branches to others when all of them had changes? Check; want to know how git stores commit references and even learn how to do low-level operations and other hardcore stuff? Check. To provide some context, the book is heavily based on git, and GitHub is barely mentioned here and there, so you will learn to do things in a generic but proper way and then lean on services such as GitHub or GitLab to maintain your remote repositories, user accounts and the like. But if you want, the book teaches you how to setup your own git servers (and even how the different available communication protocols work).

Over the ~520 pages, there's so much content, sometimes in so much detail, that I skipped most of the server management content. But now I know that it is explained there too, and if I need to, I can go back and check how to manage user credentials and push/pull repository permissions. I recommend picking, at minimum, all the general chapters (which can be around 50% of the book).

I wish I had read the book earlier because now I learn how git works internally, which helps me better understand any merge issue, any colleague asking "how do I xxxxx?", and how best to work with the tool.

Minor update: I just remembered mentioning another remarkable feature of the book. It is freely available for download. So there's no excuse not to give it a try.


Note-Taking and Knowledge Base

One way that works for me to retain information better is to write it down. Digitally if possible, as I never had great handwriting, so computers were a nice level up ๐Ÿ˜…. Because of it, I've already done multiple iterations of changing my note-taking and knowledge-base information storage: I began using simple text files, then Microsoft Word documents, then further explorations of wiki software; and, with the advent of smartphones, a myriad of note-taking applications like Evernote and Google Docs.

Note-Taking

Local-only content is not a problem, but I now stay away from binary formats (including databases) for the content. On the other side, if I use a mobile application for note-taking, I want it to be swift, even if it only keeps local notes; I don't care about having any fancy features, but if it stores content online, it should be quick.

Leaving aside "PC solutions" for a moment to focus on mobile apps, I tried a few apps for months each before jumping to the next one, but the main highlights are: Evernote -> Google Docs -> Dropbox Paper -> Notion -> Markor

Evernote got too big and slow, so it died for me. GDocs is excellent on the desktop, but I was never a fan on mobile. I've used Dropbox Paper for a long time, because on desktop works excellently, but on mobile the synchronization was sometimes noticeably slow (and apparently dependent on the number of documents you have, even if others didn't change). And Notion, yesterday's new kid on the block, is also slow on mobile, probably because of the many features provided that I don't need. So I settled on using an offline markdown-based text editor, Markor. It's by far the fastest app (both booting up and regarding responsiveness) I've ever seen on Android, and I'm okay with checking notes when I arrive home and manually syncing them.

Knowledge Base

And what about the knowledge base part? After years of using Dropbox Paper for it (~ 60 documents total), what made me change away from it was mainly a "have you tested your backups?" situation ๐Ÿคจ.

I have been strictly exporting all Dropbox Paper documents monthly in markdown format. Once or twice I sampled one or two of them, and everything looked fine. But lately, I've started to peek at "my archives" to see some things that I could convert into blog posts/pages to share (e.g., the list of Linux command line tools I use), so I picked up a few exports, opened them in the IDE to preview the rendered markdown... and found that the formatting was far from good.

Markdown is very simple: you can't change font sizes (outside of using headers and sub-headers), line and paragraph spacing define aspects such as if a sentence is part of a paragraph or goes on its own, bullet lists are simple bullet lists (unlike MS Word's chaos of indentations and icons), and code blocks should be code blocks, right? Well, it seems that is not that simple, because I got sentences weirdly spaced (or grouped), almost no multi-line code blocks (a few converted to single-line equivalents), and nested lists sometimes got wanky. All of this is visible in the source and when rendering the export vs. the current Paper document side-by-side.

And thus, a new quest for an alternative began, and in the end, I found what I was looking for in Obsidian. I'm not going to do a sales pitch of what it offers (other than offline-first markdown file edition). Instead, I'll explain what, how, and why I'm using some of its features after spending quite some time fixing in anger all the formatting issues in my existing files:

  • My primary use is for editing markdown files, and with a nice "export as html" plugin having the choice of reading without the app
  • I am not using the Canvas sub-product as of now for my existing KB; It is helpful for brainstorming and topic exploration, though
  • I am still not proficient with how the links and #-links seem to work
  • The read/edit view toggle is quick and immensely useful (it keeps the cursor where you were when you toggle)
  • Opening the document outline's right sidebar is crucial (it should be visible by default!)
  • I am not using it right now, but having the option of online sync (paid feature), plus mobile apps, makes it a complete solution if I ever want to use it for note-taking
  • There are a bunch of exciting plugins I want to check

Solving the markdown formatting has taken most of my effort so far using the approach. From now on, I expect to enjoy more of its features, as I can begin splitting big documents into smaller ones and link/reference them.


Classes in Javascript - 2023 edition

MDN is my go-to reference for anything web-related, even before doing a web search. But at times, it is structured too much as a reference guide, spreading over single pages every object property, every feature, every method... Which makes harder to get a general glimpse of a specific topic.

I wanted a quick refresh on what you can and cannot do, with Javascript, regarding classes, as of early 2023. I knew about class constructors, inheritance and recalled something about getters and setters, but for example I didn't knew about private scoping or the existence of static (I was going to go for the classic prototype modification). After reading everything about JS classes, I've written this small example containing an example of mostly everything on the topic of classes in JS:

// definition
class Movable {
  // public properties (can have default value)
  x;
  y;

  // constructor
  constructor(x = 0, y = 0) {
    this.x = x;
    this.y = y;
  }

  // getter
  get position() {
    return [this.x, this.y];
  }

  // normal method
  move(x, y) {
    this.x = x;
    this.y = y;
  }
}

// inheritance
class Robot extends Movable {
  // private property + default value
  #movements = 0;
  // static property (can also be private) + default value
  static CAN_MOVE = true;

  // class constructor
  constructor(x = 0, y = 0) {
    // invoking parent constructor
    super(x, y);
    // not needed because of default value, but could do:
    // this.#movements = 0;
  }

  // static class constructor/initializer
  // static {}

  // getter
  get movements() {
    return this.#movements;
  }

  // setter
  set movements(value) {
    this.#movements = value;
  }

  // prepend `async` on methods when corresponds

  // normal method
  move(x, y) {
    // invoking parent method
    super.move(x, y);
    this.#movements++;
  }

  countAllMovements() {
    // invoking private method
    for (const count of this.#countMovements()) {
      console.log(`Movement #${count + 1}`);
    }
  }

  // static method
  static canMove() {
    // static properties must be used via class
    return Robot.CAN_MOVE;
  }

  // generator method + private method
  *#countMovements() {
    let counter = 0;
    while (counter < this.#movements) {
      yield counter++;
    }
  }
}

// instantiation
const myRobot = new Robot(5, 5);

// static methods must be invoked via class
console.log(Robot.canMove());

myRobot.move(7, 7);
myRobot.move(9, 9);
myRobot.move(10, 7);
console.log(myRobot.position, myRobot.movements);
myRobot.countAllMovements();
myRobot.movements = 0;
console.log(myRobot.movements);

Previous entries