SWIM: The scalable membership protocol

In a distributed system we have a group of nodes that need to collaborate and send messages to each other. To achieve that they need to first answer a simple question: Who are my peers?
That’s what membership protocols do. They help each node in this system to maintain a list of nodes that are alive, notifying them when a new node joins the group, when someone intentionally leaves and when a node dies (or at least appears to be dead). SWIM, or Scalable Weakly-consistent Infection-style Process Group Membership Protocol, is one of these protocols.

Read more

Raft: Consensus made simple(r)

Consensus is one of the fundamental problems in distributed systems. We want clients to perceive our system as a single coherent unit, but at the same time we don’t want to have a single point of failure. We need to have several machines collaborating in a way that they can all agree on the state of the world, even though a lot of things can go wrong. Nodes can crash, messages can be delivered out of order or not be delivered at all, and different nodes can have a different idea of what the world looks like. Making a distributed system behave like a coherent unit in face of these failures can be a challenge, and that’s why we sometimes need a consensus algorithm, like Raft, that gives us some guarantees about the properties that we can expect of this system.

Read more

TCP Flow Control

TCP is the protocol that guarantees we can have a reliable communication channel over an unreliable network. When we send data from a node to another, packets can be lost, they can arrive out of order, the network can be congested or the receiver node can be overloaded. When we are writing an application, though, we usually don’t need to deal with this complexity, we just write some data to a socket and TCP makes sure the packets are delivered correctly to the receiver node. Another important service that TCP provides is what is called Flow Control. Let’s talk about what that means and how TCP does its magic.

Read more

A Primer on Database Replication

Replicating a database can make our applications faster and increase our tolerance to failures, but there are a lot of different options available and each one comes with a price tag. It's hard to make the right choice if we do not understand how the tools we are using work, and what are the guarantees they provide (or, more importantly, do _not_ provide), and that's what I want to explore here.

Read more

Exponential Backoff in RabbitMQ

RabbitMQ is a core piece of our event-driven architecture at AlphaSights. It makes our services decoupled from each other and extremely easy for a new application to start consuming the events it needs.

Read more

Speaking Rabbit: A look into AMQP's frame structure

RabbitMQ supports several different messaging protocols, but there is no doubt that AMQP (0-9-1) is the one most commonly used (and what RabbitMQ was originally developed for).
It’s AMQP that defines how exchanges, queues, binding and most of the things that you, as an application developer, usually have to work with.

Read more

Process registry in Elixir: a practical example

Processes in Elixir (and Erlang, for that matter) are identified with a unique process id, the pid.
That’s what we use to interact with them. We send a message to a pid and the VM takes care of delivering it to the correct process. Sometimes, though, relying on the pid of a process can be problematic.

Read more

Understanding Shell Script's idiom: 2>&1

When we are working with a programming or scripting language, we are constantly using some idioms, some things that are done in this certain way, the common solution to a problem. With Shell Script this is not different, and a quite common idiom, but not so well understood, is the 2>&1, like in
ls foo > /dev/null 2>&1.
Let me explain what is going on here and why this works the way it does.

Read more

Getting started with Plug in Elixir

In the Elixir world, Plug is the specification that enables different frameworks to talk to different web servers in the Erlang VM. If you are familiar with Ruby, Plug tries to solve the same problem that Rack does, just with a different approach.
Understanding the basics of how Plug works will make it easier to get up to speed with Phoenix, and probably any other web framework that is created for Elixir.

Read more

The actor model in 10 minutes

Our CPUs are not getting any faster. What’s happening is that we now have multiple cores on them. If we want to take advantage of all this hardware we have available now, we need a way to run our code concurrently. Decades of untraceable bugs and developers’ depression have shown that threads are not the way to go. But fear not, there are great alternatives out there and today I want to show you one of them: The actor model.

Read more

Rethinking your shebang

Are you using #!/bin/{bash,zsh,sh} in your shebang? Most scripts that I have to deal with are, and it sucks. Let me show you why using #!/usr/bin/env {bash,zsh,sh} is better, most of the time. If you are already using #!/usr/bin/env, learn why you shouldn’t just use it blindly.

Read more

Stop using tail -f (mostly)

I still see a lot of people using tail -f to monitor files that are changing, mostly log files. If you are one of them, let me show you a better alternative: less +F

Read more

Creating a RubyGems plugin

Do you know when you install a gem and it adds a custom command to RubyGems? Then you can just run gem <custom-command> <params> and it does something cool? Well, that is just a RubyGems plugin, and although it’s not very well documented, it’s not that hard to create one.

Read more

Understanding Bundler's setup process

If you work with Ruby, chances are that you are using Bundler quite a lot. It’s the de facto solution for dependency management, and it’s hard to find a project without a Gemfile. What is not part of the common knowledge, though, is how it works. More specifically, how does it make your code see just the dependencies that it should see and nothing else? Let’s look into Bundler’s code to find out.

Read more

Vim as the poor man's sed

Not long ago I wrote about sed, a powerful non-interactive editor that can be used to edit multiple files in a fairly easy way. Today I want to show how we could use vim’s not so well known ex mode to do some of these same tasks, and what are the benefits and shortcomings.

Read more

Vim registers: The basics and beyond

Vim’s register are that kind of thing you don’t think you need until you learn about it. Then, it’s hard to imagine life without it and it becomes essential in your workflow. It’s still common for people to use vim for years without knowing how to properly work with registers, which I think is a shame, and just a basic understanding of what they are and how they work can make you a lot more productive (and avoid a couple of annoyances).

Read more

Implementing a Priority Queue in Ruby

The other day I had to use a priority queue to solve a problem. It was a Java project, so I already had the PriorityQueue class ready to be used. After the code was done, I started to wonder what a solution in Ruby would look like. And then I discovered that Ruby does not have a priority queue implementation in its standard library. How hard could it be to implement my own?

Read more

Enough sed to be useful

sed is a text editor that is probably already installed in your machine and can help you be more productive. It can make the boring and time consuming task of editing multiple files a breeze, and it shouldn’t take more than a few minutes to learn the basics.

Read more

Understanding Ruby's idiom: array.map(&:method)

Ruby has some idioms that are used pretty commonly, but not very often understood. array.map(&:method_name) is one of them. We can see it being used everywhere to call a method on every array element, but why does this work? What’s really happening under the hood?

Read more

The role of a reverse proxy to protect your application against slow clients

When you are running an application server that uses a forking model, slow clients can make your application simply stop handling new requests. Slow clients can be just users with a slow connection sending a large request, or an attacker, being slow on purpose. I’ll try to explain what these slow clients are and how a reverse proxy can be used to protect your application server against them.

Read more

Working with HTTP cache

The fastest network request is a request not performed. That’s the job a HTTP cache: avoid unnecessary work. By understanding how it works, we can create web applications and APIs that are more responsive, by reducing the latency and the amount of used bandwidth.

Read more

An introduction to UNIX processes

Processes are a very important piece in the UNIX world. Basically, almost every program that you execute is running in a process.
Although you may not need to interact directly with them all the time, you are certainly depending on them to get anything done in a UNIX system.

Read more

Designing good APIs - Avoiding the type marshalling trap

Automatically serializing a model object in a data format may be tempting, as most frameworks give this functionality out of the box, but it can bring more problems than benefits.

Read more