Pandas basics

Pandas ist eine mächtige Python-Bibliothek, um mit Daten zu arbeiten. Ihre Beliebtheit erklärt sich neben einer guten Performance1 auch durch eine hohe Ergonomie. Viele Aktionen, die man mit Pandas-Daten ausführen kann, lassen sich mit Standard-Python-Operatoren ausdrücken. Leider führt dies auch dazu, dass sich Pandas fast wie eine Mini-Sprache, welche in Python eingebettet ist, anfühlt. Das systematische Erlernen dieser Sprache ist schwierig, zumal die Dokumentation von Pandas zwar umfangreich, aber meiner Meinung nach eher als Referenz als als Lernmittel gestaltet ist. ...

February 12, 2024

Calculating TOTP with bash

A colleague of mine has just posted a shell script for calculating TOTP secrets. From my POV, using a Python virtual env is overkill, and I was pretty sure that you could do it with bash and a few extra utilities as well. The bash script would probably even be faster (albeit not necessarily more secure, unless you’d gotten all the quotes right). There’s a “proper” shell implementation by Rich Felker on Github, which I didn’t read before implementing a bash-non-posix solution. ...

January 8, 2024

Generating Clojure Native Images with GraalVM and clj-nix

I’ve been building a little project with Clojure to automatically switch my IKEA Trådfri lights when I start playing a video and restoring them when I pause. I’m lazy, I know. I used Clojure because it’s fun, and I like to use fun languages when not writing code for money. Also, there’s a good CoAPS library available for Java, which is a big plus, as many other languages don’t have support for CoAPS, which is CoAP + TLS. ...

September 5, 2023

Demystifying Text Generators

In this article, I’ll try and demystify text generation via neural networks by explaining how the technology works in very basic terms. Hopefully though, my explanation will be complete enough to give the reader an understanding that’s good enough to critically examine the hype around text generators and correct some misunderstandings I’ve seen around me. There’s also some terminology in there that will probably help you understand discussions on text generation (and also make you sound smarter☺). ...

May 5, 2023

Analyzing multi-gigabyte JSON files locally

I’ve had the pleasure of having had to analyse multi-gigabyte JSON dumps in a project context recently. JSON itself is actually a rather pleasant format to consume, as it’s human-readable and there is a lot of tooling available for it. JQ allows expressing sophisticated processing steps in a single command line, and Jupyter with Python and Pandas allow easy interactive analysis to quickly find what you’re looking for. However, with multi-gigabyte files, analysis becomes quite a lot more difficult. Running a single jq command will take a long time. When you’re trial-and-erroriteratively building jq commands as I do, you’ll quickly grow tired of having to wait about a minute for your command to succeed, only to find out that it didn’t in fact return what you were looking for. Interactive analysis is similar. Reading all 20 gigabyte of JSON will take a fair amount of time. You might find out that the data doesn’t fit into RAM (which it well might, JSON is a human-readable format after all), or end up having to restart your Python kernel, which means you’ll have to endure the loading time again. ...

March 9, 2023

Recoding mixed-encoding text files

Some of my colleagues are regular users of Clojure, a JVM-based Lisp. I do like Lisp-based languages. I’ve attempted last year’s Advent of Code in Racket, and it was a great experience (although I am apparently not as great a coder as I thought I was 😢). Anyway. JVM-based languages are normally not great for scripting, on account of the longish startup time of the virtual machine. The aforementioned colleagues have sung the praises of Babashka, which is a version of Clojure suitable for scripting “where you would use bash otherwise”. ...

August 25, 2022

Solvespace mini tutorial

3D modelling can be fun for making purely virtual models of things. With the advent of 3D printing, however, you can now model things that don’t exist yet, and get them delivered to your home for modest fee (or even printing it yourself, if you happen to own a 3D printer). However, 3D CAD tools are often complicated, expensive or both. I’ve found and grown to like Solvespace. It’s a parametric CAD program that looks like old DOS software. But a) that’s the way I like it and b) it’s pretty well documented. I wouldn’t want to model a car with this, but for simple geometric shapes, it’s perfectly serviceable. ...

April 6, 2022

Accessing the IP management interface on a Supermicro board

I’ve recently recently built a new storage server (post coming up). This being the first time I’ve had access to a server board, I had some problems accessing the IP management interface that is provided by Supermicro mainboards. This IP management interface (IPMI for short) allows you to do all sorts of nifty things, like powering the server on or off remotely, or configuring the BIOS without having to connect a screen and keyboard. There’s some old information on the internet, and I also wanted to write down the detailed instructions, so others don’t have to go to the same trouble as I did. ...

March 8, 2022

Updating the BIOS on a Supermicro board

I’ve recently bought hardware for a new NAS server build. Some of that is still in the throes of shipping (postal services have been pretty unreliable as of late), but most of it has arrived: A Supermicro mainboard, case, M.2 SSD and a PSU allows me to get started on the build. As I like to live dangerously, I also upgraded the BIOS on the mainboard1. Since the process was somewhat involved, I’m documenting it here. Also note: As alluded to by the subtitle, you could always buy a license from Supermicro to install a BIOS update via the out-of-band management engine. I didn’t want to spend another 30 bucks on a license, so I did it the cheap/hard way. ...

February 8, 2022

Adventures in ELFland

I’ve been trying to get Wireguard to run on my oldish VServer. The kernel is either too old, or the provider hasn’t compiled it with the necessary modules. wireguard-go seemed like a solution for this problem. However, I did not want to install a Go toolchain on the server, so I built the program locally. Go places a premium on easy distribution, so this might have worked. However, it didn’t: Wireguard-go uses network functions, and so the resulting binaries aren’t statically linked by default. What’s worse is that the resulting file depends on a version of glibc that isn’t on my server. I saw this as an opportunity to dive deeper into the innards of ELF files, and learn something new. ...

January 26, 2022