Nudin's Blog

Writing a GIMP 3.0 Plugin – Ressources and Notes

2025-02-12

GIMP 3.0 is almost out, and I'm thrilled. Not only will this version bring new features and improvements, but it finally modernizes the technology stack in two important ways:

It will use GTK 3
It will use Python 3

This will make it easier and more sustainable to develop plugins for GIMP. The old Python 2 plugins will not work anymore, but the new Python 3 plugins will be more future-proof. I have a few ideas for plugins I want to write, so the end of last year I started to dig into it. And while I'm far from the only one with this idea, information on writing GIMP 3 plugins is naturally still sparse. So I decided to write down some collected resources and notes. If you have more resources or tips, feel free to contact me, I'll happily update the article.

Plugins can be written in C, Python 3, Scheme, JavaScript and Vala. I will focus on Python 3.
As with GIMP 2.10, GIMP 3.0 ships with a Python interpreter embedded and will not use the system Python.
It currently ships Python 3.11
The C-API documentation is available at https://developer.gimp.org/api/3.0/
The Python API documentation is not yet available to read online, but can be found in a tarball in on the download page: https://download.gimp.org/pub/gimp/v3.0/api-docs/ It is however of limited benefit, since it doesn't contain any information not in the C-API documentation, except for the exact Python-method names, that can be guessed easily form the names for their C counterparts.
A basic tutorial on how to write a plugin in Python 3 can be found at https://testing.docs.gimp.org/3.0/en/gimp-using-python-plug-in-tutorial.html
Your plugin should be placed in a subfolder of ~/.config/GIMP/3.0/plug-ins/ and the main file must be named the same as the subfolder.
GIMP contains a Python REPL that can be started with Filters -> Development -> Python-Fu -> Console or quicker with the search feature (/ and then type Python)
The Python REPL has some code limited completion that sadly only works for classes not for instances.
You don't need to restart GIMP, when you edit your plugin. It's loaded from disk on every run.
Run GIMP from terminal to see errors and stdout output of your plugin.
The GIMP source code repository contains several plugins that can be used as examples. They are located in the plug-ins folder.
If you want to use any Python modules, other than the standard library, you have to bundle them with your plugin. You can install them in the plugin folder with pip install --target /path/to/plugin somepackage. You can use the site module to add additional paths to the Python search path.
If you got stuck, you can ask for help on the IRC channel #gimp-dev on irc.gimp.org.

Shorter Hacks 23: Bash previous folder

2025-02-09

In bash you can not only cd into the last directory with cd -, but also use tilde-dash with any other command. For example: ls ~- will show the files in the previous directory.

Shorter Hacks 22: Systemd-analyze calendar

2022-12-13

Compared to cron, systemd timers have their own, alternative time specification format. This is more human readable than the old cron specification – but a "new" format means new possibilities to screw up. Luckily systemd has a feature to check if a time specification is sound and really means what you intend:

$ systemd-analyze calendar "Mon,Tue *-01..04 12:00:00"
  Original form: Mon,Tue *-01..04 12:00:00
Normalized form: Mon,Tue *-*-01..04 12:00:00
    Next elapse: Mon 2023-01-02 12:00:00 CET
       (in UTC): Mon 2023-01-02 11:00:00 UTC
       From now: 2 weeks 6 days left

Shorter Hacks 21: Git push previous commit

2022-11-27

You can not only push the current state of a branch to its remote branch, you can push any commits to any remote branch. This way you can for example push all your commit except the latest with:

git push remotename @~:branchname

I regularly use this when my current commit is not finished yet, but I want to push the already finished work done in previous commits. How does this work? The first argument to git push is the remote to which you want to push code (usually origin), the second specifies which commit to push to which remote branch. The @ is short for HEAD – the current commit, and the ~ selects its parent commit. You can use @~~ or @~2 if you want to skip the last two commits and so on. After the colon, you specify the branch name on the target on that you want to push.

Shorter Hacks 20: IPython history

2022-11-20

Shorter Hacks 20: IPython history

In IPython you can print all commands of the current session with (in a copy-paste able version) by %hist. If you want to display the history of the previous time you used IPython you can do that by %hist ~1/. To search the full history for a keyword, use %hist -g foobar.

Shorter Hacks 19: Terminate hanging ssh sessions

2022-10-22

When the network connection breaks while you work in an ssh session or the ssh server terminates – maybe due to a reboot – the ssh session freezes and stays frozen for a long time. There is an ssh keybinding to stop such a frozen session: First press return, then ~ and finally . – this will immediately terminate the connection and exit the client.

Shorter Hacks 18: Filter lines in less

2022-10-16

The pager less has an option to filter displayed lines by regex pattern. Basically a built-in grep. Press & and then type your filter and only matching lines will be shown. A ! at the start of the pattern or Ctr+N instead of the ampersand will turn it to an inverse filter.

Shorter Hacks 17: SSH Purge host from known_hosts file

2022-09-28

When the host key of a computer you want to ssh to has changed for a valid reason, ssh will block attempts to connect to it to avoid man-in-the-middle attacks. Deleting the key from the known_hosts file with an editor can be annoying, especially if the files is "hashed". Therefore, ssh-keygen offers a feature to do that:

ssh-keygen -R hostname

Shorter Hacks 16: IPython Autoreload

2022-09-18

When developing some python code and testing it in IPython, I love the autoreload feature of IPython. When enabled it will reload imported modules automatically. So you will always use the newest version of your code. It even patches modifications on class methods into existing class instances. To enable it first load it with %load_ext autoreload, then enable it with %autoreload 2 – See the documentation for explanations of the options and how to autoreload only selected imports. Here is a code sniped to demonstrate its power:

In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: from test import Foo

In [4]: foo = Foo()

In [5]: foo.bar()
1

# Edit the code
In [6]: foo.bar()
2

You can put the following into your ipython_config¹ to load and enable the autoreload extension automatically on startup (I only use the first line):

c.InteractiveShellApp.extensions.append("autoreload")
c.InteractiveShellApp.exec_lines = ["%autoreload 2"]

1: Ether

~/.ipython/profile_default/ipython_config.py or
~/.config/ipython/profile_default/ipython_config.py

Shorter Hacks 15: Dig output format

2022-09-09

You probably use dig to query dns records. The default output of dig is rather verbose, but you can configure it. Try the following for a succinct output:

dig +nostats +nocomments +nocmd +noquestion +identify +multiline @1.1.1.1 schoenitzer.de

You can set this as default output by adding the options to the file ~/.digrc.

Shorter Hacks 14: Get external IP behind NAT

2022-04-08

You are behind a NAT and need your "outer" IP address? You can use curl with one of several handy websites: For example: curl icanhazip.com or curl ifconfig.me. If you want to self-host such a service with nginx it's as simple as:

location / {
    default_type text/plain;
    return 200 "$remote_addr\n";
}

Shorter Hacks 13: Git exclude files

2022-04-02

If you want to ignore a file without putting it into the public .gitignore you can list it in .git/info/exclude – works the same way but is local to you computer and not version controlled.

Shorter Hacks 12: Open editor from within less

2022-03-29

Opened a file in less but then realise you have to edit it. The key v (as in vim?) opens up your editor, as defined in EDITOR or VISUAL.

Shorter Hacks 11: journalctl list boots

2022-03-14

journalctl --list-boots list your last boots with start and shutdown date, as far as the journal dates back. You can use journalctl -b num to view the logs of previous boots. Beside that I sometimes use it to check "at what time did I went to bed yesterday?"

Shorter Hacks 10: disown

2022-03-20

Some long running command is still active in your ssh session and you need to leave; but you don't want to stop it and therefore regret that you didn't execute it in screen/tmux or with nohup? You can still use disown. Stop the process by <C-Z>, unpause it in the background with bg and run disown so that you can end you ssh session without killing it.

Shorter Hacks 9: git autostash

2022-03-14

You want to rebase, merge or pull but have uncommitted changes? Use the option --autostash. Example: git pull --autostash is equivalent to `git stash; git pull; git stash pop`

Shorter Hacks 8: limit strace to syscalls with files

2022-02-28

I often use strace to see which files a program uses and often even more interesting: wich (non-existing) files it fails to open. With the flag -y strace will decode file descriptors, so you see the path of the files and with -e %file strace will only print syscalls that operate on a file: open, stat, chmod, unlink, … Please note that this does not include syscalls operating with a file descriptor like read or write – you can add them: -e %file,read,write but keep in mind there are multiple syscalls capable to reading from or writing to file descriptors.

Shorter Hacks 7: better search in man pages

2022-02-06

I love the man page system on Linux/Unix and frequently look something up. But one thing drove me crazy: Since the description of flags is written in a new line below the flag, most of the time you have to scroll up again after every search to see the flag you've been searching for. If you don't understand what I mean, search for "symbolic link" in the man page of grep to find the flag -R. Since you're most likely reading man pages with less as pager, there is a way to make this much more convenient: Less has the option -j that will move the matches of a search not in the first line but in the n'th line. Either set MANPAGER='less -j2' (or a higher number), or add the option to the environment variable LESS to enable it everywhere.

Shorter Hacks 6: pipe stderr

2022-01-29

Normal pipes in bash/zsh only redirect stdout to the target process. If you, for example want to grep the output of strace you need to also redirect stderr to the target. You can do this by strace ls |& grep open. On older versions of bash you need the less handy strace ls 2>&1 | grep open.

Shorter Hacks 5: git word diff

2022-01-23

If a diff is hard to read because the changes are small, try `git diff --word-diff`. While the default diffing algorithm works line wise, this will show you changes within a line.

Shorter Hacks 4: strace failed-only

2022-01-14

Want to see why some program is failing? Strace often has the information, but the output is too much to read. You can use the option -Z to limit output to failed syscalls only. -z on the other hand limits to successful syscalls.

Shorter Hacks 3: grep ps

2022-01-02

I was always annoyed that if you grep the output of ps you will always also get the process of grep itself. Then I found this trick: ps aux | grep [h]top

Shorter Hacks 2: get local ip with hostname

2021-12-24

To get the hosts ip address I used to use ip addr – but the output is noisy and the address hard to spot. Use hostname -i instead, to get exactly what you need.

Shorter Hacks 1: git dash

2021-12-19

You probably know cd - to move back into the previous directory. Git offers the same handy shortcut to refer to the last checked-out branch. So, you can conveniently switch between two branches by git checkout -, or merge the previous branch by git merge -, etc.

Arch Packages You Might Want to Install

2021-11-02

After installing a fresh Arch, the system ist pretty bare and you might spend a long time thinking about which packages you might need. But even after installing all packages you came up with, the installation is likely followed by a long period where you regularly discover missing tools. To help to speed things up, here is a opinionated list of arch packages that most people might want to install. It's not meant as a list to blindly copy-paste, but rather as a checklist. For some packages, that won't be relevant for everyone, I added some comments.

# Hardware related and drivers
alsa-firmware
alsa-utils
bluez-utils
pulseaudio
pulseaudio-bluetooth
linux-firmware
v4l-utils
wireless_tools
wpa_supplicant
networkmanager
intel-ucode      # If you use an Intel processor
sof-firmware     # Firmware for many sound cards
linux-zen        # Alternative kernel


# Admin tools
awk
bash-completion
base-devel
curl
dnsutils
git
gnupg
inetutils
iproute2
iptables
iw
iwd
jq
less
lsof
man
net-tools
nmap
openssh
strace
sudo
unzip
usbutils
whois


# More Advanced Debugging Tools
bcc
bcc-tools
bpf
bpftrace
tcpdump
wireshark


# Basic terminal applications
bc
neomutt    # enhanced version of the mutt mail client
neovim     # enhanced successor of vim
python-neovim
screen
tmux
wget


# Convenience terminal tools
bat
dust
fd
lynx
ncdu
ripgrep
zoxide
wl-clipboard  # Command-line copy/paste utilities for Wayland
xclip         # Command-line copy/paste utilities for X11
xsel          # Command-line copy/paste utilities for X11


# Replace by your favorite terminal emulator
# If you haven't give kitty a try.
kitty
konsole


# X and/or wayland + a display environment + display manager
# Replace by your favorite choice
wayland
xorg-xwayland
xorg
sddm
plasma-wayland-session


# Common Desktop Applications
chromium
firefox
firefox-i18n-de      # Replace by your language
thunderbird
thunderbird-i18n-de  # Replace by your language
keepassxc
vlc
meld
mpv
youtube-dl
slack
hunspell-de    # Replace by your language
languagetool   # More advanced spell and grammar checking


# If you use a Yubikey
libfido2
pcsclite
hopenpgp-tools
yubico-pam
yubikey-manager
yubikey-manager-qt
yubikey-personalization

Audio volume in Anki

2021-01-03

TL;DR: Install mpv and put volume=80 into ~/.local/share/Anki2/mpv.conf to set the audio volume of Anki to 80%.

Audio volume handling is pretty high in my list of "annoying things I can't believe are still an issue". Whenever I listen to music or watch video there's a high chance, I have to either turn up the volume to hear something. Not much later some other application plays audio with a much higher volume resulting in me worrying whether I just woke up the neighbors. Sadly a lot of applications don't even have a volume setting…

One of those applications is Anki. Since I use it a lot I decided to fix this and a few other things that bug me by writing according plugins. Since I searched the internet several times for any existing ways to set the volume I was surprised when while writing a plugin for it, I found that there is an easy – but undocumented – way to set the playback volume for Anki. At least if you use Linux and have mpv installed. Mpv is one of a hand full of multimedia players that Anki can use for playback. Anki will run a mpv instance as daemon in the background and control it via IPC. MPV is started with the option --config-dir=<ankidir> this way mpv will ignore you default configuration file and search in the Anki directory instead. So you can create a mpv configuration file in this option and in that among other things set the volume or enable audio filters. I set the volume to a level compatible with my other applications. If that's not enough – for example because the audio files in you Anki decks have different volumes – you could enable a dynamic range compression filter.

I still consider writing a simple Anki plugin to set the volume since I suggest more people could use that. For now, I am satisfied with the existing solution and can next try to find a solution to set the volume for websites like Duolingo that also miss a volume setting…

PS: I also considered writing a PulseAudio module to fix this problem in general, but the documentation is spare and is seemed like a bigger project.

A better button for my Tomu

2020-10-17

A while ago I ordered a Tomu. More specific an "original" Tomu. And not only one but 60… but not all of them for myself. ;) Until recently I used my Tomus only sparely. The mayor issue that prevented me so far was that the "button" that has to be pressed at every authorisation was pretty hard to touch. The origin of that problem is that the buttons are just two tiny conductive areas on the board and the case has a fixture right next to it. Often touching it was no issue at all but sometimes it didn't work for the first five times — and even though rare, this was too annoying.

Recently I decided to fix this. I soldered two simpel pins on the button, so now it was enough to touch these to pins, that can even be done blindly. This worked great but there was a new annoyance: when touching it carelessly the pins they pricked. So I used the soldering iron one more time to add to tiny balls of solder on their tips. I'm pretty pleased with the result.

Here is what the result looks like (not pretty but functional):

Arbitary virtual memory usage with numpy

2020-09-25

When recently hacking on htop, there was a bug about large memory sizes being displayed wrong. But to test for it you need a process that uses 98GiB of memory. Luckily any type of memory – so virtual memory is enough. How to get a process to have an fixed, arbitrary large amount of virtual? Sure I could write a few lines of C to do it, but with numpy anyway present on my system it's way easier:

$ echo 1 | sudo tee /proc/sys/vm/overcommit_memory
$ python
>> import numpy
>> x=numpy.empty([1024**3//8, 98]])

First we tell the kernel to always accept malloc's, even if the size is way over the available (1 = unlimited overcommitment). Then in Numpy we create an empty matrix with the right size to use the desired amount of space. Since the floats are 8 bytes large, we use 1024^3/8 as one dimension and can then set the number of GiB as the second dimension.

The advantage over a static compiled malloc in a C script is that you can change the size on the fly for free: just overwrite x with a new empty of the new desired size…

We you're finished restore default setting for overcommitment, in that the kernel will use some heuristics to determine if it should accept a memory allocation:

$ echo 0 | sudo tee /proc/sys/vm/overcommit_memory

Happy hacking.

Javascript: Country Code to Flag

2020-07-11

Here's a fun little Javascript function:

function getflag(langcode) {
	var first = langcode.charCodeAt(0) + 127397;
	var second = langcode.charCodeAt(1) + 127397;
	var flag=`&#${first};&#${second};`;
	return flag;
}

getflag("DJ")    // 🇩🇯
getflag("DE")    // 🇩🇪
getflag("SE")    // 🇸🇪

How does this work? Instead of adding the flags of every country to the Unicode standard, Unicode Defines 26 special characters 🇦 to 🇿, that can be combined according ISO 3166 to form a flag. So combining 🇵 with 🇪 will result in the flag of Peru (PE): 🇵🇪. While Unicode otherwise often uses _zero_width_joiner_ to merge characters (concatenating the five characters 👩 Woman, Zero Width Joiner, 👩 Woman, Zero Width Joiner and 👧 Girl results in 👩‍👩‍👧) in this case its even enough to write the two characters next to each. Since the 26 special characters 🇦 to 🇿 are ordered the same way as the "normal" ASCII Letters A to Z, adding a constant offset is enough. Using above function will generate the HTML Entities for the characters. If you add those to the HTML document, you will get the countries flag.

So Much Nothing

2020-06-20

If you know about sparse files, this is of no relevance for you, but might still amuse you. If you don't know about sparse files, I recommend changing that because they are not only fun, but also very useful — for example when handling filesystem images.

    $ dd if=/dev/null of=hugeemptyfile bs=4M count=0 seek=2000G
    $ ls -l hugeemptyfile
    -rw-r--r-- 1 michi users 7,9E 11. Mai 01:45 hugeemptyfile

So Much Nothing by Michael F. Schönitzer is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Convert images to PDF with ImageMagic

2019-06-03

Convert images to PDF with ImageMagic

I often have to join multiple images into one PDF file. My way to do this was always to convert the images to PDF files using ImageMagick and then concatenate the PDF files with pdftk.

Lately this failed with the following error:

convert: attempt to perform an operation not allowed by the security policy `PDF' @ error/constitute.c/IsCoderAuthorized/408.

The reason is a change in the default policies that forbids to convert files to PDF. The motivation for this is that ImageMagick uses Ghostscript as PDF back end and Ghostscript had several security issues.

You can find the policies files and the policies set by convert -list policy. You could open the policy file and disable the line that states

<policy domain="coder" rights="none" pattern="{PS,PS2,PS3,EPS,PDF,XPS}" />

to be able to write PDF files again. Be aware this would get overwritten on you next ImageMagick-update. Better create a policy file in your home directory. On Linux this is most likely `${HOME}/.config/ImageMagick/policy.xml¹. In that you can enable PDF reading and writing support by a simpel policy. The policy file should look like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE policymap [
  <!ELEMENT policymap (policy)+>
  <!ATTLIST policymap xmlns CDATA #FIXED ''>
  <!ELEMENT policy EMPTY>
  <!ATTLIST policy xmlns CDATA #FIXED '' domain NMTOKEN #REQUIRED
    name NMTOKEN #IMPLIED pattern CDATA #IMPLIED rights NMTOKEN #IMPLIED
    stealth NMTOKEN #IMPLIED value CDATA #IMPLIED>
]>
<policymap>
  <policy domain="coder" rights="read | write" pattern="PDF" />
</policymap>

Now you can again create you pdf files the good old way

$ convert foo.jpg foo.pdf
$ convert bar.png bar.pdf
$ pdftk foo.pdf bar.pdf cat output foobar.pdf

Of course you should think about the security implications for your system.

1: If that's not working for you check out convert -debug configure logo: null: to find in which folders ImageMagick searches for config files on your system.

Asking Gender in Surveys – doing it right

2019-01-08

How to read a vimscript stacktrace

2018-08-31

Reading vimscript stacktraces

When you get an ERROR in vimscript, vim will show you a stacktrace, as you might also know it from other languages. But reading them is not trivial and I haven't found any documentation of it so far. When you see them the first time, you might interpret them wrong and search the error at the wrong location.

Let's look at a stacktrace from vimwiki:

Error detected while processing function vimwiki#base#follow_link[58]..vimwiki#base#open_link[30]..vimwiki#base#edit_file:
line   21:
E325: ATTENTION
Error detected while processing function vimwiki#base#follow_link[58]..vimwiki#base#open_link:
line   30:
E171: Missing :endif
Error detected while processing function vimwiki#base#follow_link:
line   58:
E171: Missing :endif

The most relevant information is in the first three lines:

The first lines tell us that the error occurred in the function vimwiki#base#edit_file, wich was called by the function vimwiki#base#open_link, which was called by the function vimwiki#base#follow_link. From the names of the functions we learn in which file we will find them: autoload/vimwiki/base.vim.

The second line of the stacktrace tells us the line number where the bug occurred: line 21. But, here's the catch: all line numbers are relative to the function. So the bug occurs 21 lines below the definition of the function vimwiki#base#edit_file. The numbers in the square brackets are also relative line numbers, of where in the functions the next function was called.

The third line tells us the error that occurred. In this case the error is called ATTENTION and has the error code E325. You can look it up with :help E325 or :help ATTENTION.

The rest of the lines show how the error propagates though the callers. They are seldom useful (at least to me).

Introducing Hewa

2018-08-01

Introducing Hewa

While reading, typos and spelling mistakes in texts can be annoying. I myself make a lot of typos and spelling mistakes — but that doesn't mean that they annoy me any less when I stumble over one while reading. I personally already use vims built in spell checker as well as LanguageTool to check my texts — but that doesn't find everything. Asking someone to proofread every text improves the situation greatly but some mistakes will still remain. And more importantly: you might not have someone willing to proof-read everything you write. I thought it would be nice to have an easy way to report small mistakes. Others use the comment section for that — but my blog hasn't comments — on purpose — and most people won't write an email for submitting a typo they found while reading.

For this reason I wrote hewa. It adds an edit-button to every entry of my blog, that allows you to edit the page in the browser and send me your suggestion. The edit mode opens instantly, no loading of any sort of editor. Simply change the text, click save and I'll receive the patch.

Today I'm activating the first version of hewa in my blog. It's still not very mature and might not run with proprietary browsers like IE. It's written in JavaScript — without jQuery or any other libraries — and Python with Flask. I will of course release the source code under a free license. But I still need to clean up the code before I can do that.

This is also an experiment: I'm not sure if people will use it or if I'll end up receiving more spam and vandalism than actual corrections. Spam and vandalism is also the reason that I don't apply the changes to the text but rather save the diffs. I have a tool that allows me to look at those patches and apply or drop them.

Linux Raw Sockets

2018-03-18

Linux Raw Sockets

Recently I did a userspace implementation of the Host Identity Protokoll (HIPv2, RFC 7401) with the upcoming Diet Exchange (HIP DEX, IETF draft 6). Doing so, I've learnt a lot about raw socktet programing under Linux and here I want to share a few things with you.

So, I assume you have already worked with network sockets before – if not, don't fear, it's not that hard and there are plenty of nice introductions out there. I can for example recommend Beej's Guide to Network Programming. For this article I'll start with a normal UDP/TCP based socket and work my way down the layers. So we open a traditional socket by:

sockfd = socket(AF_INET, SOCK_DGRAM, 0);

This will open a UDP based datagram socket via IPv4. The first argument of socket() specifies the domain of your socket in our case that's Internet Protocol. Sometimes you will see here AF… and sometimes PF…, this doesn't matter, they are the same. While PF stands for protocol family, AF is short for address family. Historically it was thought that in the future there might be multiple protocol families sharing the same address family – but this never happend. So the correct way would be to use PF_INET in the socket call and AF_INET in your struct sockaddr_in, but most people nowadays use the address family everywhere. With the second argument type we specify if we want to use a connection-based protocol like TCP (SOCK_STREAM) or a protocol without connections like UDP (SOCK_DGRAM). The third argument protocol specifies which protocol we actually want to use – we could set UDP or TCP here (IPPROTO_UDP, IPPROTO_TCP) but setting 0 works too: this sets the protocol to the default protocol for the combination of the domain and type field – for AF_INET and SOCK_DGRAM the default is UDP and for SOCK_STREAM it's TCP. You might also see IPPROTO_IP as protocol which is simply by definition 0. But the above variant seems to be the most common one.

But hey, we have the year 2018 – why the heck should be limit us to IPv4? Luckily it's easy enough to support IPv6: just replace AF_INET by AF_INET6 and it will work with both IPv4 and IPv6! So don't you dare to ever use AF_INET anymore without a good excuse. By the way: if you want IPv6 only you can set the socket option IPV6_V6ONLY.

But we don't want to talk about ordinary TCP/UDP sockets here! So lets dig down in the mysterious world of raw sockets.

The first thing I want to note is: you'll need super user rights for creating a raw socket or more precisely the CAP_NET_RAW capability otherwise you'll get the error ”Operation not permitted.” (EPERM).

sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_UDP);
sockfd = socket(AF_INET6, SOCK_RAW, IPPROTO_UDP);

The first kind of Raw-Socket we look at is what you get by setting type to SOCK_RAW but still set protocol to TCP or UDP. You will still only receive the type of packet specified (here UDP), but this time you will not only receive the data but also the layer 4 (TCP/UDP) header and you're also responsible to set the layer 4 header yourself.

Contrary to above, here the choice of domain does matter a lot. First of all here AF_INET6 will only receive IPv6 and not both! Second what you get if you read from the socket differs: if you read from the first variant with AF_INET you will get the IPv4 header, the UDP/TCP header and the data; in the second variant your read will instead result in only the UDP/TCP header and data but not the IPv6-Header!

The third important difference between AF_INET and AF_INET6 for raw sockets is the endianness: unlike IPv4 raw sockets, all data sent via IPv6 raw sockets must be in the network byte order and all data received via raw sockets will be in the network byte order.

If you want to send something through the socket, your packet has to include the Layer 4-Header but not the IP-Header. (Note: this is unspecified in POSIX, but I focus on Linux here.) So but what if we want to change something in the IP-Header? For IPv4 there are two options: you can set the desired field(s) via calls to setsockopt or if you want to do the full header on your own, you can use the socket option IP_HDRINCL to tell that you will construct the header and write both header and payload to the socket:

sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_UDP);
int on = 1;
setsockopt(sockfd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on));

Even if you use this you won't have to deal with Source Address and Packet ID – the kernel will fill them in for you if you leave them all zero. The fields for the IP checksum and the length field will be set by the kernel if you want or not.

What's important here: IPv6 doesn't have IP_HDRINCL or a direct equivalent, as per RFC 3542 section 3. You can, however, also set various parameters via setsocketopt. Alternatively the IPv6 advanced socket API employs another framework called “ancillary data”. For outgoing packages one can set the majority of the fields in the header as well as supported header extensions via ancillary data and for received packages the majority of the fields and header extensions can be read with the same framework. A description of ancillary data is out of the scope of this article but the basic idea is you specify which values you want to set via a call of setsockopt then you write the value for the header fields and the actual data into a struct msghdr and send this via sendmsg().

If you want to send data with a transport protocol which has no user interface you can set the protocol field to raw too:

sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);

This will automatically set IP_HDRINCL and allow you to send your data with arbitrary layer 4 protocols. Most commons use: sending ICMP packets. Receiving of data is however not possible with this type of socket!

So far we got full control over layer 4 and partial control over layer 3. It's time to step down one further level into the dungeon.

sockfd = socket(AF_PACKET, SOCK_DGRAM, htons(ETHERTYPE_IPV6));

This is called a packet socket, it allows you to receive and send raw packets at the device driver level (layer 2). In the above version we used the protocol to specify that we only want to receive IPv6 packets. We can drop this requirement to receive all packets no matter if it's IPv4, IPv6 or something else:

sockfd = socket(AF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL));

By default, a packet socket will receive all packets matching the protocol. You can use bind() to bind the packet socket to an interface.

The field type set to SOCK_DGRAM results in the cooked mode: when reading from the socket you will read the packet without MAC-header but you can get the MAC-addresses comfortable by using recvfrom() and likewise you can use the sendto() to specify the destination by the sockaddr_ll struct. Alternatively we can set type to SOCK_RAW:

sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));

This is the lowest we can get: this way ethernet frames are passed from the device driver without any changes to your application, including the full level 2 header. Likewise, when writing to the socket the user-supplied buffer hast to contain all the headers of layer 2 to 4.

This is the deepest we can go in userspace – at this point we have full control of the complete ethernet frame. I hope you enjoyed our journey into the rabbit hole.

Sources and further readings:

Beej's Guide to Network Programming
socket(7)
raw(7)
packet(7)
sendto(2), recvfrom(2)
UNIX Network Programming, Volume 1 by W. Richard Stevens
IPv6 Core Protocols Implementation by Qing Li Tatuya Jinmei Keiichi Shima
IPv6 Socket API Extensions: Programmer's Guide by Qing Li Tatuya Jinmei Keiichi Shima
Linux Kernel source code

Awesome way to debug python

2018-02-08

Awesome way to debug python

This it so awesome. The probably most foolproof method to debug python after printf-debugging:

import code
code.interact(local=dict(globals(), **locals()))

You can add this at any position in your code you like. Whenever the second line is executed the python-read-evaluate-print-loop (REPL) is started and you can interact with your program live: you can check the values of your variables and run arbitrary code. After you exit the shell, your code continues to run normally.

But it gets even better. You can use this also if you don't know where in the code the problem is occurring, you can even use it preventive before you know about bugs (but make sure to remove before shipping to customers). Add this to the to of your code:

import signal
import code

def debug_handler(signum, frame):
    code.interact(local=globals())

signal.signal(signal.SIGUSR1, debug_handler)

This will register a signal handler for the signal SIGUSR1. Your program will run completely normal but whenever you run into a bug and want to debug, simply send a corresponding signal to the process:

$ kill -SIGUSR1 <pid>

This will open a REPL and you can look at any global variables and run code — you can even modify the content of variables or rewrite whole functions! So you could try out a potential bugfix live without restarting the program.

Actually you can also edit variables and functions with the first way (without a signal) — but you have to decide if you want to have access at only the global or only the local scopes. The compounding of the globals and locals with the dict copies the values, so that you won't modify the original anymore. If you really need to access and modify local and global variables you could use a little trick:

code.interact(local={'gvar': globals(), 'lvar': locals()})

But you then have to access the variables and functions by gvar['varname'] — much less easy and elegant… In that cases you might want to switch to something more advanced like pdb anyway.

There are two additional tricks: First: You can use the parameters banner= and exitmsg= to overwrite pythons default message when starting the REPL and to show something when exiting the REPL. Second: If you have IPython installed — it has the same functionality available, so then use the following instead to get a colorful more interactive shell:

import IPython
IPython.embed()

Only downside of IPython: I haven't found a way to change variables.

Happy Hacking!

Up2date softwareversions for Wikidata

2018-01-05

Up2date software versions for Wikidata

I'm a supporter of Wikidata and free Software. So naturally I care about Wikidata's items about free software. There are at least 17.000 of them and their quality is (as always on Wikis) very different. Lots of them came due to corresponding Wikipedia-articles, but others were created by imports from for example Gentoo's Portage.

One aspect where Wikidata could really shine is version numbers of software. If a new version of Firefox is released, traditionally the version number has to be updated in all 120 languages versions of Wikipedia which have an article about Firefox. With Wikidata this is not necessary anymore – update the version number once on Wikidata and all Wikipedias can show the newest version number instantly. Sadly this is still not reality. A lot of Wikipedia communities are still skeptical about Wikidata and so version numbers are still often edited on local Wikipedias instead of getting them by Wikidata.

One key aspect to improve this situation is to improve the data quality of Wikidata. Up till now too many of Wikidatas items about free software have outdated version numbers, which is not surprising, since if they are not used in Wikipedia they don't get updated by Wikipedians.

There are a few promising ways to improve this. Github-wiki-bot by Konstin is one – it imports version numbers from Github. But that only works for some projects. Since a short while I've been working on another way to improve this: Checking version numbers against those in the repository-database of Arch Linux. I used Arch since it contains very fresh versions of software – so most of the time the versions in the Arch repos are the newest versions available.

To check the versions, I need the Arch Package identifier (P3454) to be present in Wikidata – so I first wrote a script to help me add those. For every Software that runs on Linux, I search if the Arch repository contains one with the same name and website. With that I could add the Arc Package identifier to about roughly 600 packages.

Then I wrote a second script, that checks the version numbers of those items, against those in the Arch repositories. If the version available for Arch is newer than the newest version we have in Wikidata, I print it to a website, sorted by the size of the difference in the version numbers. You can find this list here – it's updated a few times per day.

The list contained a few items with terribly outdated versions (like 3 years old and two digits older in the major version number!). In the last days I updated several hundred items from this list by hand. Starting with those where the major version number was out of date and then all where the minor version number was out of date. The list now only contains items outdated in the third or fourth version number – all first and second version numbers are up to date. And I hope that I can keep the version numbers at least this much up to date.

This is not finished, of course. From the 17.000 items about free software only less than 700 have a reference to the corresponding Arch package! The Arch repos contain 10.000 packages. So even without knowing how large exactly the overlap between Wikidata and the Arch repos is – it's for sure much bigger than what we currently got!

Your help is needed!

Vim revelation of the day: wildmode

2018-01-03

Vim revelation of the day: wildmode

How often had I been annoyed by Vims command and file autocompletion. I was sure I had looked into this in the past – was that just imagination or was I searching the wrong terms? Now I finally stumbled over the right options:

set wildmenu
set wildmode=longest:full
set wildignore+=*.a,*.o,*.hi
set wildignore+=*.pdf,*.gz,*.aux,*.out,*.nav,*.snm,*.vrb

Wildmenu shows a nice list with the completions available. With wildmode you can configure what it should do if multiple commands match. And most important wildignore can make him ignore files that you won't ever want to open (for example binary files) by file extension.

Nudin's Blog

Writing a GIMP 3.0 Plugin – Ressources and Notes

Writing a GIMP 3.0 Plugin – Resources and Notes

Shorter Hacks 23: Bash previous folder

Shorter Hacks 23: Bash previous folder

Shorter Hacks 22: Systemd-analyze calendar

Shorter Hacks 22: systemd-analyse calendar

Shorter Hacks 21: Git push previous commit

Shorter Hacks 21: Git push previous commit

Shorter Hacks 20: IPython history

Shorter Hacks 20: IPython history

Shorter Hacks 19: Terminate hanging ssh sessions

Shorter Hacks 19: Terminate hanging ssh sessions

Shorter Hacks 18: Filter lines in less

Shorter Hacks: Filter lines in less

Shorter Hacks 17: SSH Purge host from known_hosts file

SSH Purge host from known_hosts file

Shorter Hacks 16: IPython Autoreload

Shorter Hacks 16: IPython Autoreload

Shorter Hacks 15: Dig output format

Dig output format

Shorter Hacks 14: Get external IP behind NAT

Shorter Hacks 14: Get external IP

Shorter Hacks 13: Git exclude files

Shorter Hacks 13: git exclude files

Shorter Hacks 12: Open editor from within less

Shorter Hacks 12: open editor from within less

Shorter Hacks 11: journalctl list boots

Shorter Hacks 11: journalctl list boots

Shorter Hacks 10: disown

Shorter Hacks 10: disown

Shorter Hacks 9: git autostash

Shorter Hacks 9: Git autostash

Shorter Hacks 8: limit strace to syscalls with files

Shorter Hacks 8: limit strace to syscalls with files

Shorter Hacks 7: better search in man pages

Shorter Hacks 7: better search in man pages

Shorter Hacks 6: pipe stderr

Shorter Hacks 6: pipe stderr

Shorter Hacks 5: git word diff

Shorter Hacks 5: git word diff

Shorter Hacks 4: strace failed-only

Shorter Hacks 4: Strace only failed syscalls

Shorter Hacks 3: grep ps

Shorter Hacks 3: grep ps

Shorter Hacks 2: get local ip with hostname

Shorter Hacks 2: get ip using hostname

Shorter Hacks 1: git dash

Shorter Hacks 1: git dash

Arch Packages You Might Want to Install

Arch Packages You Might Want to Install

Audio volume in Anki

Audio volume in Anki

A better button for my Tomu

A better button for my Tomu

Arbitary virtual memory usage with numpy

Arbitrary virtual memory usage with numpy

Javascript: Country Code to Flag

Javascript: Country Code to Flag

So Much Nothing

So Much Nothing

Convert images to PDF with ImageMagic

Convert images to PDF with ImageMagic

Asking Gender in Surveys – doing it right

Asking Gender in Surveys — doing it right

How to read a vimscript stacktrace

Reading vimscript stacktraces

Introducing Hewa

Introducing Hewa

Linux Raw Sockets

Linux Raw Sockets

Awesome way to debug python

Awesome way to debug python

Up2date softwareversions for Wikidata

Up2date software versions for Wikidata

Vim revelation of the day: wildmode

Vim revelation of the day: wildmode