Recently I did a userspace implementation of the Host Identity Protokoll (HIPv2, RFC 7401) with the upcoming Diet Exchange (HIP DEX, IETF draft 6). Doing so, I've learnt a lot about raw socktet programing under Linux and here I want to share a few things with you.
So, I assume you have already worked with network sockets before – if not, don't fear, it's not that hard and there are plenty of nice introductions out there. I can for example recommend Beej's Guide to Network Programming. For this article I'll start with a normal UDP/TCP based socket and work my way down the layers. So we open a traditional socket by:
sockfd = socket(AF_INET, SOCK_DGRAM, 0);
This will open a UDP based datagram socket via IPv4. The first argument of
socket() specifies the
domain of your socket in our case that's Internet
Protocol. Sometimes you will see here
AF… and sometimes
PF…, this doesn't
matter, they are the same. While PF stands for protocol family, AF is short for
address family. Historically it was thought that in the future there might be
multiple protocol families sharing the same address family – but this never
happend. So the correct way would be to use
PF_INET in the socket call and
AF_INET in your
struct sockaddr_in, but most people nowadays use the
address family everywhere. With the second argument
type we specify if we
want to use a connection-based protocol like TCP (
SOCK_STREAM) or a protocol
without connections like UDP (
SOCK_DGRAM). The third argument
specifies which protocol we actually want to use – we could set UDP or TCP here
IPPROTO_TCP) but setting 0 works too: this sets the
protocol to the default protocol for the combination of the domain and type
field – for
SOCK_DGRAM the default is UDP and for
it's TCP. You might also see
IPPROTO_IP as protocol which is simply by
definition 0. But the above variant seems to be the most common one.
But hey, we have the year 2018 – why the heck should be limit us to IPv4?
Luckily it's easy enough to support IPv6: just replace
it will work with both IPv4 and IPv6! So don't you dare to ever use
anymore without a good excuse.
By the way: if you want IPv6 only you can set the socket option
But we don't want to talk about ordinary TCP/UDP sockets here! So lets dig down in the mysterious world of raw sockets.
The first thing I want to note is: you'll need super user rights for creating a
raw socket or more precisely the
CAP_NET_RAW capability otherwise you'll get
the error ”Operation not permitted.” (EPERM).
sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_UDP); sockfd = socket(AF_INET6, SOCK_RAW, IPPROTO_UDP);
The first kind of Raw-Socket we look at is what you get by setting
SOCK_RAW but still set
protocol to TCP or UDP. You will still only receive
the type of packet specified (here UDP), but this time you will not only
receive the data but also the layer 4 (TCP/UDP) header and you're also
responsible to set the layer 4 header yourself.
Contrary to above, here the choice of
domain does matter a lot. First of all
AF_INET6 will only receive IPv6 and not both! Second what you get if you
read from the socket differs: if you read from the first variant with
you will get the IPv4 header, the UDP/TCP header and the data; in the second
variant your read will instead result in only the UDP/TCP header and data but
not the IPv6-Header!
The third important difference between
AF_INET6 for raw sockets
is the endianness: unlike IPv4 raw sockets, all data sent via IPv6 raw sockets
must be in the network byte order and all data received via raw sockets will be
in the network byte order.
If you want to send something through the socket, your packet has to include
the Layer 4-Header but not the IP-Header. (Note: this is unspecified in POSIX,
but I focus on Linux here.) So but what if we want to change something in the
IP-Header? For IPv4 there are two options: you can set the desired
field(s) via calls to
setsockopt or if you want to do the full header on your
own, you can use the socket option
IP_HDRINCL to tell that you will
construct the header and write both header and payload to the socket:
sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_UDP); int on = 1; setsockopt(sockfd, IPPROTO_IP, IP_HDRINCL, &on, sizeof(on));
Even if you use this you won't have to deal with Source Address and Packet ID – the kernel will fill them in for you if you leave them all zero. The fields for the IP checksum and the length field will be set by the kernel if you want or not.
What's important here: IPv6 doesn't have
IP_HDRINCL or a direct equivalent,
as per RFC 3542 section 3. You can, however, also set various parameters via
setsocketopt. Alternatively the IPv6 advanced socket API employs another
framework called “ancillary data”. For outgoing packages one can set the
majority of the fields in the header as well as supported header extensions via
ancillary data and for received packages the majority of the fields and header
extensions can be read with the same framework. A description of ancillary data
is out of the scope of this article but the basic idea is you specify which
values you want to set via a call of
setsockopt then you write the value for
the header fields and the actual data into a
struct msghdr and send this via
If you want to send data with a transport protocol which has no user interface
you can set the
protocol field to raw too:
sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
This will automatically set
IP_HDRINCL and allow you to send your data with
arbitrary layer 4 protocols. Most commons use: sending ICMP packets. Receiving
of data is however not possible with this type of socket!
So far we got full control over layer 4 and partial control over layer 3. It's time to step down one further level into the dungeon.
sockfd = socket(AF_PACKET, SOCK_DGRAM, htons(ETHERTYPE_IPV6));
This is called a packet socket, it allows you to receive and send raw packets at the device driver level (layer 2). In the above version we used the protocol to specify that we only want to receive IPv6 packets. We can drop this requirement to receive all packets no matter if it's IPv4, IPv6 or something else:
sockfd = socket(AF_PACKET, SOCK_DGRAM, htons(ETH_P_ALL));
By default, a packet socket will receive all packets matching the protocol. You can use bind() to bind the packet socket to an interface.
The field type set to
SOCK_DGRAM results in the cooked mode: when reading
from the socket you will read the packet without MAC-header but you can get the
MAC-addresses comfortable by using
recvfrom() and likewise you can use the
sendto() to specify the destination by the
Alternatively we can set type to
sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
This is the lowest we can get: this way ethernet frames are passed from the device driver without any changes to your application, including the full level 2 header. Likewise, when writing to the socket the user-supplied buffer hast to contain all the headers of layer 2 to 4.
This is the deepest we can go in userspace – at this point we have full control of the complete ethernet frame. I hope you enjoyed our journey into the rabbit hole.
Sources and further readings:
linux c network