What's the difference between sockaddr, sockaddr_in, and sockaddr_in6?
In order to give more information other people may find useful, I have decided to answer my question although I initially did not intend to.
After some digging into the linux
source code I have found the following :
There are multiple protocols and they all implement getsockname
. And each one has an underlying address data structure. For example, IPv4 has sockaddr_in
, IPV6 has sockaddr_in6
, the AF_UNIX
socket has sockaddr_un
.sockaddr
is used as the common data struct
in the signature of the linux networking
That API will copy the the socketaddr_in
or sockaddr_in6
or sockaddr_un
to a sockaddr
base on another parameter length
by memcpy
.
And all those data structures begin with same type field sa_family
.
Because of all this, the code snippet is valid, because both sockaddr_in
and sockaddr_in6
have a sa_family
field and then can be cast into the correct data structure to be used after a check on that sa_family
field.
BTW, I'm not sure why the sizeof(sockaddr_in6) > sizeof(sockaddr)
, which cause allocate memory based on size of sockaddr is not enough for ipv6 (that is error-prone), but I guess it is because of history reason.
Casting between sockaddr and sockaddr_in6
ai_family
and ai_addr
are fields of the addrinfo
struct, so presumably the code you are quoting had called getaddrinfo()
beforehand.
The result of getaddrinfo()
is a NULL-terminated linked list of addrinfo
structs, where the addrinfo::ai_addr
field is a pointer to an allocated memory block that is of sufficient size to hold a socket address of the reported addrinfo::ai_family
type. The size of the address is reported in the addrinfo::ai_addrlen
field.
For AF_INET
, the addrinfo::ai_addr
field is pointing at a memory block containing a sockaddr_in
struct.
For AF_INET6
, the addrinfo::ai_addr
field is pointing at a memory block containing a sockaddr_in6
struct.
That is why the type-casts work.
The addrinfo::ai_addr
field is declared as struct sockaddr*
so it can be passed as-is to the addr
parameter of the bind()
and connect()
functions without type-casting. The addrinfo::ai_addrlen
field can be passed as-is to their addrlen
parameter.
What's the pad to sockaddr_in for?
There are multiple protocol families. Each family has its own address structure.
Example: AF_INET
uses sockaddr_in
, AF_INET6
uses sockaddr_in6
, AF_UNIX
uses sockaddr_un
, etc. But sockaddr
is the base structure. All these structures must be type-cast to sockaddr
while binding/connecting a socket.
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
Let's look at the structures of sockaddr_in
and sockaddr
:
struct sockaddr_in {
short sin_family; /* Protocol family (always AF_INET) */
unsigned short sin_port; /* Port number in network byte order */
struct in_addr sin_addr; /* IP address in network byte order */
unsigned char sin_zero[8]; /* Pad to sizeof(struct sockaddr) */
};
struct in_addr {
uint32_t s_addr; /* Address in network byte order (big-endian) */
};
The structure of sockaddr
is:
struct sockaddr
{
sa_family_t sa_family;
char sa_data[14];
}
Look at the sizes of the elements in the two structures sockaddr_in
and sockaddr
.
The first element in both structures is the same and occupies the same memory.
sin_port
--> 2 bytessin_addr
--> 4 bytessin_zero[8]
--> 8 bytes
Total = 14 bytes (equal to size of sa_data[14]
)
We add padding bytes to make their structure sizes equal.
reference:
https://man7.org/linux/man-pages/man2/bind.2.html
https://man7.org/linux/man-pages/man2/connect.2.html
Why do we cast sockaddr_in to sockaddr when calling bind()?
No, it's not just convention.
sockaddr
is a generic descriptor for any kind of socket operation, whereas sockaddr_in
is a struct specific to IP-based communication (IIRC, "in" stands for "InterNet"). As far as I know, this is a kind of "polymorphism" : the bind()
function pretends to take a struct sockaddr *
, but in fact, it will assume that the appropriate type of structure is passed in; i. e. one that corresponds to the type of socket you give it as the first argument.
How sockaddr holds sockaddr_storage or sockaddr_in6?
Remember that all functions that take a struct sockaddr
pointer, also takes the size of the structure. Together with the meta-data on the actual socket, it's easy for the system to know what kind of structure you're passing.
Also note that it's always pointers to the address structures being passed around, not actual structures which would not work. So you never to e.g.
(struct sockaddr) a_in6_sockaddr
you do
(struct sockaddr *) &a_in6_sockaddr
What is the correct way to convert a struct sockaddr * to struct sockaddr_in6 * with valid C code?
So if the way we do socket programming (and what is also recommended by the books) is a hack, what is the correct way to rewrite the above code so that it is also a valid C code as per the C standard?
TL;DR: continue to do what you present in your example.
The code you presented appears to be syntactically correct. It may or may not exhibit undefined behavior under some circumstances. Whether or not it does depends on the behavior of getaddrinfo()
.
There is no way to do this in C that meets all the functional requirements and is any better protected against undefined behavior than the standard technique you've presented. That's why it's the standard technique. The issue here is that the function must support all conceivable address types, including types that have not yet been defined. It could declare the socket address pointer as a void *
, which would not require casting, but that wouldn't actually change anything about whether any given program exhibits undefined behavior.
For its part, getaddrinfo()
is designed with exactly such usage in mind, so it is its problem if using the expected cast on the result allows for misbehavior. Moreover, getaddrinfo()
is not part of the C standard library -- it is standardized (only) by POSIX, which also incorporates the C standard. Analyzing that function in the light of C alone therefore demonstrates an inappropriate hyperfocus. Though the casts raise some concern in light of C alone, you should expect that in the context of getaddrinfo()
and other POSIX networking functions using struct sockaddr *
, casting to the correct specific address type and accessing the referenced object produces reliable results.
Additionally, I think AnT's answer to your other question is oversimplified and overly negative. I'm considering whether to write a contrasting answer.
Comparing IPV4 socket(sockaddr_in) with IPV6 Socket(sockaddr_in6)
As Joachim Pileborg reasoned, you don't need to care about this when the IPv4 address comes from an earlier packet received on the same socket because you will be comparing one mapped IPv4 address to another. It is only in the case that the IPv4 address was obtained from an external source that you have to care.
As João Augusto pointed out, you neglected to check that the IPv6 address indeed is an IPv4 mapped address before comparing the last 32 bits. There is a macro IN6_IS_ADDR_V4MAPPED
that will help you do this:
if (
IN6_IS_ADDR_V4MAPPED(&(ipv6_clientdata->sin6_addr)) &&
(ipv6_clientdata->sin6_port == ipv4_storeddata->sin_port) &&
(ipv6_clientdata->sin6_addr.in6_u.u6_addr32[3] == ipv4_storeddata->sin_addr.s_addr)
) {
addrfound = true;
}
What is the difference between struct addrinfo and struct sockaddr
struct addrinfo
is returned by getaddrinfo()
, and contains, on success, a linked list of such struct
s for a specified hostname and/or service.
The ai_addr
member isn't actually a struct sockaddr
, because that struct
is merely a generic one that contains common members for all the others, and is used in order to determine what type of struct you actually have. Depending upon what you pass to getaddrinfo()
, and what that function found out, ai_addr
might actually be a pointer to struct sockaddr_in
, or struct sockaddr_in6
, or whatever else, depending upon what is appropriate for that particular address entry. This is one good reason why they're kept "separate", because that member might point to one of a bunch of different types of struct
s, which it couldn't do if you tried to hardcode all the members into struct addrinfo
, because those different struct
s have different members.
This is probably the easiest way to get this information if you have a hostname, but it's not the only way. For an IPv4 connection, you can just populate a struct sockaddr_in
structure yourself, if you want to and you have the data to do so, and avoid going through the rigamarole of calling getaddrinfo()
, which you might have to wait for if it needs to go out into the internet to collect the information for you. You don't have to use struct addrinfo
at all.
Related Topics
What Is the Purpose of Std::Make_Pair VS the Constructor of Std::Pair
Boost::Multi_Array Performance Question
How Are Exceptions Implemented Under the Hood
What Does "-Wall" in "G++ -Wall Test.Cpp -O Test" Do
How to Convert Concatenated Strings to Wide-Char with the C Preprocessor
Two Colours Text in Qpushbutton
Elegant Way to Implement Extensible Factories in C++
Finding the Position of the Maximum Element
C++ Return Value, Reference, Const Reference
Convert String to Integer in C++
Generate Dependencies for a Makefile for a Project in C/C++
How to Use a Boost Condition Variable to Wait for a Thread to Complete Processing
How to Test Whether a Number Is a Power of 2
Clang VS Gcc - Optimization Including Operator New
What Are Practical Uses of a Protected Constructor
Is There 'Byte' Data Type in C++