I was reading the man page to dlopen()
, and I stumbled on this block of code:
cosine = (double (*)(double)) dlsym(handle, "cos");
/* According to the ISO C standard, casting between function
pointers and 'void *', as done above, produces undefined results.
POSIX.1-2001 and POSIX.1-2008 accepted this state of affairs and
proposed the following workaround:
*(void **) (&cosine) = dlsym(handle, "cos");
This (clumsy) cast conforms with the ISO C standard and will
avoid any compiler warnings.
The 2013 Technical Corrigendum 1 to POSIX.1-2008 improved matters
by requiring that conforming implementations support casting
'void *' to a function pointer. Nevertheless, some compilers
(e.g., gcc with the '-pedantic' option) may complain about the
cast used in this program. */
I know that casting a function pointer to a void pointer and vice versa is undefined behavior. And that the standard’s reasoning for making it undefined behavior is because of architectural differences where a function pointer may not be the same size as data pointer or in some cases a function pointer actually being represented with two values (so I’ve heard at least). I understand how the workaround avoids being undefined behavior since casting the address of the cosine
to a void **
is really just casting a data pointer which points to a pointer to a function, to void **
, which is perfectly valid, and of course it is then perfectly valid to dereference a void **
and assign to it the void *
which dlsym()
returns. However, wouldn’t this code be equally error prone as just casting the function to a void pointer in the case that the previously mentioned architectural quirks are present? If that’s the case, shouldn’t the standard also specify that this workaround is also undefined behavior? Which further leads to the question of whether or not a non-error prone implementation of the dlsym()
function could even be implemented to begin with?
*(void **) (&cosine) = dlsym(handle, "cos");
This (clumsy) cast conforms with the ISO C standard and will
avoid any compiler warnings.
No, it does not “conform.” Specifically, it is not strictly conforming code and its behavior is not defined by the C standard.
Inferring from previous code, cosine
is defined to be double (*)(double)
, a pointer to a function taking a double
and returning a double
. The above code writes to it using an lvalue of type void *
. This violates the aliasing rules in C 2018 6.5 7. That paragraph says which combinations of effective type and type used for access are defined by the C standard, and accessing a pointer to a function with a void *
is not among them.
Further, the C standard does not require double (*)(double)
and void *
to have the same size or the same representation, so writing the bytes could, in theory, completely mess up the pointer. (However, this is rarer than compilers for which optimization taking advantage of the aliasing rules will mess up the program, from the perspective of an unwary programmer.)
One fix would be to create a dlsymf
routine that returns a pointer to any function type, as the C standard defines the behavior of converting between pointers to different function types (as long as an appropriate type is used for the actual call).
I know that casting a function pointer to a void pointer and vice versa is undefined behavior. And that the standard’s reasoning for making it undefined behavior is because of architectural differences where a function pointer may not be the same size as data pointer or in some cases a function pointer actually being represented with two values (so I’ve heard at least)
That is not the reason. The fact that two types have different sizes is not an impediment to converting between them, as evidence by the fact that we may easily convert int
to char
, long long int
, or double
, which commonly have sizes different from int
. A conversion is allowed to perform computations to produce its result. (A conversion is effectively an operation that takes a value in one type and produces, to the extent feasible, the same value in another type. It is not merely taking the operand bytes to represent a value in a new type.) The standard also requires conversions between void *
and any pointer-to-object type to work, provided alignment requirements are satisfied, but those pointers can also have different sizes.
I do not know particularly what the actual reason was. Perhaps it was seen as onerous for some C implementations with separated data and instruction spaces and unusual addressing schemes to perform the conversion and little benefit to requiring them to do so was perceived. But it was not due to the sizes of the pointers.
In any case, the solution is straightforward: Define the behavior of converting a void *
, particularly a void *
returned by dlsym
to a pointer to a function type. Effectively, any C implementation supporting POSIX must do this, so that dlsym
works. The fact that the C standard says this is undefined behavior does not mean we must leave it undefined. The standard’s meaning for “undefined behavior” is only that the standard does not impose any requirements. It does not require us to keep it undefined; we can add our own definition of what it does, and then the behavior will be defined for our C implementation.
In fact, C 2018 J.5.7 1 notes this as a common extension:
A pointer to an object or to
void
may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
I am not aware of any POSIX compliant implementation of C that doesn’t make casting from
void *
to a function pointer be well-defined, and I’d actually assumed that this was a requirement for POSIX compliance. (Of course, for a call through that pointer to be well-defined, you must choose a function pointer type that is compatible with the declared type of the function you are actually calling.)@AndrewHenle it’s in the Examples section of the manpage for
dlopen
. Are you saying that in a POSIX compliant implementation of the standard library it’s perfectly valid to do:cosine = (double (*)(double)) dlsym(handle, cosine)
Or that the cast results in an unusable pointer? You’re comment seems to imply it’s both but I don’t see how that’s possible.@AndrewHenle actually I see what you’re saying, the manpage for
dlopen
is a bit different from what I’ve posted, which was from a website. I’ve updated the question with the current entry in the manpage.@DanilaBerezin Thank you. I misread the source of the quote. FWIW, note the “The 2013 Technical Corrigendum 1 to POSIX.1-2008 improved matters by requiring that conforming implementations support casting ‘void *’ to a function pointer.” Which was clearly already an implied requirement the day
dlopen()
anddlsym()
were born, likely back in the 1970s as it already existed in AT&T SYS V (See sco.com/developers/devspecs/vol1a.pdf)Re “Nevertheless, some compilers (e.g., gcc with the ‘-pedantic’ option) may complain about the cast used in this program.”? Let’s just say GCC devs seem to forget the entire Rationale for the C standard when they go all wiggy over undefined behavior(italics original): “The X3J11 charter clearly mandates the Committee to codify common existing practice.” Casting
void *
to a function pointer long predates the C standard calling it “undefined behavior”. It’s only UB to accommodate architectures that have different sizes for function pointersShow 6 more comments