According to http://www.cplusplus.com/reference/cstdlib/strtol/ this function has a signature of long int strtol (const char* str, char** endptr, int base)
.
I wonder, though: If it gets passed a const char *
to the beginning of the string, how does it manage to turn that into a non-const pointer to the first unprocessed character without cheating? What would an implementation of strtol look like that doesn't perform a const_cast?
How do you implement strtol
under const-correctness?
You don't, because strtol
's definition is inherently not const
-correct.
This is a flaw in the C standard library.
There are several standard functions that take a const char*
argument (expected to point the beginning of a character array) and give back a non-const
char*
pointer that can be used to modify that array.
strchr
is one example:
char *strchr(const char *s, int c);
For example:
#include <string.h>
int main(void) {
const char *s = "hello";
char *ptr = strchr(s, 'h');
*ptr = 'H';
}
This program has undefined behavior. On my system, it dies with a segmentation fault.
The problem doesn't occur in strchr
itself. It promises not to modify the string you pass to it, and it doesn't. But it returns a pointer that the caller can then use to modify it.
The ANSI C committee, back in the late 1980s, could have split each such function into two versions, one that acts on const
character arrays and another for non-const
arrays:
char *strchr(char *s, int c);
const char *strcchr(const char *s, int c);
But that would have broken existing pre-ANSI code, written before const
existed. This is the same reason C has not made string literals const
.
C++, which inherits most of C's standard library, deals with this by providing overloaded versions of some functions.
The bottom line is that you, as a C programmer, are responsible for not modifying objects you've defined as const
. In most cases, the language helps you enforce this, but not always.
As for how these functions manage to return a non-const
pointer to const
data, they probably just use a cast internally (not a const_cast
, which exists only in C++). That's assuming they're implemented in C, which is likely but not required.
Most likely it just uses casting.
There are numerious functions that have this same property in standard library. Sacrifiying type safety over simplicity is likely reason, since you cannot overload functions as in C++.
They expect that programmer takes responsibility, and doesn't edit endptr
if str
is, for example, a string literal.
With it's limited type system, C is practical language for practical people.
strtol
does do a const_cast
(or equivalent). Casting a const
away is not a problem, using the resulting pointer to modify the originally-const
pointee may be.
But strtol
just returns this pointer to you without tampering with it, so everything is fine.
How do you implement strtol under const-correctness?
Use of C11 _Generic
would allow code to call either of
// when passed argument for `str` is `char *` and for `endptr` is `char **`
long strotol(const char* str, char** endptr, int base);
// or
// when passed argument for `str` is `const char *` and for `endptr` is `const char **`
long strotol_c(const char* str, const char** endptr, int base);
// and warn/error otherwise
An implementation, as below would be identical as only the function signature preservation is needed. Since this differs from strtol()
, it should be called something else such as strtol_s()
.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
long int strtol_c(const char * restrict nptr, const char ** restrict endptr, int base) {
return strtol((char *) nptr, (char **) endptr, base);
}
#define strtol_s(n,e,b) _Generic(n, \
char *: strtol((n), (e), (b)), \
const char *: strtol_c((n), (e), (b)), \
default: 0 \
)
int main(void) {
char *src = malloc(100);
strcpy(src, "456");
const char *srcc = "123";
char *endptr;
const char *endcptr;
long L[6] = { 0 };
// OK - matching str and *endptr
L[0] = strtol_s(src, &endptr, 0);
// warning: passing argument 2 of 'strtol' from incompatible pointer type
L[1] = strtol_s(src, &endcptr, 0);
// warning: passing argument 2 of 'strtol_c' from incompatible pointer type
L[2] = strtol_s(srcc, &endptr, 0);
// OK - matching str and *endptr
L[3] = strtol_s(srcc, &endcptr, 0);
L[4] = strtol(src, &endptr, 0);
// warning passing argument 2 of 'strtol' from incompatible pointer type
// OK
L[5] = strtol(src, &endcptr, 0);
return !L[0];
}
What is lost: strtol_s()
is not a true function, so a pointer to it can not be made.
how does it manage to turn that into a non-const pointer to the first unprocessed character without cheating?
strtol()
, although it takes a char **endptr
as the second argument, does not modify *endptr
.