Ruby
3.4.0dev (2024-11-05 revision 348a53415339076afc4a02fcd09f3ae36e9c4c61)
|
Our own, locale independent, character handling routines. More...
#include "ruby/internal/config.h"
#include "ruby/internal/attr/artificial.h"
#include "ruby/internal/attr/const.h"
#include "ruby/internal/attr/constexpr.h"
#include "ruby/internal/attr/nonnull.h"
#include "ruby/internal/dllexport.h"
Go to the source code of this file.
Macros | |
Old character classification macros | |
What is this ISPRINT business? Well, according to our VCS and some internet surfing, it appears that the initial intent of these macros were to mimic codes appear in common in several GNU projects. As far as @shyouhei detects they seem to originate GNU regex (that standalone one rather than Gnulib or Glibc), and at least date back to 1995. Let me lawfully quote from a GNU coreutils commit https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=49803907f5dbd7646184a8912c9db9b09dcd0f22
So the intent was to reroute old problematic systems that no longer exist. At the same time the problems described above no longer hurt us, because we decided to completely avoid using system-provided isupper etc. to reinvent the wheel. These macros are entirely legacy; please ignore them. But let me also put stress that GNU people are wise; they use those macros only inside of their own implementations and never let them be public. On the other hand ruby has thoughtlessly publicised them to 3rd party libraries since its beginning, which is a very bad idea. These macros are too easy to get conflicted with definitions elsewhere. New programs should stick to the
| |
#define | ISASCII rb_isascii |
Old name of rb_isascii. More... | |
#define | ISPRINT rb_isprint |
Old name of rb_isprint. More... | |
#define | ISGRAPH rb_isgraph |
Old name of rb_isgraph. More... | |
#define | ISSPACE rb_isspace |
Old name of rb_isspace. More... | |
#define | ISUPPER rb_isupper |
Old name of rb_isupper. More... | |
#define | ISLOWER rb_islower |
Old name of rb_islower. More... | |
#define | ISALNUM rb_isalnum |
Old name of rb_isalnum. More... | |
#define | ISALPHA rb_isalpha |
Old name of rb_isalpha. More... | |
#define | ISDIGIT rb_isdigit |
Old name of rb_isdigit. More... | |
#define | ISXDIGIT rb_isxdigit |
Old name of rb_isxdigit. More... | |
#define | ISBLANK rb_isblank |
Old name of rb_isblank. More... | |
#define | ISCNTRL rb_iscntrl |
Old name of rb_iscntrl. More... | |
#define | ISPUNCT rb_ispunct |
Old name of rb_ispunct. More... | |
#define | TOUPPER rb_toupper |
Old name of rb_toupper. More... | |
#define | TOLOWER rb_tolower |
Old name of rb_tolower. More... | |
#define | STRCASECMP st_locale_insensitive_strcasecmp |
Old name of st_locale_insensitive_strcasecmp. More... | |
#define | STRNCASECMP st_locale_insensitive_strncasecmp |
Old name of st_locale_insensitive_strncasecmp. More... | |
#define | STRTOUL ruby_strtoul |
Old name of ruby_strtoul. More... | |
Functions | |
locale insensitive functions | |
int | st_locale_insensitive_strcasecmp (const char *s1, const char *s2) |
Our own locale-insensitive version of strcasecmp(3) . More... | |
int | st_locale_insensitive_strncasecmp (const char *s1, const char *s2, size_t n) |
Our own locale-insensitive version of strcnasecmp(3) . More... | |
unsigned long | ruby_strtoul (const char *str, char **endptr, int base) |
Our own locale-insensitive version of strtoul(3) . More... | |
static int | rb_isascii (int c) |
Our own locale-insensitive version of isascii(3) . More... | |
static int | rb_isupper (int c) |
Our own locale-insensitive version of isupper(3) . More... | |
static int | rb_islower (int c) |
Our own locale-insensitive version of islower(3) . More... | |
static int | rb_isalpha (int c) |
Our own locale-insensitive version of isalpha(3) . More... | |
static int | rb_isdigit (int c) |
Our own locale-insensitive version of isdigit(3) . More... | |
static int | rb_isalnum (int c) |
Our own locale-insensitive version of isalnum(3) . More... | |
static int | rb_isxdigit (int c) |
Our own locale-insensitive version of isxdigit(3) . More... | |
static int | rb_isblank (int c) |
Our own locale-insensitive version of isblank(3) . More... | |
static int | rb_isspace (int c) |
Our own locale-insensitive version of isspace(3) . More... | |
static int | rb_iscntrl (int c) |
Our own locale-insensitive version of iscntrl(3) . More... | |
static int | rb_isprint (int c) |
Identical to rb_isgraph(), except it also returns true for ‘’ '`. More... | |
static int | rb_ispunct (int c) |
Our own locale-insensitive version of ispunct(3) . More... | |
static int | rb_isgraph (int c) |
Our own locale-insensitive version of isgraph(3) . More... | |
static int | rb_tolower (int c) |
Our own locale-insensitive version of tolower(3) . More... | |
static int | rb_toupper (int c) |
Our own locale-insensitive version of toupper(3) . More... | |
Our own, locale independent, character handling routines.
RBIMPL
or rbimpl
are implementation details. Don't take them as canon. They could rapidly appear then vanish. The name (path) of this header file is also an implementation detail. Do not expect it to persist at the place it is now. Developers are free to move it anywhere anytime at will. __VA_ARGS__
is always available. We assume C99 for ruby itself but we don't assume languages of extension libraries. They could be written in C++98. Definition in file ctype.h.
|
inlinestatic |
Our own locale-insensitive version of isalnum(3)
.
[in] | c | Byte in question to query. |
true | c is listed in either IEEE 1003.1 section 7.3.1.1 "upper", "lower", or "digit". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1. Definition at line 326 of file ctype.h.
Referenced by rb_ispunct().
|
inlinestatic |
Our own locale-insensitive version of isalpha(3)
.
[in] | c | Byte in question to query. |
true | c is listed in either IEEE 1003.1 section 7.3.1.1 "upper" or "lower". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1. Definition at line 279 of file ctype.h.
Referenced by rb_isalnum().
|
inlinestatic |
Our own locale-insensitive version of isascii(3)
.
[in] | c | Byte in question to query. |
false | c is out of range of ASCII character set. |
true | Yes it is. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of isblank(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "blank". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of iscntrl(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "cntrl". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of isdigit(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "digit". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1. Definition at line 302 of file ctype.h.
Referenced by rb_isalnum(), and rb_isxdigit().
|
inlinestatic |
Our own locale-insensitive version of isgraph(3)
.
[in] | c | Byte in question to query. |
true | c is listed in either IEEE 1003.1 section 7.3.1.1 "upper", "lower", "digit", or "punct". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of islower(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "lower". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1. Definition at line 255 of file ctype.h.
Referenced by rb_isalpha(), and rb_toupper().
|
inlinestatic |
Identical to rb_isgraph(), except it also returns true for ‘’ '`.
[in] | c | Byte in question to query. |
true | c is listed in either IEEE 1003.1 section 7.3.1.1 "upper", "lower", "digit", "punct", or a ‘’ '`. |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of ispunct(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "punct". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of isspace(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "space". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of isupper(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "upper". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1. Definition at line 232 of file ctype.h.
Referenced by rb_isalpha(), and rb_tolower().
|
inlinestatic |
Our own locale-insensitive version of isxdigit(3)
.
[in] | c | Byte in question to query. |
true | c is listed in IEEE 1003.1 section 7.3.1.1 "xdigit". |
false | Anything else. |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of tolower(3)
.
[in] | c | Byte in question to convert. |
c | The byte is not listed in in IEEE 1003.1 section 7.3.1.1 "upper". |
otherwise | Byte converted using the map defined in IEEE 1003.1 section 7.3.1 "tolower". |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.
|
inlinestatic |
Our own locale-insensitive version of toupper(3)
.
[in] | c | Byte in question to convert. |
c | The byte is not listed in in IEEE 1003.1 section 7.3.1.1 "lower". |
otherwise | Byte converted using the map defined in IEEE 1003.1 section 7.3.1 "toupper". |
c
is an int. This means that when you pass a char
value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1. unsigned long ruby_strtoul | ( | const char * | str, |
char ** | endptr, | ||
int | base | ||
) |
Our own locale-insensitive version of strtoul(3)
.
The conversion is done as if the current locale is set to the "C" locale, no matter actual runtime locale settings.
strtoul("i", 0, 36)
would return zero if it is locale sensitive and the current locale is tr_TR
. [in] | str | String of digits, optionally preceded with whitespaces (ignored) and optionally + or - sign. |
[out] | endptr | NULL, or an arbitrary pointer (overwritten on return). |
[in] | base | 2 to 36 inclusive for each base, or special case 0 to detect the base from the contents of the string. |
endptr
is not NULL, it is updated to point the first such byte where conversion failed. errno
on failure.EINVAL
: Passed base
is out of range.ERANGE
: Converted integer is out of range of long
. strtoul
implementation shall render ERANGE
whenever it finds the input string represents a negative integer. Such thing can never be representable using unsigned long
. However this implementation does not honour that language. It just casts such negative value to the return type, resulting a very big return value. This behaviour is at least questionable. But we can no longer change that at this point. int st_locale_insensitive_strcasecmp | ( | const char * | s1, |
const char * | s2 | ||
) |
Our own locale-insensitive version of strcasecmp(3)
.
The "case" here always means that of the POSIX Locale. It doesn't depend on runtime locale settings.
[in] | s1 | Comparison LHS. |
[in] | s2 | Comparison RHS. |
-1 | s1 is "less" than s2 . |
0 | Both strings converted into lowercase would be identical. |
1 | s1 is "greater" than s2 . |
int st_locale_insensitive_strncasecmp | ( | const char * | s1, |
const char * | s2, | ||
size_t | n | ||
) |
Our own locale-insensitive version of strcnasecmp(3)
.
The "case" here always means that of the POSIX Locale. It doesn't depend on runtime locale settings.
[in] | s1 | Comparison LHS. |
[in] | s2 | Comparison RHS. |
[in] | n | Comparison shall stop after first n bytes are scanned. |
-1 | s1 is "less" than s2 . |
0 | Both strings converted into lowercase would be identical. |
1 | s1 is "greater" than s2 . |