(25c72b0e8e206e5baec71d4ece7551b7da7da445)

Our own, locale independent, character handling routines. More...

#include "ruby/internal/config.h"
#include "ruby/internal/attr/artificial.h"
#include "ruby/internal/attr/const.h"
#include "ruby/internal/attr/constexpr.h"
#include "ruby/internal/attr/nonnull.h"
#include "ruby/internal/dllexport.h"

Include dependency graph for ctype.h:

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Macros
Old character classification macros
What is this ISPRINT business? Well, according to our VCS and some internet surfing, it appears that the initial intent of these macros were to mimic codes appear in common in several GNU projects. As far as @shyouhei detects they seem to originate GNU regex (that standalone one rather than Gnulib or Glibc), and at least date back to 1995. Let me lawfully quote from a GNU coreutils commit https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=49803907f5dbd7646184a8912c9db9b09dcd0f22 ‍Jim Meyering writes: "... Some ctype macros are valid only for character codes that isascii says are ASCII (SGI's IRIX-4.0.5 is one such system –when using /bin/cc or gcc but without giving an ansi option). So, all ctype uses should be through macros like ISPRINT... If STDC_HEADERS is defined, then autoconf has verified that the ctype macros don't need to be guarded with references to isascii. ... Defining isascii to 1 should let any compiler worth its salt eliminate the && through constant folding." Bruno Haible adds: "... Furthermore, isupper(c) etc. have an undefined result if c is outside the range -1 <= c <= 255. One is tempted to write isupper(c) with c being of type ‘char’, but this is wrong if c is an 8-bit character >= 128 which gets sign-extended to a negative value. The macro ISUPPER protects against this as well." So the intent was to reroute old problematic systems that no longer exist. At the same time the problems described above no longer hurt us, because we decided to completely avoid using system-provided isupper etc. to reinvent the wheel. These macros are entirely legacy; please ignore them. But let me also put stress that GNU people are wise; they use those macros only inside of their own implementations and never let them be public. On the other hand ruby has thoughtlessly publicised them to 3rd party libraries since its beginning, which is a very bad idea. These macros are too easy to get conflicted with definitions elsewhere. New programs should stick to the `rb_` prefixed names. Note It seems we just mimic the API. We do not share their implementation with GPL-ed programs.
#define	ISASCII rb_isascii
	Old name of rb_isascii.

#define	ISPRINT rb_isprint
	Old name of rb_isprint.

#define	ISGRAPH rb_isgraph
	Old name of rb_isgraph.

#define	ISSPACE rb_isspace
	Old name of rb_isspace.

#define	ISUPPER rb_isupper
	Old name of rb_isupper.

#define	ISLOWER rb_islower
	Old name of rb_islower.

#define	ISALNUM rb_isalnum
	Old name of rb_isalnum.

#define	ISALPHA rb_isalpha
	Old name of rb_isalpha.

#define	ISDIGIT rb_isdigit
	Old name of rb_isdigit.

#define	ISXDIGIT rb_isxdigit
	Old name of rb_isxdigit.

#define	ISBLANK rb_isblank
	Old name of rb_isblank.

#define	ISCNTRL rb_iscntrl
	Old name of rb_iscntrl.

#define	ISPUNCT rb_ispunct
	Old name of rb_ispunct.

#define	TOUPPER rb_toupper
	Old name of rb_toupper.

#define	TOLOWER rb_tolower
	Old name of rb_tolower.

#define	STRCASECMP st_locale_insensitive_strcasecmp
	Old name of st_locale_insensitive_strcasecmp.

#define	STRNCASECMP st_locale_insensitive_strncasecmp
	Old name of st_locale_insensitive_strncasecmp.

#define	STRTOUL ruby_strtoul
	Old name of ruby_strtoul.

Functions
locale insensitive functions
int	st_locale_insensitive_strcasecmp (const char s1, const char s2)
	Our own locale-insensitive version of `strcasecmp(3)`.

int	st_locale_insensitive_strncasecmp (const char s1, const char s2, size_t n)
	Our own locale-insensitive version of `strcnasecmp(3)`.

unsigned long	ruby_strtoul (const char str, char *endptr, int base)
	Our own locale-insensitive version of `strtoul(3)`.

static int	rb_isascii (int c)
	Our own locale-insensitive version of `isascii(3)`.

static int	rb_isupper (int c)
	Our own locale-insensitive version of `isupper(3)`.

static int	rb_islower (int c)
	Our own locale-insensitive version of `islower(3)`.

static int	rb_isalpha (int c)
	Our own locale-insensitive version of `isalpha(3)`.

static int	rb_isdigit (int c)
	Our own locale-insensitive version of `isdigit(3)`.

static int	rb_isalnum (int c)
	Our own locale-insensitive version of `isalnum(3)`.

static int	rb_isxdigit (int c)
	Our own locale-insensitive version of `isxdigit(3)`.

static int	rb_isblank (int c)
	Our own locale-insensitive version of `isblank(3)`.

static int	rb_isspace (int c)
	Our own locale-insensitive version of `isspace(3)`.

static int	rb_iscntrl (int c)
	Our own locale-insensitive version of `iscntrl(3)`.

static int	rb_isprint (int c)
	Identical to rb_isgraph(), except it also returns true for ‘’ '`.

static int	rb_ispunct (int c)
	Our own locale-insensitive version of `ispunct(3)`.

static int	rb_isgraph (int c)
	Our own locale-insensitive version of `isgraph(3)`.

static int	rb_tolower (int c)
	Our own locale-insensitive version of `tolower(3)`.

static int	rb_toupper (int c)
	Our own locale-insensitive version of `toupper(3)`.

Detailed Description

Our own, locale independent, character handling routines.

Author: Ruby developers ruby-.nosp@m.core.nosp@m.@ruby.nosp@m.-lan.nosp@m.g.org

Copyright: This file is a part of the programming language Ruby. Permission is hereby granted, to either redistribute and/or modify this file, provided that the conditions mentioned in the file COPYING are met. Consult the file for details.

Warning: Symbols prefixed with either RBIMPL or rbimpl are implementation details. Don't take them as canon. They could rapidly appear then vanish. The name (path) of this header file is also an implementation detail. Do not expect it to persist at the place it is now. Developers are free to move it anywhere anytime at will.

Note: To ruby-core: remember that this header can be possibly recursively included from extension libraries written in C++. Do not expect for instance __VA_ARGS__ is always available. We assume C99 for ruby itself but we don't assume languages of extension libraries. They could be written in C++98.

Definition in file ctype.h.

Function Documentation

◆ rb_isalnum()

static int rb_isalnum ( int c )

inlinestatic

Our own locale-insensitive version of isalnum(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in either IEEE 1003.1 section 7.3.1.1 "upper", "lower", or "digit".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 326 of file ctype.h.

Referenced by rb_ispunct().

◆ rb_isalpha()

static int rb_isalpha ( int c )

inlinestatic

Our own locale-insensitive version of isalpha(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in either IEEE 1003.1 section 7.3.1.1 "upper" or "lower".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 279 of file ctype.h.

Referenced by rb_isalnum().

◆ rb_isascii()

static int rb_isascii ( int c )

inlinestatic

Our own locale-insensitive version of isascii(3).

Parameters

[in] c Byte in question to query.

Return values

false	`c` is out of range of ASCII character set.
true	Yes it is.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 209 of file ctype.h.

◆ rb_isblank()

static int rb_isblank ( int c )

inlinestatic

Our own locale-insensitive version of isblank(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "blank".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 372 of file ctype.h.

◆ rb_iscntrl()

static int rb_iscntrl ( int c )

inlinestatic

Our own locale-insensitive version of iscntrl(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "cntrl".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 418 of file ctype.h.

◆ rb_isdigit()

static int rb_isdigit ( int c )

inlinestatic

Our own locale-insensitive version of isdigit(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "digit".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 302 of file ctype.h.

Referenced by rb_isalnum(), and rb_isxdigit().

◆ rb_isgraph()

static int rb_isgraph ( int c )

inlinestatic

Our own locale-insensitive version of isgraph(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in either IEEE 1003.1 section 7.3.1.1 "upper", "lower", "digit", or "punct".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 489 of file ctype.h.

◆ rb_islower()

static int rb_islower ( int c )

inlinestatic

Our own locale-insensitive version of islower(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "lower".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 255 of file ctype.h.

Referenced by rb_isalpha(), and rb_toupper().

◆ rb_isprint()

static int rb_isprint ( int c )

inlinestatic

Identical to rb_isgraph(), except it also returns true for ‘’ '`.

Parameters

[in] c Byte in question to query.

Return values

true c is listed in either IEEE 1003.1 section 7.3.1.1 "upper", "lower", "digit", "punct", or a ‘’ '. @retval false Anything else. @note Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE. @warningcis an int. This means that when you pass achar` value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 442 of file ctype.h.

◆ rb_ispunct()

static int rb_ispunct ( int c )

inlinestatic

Our own locale-insensitive version of ispunct(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "punct".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 465 of file ctype.h.

◆ rb_isspace()

static int rb_isspace ( int c )

inlinestatic

Our own locale-insensitive version of isspace(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "space".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 395 of file ctype.h.

◆ rb_isupper()

static int rb_isupper ( int c )

inlinestatic

Our own locale-insensitive version of isupper(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "upper".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 232 of file ctype.h.

Referenced by rb_isalpha(), and rb_tolower().

◆ rb_isxdigit()

static int rb_isxdigit ( int c )

inlinestatic

Our own locale-insensitive version of isxdigit(3).

Parameters

[in] c Byte in question to query.

Return values

true	`c` is listed in IEEE 1003.1 section 7.3.1.1 "xdigit".
false	Anything else.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 349 of file ctype.h.

◆ rb_tolower()

static int rb_tolower ( int c )

inlinestatic

Our own locale-insensitive version of tolower(3).

Parameters

[in] c Byte in question to convert.

Return values

c	The byte is not listed in IEEE 1003.1 section 7.3.1.1 "upper".
otherwise	Byte converted using the map defined in IEEE 1003.1 section 7.3.1 "tolower".

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 514 of file ctype.h.

◆ rb_toupper()

static int rb_toupper ( int c )

inlinestatic

Our own locale-insensitive version of toupper(3).

Parameters

[in] c Byte in question to convert.

Return values

c	The byte is not listed in in IEEE 1003.1 section 7.3.1.1 "lower".
otherwise	Byte converted using the map defined in IEEE 1003.1 section 7.3.1 "toupper".

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: c is an int. This means that when you pass a char value here, it experiences "integer promotion" as defined in ISO/IEC 9899:2018 section 6.3.1.1 paragraph 1.

Definition at line 539 of file ctype.h.

◆ ruby_strtoul()

unsigned long ruby_strtoul	(	const char *	str,
		char **	endptr,
		int	base
	)

Our own locale-insensitive version of strtoul(3).

The conversion is done as if the current locale is set to the "C" locale, no matter actual runtime locale settings.

Note: This is needed because strtoul("i", 0, 36) would return zero if it is locale sensitive and the current locale is tr_TR.

Parameters

[in]	str	String of digits, optionally preceded with whitespaces (ignored) and optionally `+` or `-` sign.
[out]	endptr	NULL, or an arbitrary pointer (overwritten on return).
[in]	base	`2` to `36` inclusive for each base, or special case `0` to detect the base from the contents of the string.

Returns: Converted integer, casted to unsigned long.

Postcondition: If endptr is not NULL, it is updated to point the first such byte where conversion failed.

Note

This function sets errno on failure.

EINVAL: Passed base is out of range.
ERANGE: Converted integer is out of range of long.

Warning: As far as @shyouhei reads ISO/IEC 9899:2018 section 7.22.1.4, a conforming strtoul implementation shall render ERANGE whenever it finds the input string represents a negative integer. Such thing can never be representable using unsigned long. However this implementation does not honour that language. It just casts such negative value to the return type, resulting a very big return value. This behaviour is at least questionable. But we can no longer change that at this point.

Note: Not only does this function works under the "C" locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Definition at line 117 of file util.c.

Referenced by ruby_strtoul().

◆ st_locale_insensitive_strcasecmp()

int st_locale_insensitive_strcasecmp	(	const char *	s1,
		const char *	s2
	)

Our own locale-insensitive version of strcasecmp(3).

The "case" here always means that of the POSIX Locale. It doesn't depend on runtime locale settings.

Parameters

[in]	s1	Comparison LHS.
[in]	s2	Comparison RHS.

Return values

-1	`s1` is "less" than `s2`.
0	Both strings converted into lowercase would be identical.
1	`s1` is "greater" than `s2`.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Definition at line 2023 of file st.c.

◆ st_locale_insensitive_strncasecmp()

int st_locale_insensitive_strncasecmp	(	const char *	s1,
		const char *	s2,
		size_t	n
	)

Our own locale-insensitive version of strcnasecmp(3).

The "case" here always means that of the POSIX Locale. It doesn't depend on runtime locale settings.

Parameters

[in]	s1	Comparison LHS.
[in]	s2	Comparison RHS.
[in]	n	Comparison shall stop after first `n` bytes are scanned.

Return values

-1	`s1` is "less" than `s2`.
0	Both strings converted into lowercase would be identical.
1	`s1` is "greater" than `s2`.

Note: Not only does this function works under the POSIX Locale, but also assumes its execution character set be what ruby calls an ASCII-compatible character set; which does not include for instance EBCDIC or UTF-16LE.

Warning: This function is not timing safe.

Definition at line 2047 of file st.c.

(25c72b0e8e206e5baec71d4ece7551b7da7da445)

Macros

Functions

Detailed Description

Function Documentation

◆ rb_isalnum()

◆ rb_isalpha()

◆ rb_isascii()

◆ rb_isblank()

◆ rb_iscntrl()

◆ rb_isdigit()

◆ rb_isgraph()

◆ rb_islower()

◆ rb_isprint()

◆ rb_ispunct()

◆ rb_isspace()

◆ rb_isupper()

◆ rb_isxdigit()

◆ rb_tolower()

◆ rb_toupper()

◆ ruby_strtoul()

◆ st_locale_insensitive_strcasecmp()

◆ st_locale_insensitive_strncasecmp()