This is the parser that is going to handle parsing regular expressions.
More...
|
| pm_parser_t * | parser |
| | The parser that is currently being used.
|
| |
| const uint8_t * | start |
| | A pointer to the start of the source that we are parsing.
|
| |
| const uint8_t * | cursor |
| | A pointer to the current position in the source.
|
| |
| const uint8_t * | end |
| | A pointer to the end of the source that we are parsing.
|
| |
| const pm_encoding_t * | encoding |
| | The encoding of the source.
|
| |
| pm_regexp_name_callback_t | name_callback |
| | The callback to call when a named capture group is found.
|
| |
| pm_regexp_name_data_t * | name_data |
| | The data to pass to the name callback.
|
| |
| const uint8_t * | node_start |
| | The start of the regexp node (for error locations).
|
| |
| const uint8_t * | node_end |
| | The end of the regexp node (for error locations).
|
| |
| const pm_encoding_t * | explicit_encoding |
| | The explicit encoding determined by escape sequences.
|
| |
| const uint8_t * | property_name |
| | Pointer to the first non-POSIX property name (for /n error messages).
|
| |
| size_t | property_name_length |
| | Length of the first non-POSIX property name found.
|
| |
| const uint8_t * | unicode_property_name |
| | Pointer to the first Unicode-only property name (for /e, /s error messages).
|
| |
| size_t | unicode_property_name_length |
| | Length of the first Unicode-only property name found.
|
| |
| pm_buffer_t | hex_escape_buffer |
| | Buffer of hex escape byte values >= 0x80, separated by 0x00 sentinels.
|
| |
| uint32_t | non_ascii_literal_count |
| | Count of non-ASCII literal bytes (not from escapes).
|
| |
| bool | extended_mode |
| | Whether or not the regular expression currently being parsed is in extended mode, wherein whitespace is ignored and comments are allowed.
|
| |
| bool | encoding_changed |
| | Whether the encoding has changed from the default.
|
| |
| bool | shared |
| | Whether the source content is shared (for named capture callback).
|
| |
| bool | has_unicode_escape |
| | Whether a \u{...} escape with value >= 0x80 was seen.
|
| |
| bool | has_hex_escape |
| | Whether a \xNN escape (or \M-x, etc.) with value >= 0x80 was seen.
|
| |
| bool | last_escape_was_unicode |
| | Tracks whether the last encoding-setting escape was \u (true) or \x (false).
|
| |
| bool | has_property_escape |
| | Whether any \p{...} or \P{...} property escape was found.
|
| |
| bool | has_unicode_property_escape |
| | Whether a Unicode-only property escape was found (not POSIX or script).
|
| |
| bool | invalid_unicode_range |
| | Whether a \u escape with invalid range (surrogate or > 0x10FFFF) was seen.
|
| |
| bool | hex_group_active |
| | Whether we are accumulating consecutive hex escape bytes.
|
| |
| bool | has_invalid_multibyte |
| | Whether an invalid multibyte character was found during parsing.
|
| |
This is the parser that is going to handle parsing regular expressions.
Definition at line 23 of file regexp.c.
◆ cursor
| const uint8_t* pm_regexp_parser_t::cursor |
A pointer to the current position in the source.
Definition at line 31 of file regexp.c.
◆ encoding
The encoding of the source.
Definition at line 37 of file regexp.c.
◆ encoding_changed
| bool pm_regexp_parser_t::encoding_changed |
Whether the encoding has changed from the default.
Definition at line 91 of file regexp.c.
◆ end
| const uint8_t* pm_regexp_parser_t::end |
A pointer to the end of the source that we are parsing.
Definition at line 34 of file regexp.c.
◆ explicit_encoding
The explicit encoding determined by escape sequences.
NULL if no encoding-setting escape has been seen, UTF-8 for \u escapes, or the source encoding for \x escapes.
Definition at line 56 of file regexp.c.
◆ extended_mode
| bool pm_regexp_parser_t::extended_mode |
Whether or not the regular expression currently being parsed is in extended mode, wherein whitespace is ignored and comments are allowed.
Definition at line 88 of file regexp.c.
◆ has_hex_escape
| bool pm_regexp_parser_t::has_hex_escape |
Whether a \xNN escape (or \M-x, etc.) with value >= 0x80 was seen.
Definition at line 100 of file regexp.c.
◆ has_invalid_multibyte
| bool pm_regexp_parser_t::has_invalid_multibyte |
Whether an invalid multibyte character was found during parsing.
Definition at line 121 of file regexp.c.
◆ has_property_escape
| bool pm_regexp_parser_t::has_property_escape |
Whether any \p{...} or \P{...} property escape was found.
Definition at line 109 of file regexp.c.
◆ has_unicode_escape
| bool pm_regexp_parser_t::has_unicode_escape |
Whether a \u{...} escape with value >= 0x80 was seen.
Definition at line 97 of file regexp.c.
◆ has_unicode_property_escape
| bool pm_regexp_parser_t::has_unicode_property_escape |
Whether a Unicode-only property escape was found (not POSIX or script).
Definition at line 112 of file regexp.c.
◆ hex_escape_buffer
Buffer of hex escape byte values >= 0x80, separated by 0x00 sentinels.
Definition at line 79 of file regexp.c.
◆ hex_group_active
| bool pm_regexp_parser_t::hex_group_active |
Whether we are accumulating consecutive hex escape bytes.
Definition at line 118 of file regexp.c.
◆ invalid_unicode_range
| bool pm_regexp_parser_t::invalid_unicode_range |
Whether a \u escape with invalid range (surrogate or > 0x10FFFF) was seen.
Definition at line 115 of file regexp.c.
◆ last_escape_was_unicode
| bool pm_regexp_parser_t::last_escape_was_unicode |
Tracks whether the last encoding-setting escape was \u (true) or \x (false).
This matters for error messages when both types are mixed.
Definition at line 106 of file regexp.c.
◆ name_callback
| pm_regexp_name_callback_t pm_regexp_parser_t::name_callback |
The callback to call when a named capture group is found.
Definition at line 40 of file regexp.c.
◆ name_data
The data to pass to the name callback.
Definition at line 43 of file regexp.c.
◆ node_end
| const uint8_t* pm_regexp_parser_t::node_end |
The end of the regexp node (for error locations).
Definition at line 49 of file regexp.c.
◆ node_start
| const uint8_t* pm_regexp_parser_t::node_start |
The start of the regexp node (for error locations).
Definition at line 46 of file regexp.c.
◆ non_ascii_literal_count
| uint32_t pm_regexp_parser_t::non_ascii_literal_count |
Count of non-ASCII literal bytes (not from escapes).
Definition at line 82 of file regexp.c.
◆ parser
The parser that is currently being used.
Definition at line 25 of file regexp.c.
◆ property_name
| const uint8_t* pm_regexp_parser_t::property_name |
Pointer to the first non-POSIX property name (for /n error messages).
POSIX properties (Alnum, Alpha, etc.) work in all encodings. Script properties (Hiragana, Katakana, etc.) work in /e, /s, /u. Unicode-only properties (L, Ll, etc.) work only in /u.
Definition at line 64 of file regexp.c.
◆ property_name_length
| size_t pm_regexp_parser_t::property_name_length |
Length of the first non-POSIX property name found.
Definition at line 67 of file regexp.c.
◆ shared
| bool pm_regexp_parser_t::shared |
Whether the source content is shared (for named capture callback).
Definition at line 94 of file regexp.c.
◆ start
| const uint8_t* pm_regexp_parser_t::start |
A pointer to the start of the source that we are parsing.
Definition at line 28 of file regexp.c.
◆ unicode_property_name
| const uint8_t* pm_regexp_parser_t::unicode_property_name |
Pointer to the first Unicode-only property name (for /e, /s error messages).
NULL if only POSIX or script properties have been seen.
Definition at line 73 of file regexp.c.
◆ unicode_property_name_length
| size_t pm_regexp_parser_t::unicode_property_name_length |
Length of the first Unicode-only property name found.
Definition at line 76 of file regexp.c.
The documentation for this struct was generated from the following file: