class REXML::Source
A Source can be searched for patterns, and wraps buffers and other objects and provides consumption of text
Attributes
The current buffer (what we're going to read next)
The line number of the last consumed text
Public Class Methods
Constructor @param arg must be a String, and should be a valid XML document @param encoding if non-null, sets the encoding of the source to this value, overriding all encoding detection
# File lib/rexml/source.rb, line 41 def initialize(arg, encoding=nil) @orig = @buffer = arg if encoding self.encoding = encoding else detect_encoding end @line = 0 end
Public Instance Methods
# File lib/rexml/source.rb, line 85 def consume( pattern ) @buffer = $' if pattern.match( @buffer ) end
@return the current line in the source
# File lib/rexml/source.rb, line 115 def current_line lines = @orig.split res = lines.grep @buffer[0..30] res = res[-1] if res.kind_of? Array lines.index( res ) if res end
@return true if the Source is exhausted
# File lib/rexml/source.rb, line 106 def empty? @buffer == "" end
Inherited from Encoding Overridden to support optimized en/decoding
# File lib/rexml/source.rb, line 54 def encoding=(enc) return unless super encoding_updated end
# File lib/rexml/source.rb, line 99 def match(pattern, cons=false) md = pattern.match(@buffer) @buffer = $' if cons and md return md end
# File lib/rexml/source.rb, line 89 def match_to( char, pattern ) return pattern.match(@buffer) end
# File lib/rexml/source.rb, line 93 def match_to_consume( char, pattern ) md = pattern.match(@buffer) @buffer = $' return md end
# File lib/rexml/source.rb, line 110 def position @orig.index( @buffer ) end
# File lib/rexml/source.rb, line 82 def read end
Scans the source for a given pattern. Note, that this is not your usual scan() method. For one thing, the pattern argument has some requirements; for another, the source can be consumed. You can easily confuse this method. Originally, the patterns were easier to construct and this method more robust, because this method generated search regexes on the fly; however, this was computationally expensive and slowed down the entire REXML package considerably, since this is by far the most commonly called method. @param pattern must be a Regexp, and must be in the form of /^s*(#{your pattern, with no groups})(.*)/. The first group will be returned; the second group is used if the consume flag is set. @param consume if true, the pattern returned will be consumed, leaving everything after it in the Source. @return the pattern, if found, or nil if the Source is empty or the pattern is not found.
# File lib/rexml/source.rb, line 75 def scan(pattern, cons=false) return nil if @buffer.nil? rv = @buffer.scan(pattern) @buffer = $' if cons and rv.size>0 rv end
Private Instance Methods
# File lib/rexml/source.rb, line 123 def detect_encoding buffer_encoding = @buffer.encoding detected_encoding = "UTF-8" begin @buffer.force_encoding("ASCII-8BIT") if @buffer[0, 2] == "\xfe\xff" @buffer[0, 2] = "" detected_encoding = "UTF-16BE" elsif @buffer[0, 2] == "\xff\xfe" @buffer[0, 2] = "" detected_encoding = "UTF-16LE" elsif @buffer[0, 3] == "\xef\xbb\xbf" @buffer[0, 3] = "" detected_encoding = "UTF-8" end ensure @buffer.force_encoding(buffer_encoding) end self.encoding = detected_encoding end
# File lib/rexml/source.rb, line 144 def encoding_updated if @encoding != 'UTF-8' @buffer = decode(@buffer) @to_utf = true else @to_utf = false @buffer.force_encoding ::Encoding::UTF_8 end end