class Prism::Source

This represents a source of Ruby code that has been parsed. It is used in conjunction with locations to allow them to resolve line numbers and source ranges.

Attributes

source [R]

The source code that this source object represents.

start_line [R]

The line number where this source starts.

Public Class Methods

for (source, start_line, offsets)

(String source, Integer start_line, Array[Integer] offsets) → Source

Source

# File lib/prism/parse_result.rb, line 33
def self.for(source, start_line, offsets)
  if source.ascii_only?
    ASCIISource.new(source, start_line, offsets)
  elsif source.encoding == Encoding::BINARY
    source.force_encoding(Encoding::UTF_8)

    if source.valid_encoding?
      new(source, start_line, offsets)
    else
      # This is an extremely niche use case where the file is marked as
      # binary, contains multi-byte characters, and those characters are not
      # valid UTF-8. In this case we'll mark it as binary and fall back to
      # treating everything as a single-byte character. This _may_ cause
      # problems when asking for code units, but it appears to be the
      # cleanest solution at the moment.
      source.force_encoding(Encoding::BINARY)
      ASCIISource.new(source, start_line, offsets)
    end
  else
    new(source, start_line, offsets)
  end
end

Create a new source object with the given source code. This method should be used instead of new and it will return either a Source or a specialized and more performant ASCIISource if no multibyte characters are present in the source code.

Note that if you are calling this method manually, you will need to supply the start_line and offsets parameters. start_line is the line number that the source starts on, which is typically 1 but can be different if this source is a subset of a larger source or if this is an eval. offsets is an array of byte offsets for the start of each line in the source code, which can be calculated by iterating through the source code and recording the byte offset whenever a newline character is encountered. The first element is always 0 to mark the first line.

new (source, start_line, offsets)

(String source, Integer start_line, Array[Integer] | String offsets) → void

Source

# File lib/prism/parse_result.rb, line 78
def initialize(source, start_line, offsets)
  @source = source
  @start_line = start_line
  @offsets = offsets
end

Create a new source object with the given source code. The offsets parameter can be either an Array of Integer byte offsets or a packed binary string of uint32_t values (from the C extension).

Public Instance Methods

byte_offset (line, column)

(Integer line, Integer column) → Integer

Source

# File lib/prism/parse_result.rb, line 124
def byte_offset(line, column)
  normal = line - @start_line
  raise IndexError if normal < 0
  offsets.fetch(normal) + column
rescue IndexError
  raise ArgumentError, "line #{line} is out of range"
end

Converts the line number and column in bytes to a byte offset.

character_column (byte_offset)

(Integer byte_offset) → Integer

Source

# File lib/prism/parse_result.rb, line 173
def character_column(byte_offset)
  character_offset(byte_offset) - character_offset(line_start(byte_offset))
end

Return the column in characters for the given byte offset.

character_offset (byte_offset)

(Integer byte_offset) → Integer

Source

# File lib/prism/parse_result.rb, line 166
def character_offset(byte_offset)
  (source.byteslice(0, byte_offset) or raise).length
end

Return the character offset for the given byte offset.

code_units_cache (encoding)

(Encoding encoding) → CodeUnitsCache

Source

# File lib/prism/parse_result.rb, line 207
def code_units_cache(encoding)
  CodeUnitsCache.new(source, encoding)
end

Generate a cache that targets a specific encoding for calculating code unit offsets.

code_units_column (byte_offset, encoding)

(Integer byte_offset, Encoding encoding) → Integer

Source

# File lib/prism/parse_result.rb, line 215
def code_units_column(byte_offset, encoding)
  code_units_offset(byte_offset, encoding) - code_units_offset(line_start(byte_offset), encoding)
end

Returns the column in code units for the given encoding for the given byte offset.

code_units_offset (byte_offset, encoding)

(Integer byte_offset, Encoding encoding) → Integer

Source

# File lib/prism/parse_result.rb, line 191
def code_units_offset(byte_offset, encoding)
  return byte_offset if encoding == Encoding::UTF_8

  byteslice = (source.byteslice(0, byte_offset) or raise).encode(encoding, invalid: :replace, undef: :replace)

  if encoding == Encoding::UTF_16LE || encoding == Encoding::UTF_16BE
    byteslice.bytesize / 2
  else
    byteslice.length
  end
end

Returns the offset from the start of the file for the given byte offset counting in code units for the given encoding.

This method is tested with UTF-8, UTF-16, and UTF-32. If there is the concept of code units that differs from the number of characters in other encodings, it is not captured here.

We purposefully replace invalid and undefined characters with replacement characters in this conversion. This happens for two reasons. First, it’s possible that the given byte offset will not occur on a character boundary. Second, it’s possible that the source code will contain a character that has no equivalent in the given encoding.

column (byte_offset)

(Integer byte_offset) → Integer

Source

# File lib/prism/parse_result.rb, line 159
def column(byte_offset)
  byte_offset - line_start(byte_offset)
end

Return the column in bytes for the given byte offset.

deep_freeze ()

() → void

Source

# File lib/prism/parse_result.rb, line 222
def deep_freeze
  source.freeze
  offsets.freeze
  freeze
end

Freeze this object and the objects it contains.

encoding ()

() → Encoding

Source

# File lib/prism/parse_result.rb, line 102
def encoding
  source.encoding
end

Returns the encoding of the source code, which is set by parameters to the parser or by the encoding magic comment.

line (byte_offset)

(Integer byte_offset) → Integer

Source

# File lib/prism/parse_result.rb, line 136
def line(byte_offset)
  start_line + find_line(byte_offset)
end

Binary search through the offsets to find the line number for the given byte offset.

line_end (byte_offset)

(Integer byte_offset) → Integer

Source

# File lib/prism/parse_result.rb, line 152
def line_end(byte_offset)
  offsets[find_line(byte_offset) + 1] || source.bytesize
end

Returns the byte offset of the end of the line corresponding to the given byte offset.

line_start (byte_offset)

(Integer byte_offset) → Integer

Source

# File lib/prism/parse_result.rb, line 144
def line_start(byte_offset)
  offsets[find_line(byte_offset)]
end

Return the byte offset of the start of the line corresponding to the given byte offset.

lines ()

() → Array[String]

Source

# File lib/prism/parse_result.rb, line 109
def lines
  source.lines
end

Returns the lines of the source code as an array of strings.

offsets ()

() → Array[Integer]

Source

# File lib/prism/parse_result.rb, line 67
def offsets
  offsets = @offsets
  return offsets if offsets.is_a?(Array)
  @offsets = offsets.unpack("L*")
end

The list of newline byte offsets in the source code. When initialized from the C extension, this may be a packed binary string of uint32_t values that is lazily unpacked on first access.

replace_offsets (offsets)

(Array[Integer] offsets) → void

Source

# File lib/prism/parse_result.rb, line 94
def replace_offsets(offsets)
  @offsets = offsets
end

Replace the value of offsets with the given value.

replace_start_line (start_line)

(Integer start_line) → void

Source

# File lib/prism/parse_result.rb, line 87
def replace_start_line(start_line)
  @start_line = start_line
end

Replace the value of start_line with the given value.

slice (byte_offset, length)

(Integer byte_offset, Integer length) → String

Source

# File lib/prism/parse_result.rb, line 117
def slice(byte_offset, length)
  source.byteslice(byte_offset, length) or raise
end

Perform a byteslice on the source code using the given byte offset and byte length.