Packed Data

Quick Reference

These tables summarize the directives for packing and unpacking.

For Integers

Directive     | Meaning
--------------|---------------------------------------------------------------
C             | 8-bit unsigned (unsigned char)
S             | 16-bit unsigned, native endian (uint16_t)
L             | 32-bit unsigned, native endian (uint32_t)
Q             | 64-bit unsigned, native endian (uint64_t)
J             | pointer width unsigned, native endian (uintptr_t)

c             | 8-bit signed (signed char)
s             | 16-bit signed, native endian (int16_t)
l             | 32-bit signed, native endian (int32_t)
q             | 64-bit signed, native endian (int64_t)
j             | pointer width signed, native endian (intptr_t)

S_ S!         | unsigned short, native endian
I I_ I!       | unsigned int, native endian
L_ L!         | unsigned long, native endian
Q_ Q!         | unsigned long long, native endian
              |   (raises ArgumentError if the platform has no long long type)
J!            | uintptr_t, native endian (same with J)

s_ s!         | signed short, native endian
i i_ i!       | signed int, native endian
l_ l!         | signed long, native endian
q_ q!         | signed long long, native endian
              |   (raises ArgumentError if the platform has no long long type)
j!            | intptr_t, native endian (same with j)

S> s> S!> s!> | each the same as the directive without >, but big endian
L> l> L!> l!> |   S> is the same as n
I!> i!>       |   L> is the same as N
Q> q> Q!> q!> |
J> j> J!> j!> |

S< s< S!< s!< | each the same as the directive without <, but little endian
L< l< L!< l!< |   S< is the same as v
I!< i!<       |   L< is the same as V
Q< q< Q!< q!< |
J< j< J!< j!< |

n             | 16-bit unsigned, network (big-endian) byte order
N             | 32-bit unsigned, network (big-endian) byte order
v             | 16-bit unsigned, VAX (little-endian) byte order
V             | 32-bit unsigned, VAX (little-endian) byte order

U             | UTF-8 character
w             | BER-compressed integer

For Floats

Directive | Meaning
----------|--------------------------------------------------
D d       | double-precision, native format
F f       | single-precision, native format
E         | double-precision, little-endian byte order
e         | single-precision, little-endian byte order
G         | double-precision, network (big-endian) byte order
g         | single-precision, network (big-endian) byte order

For Strings

Directive | Meaning
----------|-----------------------------------------------------------------
A         | arbitrary binary string (remove trailing nulls and ASCII spaces)
a         | arbitrary binary string
Z         | null-terminated string
B         | bit string (MSB first)
b         | bit string (LSB first)
H         | hex string (high nibble first)
h         | hex string (low nibble first)
u         | UU-encoded string
M         | quoted-printable, MIME encoding (see RFC2045)
m         | base64 encoded string (RFC 2045) (default)
          |   (base64 encoded string (RFC 4648) if followed by 0)
P         | pointer to a structure (fixed-length string)
p         | pointer to a null-terminated string

Additional Directives for Packing

Directive | Meaning
----------|----------------------------------------------------------------
@         | moves to absolute position
X         | back up a byte
x         | null byte

Additional Directives for Unpacking

Directive | Meaning
----------|----------------------------------------------------------------
@         | skip to the offset given by the length argument
X         | skip backward one byte
x         | skip forward one byte

Packing and Unpacking

Certain Ruby core methods deal with packing and unpacking data:

Each of these methods accepts a string template, consisting of zero or more directive characters, each followed by zero or more modifier characters.

Examples (directive 'C' specifies ‘unsigned character’):

[65].pack('C')      # => "A"  # One element, one directive.
[65, 66].pack('CC') # => "AB" # Two elements, two directives.
[65, 66].pack('C')  # => "A"  # Extra element is ignored.
[65].pack('')       # => ""   # No directives.
[65].pack('CC')               # Extra directive raises ArgumentError.

'A'.unpack('C')   # => [65]      # One character, one directive.
'AB'.unpack('CC') # => [65, 66]  # Two characters, two directives.
'AB'.unpack('C')  # => [65]      # Extra character is ignored.
'A'.unpack('CC')  # => [65, nil] # Extra directive generates nil.
'AB'.unpack('')   # => []        # No directives.

The string template may contain any mixture of valid directives (directive 'c' specifies ‘signed character’):

[65, -1].pack('cC')  # => "A\xFF"
"A\xFF".unpack('cC') # => [65, 255]

The string template may contain whitespace (which is ignored) and comments, each of which begins with character '#' and continues up to and including the next following newline:

[0,1].pack("  C  #foo \n  C  ")    # => "\x00\x01"
"\0\1".unpack("  C  #foo \n  C  ") # => [0, 1]

Any directive may be followed by either of these modifiers:

If elements don’t fit the provided directive, only least significant bits are encoded:

[257].pack("C").unpack("C") # => [1]

Packing Method

Method Array#pack accepts optional keyword argument buffer that specifies the target string (instead of a new string):

[65, 66].pack('C*', buffer: 'foo') # => "fooAB"

The method can accept a block:

# Packed string is passed to the block.
[65, 66].pack('C*') {|s| p s }    # => "AB"

Unpacking Methods

Methods String#unpack and String#unpack1 each accept an optional keyword argument offset that specifies an offset into the string:

'ABC'.unpack('C*', offset: 1)  # => [66, 67]
'ABC'.unpack1('C*', offset: 1) # => 66

Both methods can accept a block:

# Each unpacked object is passed to the block.
ret = []
"ABCD".unpack("C*") {|c| ret << c }
ret # => [65, 66, 67, 68]

# The single unpacked object is passed to the block.
'AB'.unpack1('C*') {|ele| p ele } # => 65

Integer Directives

Each integer directive specifies the packing or unpacking for one element in the input or output array.

8-Bit Integer Directives

16-Bit Integer Directives

32-Bit Integer Directives

64-Bit Integer Directives

Platform-Dependent Integer Directives

Other Integer Directives

Modifiers for Integer Directives

For the following directives, '!' or '_' modifiers may be suffixed as underlying platform’s native size.

Native size modifiers are silently ignored for always native size directives.

The endian modifiers also may be suffixed in the directives above:

Float Directives

Each float directive specifies the packing or unpacking for one element in the input or output array.

Single-Precision Float Directives

Double-Precision Float Directives

A float directive may be infinity or not-a-number:

inf = 1.0/0.0                  # => Infinity
[inf].pack('f')                # => "\x00\x00\x80\x7F"
"\x00\x00\x80\x7F".unpack('f') # => [Infinity]

nan = inf/inf                  # => NaN
[nan].pack('f')                # => "\x00\x00\xC0\x7F"
"\x00\x00\xC0\x7F".unpack('f') # => [NaN]

String Directives

Each string directive specifies the packing or unpacking for one byte in the input or output string.

Binary String Directives

Bit String Directives

Hex String Directives

Pointer String Directives

Other String Directives

Offset Directives