Ruby VM Stack and Frame Layout
This document explains the Ruby VM stack architecture, including how the value stack (SP) and control frames (CFP) share a single contiguous memory region, and how individual frames are structured.
VM Stack Architecture
The Ruby VM uses a single contiguous stack (ec->vm_stack) with two different regions growing toward each other. Understanding this requires distinguishing the overall architecture (how CFPs and values share one stack) from individual frame internals (how values are organized for one single frame).
High addresses (ec->vm_stack + ec->vm_stack_size)
β
[CFP region starts here] β RUBY_VM_END_CONTROL_FRAME(ec)
[CFP - 1] New frame pushed here (grows downward)
[CFP - 2] Another frame
...
(Unused space - stack overflow when they meet)
... Value stack grows UP toward higher addresses
[SP + n] Values pushed here
[ec->cfp->sp] Current executing frame's stack pointer
β
Low addresses (ec->vm_stack)
The βunused spaceβ represents free space available for new frames and values. When this gap closes (CFP meets SP), stack overflow occurs.
Stack Growth Directions
Control Frames (CFP):
-
Start at
ec->vm_stack + ec->vm_stack_size(high addresses) -
Grow downward toward lower addresses as frames are pushed
-
Each new frame is allocated at
cfp - 1(lower address) -
The
rb_control_frame_tstructure itself moves downward
Value Stack (SP):
-
Starts at
ec->vm_stack(low addresses) -
Grows upward toward higher addresses as values are pushed
-
Each frameβs
cfp->sppoints to the top of its value stack
Stack Overflow
When recursive calls push too many frames, CFP grows downward until it collides with SP growing upward. The VM detects this with CHECK_VM_STACK_OVERFLOW0, which computes const rb_control_frame_struct *bound = (void *)&sp[margin]; and raises if cfp <= &bound[1].
Understanding Individual Frame Value Stacks
Each frame has its own portion of the overall VM stack, called its βVM value stackβ or simply βvalue stackβ. This space is pre-allocated when the frame is created, with size determined by:
-
local_size- space for local variables -
stack_max- maximum depth for temporary values during execution
The frameβs value stack grows upward from its base (where self/arguments/locals live) toward cfp->sp (the current top of temporary values).
Visualizing How Frames Fit in the VM Stack
The left side shows the overall VM stack with CFP metadata separated from frame values. The right side zooms into one frameβs value region, revealing its internal structure.
Overall VM Stack (ec->vm_stack): Zooming into Frame 2's value stack:
High addr (vm_stack + vm_stack_size) High addr (cfp->sp)
β β
[CFP 1 metadata] β [Temporaries]
[CFP 2 metadata] ββββββββββ β [Env: Flags/Block/CME] β cfp->ep
[CFP 3 metadata] β β [Locals]
ββββββββββββββββ β βββ€ [Arguments]
(unused space) β β β [self]
ββββββββββββββββ β β β
[Frame 3 values] β β Low addr (frame base)
[Frame 2 values] <βββββββββ΄ββββββββ
[Frame 1 values]
β
Low addr (vm_stack)
Examining a Single Frameβs Value Stack
Now letβs walk through a concrete Ruby program to see how a single frameβs value stack is structured internally:
def foo(x, y) z = x.casecmp(y) end foo(:one, :two)
First, after arguments are evaluated and right before the send to foo:
ββββββββββββββ
putself β :two β
putobject :one 0x2 ββββββββββββββ€
putobject :two β :one β
βΊ send <:foo, argc:2> 0x1 ββββββββββββββ€
leave β self β
0x0 ββββββββββββββ
The put* instructions have pushed 3 items onto the stack. Itβs now time to add a new control frame for foo. The following is the shape of the stack after one instruction in foo:
cfp->sp=0x8 at this point.
0x8 βββββββββββββββββStack space for temporaries
β :one β live above the environment.
0x7 ββββββββββββββ€
getlocal x@0 β < flags > β foo's rb_control_frame_t
βΊ getlocal y@1 0x6 ββββββββββββββ€βββhas cfp->ep=0x6
send <:casecmp, argc:1> β <no block> β
dup 0x5 ββββββββββββββ€ The flags, block, and CME triple
setlocal z@2 β <CME: foo> β (VM_ENV_DATA_SIZE) form an
leave 0x4 ββββββββββββββ€ environment. They can be used to
β z (nil) β figure out what local variables
0x3 ββββββββββββββ€ are below them.
β :two β
0x2 ββββββββββββββ€ Notice how the arguments, now
β :one β locals, never moved. This layout
0x1 ββββββββββββββ€ allows for argument transfer
β self β without copying.
0x0 ββββββββββββββ
Given that locals have lower address than cfp->ep, it makes sense then that getlocal in insns.def has val = *(vm_get_ep(GET_EP(), level) - idx);. When accessing variables in the immediate scope, where level=0, itβs essentially val = cfp->ep[-idx];.
Note that this EP-relative index has a different basis than the index that comes after β@β in disassembly listings. The β@β index is relative to the 0th local (x in this case).
Q&A
Q: It seems that the receiver is always at an offset relative to EP, like locals. Couldnβt we use EP to access it instead of using cfp->self?
A: Not all calls put the self in the callee on the stack. Two examples are Proc#call, where the receiver is the Proc object, but self inside the callee is Proc#receiver, and yield, where the receiver isnβt pushed onto the stack before the arguments.
Q: Why have cfp->ep when it seems that everything is below cfp->sp?
A: In the example, cfp->ep points to the stack, but it can also point to the GC heap. Blocks can capture and evacuate their environment to the heap.