Architecture#
VMSifter uses a two-tier architecture: a Python layer for orchestration, fuzzing logic, and result collection, and a C layer for low-level VM interaction, instruction injection, and performance counter readout. The two layers communicate over a Unix domain socket.
Components#
SifterExecutor#
File: vmsifter/executor.py
The main orchestrator. Responsibilities:
Creates a Unix domain socket and a
ProcessPoolExecutorQueries Xen for available physical CPUs (PCPUs), excluding those allocated to Dom0 and optionally filtering SMT siblings
Instantiates the selected fuzzer and partitions its search space across available PCPUs
For each PCPU: creates an injector, accepts its socket connection, creates a Worker, and submits it to the process pool
On worker completion: deallocates injector + worker, returns PCPU to the available pool, logs statistics
Worker#
File: vmsifter/worker.py
Runs the per-CPU instruction execution loop in a subprocess. Core loop (handle_client()):
Receive
InjectorResultMessagefrom C injector via socketConvert to a
FuzzerExecResultsubclass (Interrupted / NMI / EPT / Other)Feed result to fuzzer generator via
gen.send(result), receiving the next instructionIf the result is βfinalβ (not a retry), log it to CSV
Send instruction bytes to injector via socket
Repeat until the fuzzer raises
StopIteration
Each worker logs to {workdir}/worker_{id}.log and returns WorkerStats on completion containing instruction count, exit reason distribution, and execution speed.
Injectors#
File: vmsifter/injector/
Interface between VMSifter and virtualization platforms.
XenVMInjector (xenvm.py)#
Once per process (protected by a thread-safe lock): creates a βparentβ XTF test VM via the C injectorβs
--setupmode. The parent VM initializes registers with canary values (0x1100 + register_offset) through a magic CPUID leaf (0x13371337) and configures performance counters.Per worker: forks the parent VM using
xc_memshr_fork()(lightweight copy-on-write clone), pins the fork to a specific PCPU viaxl vcpu-pin, and spawns the C injector subprocess in--socketmode.
C Injector (src/main.c, src/forkvm.c, src/vmi.c)#
The C injector is the low-level component that directly interacts with the VM:
main.c: Core logic including parent VM setup (setup_parent()), guest memory preparation (setup_memory()), VMEXIT event handler (exit_cb()), and the instruction injection loopforkvm.c: VM forking viaxc_memshr_fork()β creates new HVM domains with HAP and memory sharingvmi.c: LibVMI initialization wrapper
On each VMEXIT, the C injector:
Reads performance counters via
xc_vcpu_get_msrs()Captures all general-purpose registers and CR2
Sends the
InjectorResultMessage(264-byte struct) to the Python worker via socketReceives the next instruction bytes
Writes the instruction to guest memory at address
0xa000 - insn_sizeResets registers to canary state
Resumes the VM with
VMI_EVENT_RESPONSE_RESET_FORK_STATE
Fuzzers#
File: vmsifter/fuzzer/
All fuzzers implement AbstractInsnGenerator and use Python generators (yield/send pattern), allowing the next instruction to depend on the previous execution result.
TunnelFuzzer (tunnel.py)#
The primary fuzzing algorithm. Systematically explores the x86 instruction space byte-by-byte using a βmarkerβ index:
Maintains a position (
marker_idx) in the instruction byte sequenceEPT execute fault = instruction incomplete, needs more bytes
Valid instruction = check for shortest encoding via backwards search (shorten until EPT fault)
Interrupted = retry same instruction
Nibble-skipping optimization: after ~10 similar results, skip chunks of the search space
Partitionable by first-byte range for parallelism across workers
RandomFuzzer (random.py)#
Generates random instruction bytes: random length (1 to insn_buf_size) filled with os.urandom().
CsvFuzzer (csv.py)#
Replays instructions from CSV files for differential testing. Supports prefix variation generation and baseline comparison (csv_log_diff_only).
DrizzlerFuzzer (drizzler.py)#
A specification-aware multi-instruction fuzzer. Where the Tunnel fuzzer generates and executes one instruction at a time, Drizzler generates sequences of multiple instructions assembled together and executed as a single test case in the VM. This approach is similar to Googleβs SiliFuzz project, which also runs multi-instruction sequences on real CPUs to find hardware issues β though SiliFuzz generates tests by fuzzing a CPU emulator (Unicorn) and compares end state against the emulatorβs prediction, while Drizzler uses direct specification-aware mutation and detects anomalies through VMEXIT behavior and register/performance counter deltas.
The multi-instruction approach can find issues that single-instruction testing misses: pipeline interactions, state dependencies between instructions, and behavior that only manifests under specific microarchitectural conditions.
Architecture:
The fuzzer is built from several cooperating classes:
X86Spec: Models the x86 ISA subset used for test generation. Defines register sets across all widths (8/16/32/64-bit, excluding RSP/RBP to avoid corrupting the stack), legacy and REX prefixes, and 6 instruction groups for random background instruction generation: arithmetic (add/sub), data movement (mov), multiplication/division (div/mul), increment/decrement (inc/dec), nop variants, and cache flushing (clflush).Operand: Represents an operand specification with toggleable types β registers, immediates, register-memory references (WORD PTR [reg + offset]), and direct memory references ([0xb000 + offset]). Generates random concrete operands respecting bitwidth constraints and x86 encoding rules (e.g., avoiding REX + high-byte register conflicts). Memory operands target the 0xB000 region, which the C injector maps with read/write permissions when the--drizzlerflag is set.Instruction: Pairs a mnemonic with two operands and a prefix list. Builds valid operand combinations per bitwidth to avoid impossible encodings. Generates concrete assembly strings viagetCanonical(), single-prefix variants viagetSinglePrefixTests(), and randomly chained prefix variants viagetChainedPrefixTest(). Supports a custom preparation callback (e.g.,prepareMOVSBsets up RSI, RDI, RCX before amovsb).Driver: The test generation engine. Initialized with a random seed, it:Builds an injection pool from target instructions (and their prefix variants, in aggressive mode)
Generates 0β512 random background instructions from the instruction groups
For each target instruction, emits a test set: one base test (no prefix) + single-prefix variants + randomly chained prefix variants (up to
maxTestsPerUnit=12per target)
Test execution cycle:
Each test in a test set is run through 6 variations β 2 phases Γ 3 injection modes:
Base phase (no injection): the background instructions + target instruction, without randomly injected variants
Injection phase: same, but target instruction variants from the injection pool are randomly inserted among the background instructions (1% chance per position, up to 6 injections)
Each phase runs in 3 modes:
Plain (type 0): instructions as-is
Serialized (type 1):
lfenceafter each instructionFlushed (type 2):
clflushfor memory operands
The assembly string is converted to machine code using the Keystone assembler engine. A custom fix_db_and_assemble() handler splits out raw prefix bytes (db 0xNN) that Keystone cannot assemble directly, assembles the remaining instructions, and stitches the result back together.
Target instructions are defined in DrizzlerFuzzer.setup(). Currently configured: lzcnt (with register and memory operands across 16/32/64-bit widths) and movsb (with a preparation function that initializes RSI, RDI, RCX).
C-side support: When the --drizzler flag is passed to the C injector, it maps additional guest memory pages (0xB000β0x1FFFF) with read/write permissions and fills them with canary bytes (0x41). This supports Drizzlerβs memory operands which reference [0xb000 + offset].
Not partitionable across workers (unlike Tunnel). Test generation is deterministic for a given seed, controlled by num_seeds.
Guest Memory Layout and Setup#
The guest VMβs memory layout is established in two phases: first by the XTF test harness (which runs inside the guest), then by the C injector (which manipulates the guest from outside via LibVMI).
Memory Map#
The following guest physical address ranges are relevant to VMSifterβs operation:
GPA Range |
Size |
Purpose |
EPT Permissions |
|---|---|---|---|
|
4 KB |
Null page (guard) |
No access ( |
|
4 KB |
Low memory (not explicitly managed) |
β |
|
32 KB |
Code region (8 pages) |
Read-only (write + execute denied) |
|
up to 15 B |
Instruction injection zone (within code page 9) |
Same as code region |
|
β |
Boundary address: instructions are written backwards from here |
β |
|
84 KB |
Drizzler data pages (only in drizzler mode) |
Read+Write (execute denied) |
IDT base (runtime) |
4 KB |
Interrupt Descriptor Table |
Read-only (write + execute denied) |
GDT base (runtime) |
4 KB |
Global Descriptor Table |
Read-only (write + execute denied) |
Stack (RSP, runtime) |
4 KB |
Guest stack page |
Read+Write (execute denied) |
|
β |
XTF binary ( |
β |
The IDT, GDT, and stack addresses are not hardcoded β the C injector reads them from the VMβs register state (idtr_base, gdtr_base, rsp) after forking.
EPT Permission Scheme#
EPT (Extended Page Tables) permissions control what the guest can do with each physical page. VMSifter restricts permissions to generate precise VMEXITs:
Code pages (0x2000β0x9FFF):
VMI_MEMACCESS_RWβ execute is allowed, but writes cause an EPT violation. This protects the code region where instructions are injected.IDT/GDT pages:
VMI_MEMACCESS_WXβ read is allowed, but writes and executes cause EPT violations.Stack page:
VMI_MEMACCESS_Xβ read and write allowed, execute causes an EPT violation.Null page (0x0):
VMI_MEMACCESS_RWXβ all access denied, acting as a null pointer guard.Drizzler data pages (0xB000β0x1FFFF):
VMI_MEMACCESS_Xβ read and write allowed, execute denied. Filled with0x41canary bytes.
The key insight is that EPT execute faults on the code region signal that an instruction fetch crossed the 0xA000 boundary β meaning the instruction encoding was incomplete and needed more bytes. The Tunnel fuzzer relies on this to distinguish complete from incomplete instructions.
Page Deduplication#
VM forks created by xc_memshr_fork() share all memory pages with the parent via copy-on-write (COW). Before fuzzing begins, the C injector forces deduplication of critical pages using the page_dedup() helper:
// Force COW copy, then set EPT permissions
vmi_read_8_pa(vmi, addr, &tmp); // read from hypervisor
vmi_write_8_pa(vmi, addr, &tmp); // write back (triggers COW)
vmi_set_mem_event(vmi, addr>>12, perm, 0); // set EPT permissions
This ensures the fork has its own copies of pages that will be modified during fuzzing, and that EPT permissions are correctly configured before the first instruction executes.
Pagetable Deduplication#
In addition to data pages, the C injector deduplicates all guest pagetable pages that map the working address range (0x1000β0x9FFF and the stack). This is critical because x86 processors set Access (A) and Dirty (D) bits in pagetable entries on memory access, which would trigger unwanted EPT violations on shared pages.
The populate_pagetable_pages() function walks the guest pagetable hierarchy for each relevant virtual address and deduplicates every level:
Legacy (32-bit): PGD and PTE pages
PAE: adds PDPE page
IA-32e (64-bit): adds PML4E page
All pagetable pages are marked with VMI_MEMACCESS_WX (read-only), since the processor only needs read access to walk pagetables, but A/D bit updates require write access from the hardware β which VMSifter has already handled by deduplicating the pages.
Instruction Injection#
Instructions are injected into the guest at the boundary of code page 9 (GPA 0x9000β0x9FFF), written backwards from address 0xA000:
Guest Physical Memory (code page 9):
0x9000 ββββββββββββββββββββββββββ
β β
β (unused space) β
β β
ββββββββββββββββββββββββββ€
β insn byte 0 β β 0xA000 - insn_size (= RIP)
β insn byte 1 β
β ... β
β insn byte N-1 β β 0x9FFF
0xA000 ββββββββββββββββββββββββββ β page boundary
β (next page β code ends β
β here, EPT exec fault β
β if fetch crosses) β
For a 3-byte instruction, it is written at 0x9FFDβ0x9FFF and RIP is set to 0x9FFD. If the CPU attempts to fetch beyond 0x9FFF (e.g., because it decoded a prefix and needs more bytes), the fetch crosses into page 0xA000 and triggers an EPT execute violation β signaling that the instruction encoding was incomplete.
Register Initialization (Canary Values)#
Before each instruction executes, all registers are reset to deterministic βcanaryβ values. This allows VMSifter to detect which registers an instruction modified by comparing post-execution values against the known baseline.
Register |
Canary Value |
Register |
Canary Value |
|
|---|---|---|---|---|
RIP |
Set to instruction address |
R8 |
|
|
RAX |
|
R9 |
|
|
RBX |
|
R10 |
|
|
RCX |
|
R11 |
|
|
RDX |
|
R12 |
|
|
RSI |
|
R13 |
|
|
RDI |
|
R14 |
|
|
RSP |
|
R15 |
|
|
RBP |
|
CR2 |
|
The formula is 0x1100 + enum_index, where the enum order is: RIP(0), RAX(1), RBX(2), RCX(3), RDX(4), RSI(5), RDI(6), RSP(7), RBP(8), R8(9), β¦, R15(16), CR2(17). RIP is special-cased: it is set to 0xA000 - insn_size rather than a canary. Custom initial values can be provided via the --regs-init-value flag.
After each instruction executes and VMEXITs, the forkβs state is reset to the parent snapshot (VMI_EVENT_RESPONSE_RESET_FORK_STATE), undoing any memory or register modifications the instruction caused. The C injector then overwrites registers with the canary values and writes the next instruction to memory before resuming.
XTF Guest Initialization (CPUID Handshake)#
The XTF test harness running inside the guest VM configures the CPU environment before the C injector takes over. Communication between the guest and the injector uses a magic CPUID leaf (0x13371337), which triggers a VMEXIT that the C injector intercepts:
Subleaf |
Direction |
Purpose |
|---|---|---|
0 |
Guest β Injector |
Setup complete signal. Injector pauses VM, clears HVM params, initializes register canaries, reads perf counter baseline. |
1 |
Injector β Guest |
Performance counter configuration. Returns 4 |
2 |
Injector β Guest |
SSE/AVX enablement flag in RAX. Guest sets CR4 flags ( |
3 |
Injector β Guest |
SYSCALL enablement flag in RAX. Guest configures |
4 |
Injector β Guest |
FPU emulation flag in RAX. Guest sets |
The guest initialization sequence:
Query subleaf 2 β optionally enable SSE/AVX
Query subleaf 3 β optionally enable SYSCALL/SYSRET
Query subleaf 4 β optionally enable FPU emulation
Query subleaf 1 β configure performance counters
Call subleaf 0 β signal ready; injector takes over execution from this point
After subleaf 0, the injector pauses the VM, unsets HVM parameters (Xenstore, IOREQ, console, etc.) to minimize hypervisor interference, and the guest never executes its own code again β all subsequent execution is injected instructions.
Socket Protocol#
The Python worker and C injector communicate over a Unix domain socket (AF_UNIX, SOCK_STREAM) in a synchronous request-response loop:
Message sizes:
Instruction: 1-15 bytes (configurable via
insn_buf_size)Result: 264-byte
InjectorResultMessagestruct
InjectorResultMessage Structure#
Field |
Type |
Description |
|---|---|---|
|
uint64 |
VMEXIT reason |
|
uint64 |
VMEXIT qualification |
|
uint64 |
Value read from guest stack |
|
uint64[7] |
Performance counters (3 fixed + 4 programmable) |
|
uint64[18] |
17 GP registers + CR2 |
|
uint64 |
Guest linear address |
|
uint32 |
Interrupt info field |
|
uint32 |
Interrupt error code |
|
uint32 |
IDT vectoring info |
|
uint32 |
IDT vectoring error |
|
uint32 |
Instruction length reported by CPU |
|
uint32 |
Instruction info field |
Result Classification#
FuzzerExecResult hierarchy based on VMEXIT reason:
Interrupted: External interrupt received β retry same instruction
NMI: Includes interrupt type, page fault error code, CR2, stack value
EPT: EPT violation with R/W/X qualification (execute fault = incomplete instruction)
Other: All remaining VMX exit reasons (76 possible)
The factory method FuzzerExecResult.factory_from_injector_message() maps the C struct to the appropriate Python subclass.
Output Format#
CSVOutput (vmsifter/output.py) writes per-worker CSV files:
results_{id}.csvβ valid instructionsinvalid_instructions_{id}.csvβ invalid opcodes
Columns: insn (hex-encoded bytes), length, exit-type, misc (cpu_len, insn_info, stack value, page fault EC), pfct1-pfct7 (performance counter deltas), reg-delta (register changes from canary values).
Data Flow#
CPU Allocation#
Query Xen via
xl infofor total CPUsQuery
xl vcpu-list Domain-0for Dom0-allocated CPUsAvailable = Total - Dom0
If SMT disabled in config: filter out odd-numbered CPUs (keep physical cores only)
Each VM fork is pinned to its assigned PCPU via
xl vcpu-pin --ignore-global-affinity-masks