Windows Application Fuzzing: AFL, WinAFL, and Coverage-Guided Techniques

Introduction

Fuzzing has become the industry standard for discovering memory corruption vulnerabilities in Windows applications. Coverage-guided fuzzing, particularly through WinAFL (Windows AFL), enables systematic exploration of program execution paths to uncover edge cases that lead to crashes, buffer overflows, and exploitable conditions.

Fuzzing Fundamentals

Coverage-Guided Fuzzing

Traditional fuzzing generates random inputs, but coverage-guided fuzzing uses code coverage feedback to guide mutation:

Initial corpus → Execute → Measure coverage → Mutate interesting inputs → Repeat

Key metrics:

Edge coverage: Number of unique code branches executed
Path coverage: Unique execution paths discovered
Crash uniqueness: Distinct crash signatures (via stack hashes)

American Fuzzy Lop (AFL)

AFL revolutionized fuzzing with genetic algorithms and compile-time instrumentation:

// AFL compile-time instrumentation
cur_location = <COMPILE_TIME_RANDOM>;
shared_mem[cur_location ^ prev_location]++;
prev_location = cur_location >> 1;

This records edge transitions (A→B) rather than just block hits.

WinAFL Architecture

DynamoRIO Instrumentation

WinAFL uses DynamoRIO for runtime binary instrumentation without source code:

Application → DynamoRIO → Coverage tracking → WinAFL → Corpus management

Advantages:

No source code required
Works with closed-source binaries
Runtime instrumentation of Windows APIs

Installation and Setup

# Install prerequisites
choco install -y git cmake python3 visualstudio2022buildtools

# Clone WinAFL
git clone https://github.com/googleprojectzero/winafl.git
cd winafl

# Build DynamoRIO
git clone https://github.com/DynamoRIO/dynamorio.git
mkdir dynamorio/build
cd dynamorio/build
cmake -G "Visual Studio 17 2022" -A x64 ..
cmake --build . --config Release

# Build WinAFL
cd ../..
mkdir build
cd build
cmake -G "Visual Studio 17 2022" -A x64 ..
cmake --build . --config Release

Target Selection and Harness Development

Identifying Fuzz Targets

Good fuzzing targets have:

Parsing complex file formats (PDF, Office, images)
Network protocol handlers
Compression/decompression routines
Cryptographic implementations

Example: Fuzzing image parser in a media library

// Target function: ParseJPEG in image.dll
// void ParseJPEG(const uint8_t* data, size_t len)

// Harness wrapper
#include <windows.h>
#include <stdio.h>

extern "C" __declspec(dllimport) void ParseJPEG(const uint8_t* data, size_t len);

int main(int argc, char** argv) {
    if (argc < 2) {
        printf("Usage: %s <input_file>\n", argv[0]);
        return 1;
    }

    // Read input file
    HANDLE hFile = CreateFileA(argv[1], GENERIC_READ, 0, NULL,
                               OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
    if (hFile == INVALID_HANDLE_VALUE) {
        return 1;
    }

    DWORD fileSize = GetFileSize(hFile, NULL);
    uint8_t* buffer = (uint8_t*)malloc(fileSize);
    DWORD bytesRead;
    ReadFile(hFile, buffer, fileSize, &bytesRead, NULL);
    CloseHandle(hFile);

    // Call target function
    ParseJPEG(buffer, bytesRead);

    free(buffer);
    return 0;
}

Persistent Mode Fuzzing

For maximum speed, implement persistent mode to avoid process creation overhead:

// Persistent mode harness
extern "C" __declspec(dllexport)
int fuzz_iteration(const uint8_t* data, size_t len) {
    // Reset global state
    ResetParser();

    // Call target
    ParseJPEG(data, len);

    return 0;
}

int main(int argc, char** argv) {
    // WinAFL persistent loop
    __afl_persistent_loop();

    // Read from stdin (WinAFL feeds input here)
    uint8_t buffer[64 * 1024];
    size_t len = fread(buffer, 1, sizeof(buffer), stdin);

    fuzz_iteration(buffer, len);

    return 0;
}

Performance: Persistent mode achieves 10-100x speedup over traditional fork-based fuzzing.

Running WinAFL

Basic Fuzzing Campaign

# Create directories
mkdir in out

# Create seed corpus (valid JPEG files)
copy sample1.jpg in\
copy sample2.jpg in\

# Run WinAFL
afl-fuzz.exe -i in -o out -D C:\dynamorio\build\bin64 -t 20000 `
    -- -coverage_module image.dll -target_module harness.exe `
    -target_method fuzz_iteration -nargs 2 `
    -- harness.exe @@

# Parameters:
# -i: Input corpus directory
# -o: Output directory for crashes/hangs
# -D: DynamoRIO bin directory
# -t: Timeout (ms)
# -coverage_module: DLL to track coverage
# -target_method: Function to fuzz (persistent mode)
# -nargs: Number of arguments
# @@: Replaced with input file path

Understanding AFL UI

┌─ process timing ────────────────────────────────┐
│   run time : 2 days, 4 hrs, 15 min, 2 sec       │
│last new path : 0 days, 0 hrs, 3 min, 12 sec     │
│ last uniq crash : 0 days, 1 hrs, 22 min, 7 sec  │
│  last uniq hang : none seen yet                  │
└──────────────────────────────────────────────────┘
┌─ overall results ───────────────────────────────┐
│   cycles done : 147                              │
│  total paths : 8234                              │
│ uniq crashes : 12                                │
│   uniq hangs : 0                                 │
└──────────────────────────────────────────────────┘

Key metrics:

total paths: Unique execution paths discovered
uniq crashes: Distinct crash signatures
last new path: Time since coverage expansion (freshness)

Corpus Minimization

Reducing Redundant Inputs

# Minimize corpus using afl-cmin
afl-cmin.exe -i out\queue -o minimized_corpus -D C:\dynamorio\build\bin64 `
    -- -coverage_module image.dll -target_module harness.exe `
    -- harness.exe @@

# Minimize individual testcases
afl-tmin.exe -i crash.jpg -o minimized_crash.jpg -D C:\dynamorio\build\bin64 `
    -- -coverage_module image.dll -target_module harness.exe `
    -- harness.exe @@

afl-cmin: Keeps only inputs that trigger unique coverage afl-tmin: Reduces individual file size while preserving crash

Example reduction:

Original crash file: 45 KB
Minimized crash: 187 bytes (99.6% reduction)

Crash Triage and Analysis

Crash Deduplication

WinAFL uses stack hash for deduplication:

import hashlib
import os

def compute_crash_hash(crash_file):
    """
    Compute unique crash signature from exception info
    """
    # Parse crash dump
    with open(crash_file, 'rb') as f:
        data = f.read()

    # Extract stack trace (simplified)
    stack_trace = extract_stack_trace(data)

    # Hash first 5 frames
    frames = stack_trace[:5]
    signature = '|'.join(frames)

    return hashlib.md5(signature.encode()).hexdigest()

def deduplicate_crashes(crash_dir):
    """
    Group crashes by unique signature
    """
    crash_buckets = {}

    for filename in os.listdir(crash_dir):
        if not filename.startswith('id:'):
            continue

        path = os.path.join(crash_dir, filename)
        crash_hash = compute_crash_hash(path)

        if crash_hash not in crash_buckets:
            crash_buckets[crash_hash] = []

        crash_buckets[crash_hash].append(filename)

    # Print unique crashes
    for i, (crash_hash, files) in enumerate(crash_buckets.items()):
        print(f"Crash bucket {i} ({len(files)} samples):")
        print(f"  Representative: {files[0]}")
        print(f"  Hash: {crash_hash}\n")

Exploitability Analysis with !exploitable

# Load crash in WinDbg
windbg -z crash.dmp

# Load !exploitable extension
.load msec.dll

# Analyze exploitability
!exploitable

# Output example:
# EXPLOITABILITY CLASSIFICATION: EXPLOITABLE
# EXPLANATION: User mode write AV starting at ntdll!RtlFreeHeap+0x254
# The target crashed on a user mode write access violation.
# The user mode write access violation occurred at address 0x41414141.

Exploitability ratings:

EXPLOITABLE: High confidence of exploitation
PROBABLY_EXPLOITABLE: Likely exploitable with effort
UNKNOWN: Requires manual analysis
PROBABLY_NOT_EXPLOITABLE: Unlikely to be exploitable

Advanced Techniques

Custom Mutators

Implement domain-specific mutations for structured formats:

// Custom mutator for JPEG files
#include "afl-fuzz.h"

// Preserve JPEG structure while mutating data
size_t afl_custom_fuzz(uint8_t** out_buf, uint8_t* in_buf,
                       size_t in_size, unsigned int seed) {
    // Parse JPEG structure
    jpeg_parser_t parser;
    jpeg_parse(&parser, in_buf, in_size);

    // Mutate only image data segments (preserve markers)
    for (int i = 0; i < parser.num_segments; i++) {
        if (parser.segments[i].type == JPEG_SEGMENT_DATA) {
            // Apply AFL mutations to data segment
            mutate_segment(&parser.segments[i], seed);
        }
    }

    // Rebuild JPEG
    *out_buf = jpeg_rebuild(&parser, &in_size);

    return in_size;
}

Dictionary-Based Fuzzing

Provide magic values for better coverage:

# jpeg.dict - JPEG magic values
marker_soi="\xff\xd8"
marker_eoi="\xff\xd9"
marker_sos="\xff\xda"
marker_dqt="\xff\xdb"
marker_app0="\xff\xe0"
marker_comment="\xff\xfe"

# Use with -x flag
afl-fuzz.exe -i in -o out -x jpeg.dict ...

Parallel Fuzzing

Distribute fuzzing across multiple cores:

# Master instance (deterministic fuzzing)
start afl-fuzz.exe -i in -o out -M master01 -D C:\dynamorio\build\bin64 `
    -- -coverage_module image.dll -- harness.exe @@

# Secondary instances (randomized fuzzing)
start afl-fuzz.exe -i in -o out -S slave01 -D C:\dynamorio\build\bin64 `
    -- -coverage_module image.dll -- harness.exe @@

start afl-fuzz.exe -i in -o out -S slave02 -D C:\dynamorio\build\bin64 `
    -- -coverage_module image.dll -- harness.exe @@

# Sync findings every 30 minutes
# WinAFL automatically shares interesting inputs between instances

Speedup: Linear scaling up to number of CPU cores

Real-World Case Studies

CVE-2020-0938: Adobe Font Manager

Target: Windows font parsing (atmfd.dll)

Harness:

// Minimal font loading harness
#include <windows.h>

int main(int argc, char** argv) {
    HANDLE hFile = CreateFileA(argv[1], GENERIC_READ, 0, NULL,
                               OPEN_EXISTING, 0, NULL);
    DWORD size = GetFileSize(hFile, NULL);
    HANDLE hMap = CreateFileMappingA(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
    LPVOID pFont = MapViewOfFile(hMap, FILE_MAP_READ, 0, 0, 0);

    // Trigger font parsing
    AddFontMemResourceEx(pFont, size, NULL, &hFontResource);

    RemoveFontMemResourceEx(hFontResource);
    UnmapViewOfFile(pFont);
    CloseHandle(hMap);
    CloseHandle(hFile);
    return 0;
}

Results:

48 hours fuzzing
23 unique crashes discovered
CVE-2020-0938: Out-of-bounds write in CFF font parsing

CVE-2019-0708: BlueKeep (RDP)

Target: Remote Desktop Protocol (TermDD.sys)

Approach:

# Network protocol fuzzing harness
import socket
import struct

def fuzz_rdp_handshake(mutated_packet):
    """
    Send mutated RDP handshake packet
    """
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect(("target", 3389))

    # X.224 Connection Request
    sock.send(mutated_packet)

    response = sock.recv(1024)
    sock.close()

    return response

# WinAFL wrapper
def main(input_file):
    with open(input_file, 'rb') as f:
        packet = f.read()

    try:
        fuzz_rdp_handshake(packet)
    except:
        pass  # Crash handled by WinAFL

Impact: Remote code execution without authentication

Integration with CI/CD

Continuous Fuzzing Pipeline

# GitHub Actions workflow
name: Continuous Fuzzing

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 0 * * *'  # Daily

jobs:
  fuzz:
    runs-on: windows-latest
    timeout-minutes: 480  # 8 hours

    steps:
      - uses: actions/checkout@v2

      - name: Setup WinAFL
        run: |
          choco install winafl

      - name: Build harness
        run: |
          cmake -B build -DCMAKE_BUILD_TYPE=Release
          cmake --build build --config Release

      - name: Run fuzzing
        run: |
          mkdir corpus crashes
          afl-fuzz.exe -i corpus -o crashes -D C:\DynamoRIO\bin64 `
            -- -coverage_module target.dll -- harness.exe @@
        timeout-minutes: 420  # 7 hours

      - name: Analyze crashes
        if: always()
        run: |
          python scripts/triage_crashes.py crashes/

      - name: Upload artifacts
        if: always()
        uses: actions/upload-artifact@v2
        with:
          name: fuzzing-results
          path: crashes/

Best Practices

1. Seed Corpus Quality

# Collect diverse valid samples
# Good: Various file types, edge cases, minimal samples
copy valid_grayscale.jpg corpus\
copy valid_rgb.jpg corpus\
copy valid_cmyk.jpg corpus\
copy valid_progressive.jpg corpus\
copy minimal.jpg corpus\  # Smallest valid file

# Bad: Redundant similar files
# Don't copy 1000 nearly-identical images

2. Timeout Configuration

# Measure baseline execution time
Measure-Command { .\harness.exe corpus\sample.jpg }

# Set timeout to 5-10x baseline
# Baseline: 50ms → Timeout: 250-500ms

afl-fuzz.exe -t 500 ...

3. Deterministic vs Random Fuzzing

Cycle 1 (Deterministic):
  - Bit flips
  - Byte flips
  - Arithmetic mutations
  - Known integers (0, -1, MAX_INT)

Cycles 2+ (Randomized):
  - Havoc mode (stacked mutations)
  - Splicing (combine inputs)

4. Coverage Maximization

// Ensure all code paths are reachable
void ParseImage(const uint8_t* data, size_t len) {
    // Remove early exits that prevent coverage
    // BAD:
    if (len < MIN_SIZE) return;  // Prevents fuzzing small inputs

    // GOOD:
    if (len < MIN_SIZE) {
        len = MIN_SIZE;  // Pad input
    }

    // Remove unreachable dead code
    // Remove always-true checks that block paths
}

Conclusion

Coverage-guided fuzzing with WinAFL is the most effective technique for discovering vulnerabilities in Windows applications. By combining DynamoRIO instrumentation, intelligent corpus management, and persistent mode execution, researchers can systematically explore millions of execution paths to uncover critical security flaws.

Modern vulnerability research demands continuous fuzzing infrastructure integrated into development pipelines, enabling proactive security before attackers discover exploitable bugs.

References

Zalewski, M. (2017). “American Fuzzy Lop - Technical Details”
Swiecki, R. (2016). “WinAFL: AFL for Windows”
Microsoft (2014). “!exploitable Crash Analyzer”
Google Project Zero (2020). “Fuzzing at Scale”