Windows Application Fuzzing: AFL, WinAFL, and Coverage-Guided Techniques
Introduction
Fuzzing has become the industry standard for discovering memory corruption vulnerabilities in Windows applications. Coverage-guided fuzzing, particularly through WinAFL (Windows AFL), enables systematic exploration of program execution paths to uncover edge cases that lead to crashes, buffer overflows, and exploitable conditions.
Fuzzing Fundamentals
Coverage-Guided Fuzzing
Traditional fuzzing generates random inputs, but coverage-guided fuzzing uses code coverage feedback to guide mutation:
Initial corpus → Execute → Measure coverage → Mutate interesting inputs → Repeat
Key metrics:
- Edge coverage: Number of unique code branches executed
- Path coverage: Unique execution paths discovered
- Crash uniqueness: Distinct crash signatures (via stack hashes)
American Fuzzy Lop (AFL)
AFL revolutionized fuzzing with genetic algorithms and compile-time instrumentation:
// AFL compile-time instrumentation
cur_location = <COMPILE_TIME_RANDOM>;
shared_mem[cur_location ^ prev_location]++;
prev_location = cur_location >> 1;
This records edge transitions (A→B) rather than just block hits.
WinAFL Architecture
DynamoRIO Instrumentation
WinAFL uses DynamoRIO for runtime binary instrumentation without source code:
Application → DynamoRIO → Coverage tracking → WinAFL → Corpus management
Advantages:
- No source code required
- Works with closed-source binaries
- Runtime instrumentation of Windows APIs
Installation and Setup
# Install prerequisites
choco install -y git cmake python3 visualstudio2022buildtools
# Clone WinAFL
git clone https://github.com/googleprojectzero/winafl.git
cd winafl
# Build DynamoRIO
git clone https://github.com/DynamoRIO/dynamorio.git
mkdir dynamorio/build
cd dynamorio/build
cmake -G "Visual Studio 17 2022" -A x64 ..
cmake --build . --config Release
# Build WinAFL
cd ../..
mkdir build
cd build
cmake -G "Visual Studio 17 2022" -A x64 ..
cmake --build . --config Release
Target Selection and Harness Development
Identifying Fuzz Targets
Good fuzzing targets have:
- Parsing complex file formats (PDF, Office, images)
- Network protocol handlers
- Compression/decompression routines
- Cryptographic implementations
Example: Fuzzing image parser in a media library
// Target function: ParseJPEG in image.dll
// void ParseJPEG(const uint8_t* data, size_t len)
// Harness wrapper
#include <windows.h>
#include <stdio.h>
extern "C" __declspec(dllimport) void ParseJPEG(const uint8_t* data, size_t len);
int main(int argc, char** argv) {
if (argc < 2) {
printf("Usage: %s <input_file>\n", argv[0]);
return 1;
}
// Read input file
HANDLE hFile = CreateFileA(argv[1], GENERIC_READ, 0, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE) {
return 1;
}
DWORD fileSize = GetFileSize(hFile, NULL);
uint8_t* buffer = (uint8_t*)malloc(fileSize);
DWORD bytesRead;
ReadFile(hFile, buffer, fileSize, &bytesRead, NULL);
CloseHandle(hFile);
// Call target function
ParseJPEG(buffer, bytesRead);
free(buffer);
return 0;
}
Persistent Mode Fuzzing
For maximum speed, implement persistent mode to avoid process creation overhead:
// Persistent mode harness
extern "C" __declspec(dllexport)
int fuzz_iteration(const uint8_t* data, size_t len) {
// Reset global state
ResetParser();
// Call target
ParseJPEG(data, len);
return 0;
}
int main(int argc, char** argv) {
// WinAFL persistent loop
__afl_persistent_loop();
// Read from stdin (WinAFL feeds input here)
uint8_t buffer[64 * 1024];
size_t len = fread(buffer, 1, sizeof(buffer), stdin);
fuzz_iteration(buffer, len);
return 0;
}
Performance: Persistent mode achieves 10-100x speedup over traditional fork-based fuzzing.
Running WinAFL
Basic Fuzzing Campaign
# Create directories
mkdir in out
# Create seed corpus (valid JPEG files)
copy sample1.jpg in\
copy sample2.jpg in\
# Run WinAFL
afl-fuzz.exe -i in -o out -D C:\dynamorio\build\bin64 -t 20000 `
-- -coverage_module image.dll -target_module harness.exe `
-target_method fuzz_iteration -nargs 2 `
-- harness.exe @@
# Parameters:
# -i: Input corpus directory
# -o: Output directory for crashes/hangs
# -D: DynamoRIO bin directory
# -t: Timeout (ms)
# -coverage_module: DLL to track coverage
# -target_method: Function to fuzz (persistent mode)
# -nargs: Number of arguments
# @@: Replaced with input file path
Understanding AFL UI
┌─ process timing ────────────────────────────────┐
│ run time : 2 days, 4 hrs, 15 min, 2 sec │
│last new path : 0 days, 0 hrs, 3 min, 12 sec │
│ last uniq crash : 0 days, 1 hrs, 22 min, 7 sec │
│ last uniq hang : none seen yet │
└──────────────────────────────────────────────────┘
┌─ overall results ───────────────────────────────┐
│ cycles done : 147 │
│ total paths : 8234 │
│ uniq crashes : 12 │
│ uniq hangs : 0 │
└──────────────────────────────────────────────────┘
Key metrics:
- total paths: Unique execution paths discovered
- uniq crashes: Distinct crash signatures
- last new path: Time since coverage expansion (freshness)
Corpus Minimization
Reducing Redundant Inputs
# Minimize corpus using afl-cmin
afl-cmin.exe -i out\queue -o minimized_corpus -D C:\dynamorio\build\bin64 `
-- -coverage_module image.dll -target_module harness.exe `
-- harness.exe @@
# Minimize individual testcases
afl-tmin.exe -i crash.jpg -o minimized_crash.jpg -D C:\dynamorio\build\bin64 `
-- -coverage_module image.dll -target_module harness.exe `
-- harness.exe @@
afl-cmin: Keeps only inputs that trigger unique coverage afl-tmin: Reduces individual file size while preserving crash
Example reduction:
Original crash file: 45 KB
Minimized crash: 187 bytes (99.6% reduction)
Crash Triage and Analysis
Crash Deduplication
WinAFL uses stack hash for deduplication:
import hashlib
import os
def compute_crash_hash(crash_file):
"""
Compute unique crash signature from exception info
"""
# Parse crash dump
with open(crash_file, 'rb') as f:
data = f.read()
# Extract stack trace (simplified)
stack_trace = extract_stack_trace(data)
# Hash first 5 frames
frames = stack_trace[:5]
signature = '|'.join(frames)
return hashlib.md5(signature.encode()).hexdigest()
def deduplicate_crashes(crash_dir):
"""
Group crashes by unique signature
"""
crash_buckets = {}
for filename in os.listdir(crash_dir):
if not filename.startswith('id:'):
continue
path = os.path.join(crash_dir, filename)
crash_hash = compute_crash_hash(path)
if crash_hash not in crash_buckets:
crash_buckets[crash_hash] = []
crash_buckets[crash_hash].append(filename)
# Print unique crashes
for i, (crash_hash, files) in enumerate(crash_buckets.items()):
print(f"Crash bucket {i} ({len(files)} samples):")
print(f" Representative: {files[0]}")
print(f" Hash: {crash_hash}\n")
Exploitability Analysis with !exploitable
# Load crash in WinDbg
windbg -z crash.dmp
# Load !exploitable extension
.load msec.dll
# Analyze exploitability
!exploitable
# Output example:
# EXPLOITABILITY CLASSIFICATION: EXPLOITABLE
# EXPLANATION: User mode write AV starting at ntdll!RtlFreeHeap+0x254
# The target crashed on a user mode write access violation.
# The user mode write access violation occurred at address 0x41414141.
Exploitability ratings:
- EXPLOITABLE: High confidence of exploitation
- PROBABLY_EXPLOITABLE: Likely exploitable with effort
- UNKNOWN: Requires manual analysis
- PROBABLY_NOT_EXPLOITABLE: Unlikely to be exploitable
Advanced Techniques
Custom Mutators
Implement domain-specific mutations for structured formats:
// Custom mutator for JPEG files
#include "afl-fuzz.h"
// Preserve JPEG structure while mutating data
size_t afl_custom_fuzz(uint8_t** out_buf, uint8_t* in_buf,
size_t in_size, unsigned int seed) {
// Parse JPEG structure
jpeg_parser_t parser;
jpeg_parse(&parser, in_buf, in_size);
// Mutate only image data segments (preserve markers)
for (int i = 0; i < parser.num_segments; i++) {
if (parser.segments[i].type == JPEG_SEGMENT_DATA) {
// Apply AFL mutations to data segment
mutate_segment(&parser.segments[i], seed);
}
}
// Rebuild JPEG
*out_buf = jpeg_rebuild(&parser, &in_size);
return in_size;
}
Dictionary-Based Fuzzing
Provide magic values for better coverage:
# jpeg.dict - JPEG magic values
marker_soi="\xff\xd8"
marker_eoi="\xff\xd9"
marker_sos="\xff\xda"
marker_dqt="\xff\xdb"
marker_app0="\xff\xe0"
marker_comment="\xff\xfe"
# Use with -x flag
afl-fuzz.exe -i in -o out -x jpeg.dict ...
Parallel Fuzzing
Distribute fuzzing across multiple cores:
# Master instance (deterministic fuzzing)
start afl-fuzz.exe -i in -o out -M master01 -D C:\dynamorio\build\bin64 `
-- -coverage_module image.dll -- harness.exe @@
# Secondary instances (randomized fuzzing)
start afl-fuzz.exe -i in -o out -S slave01 -D C:\dynamorio\build\bin64 `
-- -coverage_module image.dll -- harness.exe @@
start afl-fuzz.exe -i in -o out -S slave02 -D C:\dynamorio\build\bin64 `
-- -coverage_module image.dll -- harness.exe @@
# Sync findings every 30 minutes
# WinAFL automatically shares interesting inputs between instances
Speedup: Linear scaling up to number of CPU cores
Real-World Case Studies
CVE-2020-0938: Adobe Font Manager
Target: Windows font parsing (atmfd.dll)
Harness:
// Minimal font loading harness
#include <windows.h>
int main(int argc, char** argv) {
HANDLE hFile = CreateFileA(argv[1], GENERIC_READ, 0, NULL,
OPEN_EXISTING, 0, NULL);
DWORD size = GetFileSize(hFile, NULL);
HANDLE hMap = CreateFileMappingA(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
LPVOID pFont = MapViewOfFile(hMap, FILE_MAP_READ, 0, 0, 0);
// Trigger font parsing
AddFontMemResourceEx(pFont, size, NULL, &hFontResource);
RemoveFontMemResourceEx(hFontResource);
UnmapViewOfFile(pFont);
CloseHandle(hMap);
CloseHandle(hFile);
return 0;
}
Results:
- 48 hours fuzzing
- 23 unique crashes discovered
- CVE-2020-0938: Out-of-bounds write in CFF font parsing
CVE-2019-0708: BlueKeep (RDP)
Target: Remote Desktop Protocol (TermDD.sys)
Approach:
# Network protocol fuzzing harness
import socket
import struct
def fuzz_rdp_handshake(mutated_packet):
"""
Send mutated RDP handshake packet
"""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("target", 3389))
# X.224 Connection Request
sock.send(mutated_packet)
response = sock.recv(1024)
sock.close()
return response
# WinAFL wrapper
def main(input_file):
with open(input_file, 'rb') as f:
packet = f.read()
try:
fuzz_rdp_handshake(packet)
except:
pass # Crash handled by WinAFL
Impact: Remote code execution without authentication
Integration with CI/CD
Continuous Fuzzing Pipeline
# GitHub Actions workflow
name: Continuous Fuzzing
on:
push:
branches: [main]
schedule:
- cron: '0 0 * * *' # Daily
jobs:
fuzz:
runs-on: windows-latest
timeout-minutes: 480 # 8 hours
steps:
- uses: actions/checkout@v2
- name: Setup WinAFL
run: |
choco install winafl
- name: Build harness
run: |
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
- name: Run fuzzing
run: |
mkdir corpus crashes
afl-fuzz.exe -i corpus -o crashes -D C:\DynamoRIO\bin64 `
-- -coverage_module target.dll -- harness.exe @@
timeout-minutes: 420 # 7 hours
- name: Analyze crashes
if: always()
run: |
python scripts/triage_crashes.py crashes/
- name: Upload artifacts
if: always()
uses: actions/upload-artifact@v2
with:
name: fuzzing-results
path: crashes/
Best Practices
1. Seed Corpus Quality
# Collect diverse valid samples
# Good: Various file types, edge cases, minimal samples
copy valid_grayscale.jpg corpus\
copy valid_rgb.jpg corpus\
copy valid_cmyk.jpg corpus\
copy valid_progressive.jpg corpus\
copy minimal.jpg corpus\ # Smallest valid file
# Bad: Redundant similar files
# Don't copy 1000 nearly-identical images
2. Timeout Configuration
# Measure baseline execution time
Measure-Command { .\harness.exe corpus\sample.jpg }
# Set timeout to 5-10x baseline
# Baseline: 50ms → Timeout: 250-500ms
afl-fuzz.exe -t 500 ...
3. Deterministic vs Random Fuzzing
Cycle 1 (Deterministic):
- Bit flips
- Byte flips
- Arithmetic mutations
- Known integers (0, -1, MAX_INT)
Cycles 2+ (Randomized):
- Havoc mode (stacked mutations)
- Splicing (combine inputs)
4. Coverage Maximization
// Ensure all code paths are reachable
void ParseImage(const uint8_t* data, size_t len) {
// Remove early exits that prevent coverage
// BAD:
if (len < MIN_SIZE) return; // Prevents fuzzing small inputs
// GOOD:
if (len < MIN_SIZE) {
len = MIN_SIZE; // Pad input
}
// Remove unreachable dead code
// Remove always-true checks that block paths
}
Conclusion
Coverage-guided fuzzing with WinAFL is the most effective technique for discovering vulnerabilities in Windows applications. By combining DynamoRIO instrumentation, intelligent corpus management, and persistent mode execution, researchers can systematically explore millions of execution paths to uncover critical security flaws.
Modern vulnerability research demands continuous fuzzing infrastructure integrated into development pipelines, enabling proactive security before attackers discover exploitable bugs.
References
- Zalewski, M. (2017). “American Fuzzy Lop - Technical Details”
- Swiecki, R. (2016). “WinAFL: AFL for Windows”
- Microsoft (2014). “!exploitable Crash Analyzer”
- Google Project Zero (2020). “Fuzzing at Scale”