Introduction
My Rusty Pension Fund is my attempt at writing tools for my bug bounty hobby.
I have scanners, fuzzers, wordlist generators, task managers, iOS and macOS apps, the WORKS! Although most of it doesn’t really work, neither have I ever found a real bug with any of these tools.
Still, all good fun to build and learn stuff!
Its focus is to build a very fast and memory/cpu optimized scanner for various scenarios. The main initial focus are specialized scanners for:
- TCP Syn port scanning
- TLS certificate Subject Alt Name scraper
- HTTP(S) scanning
The Rust library will implement it’s own custom network stack using libpnet to be able to optimize everything as much as possible, similar to the masscan service. The network stack should be easily extensible, adding new scan types in the future that leverage the custom network stack.
The core concept of the network stack is that it uses libpnet in datalink layer and contruct my own ethernet, ip and tcp packets. It will be running in two separate threads, one for sending packets and another for receiving packets. Similar to masscan, synchronisation of the send and receive thread is avoided by using the tcp sequence and ack numbers with a hash to keep track of what sending packet corresponds to what received packet.
How do the libraries work together
Main principle is:
All MRPF tools like scanning, filtering, fuzzing logic should be in external libs (eg. tcp_syn_scanner, cert_transparency, matchers, http1_scanner, etc).
This will allow us to build different ‘front ends’ for these tools.
MRPF Jobs
- The mrpf_core::tasks is the only place that defines Tasks.
- Task pull in all external MRPF tool libs in and executes the them in the execute function of the relevant tasks.
- The worker nodes are very simple. They pull in the mrpf_core::tasks library and run the tasks. With this architecture, it should be possible to have a single lambda that can execute any arbitrary task. This would make the AWS specific code very small, allowing us to move to other workers without any real effort.
TBD if we would separate the MRPF API for scheduling jobs as a separate api or integrate it with the MRPF API. A compromise could be to build a separate API, but do include endpoints for both MRPF API and MRPF Jobs into the MRPF API Client.
WebSocket interface for iOS/macOS frontend
The mrpf_scanner_api provides a WebSocket interface for real-time communication between iOS and macOS applications. It allows you to schedule tools and receive updates on their progress.
MRPF API
This is an API that stored all our recon data.
MRPF API Client
This is an async client for interacting with the MRPF API.
CLI
The mrpf_cli is a command-line interface for interacting with the MRPF framework. It allows users to initiate scans, manage tasks, and retrieve results directly from the terminal.
At the moment it hasn’t really been implemented yet.
How will we build REST API’s?
I want it to be reasonably easy to move away from AWS for my REST API’s. I’d also want to try and save costs as API Gateway can get expensive. You could run a Rust API fully inside a single lambda. There are two well known paths for this:
- Use the official lambda_http crate (ALB / API Gateway / Function URLs)
- Run a normal Axum/Hyper server inside Lambda via the AWS Lambda Web Adapter (eg. check this blog https://blog.yuki-dev.com/blogs/9qjgwg-des1z and AWS reInvent own slides here: https://d1.awsstatic.com/events/Summits/reinvent2023/BOA311_Unlocking-serverless-web-applications-with-AWS-Lambda-Web-Adapter.pdf)
Remember ‘The Algorithm’
Elon musk’s algorithm, should be applied to everything, in this specific order:
-
Question every requirement
What things do I REALLY need?
- Continuously find attack surface
- Quickly fuzz endpoints with payloads
- Detect anomalies in responses
What things don’t I really need but am I often looking for?
- Store all the results
- Have a fancy UI
-
Delete any part of the process you can
The most common thing that an engineer does is optimizing a thing that shouldn’t exist.
-
simplify and optimize
-
Accelerate cycle time
-
Automate
Core Principles
Computer systems, protocols and applications are all built on abstractions.
These abstractions help us reason about higher level concepts and speed up development by hiding complexity and to avoid reinventing the wheel.
However, in security research it’s crucial to understand the underlying reality behind these abstractions. Security vulnerabilities often arise from a mismatch between these layers of abstraction.
I love to understand how things really work. When you start to dig into the lower level systems, you develop a good intuition for how a system operates. I’ve always had a tendency to want to build everything from scratch, as if its cheating to use something that you don’t fully understand. This can hold you back but now with the advent of AI, learning and building systems has become much easier.
My goal with MRPF is to try to keep the underlying reality front and center. This sometimes comes at the cost of less intuitive or more verbose interfaces, but differentiates the toolset from most other tools out there.
TODO: Write my blog around misconceptions around hosts, ips, domains, root paths, dns, TLS SNI etc.
System Architecture
The system architecture of the MRPF project can be broken down into a few key components:
MRPF API
The MRPF API serves as the interface for clients to interact with the collected data and manage the scheduling of tasks.
Task Manager
The Task Manager is responsible for scheduling and managing various scanning tasks. It handles task creation, execution, and monitoring.
Network Engine
The Network Engine a fast network scanning engine based on masscan. It is responsible for sending and receiving network packets using a separate receive and transmit thread. The engine itself exposes traits which can be used to implement different scanning techniques.
TCP SYN Scanner
The TCP SYN Scanner is a specific implementation of a scanning technique that utilizes TCP SYN packets to identify open ports on target hosts. It leverages the Network Engine for packet transmission and reception.
HTTP/1.1 Scanner
The HTTP/1.1 Scanner is designed to perform HTTP/1.1 requests to target hosts and analyze the responses. It can be used to identify web servers, gather information about web applications, and detect potential vulnerabilities.
TLS Certificate Scanner
The TLS Certificate Scanner is responsible for retrieving and analyzing TLS/SSL certificates from target hosts. It can be used to identify the certificate issuer, expiration dates, and potential vulnerabilities in the SSL configuration. It uses the Network Engine for network communication and has a custom TLS implementation to extract certificate information without relying on external libraries.
DNS Resolver
NOT IMPLEMENTED YET
The DNS Resolver is responsible for querying DNS servers to resolve domain names to IP addresses. It utilizes the Network Engine to perform these queries and gather relevant data.
Whois Resolver
NOT IMPLEMENTED YET
The Whois Resolver is responsible for querying Whois databases to retrieve information about domain names and IP addresses. It utilizes the Network Engine to perform these queries and gather relevant data.
Models
The various components of the MRPF project utilize a shared set of models defined in the mrpf_models crate. These models define the data structures and types used throughout the system, ensuring consistency and interoperability between different components.
Iterator Models
Most of the models are reasonably straightforward. However, the Ipv4Range, Ipv4Addresses, Ports and PortRange models deserve special attention.
These models implement a custom Iterator that shuffles the order of IP addresses and ports to avoid predictable scanning patterns. The algorithm idea was taken from masscan.
The iterator ensures each item is returned only once but avoids having to store all items in memory at once. It accomplishes this by only storing the start and end values within the Ipv4Addresses and PortRange models. The iterator uses a Feistel cipher to generate a pseudo-random permutation of the range of values, allowing for efficient iteration without repetition.
The low memory footprint is very useful in our task manager as it reduces the SQS message sizes, the database storage requirements and the RAM usage of the workers.
In the future we may introduce similar iterators, for instance for domain names. When trying to fuzz for new subdomains, we could reduce memory footprint by storing it as a hierarchical structure instead of a flat list.
The MRPF API
The MRPF API allows clients to programmatically interact with the MRPF platform. It provides endpoints managing recon data like targets, domains and wordlists, as well as triggering tasks.
Current State
At the moment the code is still running on MPF Python codebase, with a DocumentDB backend. I would love to get this into rust for better performance and alreaady have some of the models defined in mrpf_models.
Some things I want to work on:
- Revisit the templating engine for rust. Think about how to represent things, our wordlist probably need to work better with bytes and then have methods to change things to utf-8/16/etc where applicable
- Move away from DocumentDB to PostgreSQL. This will give me back the triggers for timestamps that I very much like. Also, DynamoDB for at least transparency records was just to costly so lets get back to the drawing board
Ideas and Future Work
Had some insights?
For my MRPF API, I think I might be too quickly trying to push everything in full predefined structs. However, when reading and writing data, I often only want to have a subset:
- list all active fqdns of a target id
hmm, is it true? Is this the only real example I’ve found?
Ok, lets think about the write queries:
- tcp syn scan needs to append ports to an existing ip address
- SNI scanner needs to create new fqdn objects and services (ip/port that the sni was found on)
- Http scanner needs to update the WebApp content hash
- CrtSh needs to create fqdn objects
- DNS resolver needs to update fqdn objects, create new ones found through PTR, or update zones with their NS and SOA records
All these things can be done with my current task/job manager BUT are these actually not better to run continuously? Scans with larger amount of data can better bypass rate limits due to more randomization. Easier to alert when a new domain has been found?
Templating System
The mrpf_templates crate provides a standalone, configurable template engine used across MRPF for generating permutations of strings from variable data. It powers HTTP request building, DNS domain generation, task definitions, and more.
Core Concepts
A template is a string containing literals and expressions. Expressions are delimited by a configurable prefix/suffix pair — the default is $[/], while the HTTP scanners use ${/}.
Hello ${name}, welcome to ${place}!
Variables can have multiple values. When a template references multiple multi-valued variables, the engine produces the Cartesian product of all combinations.
Architecture
The system has three layers:
- Engine — configures delimiters, functions, reserved variables, post-processors, and nesting depth
- Template — a parsed AST from a template string
- Renderer — a lazy iterator that produces one rendered string per Cartesian combination
Engine::parse("...") → Template
Engine::renderer(template, variables, round_robin) → Renderer (Iterator<Item = String>)
The Engine is wrapped in Arc so multiple Renderer instances can share it across threads. Renderer is Send + 'static, suitable for scanner transmit threads.
Variable Tiers
The template system has three tiers of variables, resolved in this priority order:
1. Regular Variables (Cartesian Product)
Standard variables whose values are expanded as a Cartesian product. If path has 3 values and host has 2 values, the template produces 6 outputs.
Template: GET /${path} HTTP/1.1\r\nHost: ${host}\r\n\r\n
Variables:
path: [admin, login, api]
host: [example.com, test.com]
→ 6 rendered strings (3 × 2)
2. Static Variables (Round-Robin)
User-defined uppercase variables ([A-Z0-9_]+) that cycle through their values per Cartesian output without expanding the product count. Stored in the database as StaticVar with values directly in a TEXT[] column.
Template: ${GREETING} ${name}
Variables (cartesian):
name: [moon, world]
Static variables (round-robin):
GREETING: [hi, hello]
→ 2 outputs (NOT 4):
"hi moon" (GREETING[0])
"hello world" (GREETING[1])
Round-robin wraps around when the Cartesian product is larger than the number of static values:
Variables: name: [a, b, c]
Static: RR: [x, y]
→ "x a", "y b", "x c" (wraps at index 2)
Multiple round-robin variables cycle independently based on their own value counts.
3. Reserved Variables (Builtin Passthrough)
Uppercase keywords like IPV4, PORT, SNI, HOST, CONTENT_LENGTH, and FUZZ that are not resolved during template rendering. They pass through as literal ${IPV4} in the output and are filled later at transmit-time by the scanner’s replace_host_variables().
Template: ${host}:${IPV4}:${PORT}
Variables: host: [example.com]
→ "example.com:${IPV4}:${PORT}"
If the engine has a reserved set configured, referencing an unknown uppercase variable (not in data, not reserved) produces an UnknownReservedVariable error — catching typos like ${PROT} instead of ${PORT}.
Resolution Priority
When the engine encounters a variable reference:
- Data lookup — if the variable exists in the provided data map, use its values
- Reserved passthrough — if the name is in the reserved set, emit it as a literal
${NAME} - Unknown uppercase error — if a reserved set is configured and the name is all-uppercase, error
- Missing variable error — otherwise, error
This means you can override a reserved variable by providing it in the data map (e.g., providing FUZZ values directly).
Expressions
References
Simple variable reference:
${variable_name}
With JSONPath-style path access:
${object.property}
${array[0]}
${data.items[*].name}
${data.items.*.name}
Functions
Built-in functions: upper, lower, trim, join.
$[upper(name)] → "HELLO", "WORLD"
$[lower(name)] → "hello", "world"
$[trim(name)] → strips whitespace
$[join(names, " - ")] → "a - b - c"
Functions can be nested and take variable references or literal strings as arguments.
Custom functions can be registered on the engine:
#![allow(unused)]
fn main() {
engine.register("repeat", Box::new(|args| { ... }));
}
Literal Strings
Quoted strings inside expressions:
$[join(names, ", ")]
^^^^ literal string argument
Engine Configuration
Use the builder for full control:
#![allow(unused)]
fn main() {
let engine = Engine::builder()
.delimiters("${", "}") // custom delimiters
.with_builtins() // upper, lower, trim, join
.reserved(["IPV4", "PORT"]) // passthrough variables
.post_process(|s| s.to_lowercase()) // transform final output
.max_depth(5) // nested resolution passes
.build();
}
Nested Resolution
When max_depth > 1, the engine re-parses rendered output that still contains the prefix delimiter. This allows variables whose values contain further template expressions:
Variables:
greeting: ["${name} from ${place}"]
name: ["Alice"]
place: ["Wonderland"]
Template: ${greeting}
Pass 1 → "${name} from ${place}"
Pass 2 → "Alice from Wonderland"
Resolution stops when output stabilizes or max_depth passes are reached. Post-processors only apply to the final result.
Post-Processors
Transform every rendered string after all resolution passes:
#![allow(unused)]
fn main() {
.post_process(|s| s.to_lowercase())
.post_process(|s| format!("[{}]", s))
// Applied in order: "HELLO" → "hello" → "[hello]"
}
Rendering Modes
render_to_strings — Full Cartesian Set
Returns a HashSet<String> with all combinations materialized:
#![allow(unused)]
fn main() {
let results: HashSet<String> = engine.render_to_strings(&template, &data)?;
}
render_to_string — Joined Single String
Joins multi-valued expressions with a separator (default ,):
#![allow(unused)]
fn main() {
let result: String = engine.render_to_string(&template, &data, Some(", "))?;
}
renderer — Lazy Iterator
For scanners that need to stream combinations without materializing the full product:
#![allow(unused)]
fn main() {
let engine = Arc::new(engine);
let renderer = engine.renderer(template, &variables, &round_robin);
// renderer.len() returns total combinations (Cartesian only)
for result in renderer {
let rendered: String = result?;
// send to network...
}
}
The Renderer uses an odometer-style index to advance through combinations lazily. Round-robin variables are filled per iteration without affecting the combination count.
Usage Across MRPF
| Consumer | Delimiters | Features Used |
|---|---|---|
| Task Manager | $[ / ] | Full: functions, nested resolution, JSONPath |
| HTTP/1.1 Scanner | ${ / } | Renderer iterator, reserved vars, round-robin, post-processors |
| HTTP/1.1 Sequencer | ${ / } | Single-render, reserved vars |
| DNS Resolver | ${ / } | Renderer iterator |
| CLI auto-resolve | ${ / } | Reference extraction for API variable lookup |
Crate Structure
shared/mrpf_templates/src/
├── lib.rs # Public API re-exports
├── template.rs # Engine, EngineBuilder, Template, Renderer
├── parser.rs # Template string → AST parser
├── ast.rs # TemplatePart, Expr types
├── data.rs # TemplateData trait
├── functions.rs # FunctionRegistry, built-in functions
├── from_template.rs # FromTemplate trait for type conversions
├── error.rs # Error types
└── transform.rs # transform() function (feature-gated)
Network Engine
The core of my network stack is based on masscan. Here’s a diagram of the three threads used:
Sending Thread
Construct Ethernet Packet: Create the Ethernet frame with appropriate source and destination MAC addresses. Construct IP Packet: Create the IP packet with source and destination IP addresses. Generate Sequence Number: Generate a unique sequence number based on the source/destination IP and port pairs. Create TCP Packet: Construct the TCP packet with the generated sequence number and other necessary fields. Send Packet: Send the constructed packet over the network. Send Status: Notify the status report thread that a packet has been sent.
Receiving Thread
Listen for Incoming Packets: Continuously listen for incoming packets on the network. Filter Relevant Packets: Filter out packets that are not relevant based on the unique sequence number. Handle Packet: Process the relevant packet (e.g., extract data, acknowledge receipt). Send Status: Notify the status report thread that a packet has been received and handled.
Status Report Thread
Receive Status Updates: Continuously receive status updates from the sending and receiving threads. Update Status and Statistics: Update the current status and statistics of the scan based on the received updates. Print Status and Statistics: Print the updated status and statistics to the console or log. This diagram and description should help visualize the flow and interaction between the threads in your scanning application. If you have any further questions or need additional details, feel free to ask!
graph TD
A[Main Thread] -->|Start| B[Sending Thread]
A -->|Start| C[Receiving Thread]
A -->|Start| D[Status Report Thread]
B --> B1[Construct Ethernet Packet]
B1 --> B2[Construct IP Packet]
B2 --> B3[Generate Sequence Number]
B3 --> B4[Create TCP Packet]
B4 --> B5[Send Packet]
B5 -->|Send Status| D
C --> C1[Listen for Incoming Packets]
C1 --> C2[Filter Relevant Packets]
C2 --> C3[Handle Packet]
C3 -->|Send Status| D
D --> D1[Receive Status Updates]
D1 --> D2[Update Status and Statistics]
D2 --> D3[Print Status and Statistics]
Rate limiting
The transmit_handler function implements a rate limit bucket algorithm to control the rate at which packets are sent. The rate limit bucket algorithm is a mechanism to control the rate of packet transmission. It works as follows:
-
Initialization
- A token bucket is initialized with a certain number of tokens (
RATE_LIMIT_PACKETS_PER_INTERVAL). - Each token represents permission to send one packet.
- The bucket is refilled at regular intervals (
RATE_LIMIT_INTERVAL).
- A token bucket is initialized with a certain number of tokens (
-
Packet Transmission
- For each packet to be sent, the algorithm checks if there are tokens available in the bucket.
- If tokens are available, a token is consumed, and the packet is sent.
- If no tokens are available, the algorithm waits until the bucket is refilled.
-
Refilling the Bucket
- The bucket is refilled at a fixed interval (
RATE_LIMIT_INTERVAL). - When the interval elapses, the bucket is refilled to its maximum capacity (
RATE_LIMIT_PACKETS_PER_INTERVAL).
- The bucket is refilled at a fixed interval (
-
Handling Buffer Full Errors
- If the packet transmission fails due to a full buffer (
NO_BUFFER_SPACE_AVAILABLE_ERROR), the algorithm waits for a short period (100ms) before retrying.
- If the packet transmission fails due to a full buffer (
This algorithm ensures that packets are sent at a controlled rate, preventing network congestion and ensuring fair usage of network resources.
Great reads
This blog post nicely describes the pro’s and con’s of async Rust vs normal threads. It nicely illustrates that async Rust is not always the best choice for all use cases.
Obviously we would need to include how masscan works
And the bulk of the masscan code can be found in main.c
Infrastructure
Since we require privilege for using raw sockets, we no longer can run on AWS Lambda. However, fargate with spot pricing could be a good alternative.
We have to think about how to deploy this. AWS Batch could help, or we can create an sqs queue that will hold our tasks, then after pushing our tasks we’ll start up x amount of fargate tasks to retrieve stuff from sqs and kill them after a single job.
AWS Lambda ARM
- 0.0000000017 per 128mb per ms
- 0.0000017 per 128mb per second
- 0.000102 per 128mb per minute
- 0.0000000267 per 2048mb per ms
- 0.0000267 per 2048mb per second
- 0.0016 per 2048mb per minute
- pay per second
- size measured in memory
Fargate spot pricing ARM
- 0,00016429 per vCPU per minute
- pay per minute
- size measured in vCPU, minimum = 2Gb mem
This means lambda is 0,00016429 - 0.000102 = 0.00006229 CHEAPER per vCPU per minute, when we don’t care about memory. This means lambda is 0,0016 - 0,00016429 = 0.00143571 MORE EXPENSIVE per vCPU per second, when we compare to 2048mb lambda.
Lets say we will run 10 tasks for a full hour a day for every day of the month.
With lambda 2048mb we would pay 0.0016 60 10 * 31 = $29.76 With fargate we would pay 0,00016429 * 60 * 10 * 31 = $3.055794
UPDATE, Fargate cost described above is actually still missing the GB per hour. BUT I also see fargate tasks have a lower minimum, the minimum is 0.25vCPU with 0.5GB.
They have a fargate pricing calucation example (I don’t think it uses spot instances even!):
5 tasks running on ARM for 10 minutes every day, with 1vCPU and 2GB mem, for the whole month cost a total of $1.02.
That is missing data transfer cost and public ip cost but still, I think we can work with that!! AWS Batch will be great for this as well.
TLS Scraper
Ideas and Future Work
…
HTTP/1.1 Scanner
Ideas and Future Work
…
HTTP/1 Sequencer
The HTTP/1 Sequencer (mrpf_http1_sequencer) executes ordered HTTP/1 requests using raw packet I/O. Each step targets its own endpoint, enabling multi-host sequences. It supports variable extraction between steps and simultaneous request delivery using the first-sequence-sync technique for race condition testing.
Unlike the HTTP/1.1 Scanner which fires requests at scale across many targets, the sequencer executes steps sequentially — extracting data from responses to feed into subsequent requests. This makes it suitable for multi-step attack chains, cross-host authentication flows, and race condition exploitation.
Architecture
The sequencer uses two long-lived threads that persist for the entire sequence. A single datalink channel is opened once and reused across all steps. Before each step, the executor sends a SetTarget control message to the RX thread so it filters packets for the current step’s target.
graph TB
subgraph Executor Thread
E1[Render templates with current variables]
E2[Create connections + send SYN packets]
E3[Forward RX packets via TX]
E4[Wait for RaceReady signals]
E5[Fire trigger packets in tight loop]
E6[Collect step results]
E7[Run extractors + update variables]
E8[Send SequenceStepResult to progress]
end
subgraph RX Thread
R1[Filter incoming ethernet frames]
R2[Drive TCP handshake]
R3[Drive TLS state machine]
R4[Send outgoing packets to executor]
R5[Race: send future byte + signal RaceReady]
R6[Emit completed request/response pairs]
end
subgraph Progress Thread
P1[Collect SequenceStepResults]
end
E1 --> E2 --> E3
E3 --> E4 --> E5 --> E6 --> E7 --> E8
E8 -->|next step| E1
R1 --> R2 --> R3 --> R4
R3 --> R5
R3 --> R6
R4 -->|SendPacket| E3
R5 -->|RaceReady + trigger packet| E4
R6 -->|StepResult| E6
E8 -->|SequenceStepResult| P1
Thread Communication
The RX thread communicates with the executor via RxMessage:
| Message | Direction | Purpose |
|---|---|---|
SendPacket(Vec<u8>) | RX -> Executor | Outgoing TCP/TLS packet to send via TX |
RaceReady { syn_cookie, trigger_packet } | RX -> Executor | Connection ready for race trigger |
StepResult { syn_cookie, request, response } | RX -> Executor | Completed request/response pair |
Error { syn_cookie, error } | RX -> Executor | Connection-level error |
The executor controls the RX thread via RxControl:
| Message | Direction | Purpose |
|---|---|---|
SetTarget { ip, port } | Executor -> RX | Update packet filter for the next step’s target |
Stop | Executor -> RX | Shut down the RX thread gracefully |
Execution Flow
For each step in the sequence:
- Set target — send
RxControl::SetTargetto the RX thread with this step’s IP and port - Render templates with current variables (two-pass rendering using the step’s target)
- Create
SequencerConnection(s) with rendered payloads - Send SYN packet(s) via the datalink TX
- Event loop: forward packets from RX, collect results
- Extract values from responses (single steps only), merge into variable map
- Report
SequenceStepResultto progress thread
sequenceDiagram
participant E as Executor
participant RX as RX Thread
participant T as Target
Note over E: Step N begins
E->>T: SYN packet(s)
T->>RX: SYN-ACK
RX->>E: SendPacket (ACK)
E->>T: ACK
Note over RX: TLS handshake (if HTTPS)
RX->>E: SendPacket (ClientHello)
E->>T: ClientHello
T->>RX: ServerHello + Cert + Done
RX->>E: SendPacket (Key Exchange + Finished)
E->>T: Key Exchange + Finished
Note over RX: Encrypt & send HTTP request
RX->>E: SendPacket (encrypted request data)
E->>T: Request data
T->>RX: Response data
RX->>E: StepResult (request, response)
Note over E: Run extractors, update variables
Note over E: Step N+1 begins...
First-Sequence-Sync (Race Conditions)
The key innovation of the sequencer is first-sequence-sync for race condition testing. When a step uses SequenceStepType::Race, all requests are delivered to the server simultaneously with sub-microsecond precision.
Why Raw Packets?
Standard sockets with Barrier synchronization have 1-5ms of jitter between threads. Raw packets through a single datalink TX in a tight loop achieve microsecond-level jitter. Combined with first-sequence-sync, the server’s TCP stack delivers data to ALL connections simultaneously.
How It Works
For N concurrent race requests:
- All N connections complete TCP + TLS handshakes normally
- The full HTTP request is encrypted (TLS) or staged (plain TCP)
- All data is sent except the last 2 bytes per connection
- A “future byte” (the last byte) is sent at TCP sequence number
S+1— the server’s TCP stack buffers it because there’s a 1-byte gap at sequenceS - When all N connections signal “race ready”, N trigger packets (1 byte each at sequence
S) are sent in a single tight TX loop - Each trigger fills the gap, causing TCP reassembly and simultaneous data delivery to the application layer
sequenceDiagram
participant E as Executor
participant T as Target Server
Note over E,T: N connections established + TLS complete
rect rgb(70, 70, 120)
Note over E,T: Hold-back Phase (per connection)
E->>T: Request data (all except last 2 bytes)
E->>T: Future byte at seq S+1 (out-of-order)
Note over T: TCP buffers future byte (gap at seq S)
end
Note over E: All N connections "race ready"
rect rgb(120, 70, 70)
Note over E,T: Trigger Phase (tight loop)
E->>T: Conn 1: trigger byte at seq S
E->>T: Conn 2: trigger byte at seq S
E->>T: Conn N: trigger byte at seq S
end
Note over T: TCP reassembly completes for ALL connections
Note over T: Application receives N requests simultaneously
For TLS connections, withholding even 1 byte of a TLS record prevents the server from decrypting ANY of the HTTP request until the trigger arrives. This makes the synchronization even tighter than plain TCP.
Template System
Templates use a two-pass rendering system:
Pass 1 — User variables via mrpf_templates::Engine:
${VAR}syntax withskip_missing=true- Single value replacement only (no cartesian product)
- Variables from config + extracted from previous steps
Pass 2 — Host variables via byte-level Aho-Corasick replacement:
${IPV4}— target IP address${PORT}— target port${SNI}— TLS Server Name Indication${CONTENT_LENGTH}— computed from body after\r\n\r\n
Extractors
Extractors pull values from HTTP responses and store them as template variables for subsequent steps. They are only available on Single steps (not Race steps).
| Source | Description | Example |
|---|---|---|
Header(name) | HTTP header value (case-insensitive) | Set-Cookie |
JsonPath(path) | Dot-notation JSON path | data.token, items[0].id |
Regex { pattern, group } | Regex capture group | csrf_token=([^;]+), group 1 |
StatusCode | HTTP status code as string | 200, 302 |
Body | Entire response body | Full text content |
Connection Types
SequencerConnection supports both plain TCP and TLS, implementing the mrpf_engine::Connection trait:
- Plain TCP: Stages raw HTTP bytes, emits in MSS-sized chunks
- TLS: Uses
rustls::UnbufferedClientConnectionfor raw packet TLS, encrypts HTTP payload into appdata staging - Race mode: Both types support hold-back of last 2 bytes for first-sequence-sync
File Structure
scanners/mrpf_http1_sequencer/src/
├── lib.rs # Public API: Http1Sequencer
├── config.rs # Http1Sequence, SequenceStep, SequenceStepType, Extractor, SequenceTarget
├── connection.rs # SequencerConnection: Connection impl, TLS+plain, race mode
├── error.rs # Error/Result types
├── executor.rs # Step orchestration, TX forwarding, trigger sending
├── extractor.rs # Response data extraction (header, JSON, regex, etc.)
├── progress.rs # SequencerMessage, SequenceStepResult, ProgressHandler
├── receive.rs # RX thread: packet processing, TLS driving, race signaling
└── template.rs # Two-pass template rendering
Tcp Syn Scanner
Ideas and Future Work
…
DNS Resolver
Ideas and Future Work
…
Whois Resolver
Ideas and Future Work
…
Ideas for Improvement
This is a random collection of ideas I have for improving the network engine. Collected from random notes and thoughts I had lying around everywhere, trying to bring more structure to my notes.
Refactor network stack
Remove dependency on libpnet and re-implement linux and macOS code myself. This will prepare me to choose the most optimal transmission method for the two platforms and for instance use XDP/eBPF to reduce kernel signals.
I suspect the following library has a lot of the code that I could reuse for more flexibility to dig into the way we’re sending packets:
https://github.com/pkts-rs/tappers
This will probably also avoid me from receiving packets not destined for me as we’ll be having a dedicated interface.
I have to look at this part, it uses writev to send a packet. It states that only ip and ipv6 are supported so I think it means there’s no ethernet support. At the moment, the only reason for having ethernet support is to be able to spoof my ip address. Since we’re going to have a dedicated virtual interface, we can just set the ip address of the interface to the one we want to spoof and we should be good to go.
Had a quick look at the race condition attack and that uses ip and tcp, so we should be able to build that already.
It’s also really nice as it already splits the interface into tun(send) and tap(receive) devices. This will make it easier to split it between threads.
Before doing this, I should carefully read this blog post, as it might be that the tun/tap device actually has MORE syscalls than the current implementation.
If a TUN/TAP device is not the way to go, look at libpnet and re-implement the relevant parts myself.
Idea
This came to me during an outdoor run, the first one in a month or two letting myself go. A good reminder that working out really benefits you and makes life better overall.
LATER UPDATE, I did some work on this: Check the following branch: https://github.com/thiezn/mrpf/compare/main...mm/attempttomovetosendmsg
My idea:
I’m building a verty fast and memory efficient tcp syn scanner using rust pnet library on the datalink layer.
I’ve split transmit and receive of packets into separate threads.
The transmit thread at the moment composes a tcp packet from the ethernet layer up, and then uses the pnet datalink tx channel to send traffic out. The send function is taking up a large percentage of the total memory used.
What if i would create a larger buffer where i combine multiple packets, and then finally send that full buffer using the datalink tx channel in one go. Does the network layer below that be able to send those packets in one go and would it perhaps reduce memory footprint and less switching of user and kernel space?
Clearly think about your response and outline the pro’s con’s and potential pitfalls
Copilot’s response (Replaced initial response with a new response on GPT5.2. Perhaps we should re-run the question everytime there are more capable models?)
⚠️ Quick safety note (because the same techniques are dual‑use): I’ll focus on generic packet I/O efficiency (buffering, syscall/copy reduction, and memory behavior) and not on tactics for aggressive scanning or evasion.
Short answer
No—you generally can’t concatenate multiple Ethernet frames into one big buffer and send them “in one go” through pnet’s datalink TX channel and expect the kernel/NIC to split them into multiple packets. A datalink send call corresponds to one link-layer frame; the kernel will treat your buffer as one frame, and if it’s larger than MTU/jumbo limits it will typically be dropped or error out. Linux’s packet socket docs explicitly describe TX as queuing the user supplied buffer as a packet to the driver, unmodified. [man7.org]
If your goal is fewer copies/allocations, the more “native” way in pnet is to use build_and_send (construct in-place in a preallocated write buffer), and/or use a kernel interface designed for batching such as PACKET_MMAP / PACKET_TX_RING—but that’s outside what pnet exposes directly. [docs.rs], [kernel.org]
Why concatenation doesn’t work (and what would happen)
1) Frame boundaries matter at L2
Packet sockets / datalink TX is message-oriented: you provide one buffer = one packet. Linux packet sockets say: “When transmitting a packet, the user-supplied buffer should contain the physical-layer header. That packet is then queued unmodified to the network driver…” [man7.org]
So if you concatenate N frames:
- The kernel/NIC won’t “parse” your blob into N frames.
- You’ll instead attempt to transmit one oversized Ethernet frame.
2) MTU/jumbo constraints become a hard limit
Ethernet has an MTU (typically 1500 payload; ~1514/1518 with headers/FCS). If your concatenated “super-frame” exceeds MTU/jumbo settings, you’ll see:
sendtoerrors (depending on backend),- driver drops,
- or silent drops downstream (switch/NIC). [man7.org]
3) Even with jumbo frames, it’s still one frame
Jumbo frames just increase the maximum size of a single frame; they do not turn one write into multiple frames.
Bottom line: concatenation is not a batching mechanism at Ethernet level.
What will help: in-place build + reusable buffers (within pnet)
Use build_and_send instead of send_to
pnet_datalink::DataLinkSender documents two TX paths:
send_to(packet, ...)may require an additional copy depending on OS/backend. [docs.rs]build_and_send(num_packets, packet_size, closure)calls your closure with a mutable slice backed by pnet’s write buffer, letting you build in-place and avoid that copy. [docs.rs], [docs.rs]
That’s the closest thing pnet provides to what you’re trying to do (reduce allocations/copies), without changing the semantics of “one frame per send”.
Pros
- Lower allocation pressure / fewer transient buffers (build into a preallocated write buffer). [docs.rs], [docs.rs]
- Often lower memory footprint in profilers because you’re not allocating per packet in user space (or causing extra copies). [docs.rs]
Cons / pitfalls
- You must size
packet_sizecorrectly and keep construction strictly within that slice. - Your throughput can still be limited by syscall-per-packet and driver/NIC queueing (you’ve reduced copies, not necessarily syscalls).
- Some backends/options are “hints” and may be ignored. [docs.rs]
Tune pnet::datalink::Config for buffer sizing
Config has write_buffer_size and read_buffer_size (default 4096) that influence internal buffering.
While this doesn’t enable “multi-frame send”, it can: [docs.rs]
- prevent “insufficient capacity” paths in
build_and_send, - reduce internal reallocations if you accidentally trigger growth elsewhere. [docs.rs], [docs.rs]
Pitfall: Oversizing these buffers can increase baseline RSS if you create many channels/threads.
If your real goal is fewer syscalls / kernel crossings: use an API that supports batching
If your profiler shows “send” dominating time/memory, the root cause is often syscall/copy overhead and kernel socket buffer churn, not the packet crafting itself.
PACKET_MMAP / TX_RING (Linux): designed for batching & fewer copies
Linux kernel docs describe PACKET_MMAP as a shared ring buffer mapped into user space that can be used to send/receive packets efficiently, reducing syscalls and packet copies; for transmission, multiple packets can be sent through one system call.
A practical explanation also notes you can place multiple packets into the TX ring and then make one syscall to notify the kernel to transmit them. [kernel.org] [csulrong.github.io]
Pros
- Fewer syscalls (batch kick).
- Fewer copies (shared memory ring reduces user↔kernel copying). [kernel.org], [csulrong.github.io]
- Potentially much better CPU efficiency at high packet rates.
Cons / pitfalls
- Complexity: ring management, status flags, alignment, backpressure.
- Portability: Linux-specific (and kernel-version nuances).
- Still not “free”: NIC limits, qdisc, driver behavior can bottleneck.
- Not directly exposed by
pnet’s public API (you’d likely need a different crate or custom bindings). (pnet_datalinkmentions Linux uses AF_PACKET, but doesn’t claim PACKET_MMAP support.) [docs.rs]
AF_XDP / DPDK / netmap (even faster, even more complexity)
These are the next tier for raw packet TX at very high rates, but they’re much more invasive and typically require:
- pinned memory,
- special driver support,
- significant setup code,
- reduced portability.
(Just flagging the existence; not recommending unless you truly need it.)
“Bigger buffer” idea — pros, cons, pitfalls (explicitly)
✅ Potential pros (in theory)
- If your current code allocates one
Vecper packet, switching to a reused buffer (not concatenated) can reduce allocator churn and transient memory. - If you move to a ring/buffer pool, you can improve cache locality and reduce fragmentation.
❌ Cons / why it won’t accomplish what you think (with pnet)
- Concatenation does not mean multi-frame TX; it means one oversized frame. [man7.org]
- You may hit MTU limits and drop/error. [man7.org]
- Even if you keep the buffer ≤ MTU, you’ve just built one packet, not many.
⚠️ Pitfalls you’re likely to run into
- Silent drops due to size/driver constraints.
- Incorrect framing (Ethernet preamble/FCS are not in your buffer; NIC handles them per frame—another reason it can’t “split”).
- Backpressure behavior changes: buffering many packets in user space can increase latency and burstiness.
- Offload interactions: if you craft checksums yourself, NIC offloads can cause surprises; you may need to manage/disable offloads depending on your environment (generic caution).
Practical recommendations (lowest effort → highest impact)
1) Make sure you’re using build_and_send
This is the intended way in pnet to avoid extra copies vs send_to. [docs.rs], [docs.rs]
2) Reuse packet templates / avoid per-packet allocations
- Build a baseline Ethernet+IPv4+TCP SYN template once in a fixed-size buffer.
- For each packet, only mutate the fields that change (src IP/port, seq, checksum).
- Use a small buffer pool if you need parallelism.
(You’ll typically see big wins if your “send” cost is actually “allocate+copy+send”.)
3) Tune Config.write_buffer_size
Ensure it’s comfortably ≥ your frame size so build_and_send has capacity and doesn’t fall back or fail. [docs.rs], [docs.rs]
4) If you truly need batching across syscalls: switch backend
If the syscall boundary is the bottleneck, the real fix is a batching-capable mechanism like PACKET_MMAP TX_RING on Linux.
This is a design change (and likely not pnet), but it’s the correct way to do “send many packets with one kick”. [kernel.org], [csulrong.github.io]
A couple clarifying questions (so I can give sharper advice)
- Which OS are you targeting (Linux only, or cross-platform)?
pnetbackends differ a lot. [docs.rs] - When you say “send is taking a large percentage of total memory used”: do you mean RSS growth, allocations/sec, or heap retained (e.g., from
jemalloc/mimallocstats)? - Are you currently calling
send_towith a freshly allocatedVec<u8>each time, or reusing a buffer?
If you share a small (sanitized) snippet of your TX loop (just buffer creation + pnet send call), I can point out the most likely allocation/copy hotspots and how to restructure them without changing your overall architecture.
Reducing pselect calls
Looking at the flame graph, the receive thread spends quite some time in the pselect system call (about half). Obviously since it uses a timeout, it could be there are just no packets to receive. However, looking at the implementation, it seems the pnet code Is only reading a single packet when pselect is ready.
Instead, we should try to read more packets to avoid extra pselect calls when more packets are ready.
UPDATE: No actually it seems the code IS reading multiple packets. It’s dependent on the read buffer size of the datalink, so we could perhaps tweak performance a bit by playing around with the buffer size there.
TLS parse improvements
Task Manager
The Task Manager is the orchestration layer for all MRPF scan workflows. It coordinates task lifecycle, fan-out/fan-in execution, scheduling, and worker dispatch. It runs as a stateless AWS Lambda triggered by SQS messages and scheduled events.
Core Concepts
Manager – Stateless orchestrator holding a DbClient and QueueClient. Initialized once during Lambda cold start and reused across invocations. Processes incoming SQS messages through a carefully ordered pipeline. Holds no mutable state between invocations.
TaskCollection – A complete workflow definition wrapping a root TaskContainer. Serialized as JSON for SQS transport. The collection’s state is derived from its root container’s state.
TaskContainer – Tree node with Sequential or Parallel execution mode. Children are either nested TaskContainers or Tasks. Child ordering matters in sequential mode. Container state is computed from its children’s states (not stored directly).
Task – Individual work unit. Holds a TaskKind (the definition/payload), TaskState, CompletionPolicy, and a timeout. Task IDs are UUIDv7, generated in code (not by the database) to support idempotent at-least-once SQS delivery.
TaskKind – Enum of all concrete task types: generators (TcpSynGenerator, DnsScanGenerator, TlsScanGenerator), workers (TcpSyn, DnsScan, TlsScan), aggregators (TcpSynAggregator, DnsScanAggregator, TlsScanAggregator), data tasks (InsertData, ListModels, UpsertModels, Filter variants, Conditional), and notifications (ErrorNotification, GenericNotification, TaskResultNotification).
Task State Lifecycle
Pending -> Running -> Succeeded
-> Failed
-> Timeout
-> PendingTaskCreation -> Succeeded (when all child tasks created)
-> PartiallyFailed
Terminal states: Succeeded, Failed, Timeout, PartiallyFailed, Disabled
- Pending – Not yet scheduled.
- Running – Dispatched to a worker and actively executing.
- PendingTaskCreation – Exclusive to generator tasks. The generator has finished producing
CreateTaskmessages but is waiting for the Manager to confirm all child tasks have been inserted (viacreated_task_idsmatchingexpected_task_count). - PartiallyFailed – Applied to containers when some (but not all) children failed. Also set when tolerated failures exist.
- Disabled – Skipped entirely. Disabled children do not count toward container state.
- Timeout – The task exceeded its
timeout_secswhile in Running or PendingTaskCreation state.
Manager Pipeline
The Manager::run method processes messages in a fixed order. The ordering prevents race conditions and ensures each step operates on the most current state.
- parse_messages – Categorize incoming SQS messages into three buckets:
CreateTaskCollection,CreateTask, andCompleteTask. - process_task_timeouts – Query for running/pending-creation tasks that have exceeded their timeout. Set them to
Timeout. Returns affected root container IDs. - create_task_collections – Insert new task collections into the database (containers first, then tasks, in a single transaction with
READ COMMITTEDisolation). - create_tasks_from_generators – Insert tasks produced by generator workers. Updates the generator’s
created_task_idsarray. Looks up the generator’s parent container if no explicitparent_idis provided. - process_task_completions – Apply
CompleteTaskmessages: update task state in the database (withFOR UPDATErow lock), optionally store task data. Always adds the root container to the re-evaluation set, even on error. - complete_generator_tasks – Find all tasks in
PendingTaskCreationstate wherecreated_task_idslength equalsexpected_task_count. Transition them toSucceeded. Uses a recursive CTE to resolve root container IDs. - schedule_next_tasks – For each root container that had activity, reconstruct the full
TaskCollectionfrom the database (usingREPEATABLE READisolation), walk the tree to find pending tasks, set them toRunning, and dispatch to the appropriate worker queue. If bare metal tasks are dispatched, launch EC2 instances. - finalize_task_collections – Delete completed collections from
task_managertables (CASCADE handles descendants). Updaterecon.job_historywith final state and completion timestamp. - start_due_task_collections – Query
recon.jobsfor eligible jobs (not running, recurrence not 0) usingFOR UPDATE ... SKIP LOCKED. Evaluate cron schedules. Insertjob_historyrows, decrement recurrence, and sendCreateTaskCollectionmessages back to the Manager queue.
Fan-out/Fan-in Pattern
Distributed tasks follow a three-stage pattern within a sequential container:
-
Generator (e.g.,
TcpSynGenerator) – Reads input parameters, partitions work, and produces N worker tasks plus 1 aggregator task. SendsCreateTaskmessages to the Manager queue. Transitions toPendingTaskCreationwith anexpected_task_count. -
Worker tasks (e.g.,
TcpSyn) – Execute in parallel within a parallel container. Each stores intermediate results intask_manager.task_data. -
Aggregator (e.g.,
TcpSynAggregator) – Runs after all worker tasks complete. Reads results fromtask_data/task_collection_data, combines them, and writes final output.
The sequential container ensures the aggregator waits for all parallel workers to reach a terminal state before starting.
CompletionPolicy
Each task carries a CompletionPolicy that controls sequential container behavior on failure:
- FailOnFailure (default) – A failed task stops the sequential container. The container propagates the failure.
- ContinueOnPartialFailure – Continue if the task partially fails (some sub-tasks failed).
- ContinueOnFailure – Skip the failed task and proceed to the next child. The failure is tolerated but tracked: the container enters
PartiallyFailedat completion if any tolerated failures occurred.
Concurrency Safety
- The Manager is the single point of coordination. Workers never modify task state directly in the database.
- Workers send
CompleteTaskmessages via SQS. The Manager applies state transitions. - Task state updates use
SELECT ... FOR UPDATErow locks to prevent concurrent modifications across Lambda invocations. start_due_jobsusesFOR UPDATE ... SKIP LOCKEDso concurrent Manager invocations do not double-start the same job.- Task collection reads use
REPEATABLE READ, READ ONLYtransaction isolation for a consistent snapshot during tree assembly. - Task collection writes use
READ COMMITTED, READ WRITEisolation.
SQS Message Types
TaskManagerQueueMessage (Manager queue):
CreateTaskCollection { collection }– Full workflow definition to insert and start.CreateTask { task, generator_id, parent_id }– New task to insert, generated by a running generator.CompleteTask { task_id, state, root_container_id, data, expected_task_count }– Worker completion notification.
WorkerQueueMessage (Worker queues):
StartTask { root_container_id, task }– Dispatched by the Manager to a worker queue.
Infrastructure
The system leverages AWS services for scalability and security, with SQS for event-driven invocations.
%%{init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#0b5cab",
"primaryTextColor": "#ffffff",
"lineColor": "#6b7280",
"tertiaryColor": "#eef2ff",
"fontFamily": "Segoe UI, Roboto, Helvetica, Arial, sans-serif"
},
"flowchart": { "diagramPadding": 8, "curve": "basis" }
}}%%
flowchart TD
%% ---------------------------
%% Nodes (defined first)
%% ---------------------------
EB[EB Scheduler]
TM[Task Manager]
WP1[Workers]
WP2[Workers]
SQS[SQS Queue]
TT[(Task Table)]
DT[(Data Table)]
ST[(Statistics Table)]
%% ---------------------------
%% Subgraphs / groupings
%% ---------------------------
subgraph PG[PostgreSQL]
ST
TT
DT
end
subgraph VPC[AWS VPC]
TM
PG
WP2
end
subgraph EXT[External]
WP1
end
%% ---------------------------
%% Edges
%% ---------------------------
EB -->| Check task timeouts
5 min | SQS
EB -->| Gather statistics
1 hour | SQS
EB -->| Cleanup old tasks and data
1 day | SQS
WP1 -->| Push completion | SQS
WP2 -->| Push completion | SQS
SQS -->| Trigger invoke | TM
TM -->| Store statistics | ST
TM -->| Manage tasks | TT
TM -->| Mutate data | DT
WP2 -->| Mutate data | DT
TM -->| Dispatch tasks | WP1
TM -->| Dispatch tasks | WP2
%% ---------------------------
%% Styling
%% ---------------------------
%% Database tables as cylinders (already set by [( )]); add color:
classDef db fill:#89CFF0,stroke:#0096FF,stroke-width:1px,color:#3b2f00;
class ST,TT,DT db;
classDef rounded rx:8,ry:8,stroke:#2b5fab,stroke-width:1.2px,fill:#0b5cab,color:#ffffff;
class TM,WP1,WP2 rounded;
classDef roundedInfra rx:8,ry:8,stroke:#F36717,stroke-width:1.2px,fill:#E25606,color:#ffffff;
class EB,SQS roundedInfra;
%% Subgraph backgrounds & borders
style PG fill:#fff6e5,stroke:#ff8c00,stroke-width:2px,rx:10,ry:10
style VPC fill:#f0f7ff,stroke:#0b5cab,stroke-width:1.5px,rx:10,ry:10
style EXT fill:#f7f7f7,stroke:#9ca3af,stroke-width:1px,rx:10,ry:10
%% Links (edges)
linkStyle default stroke:#6b7280,stroke-width:2px
Tasks and Containers
Data Management
Each task collection run maintains temporary state within the tasks_data table. This allows tasks to build on previous outputs. The task templating variables allow you to define how data is passed between tasks.
Workers
Workers execute tasks dispatched by the Manager. Three worker types exist, differentiated by environment, capabilities, and queue.
Worker Types
| Type | Environment | DB Access | Raw Sockets | Queue |
|---|---|---|---|---|
| External | Lambda (outside VPC) | No | No | External SQS |
| Internal | Lambda (inside VPC) | Read/Write task_data + recon | No | Internal SQS |
| Bare Metal | EC2 (inside VPC) | Read/Write task_data + recon | Yes | Bare Metal SQS |
External Workers
Run on AWS Lambda outside the VPC. No database access. No raw socket capabilities.
Used for tasks that call external services: crt.sh, Censys, Shodan, and similar public APIs. Results are returned to the Manager via CompleteTask messages with an optional data payload. The Manager writes this data into task_manager.task_data on the worker’s behalf.
Internal Workers
Run on AWS Lambda inside the VPC. Direct read/write access to task_manager.task_data, task_manager.task_collection_data, and recon tables.
Used for data mutations, filtering, aggregation, model upserts, and any task that needs to query or modify stored results without raw socket access. Examples: Filter variants, ListModels, UpsertModels, InsertData, aggregator tasks, Conditional.
Bare Metal Workers
Run on EC2 instances inside the VPC. Full database access plus raw socket capabilities.
Used for TCP SYN scanning, custom TLS scanning, DNS scanning, and any task requiring raw packet construction. These are the only workers that can run the mrpf_engine-based scanners.
Lifecycle:
- Launched on-demand by the Manager via EC2
RunInstanceswhen bare metal tasks are dispatched. - Uses a launch template (
mathijs-worker-ami) with$Latestversion. - On startup, the worker polls the Bare Metal SQS queue. If no messages arrive within a configurable timeout (environment variable), the worker shuts itself down to minimize costs.
- Maximum 5 concurrent instances (capped by
MAX_BARE_METAL_WORKERS).
Capacity handling:
- The Manager accepts an optional comma-separated list of subnet IDs (
BARE_METAL_WORKER_SUBNETSenv var) across different availability zones. - It iterates through instance type alternatives (
None/launch-template default,t4g.micro,t4g.small,c8gd.medium,c7g.medium) and subnets, attempting each combination until one succeeds or all are exhausted. InsufficientInstanceCapacityerrors trigger fallback to the next subnet/instance-type combination.
Task Routing
The Manager routes each task based on WorkerRequirements declared by its TaskDefinition:
requires_raw_sockets() -> Bare Metal queue
requires_internal_worker() -> Internal queue
otherwise -> External queue
The routing logic in detail:
- If
WORKER_REQUIREMENTScontainsRawSocketAccess-> Bare Metal queue. - If
WORKER_REQUIREMENTScontainsTaskDataAccessorReconDataAccess(and NOTRawSocketAccess) -> Internal queue. - Otherwise -> External queue.
WorkerRequirements
Each task definition declares a static WORKER_REQUIREMENTS slice:
- TaskDataAccess – Direct database access to
task_manager.task_dataandtask_manager.task_collection_data. - QueueAccess – Access to the Task Manager SQS queue (for generator tasks that produce
CreateTaskmessages). - ReconDataAccess – Access to the
recondatabase tables (for reading/writing models like targets, domains, endpoints). - RawSocketAccess – Ability to send/receive raw packets (TCP SYN, custom TLS handshakes).
Task generation and aggregation
The task manager allows for dynamic generation of distributed tasks based on data captured during a task collection run.
For example, we might capture new domain names from scraping TLS certificates of known hosts on a target. Then we can feed these new domains into task generator for DNS resolution. The Task generator will create the required tasks in the running task collection to a destination container.
These dynamically generated tasks typically will be run in parallel across multiple workers to avoid rate limits and speed up the process. The output of these tasks are all the same and often we want to aggregate the results into a single output destination. For this we introduce task aggregators. These task aggregators are also created by the generator task and put in the queue of the task collection to be run after the generated tasks are done. The aggregator will then collect all the results from the generated tasks and aggregate them into a single output.
Ideas and Future Work
…
Templating
Our Task Manager has a built-in templating engine that allows you to provide static data, references to data stored in the task collection database table and the use of functions for basic data transformations.
The rendered result of a template is a DataKind, which will always be a collection of values of the same type. This is even the case if there’s just a single value, it will be wrapped in a set.
Literals
The most basic type of expression is a literal. This allows you to provide a static value directly to a task argument. Some examples are:
- String literal:
some_literal_string - Integer literal:
42
References
References allow to retrieve data from various places in the system or task definition. The basic syntax for a reference is as follows:
$[<reference_type>:<key>]
Where <reference_type> indicates the type of the reference and <key> is the identifier of the specific data you want to reference. At the moment we support Data references and Task Parameter References. Since data references are the most commonly used, the reference type can be omitted and it will default to a data reference.
Data References
Data references allow you to reference data stored in the task collection’s data storage. During runtime of a task, these references will be resolved to their actual values from the database.
The basic syntax for a data reference is as follows:
$[some_key]
Where some_key is the key of the data within a task collection you want to reference. When providing an expression to a task argument, the task manager will evaluate the expression at runtime and replace it with the corresponding value from the task collection data storage (retrieved from the PostgreSQL database).
A lightweight version of JSONPath syntax is supported for accessing nested data structures. Here are some examples:
-
Retrieve a nested key
$[another_key.hello] -
Access an array element by index
$[another_key.hello[0]]
TODO: THis example doesn’t match my another_key data example, fix it
- Return an array of values from an array of objects
$[another_key.how.are.[*].you]or$[another_key.how.*.you]
Our database supports a variety of data types, including set of strings, JSON objects and predefined common models like IPv4 Ranges and Domains. When referencing data, the task manager will automatically handle type conversions as needed to ensure that the data is in the correct format for the task argument.
Here is an example of some data that might be stored in a task collection. Specific object properties can be extracted through the JSONPath syntax mentioned above.
| key | kind | value |
|---|---|---|
| some_key | set of strings | [“some”, “string”] |
| another_key | generic_object | {“Hello”: [“World”, “Moon”], “how”: [“are”: [{“you”: “doing”}, {“things”: “going”}]]“}, …} |
| some_ips | ipv4_ranges | [{“start”: “127.0.0.1”, “end”: “127.0.0.1”}, …] |
| some_domains | set of domains | [{“fqdn”: “example.com”, “is_active”: True, “dns_chain”: [“a.com”, [“10.0.0.1”, 10.0.0.2]]}, …] |
Task Parameter References
In certain cases you want to be able to reference to parameters defined within the task itself. This can be especially handy if you want to re-use standard values for different arguments.
One example of this is when performing HTTP requests. Often a simple GET request will have the same content. It might look something like this:
GET / HTTP/1.1
Host: example.com
Lets say we want to fuzz across different SNI values. We could construct a standard body and use self referencing data variables to insert the SNI value in both the TLS SNI field and the Host header.
- kind: http1
ip: 10.0.0.1
port: 443
tls: true
sni: $[host_fuzzing]
content: |
GET / HTTP/1.1
Host: $[task:sni]
Combining this with our data references, we could create standard content payloads and re-use them across multiple tasks.
- insert_data:
key: http_get_content
kind: string
value: |
GET / HTTP/1.1
Host: $[task:sni]
- kind: http1
ip: 10.0.0.1
port: 443
tls: true
sni: $[host_fuzzing]
content: $[http_get_content]
...
Combined literals and references
You can combine literals and data references within a single expression. When doing so, the task manager will evaluate the entire expression and produce a set of values based on all possible combinations of the literals and referenced data.
Lets take the folllowing example:
- Combining a literal with a data reference
prefix_$[some_key]_suffix
If some_key contains the values ["some", "string"], the resulting set of values would be:
prefix_some_suffixprefix_string_suffix
Functions
In addition to simple data references, the expression syntax supports a variety of built-in functions that can be used to manipulate and transform data. Here are some examples:
capitalize($[some_data_key]): Converts a string to uppercase.split($[some_data_key], "."): Splits a string into an array based on the specified delimiter (in this case, a period).
Examples
Here are some more complex examples that combine literals, data references and functions:
Retrieving and transforming data
This expression first splits the string retrieved from some_data_key at each period, takes the first element of the resulting array, and then capitalizes it.
capitalize(split($[some_data_key], ".")[0])
Accessing nested data
This expression retrieves an array of values from the property field of each object in the array_key array within the some_data_key data structure.
$[some_data_key.array_key[*].property]
We also support the alternative wildcard JSONPath syntax:
$[some_data_key.array_key.*.property]
Applying templates to task definitions
Here’s an example of how these expressions might be used in the context of a TCP SYN task generator:
- kind: tcp_syn_generator
ipv4_address_ranges: "$[target_ip_ranges]"
ports:
- 80
- "$[https_ports]"
The result of rendering this template would be a set of TCP SYN tasks, each with a specific IP address from the target_ip_ranges data reference and a destination port that is either 80 or one of the ports specified in the https_ports data reference.
Note that templates are always converted to a set of values, even if the result is a single value. This ensures consistency in how task arguments are handled.
In the example above, if https_ports contains the values [443, 8443], the resulting task ports variable would contain a single set of integers {80, 443, 8443}. ipv4_address_ranges would contain a set of all individual IP addresses derived from the provided ranges.
For ease of use, parameters for task templates can be either a single expression or an array of expressions. If an array is provided, the results of each expression will be combined into a single set.
For example, the following two configurations are equivalent:
ipv4_address_ranges: "$[target_ip_ranges]"
ipv4_address_ranges:
- "$[target_ip_ranges]"
HTTP fuzzing examples
When performing HTTP fuzzing, we’d often want to iterate over several different data variables (e.g. wordlists) and generate all possible combinations. Lets see how this can be achieved with our templating engine.
Lets say we have the following data stored in our task collection:
| key | kind | value |
|---|---|---|
| path_traversal | array of strings | [“../”, “.;/”] |
| paths | array of strings | [“/admin”, “/login”] |
| hostnames | array of strings | [“localhost”, “127.0.0.1”] |
We can then define an HTTP fuzzing task template like this:
- kind: http_fuzz
method: GET
host: target.com
sni: target.com
body: |
GET /$[path_traversal]/$[paths] HTTP/1.1
Host: $[hostnames]
When this template is rendered, the task manager will generate a set of HTTP fuzzing tasks that cover all combinations of the provided path traversal strings, paths and hostnames. This would generate $2 (path_traversal) * 2 (paths) * 2 (hostnames) = 8$ unique body payloads:
GET /../admin HTTP/1.1
Host: localhost
GET /.;/admin HTTP/1.1
Host: localhost
GET /../login HTTP/1.1
Host: localhost
GET /.;/login HTTP/1.1
Host: localhost
GET /../admin HTTP/1.1
Host: 127.0.0.1
GET /.;/admin HTTP/1.1
Host: 127.0.0.1
GET /../login HTTP/1.1
Host: 127.0.0.1
GET /.;/login HTTP/1.1
Host: 127.0.0.1
Note that we could have taken this further by also templating the host and sni fields, but this should give you an idea of how powerful the templating engine can be when combined with data stored in the task collection.
Examples
Here are some examples of task collection structures to illustrate different use cases. Note that this is the YAML representation of how developers would define these task collections. The MRPF API will convert these definitions in a slightly different JSON internal representation to allow for the various features like task generators, data aggregations, conditionals and loops.
- name: Example Task Collection
description: An example task collection demonstrating various features.
tasks:
- sequential:
- kind: get_target
target_id: "victim"
output: target
- parallel:
- kind: tcp_syn_scanner
ipv4_address_ranges: "$[target.ip_ranges]"
ports: "80,443,8080-8089"
output: open_ports
- kind: dns_lookup
domains: "$[domains[*].fqdn]"
record_types: A
output: domains
- if:
- when:
contains: { var: "$[open_ports]", value: "443" }
then:
- kind: http_fuzz
method: GET
host: "$[target.domain]"
sni: "$[target.domain]"
tls: true
content: |
GET $[paths] HTTP/1.1
Host: $[target.domain]
output: http_responses
- when:
contains: { var: "$[open_ports]", value: "80" }
then:
- kind: http_fuzz
method: GET
host: "$[target.domain]"
tls: false
content: |
GET $[paths] HTTP/1.1
Host: $[target.domain]
output: http_responses
- else:
- kind: notification
message: "No HTTP ports open on $[target.domain], skipping HTTP fuzzing."
- if:
- when:
non_empty: "$[http_responses]"
then:
- loop:
condition: "$[http_responses[*].status_code]"
do:
- kind: notification
message: "Received status code $[item] from $[target.domain]"
- else:
- kind: notification
message: "No HTTP responses for $[target.domain]"
- loop:
condition: "$[http_responses[*].status_code]"
do:
- kind: notification
message: "Received status code $[item] from $[target.domain]"
CLI Tool
The MRPF CLI (mrpf) is a command-line interface to all MRPF scanners. It provides subcommands for each scanner type, a TOML-based configuration system, flexible input/output options, and API integration for notifications.
Architecture
The CLI follows a clean separation between argument parsing, configuration, and scanner execution:
graph TB
A[main.rs] -->|parse args| B[cli.rs<br>Clap derive structs]
A -->|load config| C[config.rs<br>TOML + CLI merge]
A -->|dispatch| D[commands/]
C -->|build| E[ScannerConfig]
D --> F[dns.rs]
D --> G[whois.rs]
D --> H[tcpsyn.rs]
D --> I[tls.rs]
D --> J[http.rs]
D --> K[generate.rs]
D --> L[notify.rs]
F & G & H & I & J -->|use| E
F & G & H & I & J -->|write| M[output.rs<br>console / JSON file]
L -->|uses| N[mrpf_api_client]
Subcommands
| Command | Purpose | Requires root |
|---|---|---|
mrpf dns | DNS resolution (A, AAAA, CNAME, MX, etc.) | Yes |
mrpf whois | WHOIS domain lookups | Yes |
mrpf tcpsyn | TCP SYN port scanning | Yes |
mrpf tls | TLS/SNI certificate discovery | Yes |
mrpf http | Templated HTTP/1.1 requests | Yes |
mrpf generate | Generate sample JSON input files or config | No |
mrpf notify | Send notification via MRPF API | No |
Configuration
The CLI uses a layered configuration model where CLI arguments always take priority:
CLI flags > config file > auto-detected defaults
Config file location
- Default:
~/.mrpf/config.toml - Override:
mrpf --config /path/to/config.toml <command>
Only the [cli] section is read by mrpf_cli.
Config file format
[cli]
interface = "en0"
# src_ip = "192.168.1.100" # optional — auto-detected from interface
# router_ip = "192.168.1.1" # optional — auto-detected from interface
rate_limit = 1000 # packets per second
base_url = "https://api.mrpf.example.com" # used by `mrpf notify`
api_key = "your-api-key-here" # used by `mrpf notify`
The ScannerConfig is built by resolving the interface, extracting its MAC and IPv4 address, and applying any overrides from the config file or CLI flags. When src_ip or router_ip are not explicitly configured, they default to the interface’s actual IP and the first IP in its subnet respectively.
Input / Output
Targets
All scanner subcommands accept targets in two ways:
- CLI positional arguments:
mrpf dns example.com google.com - JSON input file:
mrpf dns -f dns_input.json
The mrpf generate <type> command creates sample input files pre-filled with sensible defaults for each scanner type.
Output
- Console (default): Pretty-printed JSON to stdout
- JSON file:
--output json --output-file results.json
API Integration
The notify subcommand uses mrpf_api_client to send notifications to the MRPF API via PUT /notifications. This enables the CLI to report scan results or status updates that trigger push notifications to the Apple app.
The API client uses reqwest with rustls-tls and authenticates via the x-api-key header.
Design Decisions
Why clap derive over builder
Clap’s derive API keeps argument definitions close to their types, making it easy to add new subcommands by creating a new file in commands/ with its Args struct. The Commands enum in cli.rs is the single point where all subcommands are registered.
Why TOML for config
TOML is simple, human-readable, and works well with serde while allowing multiple services to share a single ~/.mrpf/config.toml via separate sections (e.g. [cli], [scanner], [proxy]).
Why one-shot tokio runtime for notify
The scanner subcommands are synchronous (they use the engine’s thread-based architecture). Only the notify command needs async (for reqwest). Rather than making the entire binary async, a one-shot tokio::runtime::Runtime is created only when needed.
The Apple universal iOS/macOS app
To make it easier to work with my recon data and task scheduler I’ve created a universal iOS/macOS app in Swift. It provides a nice frontend for all my tools and data.
The alternative was to build some kind of web frontend, but truth be told, I just don’t enjoy writing Javascript. Swift, and especially SwiftUI, feels a lot more fun and rewarding to build things with and can have a lot more focussed user experience on phones.
The current iteration works with the older MPF API and task manager, but I want to move this to the new MRPF API and task manager once I’ve built that out a bit more.
HTTP Repeater/scanner
The MRPF Scanner API provides a websockets interface into the various scanners built on top of the MRPF network engine. The current macOS app is able to interact with it and I’m trying to work towards a similar functionality as the repeater in Caido and Burp. Instead of making single requests, I’ve built the templating engine into it, similar to how I’m constructing the task manager. This allows you to more easily fuzz things, I guess it’s more akin to the intruder in Burp.
I feel I should be able to find a better balance than Caido and Burp for the UI, and feel a mixture between requests and the intruder tab is getting me amost there. I need to iterate more. Other things that would really help with it is a more mature wordlist generators from the app. The killer feature will be my ability to bring together the task manager for scanning, all the collected data and the repeater/inspector in one app.
Whats the current status?
The iOS/macOS app needs work, this would be really nice to give a big refactor but I want to leverage the latest macOS 26 version. This also introduces copilot directly in XCode so should help me learn Swift and best practices a lot faster. The time to be a solo developer is now, finally I’m able to build everything myself if I just manage to keep focus on the things I really want to move forward..
- Better handling of textarea in my ‘burp’ mimicking feature
- Revisit the job template composition. There’s a bunch of inefficient strange code, which I think I should be able to make more ergonomic in the swift language. All those casting, generics and codable stuff is a mess
- Fully buy into the two column Split View and make the macOS design aligned with liquid glass. Althernative might be to switch completely to a Tabview design. Apparently on iPadOS this tab view now transforms in a sidebar automatically, not sure if this carries over to macOS as well?
- macOS works ok-ish but iOS is lagging behind. What do I want to do, it probably needs a few different design patterns to work well on the platform. Some actions might just not be suited for a phone either.
- Make a more robust wordlist section. Especially Apple’s easy integration with language model can be very helpful here to generate new wordlists on the fly. I also need to dig into the wordlist problem a lot deeper and try to take it up to a more professional level. I need to be able to support different encodings, rate words by potential impact, link things across targets, think about efficient storage and retrieval in the database, full integration with the templating engine, etc.
MRPF Universal Application Architecture
The app’s architecture is a modular, concurrency-safe design for a universal SwiftUI app using Swift 6 features (async/await, actors for isolation). It decouples UI from data/network/navigation/error logic, emphasizes non-blocking operations, and supports extensibility for new models/endpoints. Key updates incorporate reusable ErrorPublisher instances for both global and scoped error handling, addressing multi-window requirements on macOS (e.g., independent scanner errors per window, shared REST errors across all). AppContext centralizes initialization and propagation, simplifying complex setups across platforms, windows, previews, and backgrounds.
- Decoupling and Data Flow: SwiftUI views (UX) interact only with high-level services/repositories/managers/publishers via @Environment and async calls (e.g., fetch(), startScan(), navigation triggers). Views use @Query for reactive local data from SwiftData, keeping them ignorant of network/caching/navigation/error details. Business logic is offloaded to actors/services/managers.
- Concurrency and Performance: All network/storage ops are async and background-threaded. Actors (e.g., DataRepository, ScannerService) isolate mutable state to prevent data races. URLSession and WebSocket tasks run non-blockingly; SwiftData uses background contexts for writes to avoid UI stalls.
- Navigation Management: NavigationManager (@Observable class or actor) is scene/window-scoped, handling navigation state (e.g., paths/stacks) independently per window on macOS. Created and injected per WindowGroup instance for multi-window isolation.
- Network Clients:
- APIClient (actor): Global, handles REST requests with error throwing.
- WebSocketClient (actor): Global, manages WS connections, sending, and receiving as an AsyncStream. Shared across services.
- Data Management:
- DataRepository (actor): Global facade for REST/WS on models. Handles caching, fetches, subscriptions, and storage. Publishes errors to global ErrorPublisher.
- ScannerService (@Observable actor): Scoped per window/instance (e.g., one per macOS window). Handles scan workflows, streams results, optional storage. Publishes errors to its own scoped ErrorPublisher.
- Storage: SwiftData (ModelContainer) is app-global, initialized in AppContext. Models are @Model-conformed; services use background contexts.
- Error Handling (Reusable/Decoupled): ErrorPublisher is a reusable @MainActor @Observable class managing a list of errors for multi-error support and selective dismissal:
- var errors: [ErrorItem] = [] where ErrorItem = (id: UUID, error: Error, title: String).
- Methods: publish(_ error: Error, title: String), dismiss(id: UUID), dismissAll().
- Global Instance: For app-wide errors (e.g., REST API). Shared across all windows; injected app-wide.
- Scoped Instances: For per-instance errors (e.g., scanner-specific). Each ScannerService has its own ErrorPublisher; injected to the relevant window/view subtree.
- UX Display: Each window’s root view uses a custom ErrorOverlay or .alert that observes both global and local (scanner) publishers, combines their errors into a single list, and provides dismiss buttons (per error or all). This enables:
- Global errors to appear in all windows.
- Scoped errors to appear only in the affected window.
- Concurrent errors shown together; dismiss one without affecting others.
- Central Coordination with AppContext: AppContext (shared instance or singleton) initializes globals (e.g., ModelContainer, APIClient, DataRepository, WebSocketClient, global ErrorPublisher). It provides factories for scoped objects (e.g., createScannerService() -> ScannerService with new ErrorPublisher). Propagation:
- App-wide globals via .environment in App body.
- Scene/window-scoped (e.g., NavigationManager, ScannerService with scoped ErrorPublisher) created/injected in WindowGroup builder for per-window isolation on macOS.
- Handles platform-specifics: Background wakeups for notifications, mock data for Xcode previews, window scene setup on macOS.
- Extensibility: Add models via protocols. New services reuse clients/publishers. AppContext centralizes additions.
- UX Bridge: Views trigger in Task {}, observe streams/status/@Query. No ViewModels. Navigation via scoped manager. Errors via combined overlay for global+scoped. This ensures scalability, multi-window correctness (independent scanner states/errors on macOS), and reusable error handling per 2026 SwiftUI practices.
graph TD
subgraph "App Setup & Coordination"
AppContext[AppContext Class
(init globals: ModelContainer, APIClient, Repository, WebSocketClient, global ErrorPublisher; factories for scoped: ScannerService w/ scoped EP, NavigationManager; handles previews/background)]
AppContext -->|app-wide injection
(e.g., repository, global EP)| App[App Struct]
App -->|scene-specific creation/injection
(e.g., new NavManager, new ScannerService w/ scoped EP per window)| WindowScene[WindowGroup/Scene
(multi-window on macOS)]
AppContext -.->|handles previews/background/windowscene| PreviewsBackground[Previews, Background Wakeups, Window Scenes]
end
subgraph "UX Layer"
WindowScene -->|binds scoped: NavManager, ScannerService, scoped EP| Views[SwiftUI Views (UX)]
Views -->|async calls (e.g., fetch())| DataRepository
Views -->|async calls (e.g., startScan())| ScannerService
Views -->|observes @Query| SwiftData[SwiftData (ModelContainer)]
Views -->|observes status/streams| ScannerService
Views -->|triggers navigation| NavigationManager[NavigationManager
(scoped per-window)]
RootViewPerWindow[Root View per Window] -->|custom ErrorOverlay: combines & shows lists from global + scoped EPs
(dismiss one/all)| ErrorPublisherGlobal[Global ErrorPublisher
(app-wide errors, e.g., REST)]
RootViewPerWindow -->|custom ErrorOverlay| ErrorPublisherScoped[Scoped ErrorPublisher
(per-scanner/window errors)]
end
subgraph "Service Layer (Actors/Managers)"
DataRepository[DataRepository Actor
(global; caching, storage)] -->|uses for REST| APIClient[APIClient Actor
(global)]
DataRepository -->|uses for subscriptions| WebSocketClient[WebSocketClient Actor
(global)]
ScannerService[ScannerService @Observable Actor
(scoped per-window; scan logic)] -->|uses for WS| WebSocketClient
end
subgraph "Network/Storage"
APIClient -.->|network I/O| RemoteREST[Remote REST API]
WebSocketClient -.->|network I/O| RemoteWS[Remote WS Service]
DataRepository -->|background ops| SwiftData
ScannerService -->|optional background store| SwiftData
end
subgraph "Error Handling (Reusable)"
DataRepository -->|publishes (MainActor.run)| ErrorPublisherGlobal
ScannerService -->|publishes| ErrorPublisherScoped
APIClient -->|throws| DataRepository
WebSocketClient -->|throws| ScannerService
end
style AppContext fill:#f9f,stroke:#333
style App fill:#f9f,stroke:#333
style WindowScene fill:#f9f,stroke:#333
style Views fill:#bbf,stroke:#333
style NavigationManager fill:#fd9,stroke:#333
style DataRepository fill:#fd9,stroke:#333
style ScannerService fill:#fd9,stroke:#333
style APIClient fill:#dfd,stroke:#333
style WebSocketClient fill:#dfd,stroke:#333
style SwiftData fill:#9f9,stroke:#333
style ErrorPublisherGlobal fill:#f99,stroke:#333
style ErrorPublisherScoped fill:#f99,stroke:#333
style RemoteREST fill:#ccc,stroke:#333
style RemoteWS fill:#ccc,stroke:#333
style PreviewsBackground fill:#ccc,stroke:#333
MRPF Scanner API
This is a websockets interface to the different scanners built on top of the MRPF network engine. It can be run on any machine that has Rust, at the moment focussed on running it on my mabook itself, but I can see it being useful to run on a VM in the cloud. It would be good if we can somehow get the VM being part of the workers of our task manager as these things have some overlap. The MRPF Scanner API should be only a frontend for the scanners, not do any scanning itself to keep separation of concerns.
Running websockets on any server is great as we can have bare-metal workers this way (or for instance my macbook). What would also be nice though is to actually use AWS Lambda for certain tasks here as well. We could leverage AWS WebSocket API Gateway with a lambda backing. Apparently all the keepalive stuff is handled by API gateway, you only pay for real messages and the lambda execution time.
Certificate Transparency Records
The Certificate Transparency records produced by the big certificate issuers are a goldmine for finding new domains. The most popular way to retrieve this is through crt.sh website or better yet through their PostgreSQL database. However, they have stricter rate limits and more difficulty getting all the records returned for larger subdomains. We can do this better so I’ve written my own code that can scrape the certificate transparency logs directly from the issuers.
However, the main problem eventually is with costs. I wanted to use AWS DynamoDB for this, and although I got it to work and learned a lot by how to model things there, it turns out it’s quite costly for this usecase. I am better off moving this to PostgreSQL. Also, the lambda invocation costs are quite high so makes more sense to run the initial scraping of all older logs on EC2, my VPS or my macbook. Once the initial bulk is done, we could probably use lambda to keep the incremental updates going.
Ideas and Future Work
…
TODO
I’m afraid I need to rework my approach again. I figured out that the initial connection timeout is very important way to get rid of a lot of rate limits. Because of this, my approach of sending ranges in batches of 1 per log server isn’t holding up great. This is causing the loop to wait for the connection timeout fully until it processes a new batch. Instead I should
- Create a proper RateLimit class, similar to the rust sdk interface
- Tweak rate limits and connection timeouts find the optimal balance per server.
- Provide the scan log servers with a
HashMap<LogRange, Vec<Ranges>> - Once a range for a particular log server is completed, pluck another one from the HashMap
- If a range fails to be completed in the rate limit timeframe, move to the next range and leave the range as Pending in the database.
- See if we can store the logserver optimal rate limits and preferred range sizes in the database. We could base the latter on the average entry count that a log server returns. so servers with 1024 entries per request can use larger range counts than servers that return lower count like 32. Something like
LOGSERVER#<url> SK: PROPERTIES.
#![allow(unused)]
fn main() {
struct RetryConfig {
max_retries: usize = 3
initial_backoff: Duration = 3
max_backoff: Range<Duration> = [20..22] // range to randomize
exponential_backoff: bool = true
step: Range<Duration> = [1..2] // With exponential, we will increase this exponentially, otherwise we will do this linearly. We use a range to randomize
}
struct LogServer {
retry_config: RetryConfig
average_entry_size: u16 = 1024
url: String
mirror: Optional<String>
}
struct CertTransparencyClient {
log_server: LogServer,
retry_config: RetryConfig,
max_range_failure: usize = 3 // maximum amount of time a range can fail until we completely want to stop trying
}
impl CertTransparencyClient {
async fn get_entries(range) {
// Retrieve entries
// retrieve entries
// if entries = 0, retry
// if entries != 0, reset max_retries
// if MaxRangeRetries, return error. user of the certclient should skip this range and try another range
// if MaxLogServerRetries, return error. of of the certclient should abandon this log server completely
// if range completed, return all entries that are retrieved OR
// Do we want to handle the name extraction here already? Maybe this code above should be the function get_names, and get_entries should return RateLimited and connection errors errors. Then this function will parse those errors and either, retry, raise MaxLogServerRetries or MaxRangeRetries, or return the entries.
}
}
}
Coding Guidelines
See .github/copilot-instructions.md for the full coding guidelines and project overview.
Faster macOS builds
You can get faster builds on mac by disabling XProtect on the terminal app.
Here’s some better docs on how to add terminal to developer tools: https://nexte.st/docs/installation/macos/#gatekeeper
Error Handling
My prompt:
I am writing a large rust codebase with several libraries and binaries. The code will only be used by myself so doesn’t have to take external users into account at all.
I want to have a proper way of handling errors without having to write too much boiler plate code and matches everywhere.
What are best practices for error handling in my scenario and what do experts advice on this? I would prefer not to bring in external crates. Perhaps jon Gjengset has good advice or others?
- Should I use ? as much as possible
- What should my typical return type be? Result<something, Error?>
- How to handle/catch Errors defined by other external crates and capture them easily upstream without too much hassle
- Should i define my own error types to make this easier?
- Is there a way to avoid using Box all over the place?
Think hard about maintainability for my requirements and being ideomatic rust
The 80/20 pattern (no external crates)
I suspect this 80/20 line comes from this blog.
The Rust by Example book has a chapter that looks a lot like this approach.
Per library crate
- Define one crate‑local error enum and a
Result<T>alias. - Implement
Display,std::error::Error, and a fewFrom<_>conversions for external error types you actually use. - Lean on
?everywhere; it auto‑converts viaFrom. 12
Per binary crate
Return Result<(), Box<dyn std::error::Error>> from main (or use your own top‑level error). This is the only place you need a Box, keeping it out of the rest of your code. Rust’s Termination trait makes main returning a Result work as expected (nonzero exit when Err). 3
Minimal template
#![allow(unused)]
fn main() {
// lib/src/error.rs
use std::{error::Error as StdError, fmt};
#[derive(Debug)]
pub enum Error {
Io(std::io::Error),
Json(serde_json::Error),
// Add more *typed* variants as needed.
// Optionally a catch‑all if you want (kept rare):
Other(String),
}
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Error::Io(e) => write!(f, "I/O error: {e}"),
Error::Json(e) => write!(f, "JSON error: {e}"),
Error::Other(msg) => write!(f, "{msg}"),
}
}
}
impl StdError for Error {
fn source(&self) -> Option<&(dyn StdError + 'static)> {
match self {
Error::Io(e) => Some(e),
Error::Json(e) => Some(e),
Error::Other(_) => None,
}
}
}
// Focused conversions so `?` just works:
impl From<std::io::Error> for Error {
fn from(e: std::io::Error) -> Self { Error::Io(e) }
}
impl From<serde_json::Error> for Error {
fn from(e: serde_json::Error) -> Self { Error::Json(e) }
}
// Your crate-wide Result alias
pub type Result<T> = std::result::Result<T, Error>;
}
#![allow(unused)]
fn main() {
// lib/src/lib.rs
mod error;
pub use error::{Error, Result};
use std::fs;
pub fn load_config(path: &str) -> Result<serde_json::Value> {
let data = fs::read_to_string(path)?; // auto-converts io::Error -> Error via From
let v = serde_json::from_str(&data)?; // auto-converts serde_json::Error -> Error via From
Ok(v)
}
}
// bin/src/main.rs
use mylib::{load_config, Error};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let cfg = load_config("config.json")?; // prints a neat chain on failure
println!("{cfg}");
Ok(())
}
- Why this works well:
?is the standard way to propagate errors; it’s concise and idiomatic. 14- Implementing
std::error::Error+Displaylets errors compose and present well. 2 - Keeping
Box<dyn Error>only at the binary boundary avoids “Box everywhere” while still lettingmainfail gracefully with a proper exit code viaTermination. 3
Adding lightweight context (still no external crates)
If you sometimes want more context (e.g., which file failed to parse), add a single contextual variant and a tiny helper trait:
#![allow(unused)]
fn main() {
// in error.rs
use std::borrow::Cow;
#[derive(Debug)]
pub enum Error {
Io(std::io::Error),
Json(serde_json::Error),
Context { msg: Cow<'static, str>, source: Box<dyn StdError + Send + Sync> },
Other(String),
}
impl From<std::io::Error> for Error { /* as before */ }
impl From<serde_json::Error> for Error { /* as before */ }
// Extension trait to tack on context ergonomically
pub trait ResultExt<T> {
fn context(self, msg: impl Into<Cow<'static, str>>) -> Result<T>;
}
impl<T, E> ResultExt<T> for std::result::Result<T, E>
where
E: StdError + Send + Sync + 'static,
{
fn context(self, msg: impl Into<Cow<'static, str>>) -> Result<T> {
self.map_err(|e| Error::Context { msg: msg.into(), source: Box::new(e) })
}
}
}
Usage:
#![allow(unused)]
fn main() {
use crate::error::ResultExt;
fn load_config(path: &str) -> Result<serde_json::Value> {
let data = std::fs::read_to_string(path)
.context(format!("reading config from {path}"))?;
serde_json::from_str(&data)
.context(format!("parsing JSON from {path}"))
}
}
- This keeps your enum typed for the common cases and uses one boxed variant only when you add context, not “all over the place.”
- If you want stack traces during debugging, you can attach a
std::backtrace::Backtracefield toContextand enable it withRUST_BACKTRACE=1(orRUST_LIB_BACKTRACE=1). 5
Answers to your specific questions
1) “Should I use ? as much as possible?”
Yes. Use ? whenever you cannot or do not want to recover at that level; it’s the idiomatic, concise way to propagate errors, and it triggers From to convert error types along the way. 1
Reserve explicit match only for cases where you’re handling the error (retrying, defaulting, logging with additional side‑effects, etc.). The Book’s error chapter lays out this philosophy (recoverable Result vs unrecoverable panic!). 4
2) “What should my typical return type be?”
- In library crates:
pub type Result<T> = std::result::Result<T, crate::Error>;with a crate‑specificErrorenum. This keeps signatures short and type‑checked. 2 - In binary crates: either reuse your library
Erroror keep the boundary generic:fn main() -> Result<(), Box<dyn std::error::Error>>(simplest), relying onTerminationto convertErrto a non‑zero exit code. 3
3) “How do I capture external crate errors upstream without hassle?”
Add From<TheirError> impls to your error enum; then ? “just works.” Also, implement source() to expose the underlying error so higher layers (or your CLI) can print the chain. The standard library’s Error trait is designed for this chaining. 2
#![allow(unused)]
fn main() {
impl From<reqwest::Error> for Error { fn from(e: reqwest::Error) -> Self { Error::Other(e.to_string()) } }
// or better: add a dedicated variant `Http(reqwest::Error)` and a Display arm
}
When adding context, use the small ResultExt::context trick above rather than writing verbose map_err everywhere.
4) “Should I define my own error types to make this easier?”
Yes. A domain‑specific error enum per crate is a widely used pattern. It avoids stringly‑typed errors, makes it easy to distinguish cases, and composes well. The Book/std::error::Error docs show how to implement Display + Error so they print nicely and chain correctly. 24
If you expect your library crates to evolve, you can mark the enum #[non_exhaustive] (optional, more relevant for public APIs).
5) “Is there a way to avoid using Box all over the place?”
Absolutely. Keep errors typed (enums) throughout your libs; use at most one Box<dyn Error>:
- at the binary boundary (
main) for convenience, and/or - in a single contextual variant (if you want arbitrary sources + messages).
Everywhere else, use concrete variants, which are zero‑cost and ergonomic with?.
A few “expert‑style” tips
- When to
panic!: only for programmer‑bugs / invariant violations (e.g.,unreachable!(), “this should never happen”), not for I/O or user‑driven failures. The Book explicitly distinguishes recoverable from unrecoverable errors. 4 - Display vs Debug: make
Displayshort and human‑readable; rely onsource()chains for details.Debugcan be derived. 2 - Backtraces in development: consider adding a
Backtracefield (captured on construction) to your contextual variant and enable it via env vars when you need it. 5 mainergonomics: returningResultfrommainis fine and standard; theTerminationtrait glues it to exit codes. If you need full control over formatting, handle the top error yourself instead of returning it. 3
If you ever relax “no external crates”
Many experienced Rustaceans use:
thiserror(deriveError+Display+#[from]on variants) to eliminate boilerplate in library error enums. 6anyhowin binary crates for an ergonomicanyhow::Error+.context()builder (you can still keep typed errors in libs). 7
You don’t need them for your scenario, but they are the shortest path to zero boilerplate if you change your mind.
TL;DR (what to do tomorrow)
- Per lib: create
Errorenum +Result<T>alias; implement a handful ofFromimpls andDisplay. - Use
?everywhere you propagate;map_erronly when you genuinely handle/transform; optionalResultExt::contextto add messages. 1 - Per bin:
fn main() -> Result<(), Box<dyn std::error::Error>>(or your top‑level typed error) and letTerminationhandle exit codes. 3 - Keep
Boxat the edges, not in your core APIs.
If you want, I can generate the error scaffolding for each of your crates (enums + From impls + ResultExt) based on the external errors you use—just tell me which crates pop up most (e.g., tokio, reqwest, serde_json, rusqlite, etc.). Also, are you OK with a single derive crate (thiserror) if it replaces ~50–100 lines of boilerplate per crate, or do you want to stay 100% std-only? 6
Bug Bounty Ideas
This section contains random ideas I have regarding bug bounty hunting, security research and general security topics.
AI-Generated Wordlists
- Build instructions for different attack vectors or reconnaissance techniques.
- Provide a target (e.g. a domain, ip range or a MRPF Target object), existing wordlist and the build instruction to an AI model and let it generate a new wordlist. Store the generated words and run your scans. Gather statistics on the results and repeat the process.
Good articles to get ideas from for instructions
Protections implemented by AWS for scanning
Blackfoot is the EC2 elastic networking backend
AWS MadPot:
- When they detect a connection to a ip that’s not allocated to any Elastic IP they will know it’s a scanner so they can mark your ip. This is not great for my TLS scanner…
- Sonaris is the S3 protection
Important part will be if you can run your scanner on AWS infrastructure that uses dynamic outbound public ips, they very likely won’t block your IP as that would mean they could impact other AWS customers.
I’m not sure where I got the above information anymore, I think it was in a podcast somewhere from Critical Thinking?
Summary of some services
Learned this during some discussions
- Mithra - network to inspect DNS request per region. Is also give a ‘benign reputation source’,that guardduty uses to prevent false positives. Route53 domain blocking also uses Mithra, perhaps also some AWS internal services use it but wasn’t really clear.
- MadPot - Think a more standard honeypot solution. When it detect a proper validated attack, it can replicate the blacklisted ips to the whole network.
- Blackfoot - analyze all inbound and outbound flows (13T flows an hour) to VPCs. How many of these come from malicious ips, and they use MadPot to determine if it’s a really malicious ip.
- Sonaris - internal threat intelligence tool looks at network traffic and find potential security threats. Finds attempts of people trying to find public buckets, vulnerable services on EC2, etc.
- SnowTube, what public IPs are associded with EC2. is published to an SNS topic. Would be beautiful if we can subscribe to this topic?!! are there explicit accounts or Org conditions? Can we levarage AWS services to listen to this topic? How can we find out the name of the SNS topic?
- IP Ownership. this is a service managed by EC2 team - EC2 which ip addresses are associated with what instances for a point in time
How does GuardDuty work?
- S3 malware uses bitdefender to help with hashes. They also have a few other internal rules for it.
- GuardDuty gathers all required data itself, does not need you to enable it (vpc flow logs, cloudtrail, dns logs, S3/RDS access logs)
GuardDuty infrastructure
GuardDuty is built using a lot of the ‘normal’ AWS services, like Lambda, S3, EC2, RDS, Firehose.
-
Frontend running in customer account, these are the actual resources that will be checked.
-
Non guardduty internal components, S3, Route53 logs, flow logs, service logs from s3, eks audit logs, IP Ownership, (this is a service managed by EC2 team - EC2 which ip addresses are associated with what instances for a point in time), Mithra (DNS inspection)
Their evaluation components:
- Stateless processor: evaluate, this is related to the threat intel providers. eg. ip ownership, external vendor intel (croudstrike and proofpoint are definitely used), etc
- Stateful processing: This is where machine learning models are applied, what kind of things can it detect
- Malware engine:
Another service:
- Account service: Which accounts do you have enabled guardduty on, what is the delegate account, what features are enabled? etc?
Security boundaries:
GuardDuty runs internally across a whole lot of these above ’micro’services. They spread their services into different accounts, using it as a security boundary. Often they just use IAM roles and resource policies to control this, they don’t put everything behind API gateways etc.
DNS Graph statistics
They get all their data from Route53 to build their mitigfations (200TB DNS logs with 5B subdomain nodes oktober-2025).
Domain (TLD + 1) -> CNAME -> DNS Subdomain -> DNS -> EC2 Instances Subdomain -> DNS -> AWS Account
Domain reputation pipeline:
- Create a graph for the Domain target
- Train models on the graph
- Evaluate models using ??
Firenze for Model Evaluation
What are the manual sec engineer steps eg. for evaluation domains.
- New domain comes, are any ip addresses already sinkholed, so more likely to be malicious. Is it low popularity, is it nonsense, is the TLD abused often?
Firenze will use the signals that sec engineers generate or engineers identify new of these weak signals, to better evaluate a model and provide guardrails. This is used to improve Mithra.
There is a whitepaper firenze-model-evaluation-using-weak-signals
High level getting findings into GuardDuty
- ingest signals, apply ETL
- Signals delta table -> Clustering
- Clustering
- Compoind signals Delta Table -> Pripritization
- Compoind signals Delta Table -> training
- Prioritization -> attach sequences (s3)
- attach sequences (s3) -> Security Hub
- attach sequences (s3)-> Finding Gateway -> Findings into the API for GuardDuty console.
Future features
Note Jeff Bezos last bit of this youtube video, thinking small is a self fulfilling prophecy.
LETS BUILD THE BEST BUG BOUNTY TOOLS IN THE WORLD! THIS INCLUDES ULTRA FAST AND SCALABLE SCANNERS, AND A BEAUTIFUL GUI TO MANAGE THEM! PROXY SERVER, REPEATER, BIG CONTINUOUS SCRAPERS, CERT TRANSPARENCY MONITORING, DNS RESOLVER, AND MORE!
IT WILL BE THE BEST AND CHEAPEST TOOL IN THE WORLD! BUILT USING RUST AND LEVERAGING THE BEST SCALABLE AND CHEAPEST AWS CLOUD SERVICES.
TCP Fast Open
See if we want to implement TCP Fast Open for SNI scanning. Linux apparently does support it by default, windows doesn’t. This could reduce the round trip time of SNI scraping tasks.
Race condition testing
implement a scanner that does this: https://flatt.tech/research/posts/beyond-the-limit-expanding-single-packet-race-condition-with-first-sequence-sync/
Since this technique relies on crafting IP re-assembled packets and using TCP sequence numbers in a particular way, it will likely not be able to leverage the normal Engine as that leverages syn cookies to sync the transmit and receiving threads.
HTTP1 Pipelining
Can I use my engine to implement HTTP1 pipelining?
HTTP-to-DNS
Use my engine to very quickly resolve domains against CloudFlare. Can we make this very memory efficient as most packets will look very much alike. I could perhaps use the streaming json library https://github.com/pydantic/jiter
Tricks using DNS lookups for recon
Read this for discovering S3 buckets using DNS enumeration:
MPF - The previous iteration
Before there was MRPF, there was MPF. The original idea was the same, build my own tooling around bug bounty hunting.
It was built in Python and I’ve learned a lot from building it. I wanted to investigate if I could build a custom network stack and have more control over HTTP traffic, but Python was a bit more limited here. For instance, a lot of the HTTP client libraries don’t allow for a lot of customization around TLS. The libraries often build abstractions around the networking layers, making it difficult to customize things like TLS SNI or ALPN. Also, concepts like domain names, host names and ip addresses are all mixed together. This is nice fro a user perspective, but I want very specific control over all these factors to find misconfigurations.
Initially I started to build a custom network stack in C using libuv. Unfortunately my laptop crashed and I was stupid enough not to commit all that code into a repo. Also, I was very much struggling with writing save concurrent C code.
After a while I decided to try and rebuild the network stack and looked into Rust. By this time ChatGPT was really getting good and it really helped me quickly get up to speed with a new language. I learned a LOT and started to love some of the rust concepts like ownership, fearless concurrency and the way it makes refactoring code bases a lot easier.
This now has let me to try and re-build MPF completely in rust, Hence the M(y) (Rusty) P (Pension) F (Fund) project.
What are the things that don’t work that well in my current iteration of MRF
- The task manager parallelization is not optimal
- The memory management of building parallel tasks is not optimal
- Python is quite memory hungry for large scans
- All scanning tasks work on Lambda, I can’t mix in bare metal or containers
- Job scheduling is helpful but also very repetitive for each target. Would be better to have generic continuous scanning for all targets. This will help spread out load as well with my new randomization rust scanners
- The database model has some limitations:
- Nothing showing where certain results came from
- Not possible to construct the mermaid graph representation I came up with
- Too much things stacked inside the Domain object that are not 100% correct. For example, IP addresses should be their own entities, tcp/udp ports are related to an ip not a domain, the order of resolved IP addresses in the domain object is not static, making it seem like we have a lot of updates.
- The task manager code is quite difficult to read and not confident it’s robust enough.
- Introducing new tasks is quite labor intensive
- Tasks do not easily show the task template they belong to, making parsing log files more difficult
- The statistics of all the scans happening are not easily accessible or useful
- THE MAIN THING, it hasn’t helped me a single time to find or get closer to any actual bug/bounty. I have learned a bunch of things though so that is something..
Lets watch the Beazley talk and build my job along side it
https://www.youtube.com/watch?v=r-A78RgMhZU
Contributing
At the moment the project is solely maintained by me, with the purpose of learning and experimenting with Rust, distributed systems and bug bounties. I haven’t got any intentions of putting this out into the world as I’d like to be able to break things when I want and work on my own pace.
However, I love to talk to like minded people. If you happened to come across my hidden little corner of the internet, feel free to get in touch.