Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Workers

Workers execute tasks dispatched by the Manager. Three worker types exist, differentiated by environment, capabilities, and queue.

Worker Types

TypeEnvironmentDB AccessRaw SocketsQueue
ExternalLambda (outside VPC)NoNoExternal SQS
InternalLambda (inside VPC)Read/Write task_data + reconNoInternal SQS
Bare MetalEC2 (inside VPC)Read/Write task_data + reconYesBare Metal SQS

External Workers

Run on AWS Lambda outside the VPC. No database access. No raw socket capabilities.

Used for tasks that call external services: crt.sh, Censys, Shodan, and similar public APIs. Results are returned to the Manager via CompleteTask messages with an optional data payload. The Manager writes this data into task_manager.task_data on the worker’s behalf.

Internal Workers

Run on AWS Lambda inside the VPC. Direct read/write access to task_manager.task_data, task_manager.task_collection_data, and recon tables.

Used for data mutations, filtering, aggregation, model upserts, and any task that needs to query or modify stored results without raw socket access. Examples: Filter variants, ListModels, UpsertModels, InsertData, aggregator tasks, Conditional.

Bare Metal Workers

Run on EC2 instances inside the VPC. Full database access plus raw socket capabilities.

Used for TCP SYN scanning, custom TLS scanning, DNS scanning, and any task requiring raw packet construction. These are the only workers that can run the mrpf_engine-based scanners.

Lifecycle:

  • Launched on-demand by the Manager via EC2 RunInstances when bare metal tasks are dispatched.
  • Uses a launch template (mathijs-worker-ami) with $Latest version.
  • On startup, the worker polls the Bare Metal SQS queue. If no messages arrive within a configurable timeout (environment variable), the worker shuts itself down to minimize costs.
  • Maximum 5 concurrent instances (capped by MAX_BARE_METAL_WORKERS).

Capacity handling:

  • The Manager accepts an optional comma-separated list of subnet IDs (BARE_METAL_WORKER_SUBNETS env var) across different availability zones.
  • It iterates through instance type alternatives (None/launch-template default, t4g.micro, t4g.small, c8gd.medium, c7g.medium) and subnets, attempting each combination until one succeeds or all are exhausted.
  • InsufficientInstanceCapacity errors trigger fallback to the next subnet/instance-type combination.

Task Routing

The Manager routes each task based on WorkerRequirements declared by its TaskDefinition:

requires_raw_sockets()        -> Bare Metal queue
requires_internal_worker()    -> Internal queue
otherwise                     -> External queue

The routing logic in detail:

  1. If WORKER_REQUIREMENTS contains RawSocketAccess -> Bare Metal queue.
  2. If WORKER_REQUIREMENTS contains TaskDataAccess or ReconDataAccess (and NOT RawSocketAccess) -> Internal queue.
  3. Otherwise -> External queue.

WorkerRequirements

Each task definition declares a static WORKER_REQUIREMENTS slice:

  • TaskDataAccess – Direct database access to task_manager.task_data and task_manager.task_collection_data.
  • QueueAccess – Access to the Task Manager SQS queue (for generator tasks that produce CreateTask messages).
  • ReconDataAccess – Access to the recon database tables (for reading/writing models like targets, domains, endpoints).
  • RawSocketAccess – Ability to send/receive raw packets (TCP SYN, custom TLS handshakes).