Workers
Workers execute tasks dispatched by the Manager. Three worker types exist, differentiated by environment, capabilities, and queue.
Worker Types
| Type | Environment | DB Access | Raw Sockets | Queue |
|---|---|---|---|---|
| External | Lambda (outside VPC) | No | No | External SQS |
| Internal | Lambda (inside VPC) | Read/Write task_data + recon | No | Internal SQS |
| Bare Metal | EC2 (inside VPC) | Read/Write task_data + recon | Yes | Bare Metal SQS |
External Workers
Run on AWS Lambda outside the VPC. No database access. No raw socket capabilities.
Used for tasks that call external services: crt.sh, Censys, Shodan, and similar public APIs. Results are returned to the Manager via CompleteTask messages with an optional data payload. The Manager writes this data into task_manager.task_data on the worker’s behalf.
Internal Workers
Run on AWS Lambda inside the VPC. Direct read/write access to task_manager.task_data, task_manager.task_collection_data, and recon tables.
Used for data mutations, filtering, aggregation, model upserts, and any task that needs to query or modify stored results without raw socket access. Examples: Filter variants, ListModels, UpsertModels, InsertData, aggregator tasks, Conditional.
Bare Metal Workers
Run on EC2 instances inside the VPC. Full database access plus raw socket capabilities.
Used for TCP SYN scanning, custom TLS scanning, DNS scanning, and any task requiring raw packet construction. These are the only workers that can run the mrpf_engine-based scanners.
Lifecycle:
- Launched on-demand by the Manager via EC2
RunInstanceswhen bare metal tasks are dispatched. - Uses a launch template (
mathijs-worker-ami) with$Latestversion. - On startup, the worker polls the Bare Metal SQS queue. If no messages arrive within a configurable timeout (environment variable), the worker shuts itself down to minimize costs.
- Maximum 5 concurrent instances (capped by
MAX_BARE_METAL_WORKERS).
Capacity handling:
- The Manager accepts an optional comma-separated list of subnet IDs (
BARE_METAL_WORKER_SUBNETSenv var) across different availability zones. - It iterates through instance type alternatives (
None/launch-template default,t4g.micro,t4g.small,c8gd.medium,c7g.medium) and subnets, attempting each combination until one succeeds or all are exhausted. InsufficientInstanceCapacityerrors trigger fallback to the next subnet/instance-type combination.
Task Routing
The Manager routes each task based on WorkerRequirements declared by its TaskDefinition:
requires_raw_sockets() -> Bare Metal queue
requires_internal_worker() -> Internal queue
otherwise -> External queue
The routing logic in detail:
- If
WORKER_REQUIREMENTScontainsRawSocketAccess-> Bare Metal queue. - If
WORKER_REQUIREMENTScontainsTaskDataAccessorReconDataAccess(and NOTRawSocketAccess) -> Internal queue. - Otherwise -> External queue.
WorkerRequirements
Each task definition declares a static WORKER_REQUIREMENTS slice:
- TaskDataAccess – Direct database access to
task_manager.task_dataandtask_manager.task_collection_data. - QueueAccess – Access to the Task Manager SQS queue (for generator tasks that produce
CreateTaskmessages). - ReconDataAccess – Access to the
recondatabase tables (for reading/writing models like targets, domains, endpoints). - RawSocketAccess – Ability to send/receive raw packets (TCP SYN, custom TLS handshakes).