Sloth-Runner Master-Agent Architecture¶
sloth-runner
is designed with a master-agent architecture to facilitate distributed task execution. This allows you to orchestrate and run tasks across multiple remote machines from a central control point.
Core Concepts¶
Master Server¶
The Master Server is the central component of the sloth-runner
ecosystem. Its primary responsibilities include:
- Agent Registry: Maintains a registry of all connected and available agents.
- Task Orchestration: Receives task execution requests and dispatches them to the appropriate agents.
- Communication Hub: Acts as the communication hub between the user (via the CLI) and the agents.
Agent¶
An Agent is a lightweight process that runs on a remote machine. Its main functions are:
- Registration: Registers itself with the Master Server upon startup, providing its network address and name.
- Task Execution: Receives commands and tasks from the Master Server and executes them locally.
- Status Reporting: Reports the status and output of executed tasks back to the Master Server.
Communication Protocol¶
Master and Agents communicate using gRPC, a high-performance, open-source universal RPC framework. This ensures efficient and reliable communication between the distributed components.
Installation and Startup¶
Master Server Installation¶
To set up the sloth-runner
Master Server, you typically run it on your local machine or a designated control server. The master listens for agent connections on a specified port.
Command:
-p, --port <port>
: Specifies the port on which the master server will listen for agent connections. The default port is50053
.--daemon
: (Optional) Runs the master server as a background daemon process. This is recommended for continuous operation.
Example:
To start the master server on port 50053
in daemon mode:
Upon successful startup, the master will log that it is listening for agent registrations.
Agent Installation¶
Agents are deployed on the remote machines where you intend to execute tasks. Each agent needs to be configured with a unique name and the address of the Master Server.
Command:
sloth-runner agent start --name <agent_name> --master <master_ip>:<master_port> --port <agent_port> --bind-address <agent_ip> [--daemon]
--name <agent_name>
: A unique name for this agent (e.g.,agent1
,web-server-agent
). This name is used by the master to identify and address the agent.--master <master_ip>:<master_port>
: The IP address and port of the running Master Server. Agents will connect to this address to register and receive tasks.--port <agent_port>
: The port on which the agent itself will listen for direct communication from the master (e.g., for task execution requests). The default port is50051
.--bind-address <agent_ip>
: Crucial for remote agents. This specifies the specific IPv4 address that the agent should bind to and report to the master. This ensures the master can correctly connect to the agent, especially in environments with multiple network interfaces or IPv6 preference. Always set this to the remote machine's accessible IPv4 address.--daemon
: (Optional) Runs the agent as a background daemon process.
Example:
To start an agent named agent1
on a machine with IP 192.168.1.16
, connecting to a master at 192.168.1.21:50053
, and listening on port 50051
:
sloth-runner agent start --name agent1 --master 192.168.1.21:50053 --port 50051 --bind-address 192.168.1.16 --daemon
Task Execution Workflow¶
- Master Startup: The
sloth-runner
master server starts and begins listening for agent registrations. - Agent Startup & Registration: An agent starts on a remote machine, connects to the configured master, and registers itself, providing its unique name and accessible network address.
- Agent Listing: The user can list all registered agents using
sloth-runner agent list
from the master's machine. - Task Request: The user initiates a task execution on a specific agent using
sloth-runner agent run <agent_name> <command>
. - Task Dispatch: The master receives the request, looks up the agent's address in its registry, and dispatches the command to the target agent via gRPC.
- Task Execution: The agent receives the command, executes it locally (e.g., using
bash -c <command>
), and captures its standard output, standard error, and exit status. - Result Reporting: The agent sends the execution results (stdout, stderr, success/failure) back to the master.
- Output Presentation: The master receives the results and presents them to the user in a clear, formatted, and colored output (as described in the Enhanced
sloth-runner agent run
Output documentation).
This architecture provides a flexible and scalable way to manage and execute tasks across your infrastructure.
Special Configurations¶
Agents in Incus/LXC Containers¶
When deploying agents inside Incus (or LXC) containers, you need to configure port forwarding and use the --report-address
flag because the container's internal IP is not accessible from the master.
Quick Start¶
For a fast setup in an Incus container:
# 1. On the HOST - Configure port forwarding
sudo incus config device add main sloth-proxy proxy \
listen=tcp:0.0.0.0:50052 \
connect=tcp:127.0.0.1:50051
# 2. In the CONTAINER - Install with bootstrap script
sudo incus exec main -- bash -c "curl -fsSL https://raw.githubusercontent.com/chalkan3-sloth/sloth-runner/master/bootstrap.sh | bash -s -- --name main --master 192.168.1.29:50053 --incus 192.168.1.17:50052"
# Done! The agent is now running and configured.
Setup Steps¶
- Configure Port Forwarding on the Host
Add a proxy device to forward a host port to the container's agent port:
# On the host machine running Incus
sudo incus config device add <container_name> sloth-proxy proxy \
listen=tcp:0.0.0.0:<host_port> \
connect=tcp:127.0.0.1:<agent_port>
Example:
sudo incus config device add main sloth-proxy proxy \
listen=tcp:0.0.0.0:50052 \
connect=tcp:127.0.0.1:50051
- Start Agent with Report Address
Inside the container, start the agent with:
Option A: Using Bootstrap Script (Recommended)
# Inside the container
bash <(curl -fsSL https://raw.githubusercontent.com/chalkan3-sloth/sloth-runner/master/bootstrap.sh) \
--name <agent_name> \
--master <master_ip>:<master_port> \
--incus <host_ip>:<host_port>
The --incus
flag automatically sets: - --bind-address 0.0.0.0
(listen on all interfaces) - --report-address <host_ip>:<host_port>
(master connects via host) - Creates and enables systemd service
Option B: Manual Configuration
--bind-address 0.0.0.0
to listen on all interfaces--report-address <host_ip>:<host_port>
to tell the master how to reach this agent
# Inside the container
sloth-runner agent start \
--name <agent_name> \
--master <master_ip>:<master_port> \
--port <agent_port> \
--bind-address 0.0.0.0 \
--report-address <host_ip>:<host_port> \
--daemon
Example:
# Inside container "main" on host 192.168.1.17
sloth-runner agent start \
--name main \
--master 192.168.1.29:50053 \
--port 50051 \
--bind-address 0.0.0.0 \
--report-address 192.168.1.17:50052 \
--daemon
- Systemd Service Configuration (Recommended)
Create a systemd service file at /etc/systemd/system/sloth-runner-agent.service
:
[Unit]
Description=Sloth Runner Agent - <agent_name>
Documentation=https://chalkan3.github.io/sloth-runner/
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=root
WorkingDirectory=/var/lib/sloth-runner
Restart=always
RestartSec=5s
StartLimitInterval=60s
StartLimitBurst=5
# Agent Configuration
ExecStart=/usr/local/bin/sloth-runner agent start \
--name <agent_name> \
--master <master_ip>:<master_port> \
--port <agent_port> \
--bind-address 0.0.0.0 \
--report-address <host_ip>:<host_port>
# Logging
StandardOutput=journal
StandardError=journal
SyslogIdentifier=sloth-runner-agent
# Performance
LimitNOFILE=65536
# Security
NoNewPrivileges=true
PrivateTmp=true
[Install]
WantedBy=multi-user.target
Then enable and start the service:
Port Mapping Summary¶
Component | Internal IP:Port | Exposed Host IP:Port | Master Sees |
---|---|---|---|
Container Agent | 10.x.x.x:50051 | host_ip:50052 | host_ip:50052 |
Host Agent | host_ip:50051 | host_ip:50051 | host_ip:50051 |
Troubleshooting¶
Agent shows as "Active" but commands timeout: - Verify port forwarding is configured: incus config device list <container_name>
- Check the agent is using --report-address
with the host's IP and forwarded port - Test connectivity: nc -zv <host_ip> <host_port>
from the master machine
Multiple containers on the same host: - Use different host ports for each container (e.g., 50052, 50053, 50054) - Update each agent's --report-address
accordingly