Setup Guide
This guide provides comprehensive instructions for setting up a development environment for Osprey.
Prerequisites
- Operating System: macOS, Linux, or Windows (with WSL recommended)
- Python 3.11 or higher (check with
python --version) - Git for version control
- uv for Python package management
- npm
Project Setup
1. Clone the Repository
git clone git@github.com:roostorg/osprey.git
cd osprey
2. Install Dependencies
# Install all dependencies including development tools
uv sync
This command will:
- Create a virtual environment automatically
- Install all production dependencies
- Install development dependencies (ruff, mypy, pre-commit) automatically
- Use the locked versions from
uv.lockfor reproducible builds
Note: uv sync includes development dependencies by default. Use uv sync --no-dev if you only want production dependencies.
3. Set Up Pre-commit Hooks
uv run pre-commit install
This installs git hooks that automatically run code quality checks before each commit.
4. Verify Setup
Run these commands to ensure everything is working correctly:
# Check linting configuration
uv run ruff check
# Check formatting
uv run ruff format --diff
# Run type checking
uv run mypy .
# Test pre-commit hooks
uv run pre-commit run --all-files
Expected Results:
- Ruff should report “All checks passed!” or show specific issues to fix
- MyPy should run without errors
- Pre-commit should run all hooks successfully
5. Getting Started
docker compose up -d
or using the wrapper script
./start.sh
This starts up many services, including:
- Osprey Worker: The main engine that processes input events given the rules and UDFs
- Test Data Producer: Optional with
--profile test_data
- Test Data Producer: Optional with
- Osprey UI: Frontend service that hosts the react code for the web interface and communicates to the UI API
- Osprey UI API: Backend service that provides data and functionality to the web interface
- Kafka (KRaft mode): Message streaming for user generated events
- Postgres: A database that the Worker, UI API, and Druid use for various reasons, such as the Postgres-backed Labels Service (in the example plugins)
- Druid: A database that consumes Osprey Worker outputs to power the UI API for real-time querying
Alternatively, you can start Osprey with osprey-coordinator, refer to the Coordinator README for more information
6. (Optional) Open ports for the UI/UI API
By default, the docker-compose.yaml binds running services to 127.0.0.1. If you are running the docker compose on a headless machine, you may need to modify this configuration and/or make changes to your firewall, specifically for ports 5002 and 5004.
For example, if you use Tailscale to access your Osprey instance, you may change 127.0.0.1:5002:5002 to <Tailscale IP>:5002:5002. Alternatively, if you wish for your instance to be accessible from the public internet, you may set it simply to 5002:5002 to bind to 0.0.0.0.
Be aware that some firewalls like iptables/UFW do not prevent access to ports being used by Docker networking. Not explicitly setting a bind address with only UFW as a firewall will not prevent access from the public internet unless properly configured.
7. Access the Application
The UI will automatically connect to the backend services running in Docker containers.
- Osprey UI: localhost:5002
- Backend API: localhost:5004
- Worker Service: localhost:5001
Plugins
In Osprey, UDFs and output sinks are designed to be easily portable. This is done through a plugin system based on pluggy. An example plugin package has been provided for reference, see example_plugins/register_plugins.py:
@hookimpl_osprey
def register_udfs() -> Sequence[Type[UDFBase[Any, Any]]]:
# Register custom user-defined functions
@hookimpl_osprey
def register_output_sinks(config: Config) -> Sequence[BaseOutputSink]:
# Define output destinations
# By default it prints the execution results to the console
@hookimpl_osprey
def register_ast_validators() -> None:
# Register AST validators
Rules
Rules are written in SML, some examples are provided in example_rules/ with YAML config, the rules are mounted to the worker processes when the containers start via environment variables. ex:
OSPREY_RULES=./example_rules uv run python3.11 osprey_worker/src/osprey/worker/cli/sinks.py run-rules-sink
Test Data
Generate sample JSON actions:
docker compose --profile test_data up osprey-kafka-test-data-producer -d
Produces user login events with timestamps, user IDs, and IP addresses to osprey.actions_input topic.