Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Setup Guide

This guide provides comprehensive instructions for setting up a development environment for Osprey.

Prerequisites

  • Operating System: macOS, Linux, or Windows (with WSL recommended)
  • Python 3.11 or higher (check with python --version)
  • Git for version control
  • uv for Python package management
  • npm

Project Setup

1. Clone the Repository

git clone git@github.com:roostorg/osprey.git
cd osprey

2. Install Dependencies

# Install all dependencies including development tools
uv sync

This command will:

  • Create a virtual environment automatically
  • Install all production dependencies
  • Install development dependencies (ruff, mypy, pre-commit) automatically
  • Use the locked versions from uv.lock for reproducible builds

Note: uv sync includes development dependencies by default. Use uv sync --no-dev if you only want production dependencies.

3. Set Up Pre-commit Hooks

uv run pre-commit install

This installs git hooks that automatically run code quality checks before each commit.

4. Verify Setup

Run these commands to ensure everything is working correctly:

# Check linting configuration
uv run ruff check

# Check formatting
uv run ruff format --diff

# Run type checking
uv run mypy .

# Test pre-commit hooks
uv run pre-commit run --all-files

Expected Results:

  • Ruff should report “All checks passed!” or show specific issues to fix
  • MyPy should run without errors
  • Pre-commit should run all hooks successfully

5. Getting Started

docker compose up -d

or using the wrapper script

./start.sh

This starts up many services, including:

  • Osprey Worker: The main engine that processes input events given the rules and UDFs
    • Test Data Producer: Optional with --profile test_data
  • Osprey UI: Frontend service that hosts the react code for the web interface and communicates to the UI API
  • Osprey UI API: Backend service that provides data and functionality to the web interface
  • Kafka (KRaft mode): Message streaming for user generated events
  • Postgres: A database that the Worker, UI API, and Druid use for various reasons, such as the Postgres-backed Labels Service (in the example plugins)
  • Druid: A database that consumes Osprey Worker outputs to power the UI API for real-time querying

Alternatively, you can start Osprey with osprey-coordinator, refer to the Coordinator README for more information

6. (Optional) Open ports for the UI/UI API

By default, the docker-compose.yaml binds running services to 127.0.0.1. If you are running the docker compose on a headless machine, you may need to modify this configuration and/or make changes to your firewall, specifically for ports 5002 and 5004.

For example, if you use Tailscale to access your Osprey instance, you may change 127.0.0.1:5002:5002 to <Tailscale IP>:5002:5002. Alternatively, if you wish for your instance to be accessible from the public internet, you may set it simply to 5002:5002 to bind to 0.0.0.0.

Be aware that some firewalls like iptables/UFW do not prevent access to ports being used by Docker networking. Not explicitly setting a bind address with only UFW as a firewall will not prevent access from the public internet unless properly configured.

7. Access the Application

The UI will automatically connect to the backend services running in Docker containers.

Plugins

In Osprey, UDFs and output sinks are designed to be easily portable. This is done through a plugin system based on pluggy. An example plugin package has been provided for reference, see example_plugins/register_plugins.py:

@hookimpl_osprey
def register_udfs() -> Sequence[Type[UDFBase[Any, Any]]]:
    # Register custom user-defined functions

@hookimpl_osprey
def register_output_sinks(config: Config) -> Sequence[BaseOutputSink]:
    # Define output destinations
    # By default it prints the execution results to the console

@hookimpl_osprey
def register_ast_validators() -> None:
    # Register AST validators

Rules

Rules are written in SML, some examples are provided in example_rules/ with YAML config, the rules are mounted to the worker processes when the containers start via environment variables. ex:

OSPREY_RULES=./example_rules uv run python3.11 osprey_worker/src/osprey/worker/cli/sinks.py run-rules-sink

More about rules →

Test Data

Generate sample JSON actions:

docker compose --profile test_data up osprey-kafka-test-data-producer -d

Produces user login events with timestamps, user IDs, and IP addresses to osprey.actions_input topic.