Quick Start

Prerequisites

Python 3.10 — 3.13
Docker (to run Weaviate)

1. Start Weaviate

VectorWave uses Weaviate as its vector database. Start it with Docker:

docker run -d \
  --name weaviate \
  -p 8080:8080 \
  -p 50051:50051 \
  cr.weaviate.io/semitechnologies/weaviate:latest

For production deployments, consider Weaviate Cloud (WCS) — a fully managed Weaviate instance.

2. Install VectorWave

pip install vectorwave

3. Initialize and Start Tracing

from vectorwave import vectorize, initialize_database

# Create Weaviate collections
initialize_database()

# Decorate any function to trace and cache it
@vectorize(
    semantic_cache=True,
    cache_threshold=0.95,
    capture_return_value=True,
    team="my-team"  # custom tag
)
async def generate_response(query: str):
    # Your LLM call here
    return await llm.complete(query)

That's it. Every call to generate_response is now:

Vectorized — input is converted to an embedding
Traced — execution is logged with trace_id/span_id
Cached — similar queries return instantly from cache

4. Self-Healing (Optional)

Self-healing requires environment variables in your .env:

# Required for diagnosis
OPENAI_API_KEY=sk-...

# Required for auto-PR (optional)
GITHUB_TOKEN=ghp_...
GITHUB_REPO_NAME=org/repo-name

On-Demand Healing

Call VectorWaveHealer to diagnose errors when needed:

from vectorwave import VectorWaveHealer

healer = VectorWaveHealer(model="gpt-4-turbo")

# Diagnose recent errors for a specific function
result = healer.diagnose_and_heal(
    function_name="generate_response",
    lookback_minutes=60,
    create_pr=True
)

Scheduled Healing (AutoHealerBot)

VectorWave includes a built-in scheduler that automatically scans for errors and creates fix PRs:

from vectorwave.utils.scheduler import start_scheduler
import threading

# Start the auto-healer in a background thread
# Scans every 5 minutes, with 60-minute cooldown per function
scheduler_thread = threading.Thread(
    target=start_scheduler,
    args=(5,),  # check interval in minutes
    daemon=True
)
scheduler_thread.start()

The AutoHealerBot automatically:

Scans for ERROR status executions in the last 60 minutes
Deduplicates by function name
Skips functions in cooldown (60 min after last PR)
Skips errors listed in .vtwignore (marked as FAILURE, not ERROR)

VectorSurfer: Error diagnosis is also available through the VectorSurfer dashboard with a visual interface — browse errors, view diagnoses, and create PRs with one click.

5. Golden Dataset (Optional)

Promote verified executions to your Golden Dataset for better caching and drift detection:

from vectorwave import VectorWaveDatasetManager

dm = VectorWaveDatasetManager()

# Promote a verified execution to "Golden" status
dm.register_as_golden(
    log_uuid="abc-123",
    note="Verified by QA",
    tags=["v2", "production"]
)

# Get recommendations for new golden candidates
candidates = dm.recommend_candidates(
    function_name="generate_response",
    limit=5
)

Golden data is prioritized in semantic caching — when a cache hit matches golden data, it's returned first. It also serves as the baseline for drift detection.

VectorSurfer: Golden Dataset management is built into the VectorSurfer dashboard — browse, promote, and manage golden entries visually.

6. Drift Detection (Optional)

Monitor when inputs start drifting from known-good patterns. VectorWave uses a KNN-based approach to compare new inputs against past successful executions. When the average distance exceeds the threshold, a webhook alert is sent (Discord, Slack, etc.).

See Drift Detection for setup and configuration.

7. Launch the Dashboard

VectorSurfer

Open the VectorSurfer dashboard to monitor all your AI functions:

Open Dashboard →

Features include:

Real-time KPI cards (executions, success rate, avg duration)
Execution timeline charts
Distributed trace waterfall
AI-powered error diagnosis
Golden Dataset management

See VectorSurfer docs for details.

VectorSurferSTL (Lightweight)

For a lightweight Python-only alternative:

pip install vectorsurferstl
vectorsurferstl

8. Error Suppression with .vtwignore (Optional)

Create a .vtwignore file in your project root to exclude specific error types from healing and alerts:

# Ignore expected exceptions
ValueError
KeyError
DeprecationWarning

Errors matching these codes are marked as FAILURE instead of ERROR — the AutoHealerBot skips them, and no webhook alerts are sent.

9. Regression Testing with VectorCheck

pip install vectorcheck

# Run tests against all tracked functions
vw test --target all

# Semantic comparison mode (for generative AI)
vw test --target all --semantic --threshold 0.85

# Export execution logs for offline analysis
vw export --target my_module.my_function --output data.jsonl

Configuration

Environment Variables

Variable	Required	Description
`OPENAI_API_KEY`	For caching/healing	OpenAI API key for embeddings and LLM features
`WEAVIATE_HOST`	No	Weaviate hostname (default: `localhost`)
`WEAVIATE_PORT`	No	Weaviate HTTP port (default: `8080`)
`WEAVIATE_GRPC_PORT`	No	Weaviate gRPC port (default: `50051`)
`VECTORIZER`	No	Embedding provider: `weaviate_module`, `openai_client`, `huggingface`, or `none` (default: `weaviate_module`)
`GITHUB_TOKEN`	For auto-PR	GitHub token for self-healing PR creation
`GITHUB_REPO_NAME`	For auto-PR	Target repository (e.g. `org/repo-name`)

@vectorize Options

Option	Type	Default	Description
`semantic_cache`	bool	`False`	Enable semantic caching
`cache_threshold`	float	`0.9`	Cosine similarity threshold for cache hits
`capture_return_value`	bool	`False`	Store return values in Weaviate
`capture_inputs`	bool	`False`	Auto-capture all function parameters
`replay`	bool	`False`	Enable replay-based regression testing
`auto`	bool	`False`	Auto-generate metadata via LLM
`search_description`	str	`None`	Description for vector search
`**execution_tags`	any	—	Custom tags defined in `.weaviate_properties` (e.g. `team="ml-team"`)

What's Next?

Explore the VectorWave product page for detailed features
Check out the VectorSurfer dashboard
Browse GitHub for source code and issues