Installation

Prerequisites

Python 3.10 — 3.13
Docker (to run the Weaviate vector database)
(Optional) OpenAI API Key for AI auto-documentation, high-performance embedding, and self-healing

1. Start Weaviate

VectorWave uses Weaviate as its vector database. Start it with Docker:

Quick Start (Single Container)

docker run -d \
  --name weaviate \
  -p 8080:8080 \
  -p 50051:50051 \
  cr.weaviate.io/semitechnologies/weaviate:latest

For production deployments, consider Weaviate Cloud (WCS) — a fully managed Weaviate instance.

Docker Compose (Recommended)

Create a docker-compose.yml:

version: '3.4'
services:
  weaviate:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: semitechnologies/weaviate:1.26.1
    ports:
    - 8080:8080
    - 50051:50051
    restart: on-failure:0
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'none'
      ENABLE_MODULES: 'text2vec-openai,generative-openai'
      CLUSTER_HOSTNAME: 'node1'

docker-compose up -d

2. Install VectorWave

pip install vectorwave

3. Environment Variables

Create a .env file in your project root. Below is a complete reference with all available options:

# === Weaviate Connection ===
WEAVIATE_HOST=localhost              # Weaviate hostname (default: localhost)
WEAVIATE_PORT=8080                   # Weaviate REST API port (default: 8080)
WEAVIATE_GRPC_PORT=50051             # Weaviate gRPC port (default: 50051)
# WEAVIATE_API_KEY=your-wcs-key     # Required for Weaviate Cloud (WCS)

# === Vectorizer ===
VECTORIZER=weaviate_module           # weaviate_module | openai_client | huggingface | none
# HF_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2  # Model for huggingface vectorizer

# === OpenAI ===
OPENAI_API_KEY=sk-...                # Required for openai_client vectorizer, healing, RAG

# === GitHub (Self-Healing Auto-PR) ===
# GITHUB_TOKEN=ghp_...              # GitHub personal access token
# GITHUB_REPO_NAME=org/repo-name    # Target repository (format: org/repo)
# GITHUB_BASE_BRANCH=main           # Base branch for PRs (default: main)

# === Drift Detection ===
# DRIFT_DETECTION_ENABLED=True      # Enable drift monitoring (default: False)
# DRIFT_DISTANCE_THRESHOLD=0.25     # KNN distance threshold (default: 0.25)
# DRIFT_NEIGHBOR_AMOUNT=5           # Number of KNN neighbors (default: 5)

# === Alerting ===
# ALERTER_STRATEGY=webhook          # Alert strategy: none | webhook (default: none)
# ALERTER_WEBHOOK_URL=https://discord.com/api/webhooks/...
# ALERTER_MIN_LEVEL=ERROR           # Minimum alert level (default: ERROR)

# === Performance Tuning ===
# BATCH_THRESHOLD=20                # Batch flush threshold in objects (default: 20)
# FLUSH_INTERVAL_SECONDS=2.0        # Batch flush interval in seconds (default: 2.0)
# ASYNC_LOGGING=False               # Enable async DB logging (default: False)

# === Custom Properties & Error Handling ===
# CUSTOM_PROPERTIES_FILE_PATH=.weaviate_properties  # Custom schema file (default: .weaviate_properties)
# IGNORE_ERROR_FILE_PATH=.vtwignore                 # Error suppression file (default: .vtwignore)
# FAILURE_MAPPING_FILE_PATH=.vectorwave_errors.json  # Error code mapping file

# === Data Masking ===
# SENSITIVE_FIELD_NAMES=password,api_key,token,secret,auth_token  # Fields to mask in logs

# === Global Tags ===
# VECTORWAVE_TAGS_ENVIRONMENT=production   # Applied to all executions
# VECTORWAVE_TAGS_REGION=us-east-1

Minimal setup: For most projects, you only need VECTORIZER and optionally OPENAI_API_KEY. All other variables have sensible defaults.

Vectorizer Options

Vectorizer	Cost	Quality	Speed	Setup
`weaviate_module`	Depends on module	Excellent	Server-side	Weaviate module configured (default)
`openai_client`	$0.0001/1K tokens	Excellent	API call	Requires `OPENAI_API_KEY`
`huggingface`	Free	Good	Local CPU/GPU	No API key needed
`none`	Free	—	—	No vectorization

4. Initialize Database

Before first use, initialize the Weaviate collections:

from vectorwave import initialize_database

initialize_database()

This creates four collections in Weaviate:

VectorWaveFunctions — Stores function metadata (source code, descriptions, module paths)
VectorWaveExecutions — Stores execution logs (inputs, outputs, errors, embeddings)
VectorWaveGoldenDataset — Stores verified high-quality executions for cache priority and drift detection
VectorWaveTokenUsage — Tracks LLM token usage for cost monitoring

Schema Migration

If you add new custom properties via .weaviate_properties or upgrade VectorWave, update the schema:

from vectorwave import update_database_schema

update_database_schema()

This adds new properties to existing collections without data loss. Existing properties are never modified.

5. Verify Installation

from vectorwave import vectorize, initialize_database

initialize_database()

@vectorize(auto=True)
def hello(name: str):
    return f"Hello, {name}!"

result = hello("World")
print(result)  # "Hello, World!"

If this runs without errors, VectorWave is ready. Check your Weaviate console at http://localhost:8080/v1/meta to verify the connection.

Troubleshooting

Weaviate connection refused

ConnectionError: Connection to Weaviate failed

Make sure Docker is running and Weaviate is accessible:

curl http://localhost:8080/v1/meta

Module not found

ModuleNotFoundError: No module named 'vectorwave'

Ensure you're using the correct Python environment:

pip show vectorwave

External Resources

GitHub: Cozymori/VectorWave — Source code and issues
Weaviate Documentation — Vector database docs
Weaviate Cloud Console — Managed Weaviate instances
Weaviate Python Client — Python client library

Next Steps

@vectorize Core — Learn the core decorator
Semantic Caching — Enable caching for cost savings
Quick Start — Full end-to-end walkthrough