VectorWave
Installation
Install VectorWave and set up the Weaviate vector database.
Prerequisites
- Python 3.10 — 3.13
- Docker (to run the Weaviate vector database)
- (Optional) OpenAI API Key for AI auto-documentation, high-performance embedding, and self-healing
1. Start Weaviate
VectorWave uses Weaviate as its vector database. Start it with Docker:
Quick Start (Single Container)
docker run -d \
--name weaviate \
-p 8080:8080 \
-p 50051:50051 \
cr.weaviate.io/semitechnologies/weaviate:latest
For production deployments, consider Weaviate Cloud (WCS) — a fully managed Weaviate instance.
Docker Compose (Recommended)
Create a docker-compose.yml:
version: '3.4'
services:
weaviate:
command:
- --host
- 0.0.0.0
- --port
- '8080'
- --scheme
- http
image: semitechnologies/weaviate:1.26.1
ports:
- 8080:8080
- 50051:50051
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'none'
ENABLE_MODULES: 'text2vec-openai,generative-openai'
CLUSTER_HOSTNAME: 'node1'
docker-compose up -d
2. Install VectorWave
pip install vectorwave
3. Environment Variables
Create a .env file in your project root. Below is a complete reference with all available options:
# === Weaviate Connection ===
WEAVIATE_HOST=localhost # Weaviate hostname (default: localhost)
WEAVIATE_PORT=8080 # Weaviate REST API port (default: 8080)
WEAVIATE_GRPC_PORT=50051 # Weaviate gRPC port (default: 50051)
# WEAVIATE_API_KEY=your-wcs-key # Required for Weaviate Cloud (WCS)
# === Vectorizer ===
VECTORIZER=weaviate_module # weaviate_module | openai_client | huggingface | none
# HF_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2 # Model for huggingface vectorizer
# === OpenAI ===
OPENAI_API_KEY=sk-... # Required for openai_client vectorizer, healing, RAG
# === GitHub (Self-Healing Auto-PR) ===
# GITHUB_TOKEN=ghp_... # GitHub personal access token
# GITHUB_REPO_NAME=org/repo-name # Target repository (format: org/repo)
# GITHUB_BASE_BRANCH=main # Base branch for PRs (default: main)
# === Drift Detection ===
# DRIFT_DETECTION_ENABLED=True # Enable drift monitoring (default: False)
# DRIFT_DISTANCE_THRESHOLD=0.25 # KNN distance threshold (default: 0.25)
# DRIFT_NEIGHBOR_AMOUNT=5 # Number of KNN neighbors (default: 5)
# === Alerting ===
# ALERTER_STRATEGY=webhook # Alert strategy: none | webhook (default: none)
# ALERTER_WEBHOOK_URL=https://discord.com/api/webhooks/...
# ALERTER_MIN_LEVEL=ERROR # Minimum alert level (default: ERROR)
# === Performance Tuning ===
# BATCH_THRESHOLD=20 # Batch flush threshold in objects (default: 20)
# FLUSH_INTERVAL_SECONDS=2.0 # Batch flush interval in seconds (default: 2.0)
# ASYNC_LOGGING=False # Enable async DB logging (default: False)
# === Custom Properties & Error Handling ===
# CUSTOM_PROPERTIES_FILE_PATH=.weaviate_properties # Custom schema file (default: .weaviate_properties)
# IGNORE_ERROR_FILE_PATH=.vtwignore # Error suppression file (default: .vtwignore)
# FAILURE_MAPPING_FILE_PATH=.vectorwave_errors.json # Error code mapping file
# === Data Masking ===
# SENSITIVE_FIELD_NAMES=password,api_key,token,secret,auth_token # Fields to mask in logs
# === Global Tags ===
# VECTORWAVE_TAGS_ENVIRONMENT=production # Applied to all executions
# VECTORWAVE_TAGS_REGION=us-east-1
Minimal setup: For most projects, you only need
VECTORIZERand optionallyOPENAI_API_KEY. All other variables have sensible defaults.
Vectorizer Options
| Vectorizer | Cost | Quality | Speed | Setup |
|---|---|---|---|---|
weaviate_module | Depends on module | Excellent | Server-side | Weaviate module configured (default) |
openai_client | $0.0001/1K tokens | Excellent | API call | Requires OPENAI_API_KEY |
huggingface | Free | Good | Local CPU/GPU | No API key needed |
none | Free | — | — | No vectorization |
4. Initialize Database
Before first use, initialize the Weaviate collections:
from vectorwave import initialize_database
initialize_database()
This creates four collections in Weaviate:
VectorWaveFunctions— Stores function metadata (source code, descriptions, module paths)VectorWaveExecutions— Stores execution logs (inputs, outputs, errors, embeddings)VectorWaveGoldenDataset— Stores verified high-quality executions for cache priority and drift detectionVectorWaveTokenUsage— Tracks LLM token usage for cost monitoring
Schema Migration
If you add new custom properties via .weaviate_properties or upgrade VectorWave, update the schema:
from vectorwave import update_database_schema
update_database_schema()
This adds new properties to existing collections without data loss. Existing properties are never modified.
5. Verify Installation
from vectorwave import vectorize, initialize_database
initialize_database()
@vectorize(auto=True)
def hello(name: str):
return f"Hello, {name}!"
result = hello("World")
print(result) # "Hello, World!"
If this runs without errors, VectorWave is ready. Check your Weaviate console at http://localhost:8080/v1/meta to verify the connection.
Troubleshooting
Weaviate connection refused
ConnectionError: Connection to Weaviate failed
Make sure Docker is running and Weaviate is accessible:
curl http://localhost:8080/v1/meta
Module not found
ModuleNotFoundError: No module named 'vectorwave'
Ensure you're using the correct Python environment:
pip show vectorwave
External Resources
- GitHub: Cozymori/VectorWave — Source code and issues
- Weaviate Documentation — Vector database docs
- Weaviate Cloud Console — Managed Weaviate instances
- Weaviate Python Client — Python client library
Next Steps
- @vectorize Core — Learn the core decorator
- Semantic Caching — Enable caching for cost savings
- Quick Start — Full end-to-end walkthrough