VectorWave
@vectorize Core
The core decorator that powers function vectorization, tracing, and storage.
The @vectorize Decorator
@vectorize is VectorWave's core primitive. A single decorator transforms any Python function into a vectorized, traced, and searchable unit of execution.
from vectorwave import vectorize
@vectorize(
semantic_cache=True,
cache_threshold=0.95,
capture_return_value=True,
team="ml-team" # custom tag
)
async def generate_response(query: str):
return await llm.complete(query)
How It Works
When a decorated function is called, VectorWave performs the following:
1. Static Registration
On first import, VectorWave captures the function's static metadata and stores it in VectorWaveFunctions:
- Function name, module path, fully qualified name
- Source code (via
inspect.getsource) - Docstring
- AI-generated description (if
auto=True) - Search description (if provided)
- Custom properties (if defined via
.weaviate_properties)
2. Execution Interception
On each call, the decorator:
- Generates a
trace_idandspan_idfor distributed tracing - Converts inputs to an embedding vector
- Checks for semantic cache hits (if
semantic_cache=True) - If cache hit → returns cached result in ~0.02s
- If cache miss → executes the function normally
- Stores the execution log in
VectorWaveExecutions
3. Error Handling
If the function raises an exception:
- The error is captured with full stack trace
- Stored in Weaviate with
status: "error" - Available for self-healing diagnosis and automated fix
Sync and Async Support
VectorWave supports both sync and async functions:
# Sync function
@vectorize(auto=True)
def sync_function(query: str):
return process(query)
# Async function
@vectorize(auto=True)
async def async_function(query: str):
return await async_process(query)
Distributed Tracing
Root Span
Every @vectorize call creates a root span with a unique trace_id:
@vectorize(auto=True)
def parent_function(data):
result_a = child_function_a(data)
result_b = child_function_b(result_a)
return result_b
Child Spans with @trace_span
Use @trace_span to trace internal functions as child spans:
from vectorwave import vectorize, trace_span
@vectorize(auto=True)
def pipeline(query: str):
embedding = embed(query)
result = search(embedding)
return format_result(result)
@trace_span
def embed(query: str):
return model.encode(query)
@trace_span
def search(embedding):
return db.query(embedding, limit=10)
@trace_span
def format_result(result):
return {"items": result, "count": len(result)}
@trace_span also accepts optional parameters for fine-grained control:
@trace_span(capture_return_value=True, attributes_to_capture=["score"])
def rank_results(results, score=0.5):
return sorted(results, key=lambda x: x["score"])
See the API Reference for all @trace_span parameters.
All child spans share the parent's trace_id, creating a hierarchical trace:
pipeline (trace_id: abc-123, span_id: 001)
├── embed (trace_id: abc-123, span_id: 002)
├── search (trace_id: abc-123, span_id: 003)
└── format_result (trace_id: abc-123, span_id: 004)
VectorSurfer: Distributed traces are visualized as interactive waterfall diagrams in the VectorSurfer dashboard — click any span to see inputs, outputs, and duration.
Querying Traces
Trace data can be queried through VectorWave's execution search:
from vectorwave import search_executions
# Get all executions for a specific trace
spans = search_executions(
filters={"trace_id": "abc-123"},
sort_by="timestamp_utc",
)
for span in spans:
print(f"{span['function_name']} — {span['duration_ms']}ms")
Automatic Behavior
Some @vectorize options automatically enable other features:
semantic_cache=Trueautomatically enablescapture_return_value=True— cached results need stored return values.replay=Trueautomatically enablescapture_return_value=Trueand captures all function arguments — replay testing needs full input/output data.
This means you don't need to set capture_return_value=True manually when using caching or replay.
Execution Status Values
Each execution is stored with one of these status values:
| Status | Description |
|---|---|
SUCCESS | Function completed normally |
ERROR | Exception raised, alert sent (if configured) |
FAILURE | Exception raised, but suppressed via .vtwignore |
CACHE_HIT | Semantic cache returned a cached result |
ANOMALY | Drift detected — input is far from known-good patterns |
Use these values to filter executions:
from vectorwave import search_executions
errors = search_executions(filters={"status": "ERROR"}, limit=10)
anomalies = search_executions(filters={"status": "ANOMALY"}, limit=10)
Input Capture Modes
Manual (Default)
By default, VectorWave captures only what you explicitly pass:
@vectorize(auto=True)
def process(query: str, context: dict):
# Only function name and execution metadata are stored
return llm.complete(query, context=context)
Auto-Capture
With capture_inputs=True, all function parameters are automatically stored:
@vectorize(capture_inputs=True, auto=True)
def process(query: str, context: dict):
# Both 'query' and 'context' are stored in Weaviate
return llm.complete(query, context=context)
Return Value Capture
With capture_return_value=True, the function's return value is also stored:
@vectorize(capture_return_value=True, auto=True)
def process(query: str):
result = llm.complete(query)
return result # This value is stored in Weaviate
AI Auto-Documentation
With auto=True, VectorWave uses an LLM to automatically generate:
- A human-readable description of the function
- Search-optimized metadata for RAG queries
@vectorize(auto=True)
def calculate_risk_score(portfolio: dict, market_data: dict):
# VectorWave will auto-generate:
# "Calculates a risk score for a given investment portfolio
# based on current market data conditions."
...
Custom Execution Tags
Tag functions with custom metadata for filtering and monitoring. Tags are passed as **execution_tags and must be defined in your .weaviate_properties file:
@vectorize(team="ml-team", auto=True) # custom tag
def ml_function():
...
@vectorize(team="backend-team", priority="high", auto=True) # custom tags
def backend_function():
...
Custom tags are stored in both function metadata and execution logs, enabling filtering across the VectorSurfer dashboard and search APIs. See Advanced Configuration for .weaviate_properties setup.
Next Steps
- Semantic Caching — Deep dive into caching
- Self-Healing — Automatic error diagnosis and fix
- API Reference — Complete parameter reference