digitalocean/llama-index-retrievers-digitalocean-gradientai
Python
Captured source
source ↗digitalocean/llama-index-retrievers-digitalocean-gradientai
Description: LlamaIndex retriever integration for DigitalOcean Gradient Knowledge Base
Language: Python
License: MIT
Stars: 0
Forks: 0
Open issues: 0
Created: 2026-01-27T10:50:44Z
Pushed: 2026-01-29T11:49:31Z
Default branch: main
Fork: no
Archived: no
README:
LlamaIndex Retrievers Integration: DigitalOcean Gradient

Native LlamaIndex retriever integration for DigitalOcean Gradient Knowledge Base as a Service (KBAas). This package provides seamless integration between Gradient's knowledge base retrieval and the LlamaIndex ecosystem.
Features
- 🔌 Native LlamaIndex Integration - Works seamlessly with
RetrieverQueryEngineand other LlamaIndex components - 📦 Automatic Format Conversion - Converts Gradient KB results to
NodeWithScoreobjects - 🎯 Preserves Metadata - Maintains document IDs, chunk IDs, sources, and relevance scores
- ⚡ Async Support - Full support for both synchronous and asynchronous retrieval
- 🔄 Simple API - Clean, intuitive interface following LlamaIndex patterns
Installation
pip install llama-index-retrievers-digitalocean-gradientai
Quick Start
Basic Usage
from llama_index.retrievers.digitalocean.gradientai import GradientKBRetriever
# Initialize retriever
retriever = GradientKBRetriever(
knowledge_base_id="kb-your-uuid-here",
api_token="your-digitalocean-access-token", # DIGITALOCEAN_ACCESS_TOKEN
num_results=5
)
# Direct retrieval
nodes = retriever.retrieve("What is machine learning?")
# Access results
for node in nodes:
print(f"Score: {node.score}")
print(f"Content: {node.node.text}")
print(f"Metadata: {node.node.metadata}")End-to-End RAG with Gradient LLM
Build a complete RAG pipeline using both the retriever and LLM packages from DigitalOcean Gradient.
Install both packages:
pip install llama-index-retrievers-digitalocean-gradientai llama-index-llms-digitalocean-gradientai
Full example:
from llama_index.retrievers.digitalocean.gradientai import GradientKBRetriever
from llama_index.llms.digitalocean.gradientai import GradientAI
from llama_index.core.query_engine import RetrieverQueryEngine
# Initialize retriever (uses DIGITALOCEAN_ACCESS_TOKEN)
retriever = GradientKBRetriever(
knowledge_base_id="kb-your-uuid-here",
api_token="your-digitalocean-access-token",
num_results=5
)
# Initialize LLM (uses MODEL_ACCESS_KEY)
llm = GradientAI(
model="llama3.3-70b-instruct",
model_access_key="your-model-access-key"
)
# Create query engine - retrieves relevant docs and generates a response
query_engine = RetrieverQueryEngine.from_args(
retriever=retriever,
llm=llm
)
# Query: retriever fetches context from KB, LLM generates the answer
response = query_engine.query("Explain quantum computing")
print(response)This gives you a full RAG pipeline where: 1. The retriever searches your Gradient Knowledge Base for relevant documents 2. The LLM uses those documents as context to generate a grounded response
Async Usage
import asyncio from llama_index.core import QueryBundle async def async_retrieve(): retriever = GradientKBRetriever( knowledge_base_id="kb-your-uuid-here", api_token="your-digitalocean-access-token" # DIGITALOCEAN_ACCESS_TOKEN ) query = QueryBundle(query_str="What is neural networks?") nodes = await retriever.aretrieve(query) return nodes nodes = asyncio.run(async_retrieve())
Configuration Options
| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | knowledge_base_id | str | *Required* | Gradient Knowledge Base UUID | | api_token | str | *Required* | DigitalOcean access token (DIGITALOCEAN_ACCESS_TOKEN) | | num_results | int | 5 | Number of results to retrieve (1-100) | | alpha | float | None | Hybrid search weight: 0=keyword/BM25, 1=semantic/vector | | filters | dict | None | Metadata filters (see below) | | base_url | str | None | Custom API base URL (optional) | | timeout | float | 60.0 | Request timeout in seconds |
Hybrid Search (alpha)
Control the balance between keyword and semantic search:
# Pure keyword/BM25 search (good for exact matches, technical terms) retriever = GradientKBRetriever(..., alpha=0.0) # Balanced hybrid search retriever = GradientKBRetriever(..., alpha=0.5) # Pure semantic/vector search (good for conceptual queries) retriever = GradientKBRetriever(..., alpha=1.0)
Metadata Filtering
Filter results based on document metadata:
# Only retrieve from documents with source="docs"
retriever = GradientKBRetriever(
...,
filters={
"must": [{"key": "source", "operator": "eq", "value": "docs"}]
}
)
# Exclude certain document types
retriever = GradientKBRetriever(
...,
filters={
"must_not": [{"key": "type", "operator": "eq", "value": "draft"}]
}
)Supported filter operators: eq, ne, gt, gte, lt, lte, in, not_in, contains
Why Use This Instead of Manual SDK Calls?
Before (Manual SDK Integration):
# ❌ Manual approach - lots of boilerplate response = gradient_client.retrieve.documents( knowledge_base_id=kb_id, num_results=5, query=query ) # Extract text manually docs = [result.text_content for result in response.results if hasattr(result, 'text_content')] # ❌ Loses scores, metadata, and can't use with LlamaIndex components
After (Native Retriever):
# ✅ Clean, native integration retriever = GradientKBRetriever(knowledge_base_id=kb_id, api_token=token) nodes = retriever.retrieve(query) # ✅ Full NodeWithScore objects with metadata and scores # ✅ Works with all LlamaIndex retrieval patterns # ✅ Supports re-ranking, filtering, composition
What Gets Preserved
The retriever automatically captures and preserves:
- Text Content - The retrieved document/chunk text
- Relevance Score - Similarity/relevance score from Gradient
- Document ID - Source document identifier
- Chunk ID - Specific chunk identifier
- Source - Document source/origin
- Custom Metadata - Any additional metadata from Gradient
Advanced Usage
Combining with Other Retrievers
from llama_index.core.retrievers import BaseRetriever class…
Excerpt shown — open the source for the full document.
Notability
notability 3.0/10Minor integration repo, low traction.