Custom Model Support
Custom Model Support
Deeptrain is designed to be model-agnostic. While it natively supports over 200 popular open-source and private models, you can integrate specialized, proprietary, or fine-tuned models into the ecosystem by implementing a custom model adapter.
This allows your unique models to leverage Deeptrain’s multi-modal data pipeline—including real-time video transcription, image analysis, and localized embedding retrieval.
The Custom Model Interface
To integrate a custom model, you must wrap your model's inference logic in a class that inherits from Deeptrain's BaseModelAdapter. This ensures compatibility with Deeptrain's multi-modal processing engine.
Implementation Example
from deeptrain.models import BaseModelAdapter
from deeptrain.registry import register_model
@register_model("my-proprietary-llm")
class MyCustomModel(BaseModelAdapter):
def __init__(self, api_key, model_endpoint, **kwargs):
self.api_key = api_key
self.endpoint = model_endpoint
self.config = kwargs
def generate(self, prompt, context=None, multi_modal_data=None):
"""
Main inference method.
Inputs:
prompt (str): The user query or instruction.
context (list): Retrieved data from Deeptrain's localized embedding DB.
multi_modal_data (dict): Processed data (e.g., image tags, video transcripts).
Returns:
str: The generated response.
"""
# Logic to call your model API or local weights
response = self.call_my_model(prompt, context, multi_modal_data)
return response
def embed(self, text):
"""
Optional: Implement if using your model for custom vector embeddings.
"""
pass
Registering and Using Your Model
Once defined, you can initialize your AI agent by referencing your custom model identifier. Deeptrain handles the routing of multi-modal inputs (like video transcriptions from the Transcribe API) directly into your model's generate method.
from deeptrain import Agent
# Initialize the agent with your custom model
agent = Agent(
model_type="my-proprietary-llm",
api_key="your_secret_key",
model_endpoint="https://api.yourdomain.ai/v1"
)
# Deeptrain will automatically process the video and pass
# the multi-modal context to your custom model.
agent.query(
"Analyze the behavior shown in this video.",
video_source="https://www.youtube.com/watch?v=example"
)
Handling Multi-modal Inputs
When a custom model is active, Deeptrain's engine pre-processes non-textual data into a structured format before passing it to the multi_modal_data parameter.
| Input Type | Data Structure Passed to Model | | :--- | :--- | | Images | Dictionary containing OCR text, detected objects, and visual descriptions. | | Audio/Video | Full transcripts processed via the Transcribe API, including timestamps. | | Flowcharts | Graph nodes and edge relationships represented as structured JSON. |
API Reference: BaseModelAdapter
If you are building a custom integration, ensure your adapter adheres to the following input/output specifications:
generate(prompt, context, multi_modal_data)
- prompt (
str): The raw text input from the user. - context (
list[str], optional): A list of relevant document snippets retrieved from the localized embedding database. - multi_modal_data (
dict, optional): A payload containing keys forimages,video_transcripts, oraudio_metadatabased on the user's input. - Returns:
strorStreamingResponse.
register_model(model_name)
- Internal Role: A decorator used to map a string identifier to your custom class within the Deeptrain registry.
- Usage: Required to allow the
Agentclass to resolve your model name during instantiation.