Custom Model Support

Deeptrain is designed to be model-agnostic. While it natively supports over 200 popular open-source and private models, you can integrate specialized, proprietary, or fine-tuned models into the ecosystem by implementing a custom model adapter.

This allows your unique models to leverage Deeptrain’s multi-modal data pipeline—including real-time video transcription, image analysis, and localized embedding retrieval.

The Custom Model Interface

To integrate a custom model, you must wrap your model's inference logic in a class that inherits from Deeptrain's BaseModelAdapter. This ensures compatibility with Deeptrain's multi-modal processing engine.

Implementation Example

from deeptrain.models import BaseModelAdapter
from deeptrain.registry import register_model

@register_model("my-proprietary-llm")
class MyCustomModel(BaseModelAdapter):
    def __init__(self, api_key, model_endpoint, **kwargs):
        self.api_key = api_key
        self.endpoint = model_endpoint
        self.config = kwargs

    def generate(self, prompt, context=None, multi_modal_data=None):
        """
        Main inference method.
        
        Inputs:
            prompt (str): The user query or instruction.
            context (list): Retrieved data from Deeptrain's localized embedding DB.
            multi_modal_data (dict): Processed data (e.g., image tags, video transcripts).
            
        Returns:
            str: The generated response.
        """
        # Logic to call your model API or local weights
        response = self.call_my_model(prompt, context, multi_modal_data)
        return response

    def embed(self, text):
        """
        Optional: Implement if using your model for custom vector embeddings.
        """
        pass

Registering and Using Your Model

Once defined, you can initialize your AI agent by referencing your custom model identifier. Deeptrain handles the routing of multi-modal inputs (like video transcriptions from the Transcribe API) directly into your model's generate method.

from deeptrain import Agent

# Initialize the agent with your custom model
agent = Agent(
    model_type="my-proprietary-llm",
    api_key="your_secret_key",
    model_endpoint="https://api.yourdomain.ai/v1"
)

# Deeptrain will automatically process the video and pass 
# the multi-modal context to your custom model.
agent.query(
    "Analyze the behavior shown in this video.",
    video_source="https://www.youtube.com/watch?v=example"
)

Handling Multi-modal Inputs

When a custom model is active, Deeptrain's engine pre-processes non-textual data into a structured format before passing it to the multi_modal_data parameter.

API Reference: `BaseModelAdapter`

If you are building a custom integration, ensure your adapter adheres to the following input/output specifications:

`generate(prompt, context, multi_modal_data)`

prompt (str): The raw text input from the user.
context (list[str], optional): A list of relevant document snippets retrieved from the localized embedding database.
multi_modal_data (dict, optional): A payload containing keys for images, video_transcripts, or audio_metadata based on the user's input.
Returns: str or StreamingResponse.

`register_model(model_name)`

Internal Role: A decorator used to map a string identifier to your custom class within the Deeptrain registry.
Usage: Required to allow the Agent class to resolve your model name during instantiation.

Custom Model Support