Model-Agnostic Integration
Model-Agnostic Integration
Deeptrain is designed to be model-agnostic, functioning as a sophisticated bridge between your multi-modal data and over 200+ private and open-source Large Language Models (LLMs). This architecture allows you to swap underlying models without reconfiguring your entire data pipeline.
Connecting to Language Models
To integrate Deeptrain with your preferred model, you utilize the primary configuration interface. The system supports major providers including OpenAI, Anthropic, Google, and any open-source model hosted via Hugging Face or self-hosted environments (e.g., Ollama, vLLM).
Basic Initialization
The following example demonstrates how to initialize Deeptrain with a specific model provider:
from deeptrain import DeeptrainConnector
# Initialize the connector for a specific model
connector = DeeptrainConnector(
provider="openai", # e.g., 'anthropic', 'huggingface', 'ollama'
model="gpt-4-vision", # The specific model identifier
api_key="your_api_key" # Your provider API key
)
# Connect multi-modal data (e.g., a video) to the model
response = connector.process(
source="https://www.youtube.com/watch?v=example",
prompt="Analyze the sequence of events in this video."
)
print(response.text)
Configuration Parameters
When initializing the connection, the following parameters define the interface between Deeptrain and the target LLM:
| Parameter | Type | Description |
| :--- | :--- | :--- |
| provider | string | The hosting provider or framework (e.g., openai, anthropic, local). |
| model | string | The specific model ID (e.g., claude-3-opus, llama-3). |
| api_key | string | Required for private providers. For local models, this can be omitted. |
| base_url | string | (Optional) The endpoint for self-hosted or proxy models. |
| config | dict | (Optional) Additional model-specific parameters like temperature or top_p. |
Supported Model Categories
Deeptrain’s integration layer is categorized into three main implementation paths:
1. Private Managed Models
Direct integration for API-based models. Deeptrain handles the formatting of multi-modal inputs (like image embeddings or video transcriptions) into the specific format required by the provider's API.
- Supported: OpenAI (GPT-4o), Anthropic (Claude 3.5), Google (Gemini 1.5 Pro).
2. Open-Source & Self-Hosted
For developers running models locally or on private clouds. By specifying a base_url, Deeptrain can communicate with any OpenAI-compatible server.
- Supported: Llama 3, Mistral, Mixtral, Phi-3.
3. Non-Vision Models (Vision Enhancement)
One of Deeptrain's core strengths is enabling vision capabilities for models that do not natively support image or video processing. Deeptrain processes the visual data into a format (such as detailed spatial descriptions or temporal transcriptions) that a standard text-based LLM can interpret.
Handling Multi-modal Inputs
When using Deeptrain's model-agnostic layer, the input types are automatically handled based on the target model's capabilities:
# Example: Sending an image to a non-vision model
# Deeptrain will automatically process the image into a context-rich description
# before sending it to the text-only model.
connector = DeeptrainConnector(provider="local", model="llama-3-8b")
result = connector.process(
image_path="./diagram.png",
prompt="Explain this flowchart step by step."
)
Custom Model Integration (Internal Interface)
While Deeptrain supports 200+ models out of the box, it also provides an internal BaseModelAdapter class.
- Note: This is considered an internal component. It is used to map custom model response schemas to the Deeptrain standard output. If you are using a proprietary model with a unique API structure, you can extend this adapter to ensure compatibility with Deeptrain’s multi-modal data retrieval.