Connector Registry
Connector Registry
The Connector Registry is the central orchestration layer within VMTP (Deeptrain) that manages how multi-modal data is ingested, processed, and served to LLMs. It provides a unified interface for registering new data sources and retrieving processed content from diverse modalities including text, audio, video, and visual diagrams.
Supported Modalities
The registry categorizes connectors into specific modalities to ensure the correct preprocessing pipelines are applied:
| Modality | Description | Typical Sources |
| :--- | :--- | :--- |
| TEXT | Unstructured or structured text for context injection. | Local files, Live data streams, DBs. |
| IMAGE | Visual data for computer vision tasks. | Local storage, Web URLs. |
| GRAPH | Flowcharts, diagrams, and structural visuals. | Technical documentation, SVG, PNG. |
| AUDIO | Sound files for transcription and analysis. | MP3, WAV, Live streams. |
| VIDEO | Multi-dimensional video processing. | YouTube, Vimeo, Local Storage. |
Registering a Connector
To integrate a new data source, you must register it with the registry. This allows the VMTP engine to track the source and apply the necessary embeddings or transcription logic.
from deeptrain import ConnectorRegistry, VideoConnector
# Initialize the registry
registry = ConnectorRegistry()
# Register a video source (e.g., YouTube)
youtube_connector = VideoConnector(
source="https://www.youtube.com/watch?v=example",
provider="youtube",
config={"transcribe": True}
)
registry.register("marketing_video_01", youtube_connector)
Retrieving and Processing Content
The registry provides a standardized method to pull data into your AI agent's context window or localized embedding database.
Using the Transcribe API
For audio and video connectors, the registry interfaces with the Transcribe API to convert raw media into model-interpretable text.
# Process a registered connector through the Transcribe API
content = registry.process("marketing_video_01")
print(content.transcription)
print(content.metadata)
Multi-modal Data Retrieval
The registry allows models to query data that isn't natively supported (e.g., non-vision models accessing images).
# Retrieve visual data for a non-vision model
graph_data = registry.get_context(
modality="GRAPH",
query="Show me the system architecture flow"
)
# Pass the result directly to the LLM agent
agent.query(f"Based on this graph context: {graph_data}, describe the bottleneck.")
Connector Configuration
Each connector registered requires a configuration object that defines its behavior.
source: The URI or file path to the raw data.modality: The type of data (Text, Image, Audio, Video, Graph).sync_interval: (Optional) For live sources, how often the registry should refresh the data.embedding_model: (Optional) Specific model to use for generating localized embeddings.
Localized Embedding Database
The registry interacts directly with a localized embedding database. This ensures that even when dealing with massive datasets that exceed the LLM's context window, the registry can perform a similarity search and inject only the most relevant snippets into the prompt.
# Configure the registry to use a localized embedding store
registry.configure_storage(
storage_type="local",
db_path="./data/embeddings",
vector_dim=1536 # Match your LLM's embedding dimension
)
Internal Registry Methods
While the public methods above handle most use cases, the registry performs several internal tasks:
- Normalization: Standardizing different video/audio formats before processing.
- Token Management: Ensuring injected text fits within the target model's limits.
- Provider Mapping: Routing requests to the correct 3rd-party API (Vimeo, YouTube, etc.) based on the source URL.