Connector Endpoints
Connector Overview
Deeptrain Connectors act as the bridge between raw multi-modal data sources and your AI models. These endpoints allow you to ingest, process, and retrieve data from various formats—text, images, audio, and video—converting them into formats compatible with 200+ supported LLMs.
Transcribe API (Audio & Video)
The Transcribe API is the primary interface for processing temporal media. It handles extraction and transcription from local files or third-party hosting platforms.
Endpoint: /v1/transcribe
Method: POST
Description: Accepts a video or audio source and returns a structured transcription optimized for LLM context windows.
Request Parameters
| Parameter | Type | Required | Description |
| :--- | :--- | :--- | :--- |
| source_url | string | Yes* | The URL of the video (YouTube, Vimeo, or self-hosted). |
| file | binary | Yes* | Local file upload (mp4, mp3, wav, etc.) if source_url is not provided. |
| mode | string | No | Processing mode: fast, accurate, or multi-dimensional. |
| chunk_size | integer| No | Defines the segment length for transcription (default: 1000 tokens). |
Example Usage
import requests
url = "https://api.deeptrain.ai/v1/transcribe"
payload = {
"source_url": "https://www.youtube.com/watch?v=example",
"mode": "multi-dimensional"
}
headers = {"Authorization": "Bearer YOUR_API_KEY"}
response = requests.post(url, json=payload, headers=headers)
print(response.json())
Vision & Diagram Connectors
These endpoints enable non-vision-supported models to interpret visual data such as flowcharts, graphs, and standard images by converting visual features into descriptive embeddings or structured text.
Endpoint: /v1/vision/process
Method: POST
Description: Analyzes images or diagrams to provide a textual or vector-based representation.
Request Body
image: Base64 encoded string or image URL.type: One ofgeneral,flowchart,graph, ordiagram.
Response Structure
{
"status": "success",
"data": {
"description": "A flowchart showing a logical sequence for a login system...",
"nodes": [...],
"edges": [...],
"embedding_ref": "uuid-v4-reference"
}
}
Real-Time Text & Embedding Connectors
To bypass context window limitations, Deeptrain utilizes a localized embedding database. This allows models to "query" live data sources in real-time.
Endpoint: /v1/connect/text
Method: POST
Description: Connects a live text source (RSS feed, live doc, or web scrapers) to your model's retrieval context.
Configuration Example
const deeptrain = require('deeptrain-sdk');
const connector = deeptrain.Connector({
type: 'live-text',
source: 'https://api.live-news-feed.com/v1/updates',
refreshInterval: '5m',
embeddingModel: 'localized-vector-01'
});
// Connect the data stream to an LLM instance
connector.pipe(myAiAgent);
Supported Input Types & Specifications
| Connector Type | Supported Formats | Native Support | | :--- | :--- | :--- | | Video | MP4, AVI, MOV, YouTube URL, Vimeo URL | Yes | | Audio | MP3, WAV, FLAC, AAC | Yes | | Visual | PNG, JPEG, WEBP, SVG (Flowcharts/Graphs) | Yes | | Data/Text | PDF, TXT, HTML, Live JSON Streams | Yes |
Error Handling
Deeptrain connectors return standard HTTP status codes. When a connector fails (e.g., a YouTube video is private or a file format is unsupported), the response follows this schema:
{
"error": {
"code": 422,
"message": "Unprocessable Entity: Video source is restricted or unreachable.",
"type": "SourceAccessError"
}
}
Note: For high-volume processing, it is recommended to use the asynchronous webhook callback by providing a
callback_urlin your request headers to receive processing results once transcription or vision analysis is complete.