Security & Privacy

Deeptrain is designed with a "Privacy-First" architecture, ensuring that your data remains under your control while leveraging powerful multi-modal processing. This section outlines how to manage sensitive information, secure your API credentials, and utilize localized storage for maximum data sovereignty.

API Key Management

To interact with supported language models (LLMs) and Deeptrain’s internal features like the Transcribe API, you must provide valid API keys.

Best Practices:

Environment Variables: Never hardcode API keys directly into your scripts. Use a .env file or environment variables.
Least Privilege: Ensure your provider keys (e.g., OpenAI, Anthropic, or custom providers) have only the permissions necessary for the specific task.

Configuration Example

When initializing Deeptrain, ensure your environment is configured to read from a secure source:

import os
from deeptrain import Deeptrain

# Set your keys in the environment before running the application
os.environ["DEEPTRAIN_API_KEY"] = "your_deeptrain_key"
os.environ["PROVIDER_API_KEY"] = "your_llm_provider_key"

# Initialize with security in mind
dt = Deeptrain(
    api_key=os.getenv("DEEPTRAIN_API_KEY"),
    model_provider="openai"
)

Localized Data & Storage

Deeptrain supports localized embedding databases. This allows you to store vector representations of your private documents, images, and audio files on your own infrastructure rather than on external cloud servers.

Real-time Retrieval: When using the localized embedding database, Deeptrain fetches content from your live data sources and injects it into the model's context window on the fly.
Data Residency: Your raw source files (local storage, self-hosted videos) stay within your defined storage parameters.

Transcribe API Security

The Transcribe API processes video and audio inputs to generate text for AI training.

Example: Processing a Local Video Privately

# Processing a video file from a secure local directory
transcription_result = dt.transcribe(
    source_type="local",
    file_path="/path/to/private/meeting_recording.mp4",
    output_format="text"
)

# The result is stored in memory for use in your LLM context
print(transcription_result["text"])

Model Agnosticism and Private LLMs

Deeptrain supports over 200 models, including private and open-source models that can be hosted on-premises (e.g., via Ollama or vLLM). By using a private model with Deeptrain, you ensure that:

Prompt Data never leaves your internal network.
Multi-modal inputs (images, audio) are processed by your self-hosted inference engine.

Data Transmission

When interacting with external LLM providers through Deeptrain:

Only the necessary context (retrieved embeddings and user queries) is transmitted.
Connections are secured via standard TLS/SSL protocols.
Deeptrain does not store your model inputs/outputs unless you explicitly configure a local logging or database connector.