Security & Privacy
Security & Privacy
Deeptrain is designed with a "Privacy-First" architecture, ensuring that your data remains under your control while leveraging powerful multi-modal processing. This section outlines how to manage sensitive information, secure your API credentials, and utilize localized storage for maximum data sovereignty.
API Key Management
To interact with supported language models (LLMs) and Deeptrain’s internal features like the Transcribe API, you must provide valid API keys.
Best Practices:
- Environment Variables: Never hardcode API keys directly into your scripts. Use a
.envfile or environment variables. - Least Privilege: Ensure your provider keys (e.g., OpenAI, Anthropic, or custom providers) have only the permissions necessary for the specific task.
Configuration Example
When initializing Deeptrain, ensure your environment is configured to read from a secure source:
import os
from deeptrain import Deeptrain
# Set your keys in the environment before running the application
os.environ["DEEPTRAIN_API_KEY"] = "your_deeptrain_key"
os.environ["PROVIDER_API_KEY"] = "your_llm_provider_key"
# Initialize with security in mind
dt = Deeptrain(
api_key=os.getenv("DEEPTRAIN_API_KEY"),
model_provider="openai"
)
Localized Data & Storage
Deeptrain supports localized embedding databases. This allows you to store vector representations of your private documents, images, and audio files on your own infrastructure rather than on external cloud servers.
- Real-time Retrieval: When using the localized embedding database, Deeptrain fetches content from your live data sources and injects it into the model's context window on the fly.
- Data Residency: Your raw source files (local storage, self-hosted videos) stay within your defined storage parameters.
Transcribe API Security
The Transcribe API processes video and audio inputs to generate text for AI training.
| Feature | Data Handling | Privacy Note | | :--- | :--- | :--- | | Local Video | Processed directly from your filesystem. | No third-party platform access required. | | Platform Video | Metadata and audio streams are fetched from YouTube/Vimeo. | Subject to platform-specific privacy terms. | | Transcription | Audio data is converted to text via the Transcribe API. | Text is returned to your application for local context injection. |
Example: Processing a Local Video Privately
# Processing a video file from a secure local directory
transcription_result = dt.transcribe(
source_type="local",
file_path="/path/to/private/meeting_recording.mp4",
output_format="text"
)
# The result is stored in memory for use in your LLM context
print(transcription_result["text"])
Model Agnosticism and Private LLMs
Deeptrain supports over 200 models, including private and open-source models that can be hosted on-premises (e.g., via Ollama or vLLM). By using a private model with Deeptrain, you ensure that:
- Prompt Data never leaves your internal network.
- Multi-modal inputs (images, audio) are processed by your self-hosted inference engine.
Data Transmission
When interacting with external LLM providers through Deeptrain:
- Only the necessary context (retrieved embeddings and user queries) is transmitted.
- Connections are secured via standard TLS/SSL protocols.
- Deeptrain does not store your model inputs/outputs unless you explicitly configure a local logging or database connector.