Installation Guide
Prerequisites
Before installing Deeptrain, ensure your environment meets the following requirements:
- Python: Version 3.9 or higher.
- Package Manager:
pip(standard with Python) orpoetry. - System Dependencies:
- FFmpeg: Required for video and audio processing (Transcribe API).
- Tesseract OCR: Required for processing flowcharts, graphs, and images.
System-Level Installation
Depending on your operating system, you may need to install external binary dependencies to handle multi-modal data.
macOS
Use Homebrew to install the necessary media processing libraries:
brew install ffmpeg tesseract
Ubuntu/Debian
Update your package list and install the required utilities:
sudo apt update
sudo apt install -y ffmpeg tesseract-ocr libtesseract-dev
Windows
- FFmpeg: Download the builds from ffmpeg.org, extract them, and add the
binfolder to your System PATH. - Tesseract: Download the installer from the UB Mannheim repository and add the installation path to your System PATH.
Python Package Installation
Install the Deeptrain core library using pip. It is recommended to use a virtual environment (venv or conda) to manage your dependencies.
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install Deeptrain
pip install deeptrain
Optional Dependencies
If you plan to use specific local embedding databases or specialized vision models, you can install the extended suites:
pip install "deeptrain[vision,audio]"
Configuration
Deeptrain requires specific environment variables to interface with your chosen LLMs and the Transcribe API. Create a .env file in your project root:
# Core Deeptrain Configuration
DEEPTRAIN_API_KEY=your_deeptrain_key_here
# Model Provider Keys (Example for OpenAI/Anthropic)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Storage Configuration (Optional)
LOCAL_DB_PATH=./data/embeddings
Initializing the Client
Once installed, you can initialize the Deeptrain connector in your Python script:
from deeptrain import DeeptrainConnector
# Initialize the connector
# The library automatically loads credentials from environment variables
connector = DeeptrainConnector()
print("Deeptrain successfully initialized.")
Verifying the Installation
To ensure that the multi-modal components (specifically video and audio) are working correctly, run the following verification command:
python -m deeptrain.verify
This internal utility checks for:
- FFmpeg accessibility: Confirms video frames can be extracted.
- Tesseract availability: Confirms text can be read from images/graphs.
- API connectivity: Validates your connection to the Deeptrain processing backend.
Advanced Environment Setup
Docker Installation
For consistent behavior across environments, you can use the official Deeptrain Docker image which comes pre-configured with all system-level dependencies:
docker pull uditakhourii/deeptrain:latest
docker run -it --env-file .env uditakhourii/deeptrain
Private Model Support
If you are using one of the 200+ supported private or open-source models, ensure your local inference server (like Ollama or LocalAI) is running before initializing Deeptrain, and point the BASE_URL in your configuration to your local endpoint.