Flowcharts and Graphs
Overview
Deeptrain's Flowcharts and Graphs module provides a specialized pipeline for converting visual structural data into LLM-readable formats. Unlike standard OCR, which treats text as a flat stream, this module identifies nodes, edges, and hierarchical relationships, transforming diagrams into structured schemas or descriptive representations that preserve the original logic.
Processing Visual Diagrams
To process a flowchart or graph, use the process_diagram interface. This method accepts various visual formats and outputs structured data that can be injected directly into your model's context or stored in your localized embedding database.
Supported Formats
- Input: PNG, JPG, PDF, SVG.
- Output: Mermaid.js syntax, JSON (Node-Link), or Semantic Textual Descriptions.
Basic Usage
from deeptrain import MultiModalConnector
# Initialize the connector
connector = MultiModalConnector(api_key="your_api_key")
# Process a local flowchart image
diagram_data = connector.flowcharts.process(
source="./assets/workflow_v1.png",
output_format="mermaid",
detail_level="high"
)
print(diagram_data.content)
Configuration Parameters
The process method accepts several parameters to fine-tune how visual structures are interpreted:
| Parameter | Type | Description |
| :--- | :--- | :--- |
| source | str | Path to a local file or a URL to an image/PDF. |
| output_format | str | The desired representation: "mermaid", "json", or "text". |
| ocr_engine | str | (Optional) Specify a preferred engine for text extraction within nodes. |
| detect_colors | bool | Whether to include color-coding logic in the resulting graph data. |
| detail_level | str | "low" (summary) vs "high" (captures every sub-label and edge weight). |
Structured Output Examples
Mermaid.js Output
Ideal for developers who want to re-render the logic or pass it to models trained on code and markup.
JSON Node-Link Schema
Recommended for applications that need to perform programmatic graph analysis or custom RAG (Retrieval-Augmented Generation) logic.
{
"nodes": [
{"id": "n1", "label": "Start", "type": "process"},
{"id": "n2", "label": "Is Authorized?", "type": "decision"}
],
"edges": [
{"source": "n1", "target": "n2", "label": "next"}
]
}
Integration with LLM Context
Once a flowchart is processed, the resulting data is typically passed to the LLM to provide environmental context or logic rules.
# Integrating with an agent's prompt
response = connector.agents.query(
prompt="Based on this flowchart, what happens if the user is not authorized?",
context=diagram_data.content # The Mermaid or JSON output
)
Best Practices
- Image Clarity: For complex architectural diagrams, ensure a minimum resolution of 1024px on the shortest side to maintain node label accuracy.
- Contextual Anchoring: When using the
"text"output format, Deeptrain provides a semantic narrative. This is often more effective for smaller context windows than raw JSON. - Vector Storage: For large-scale graph datasets, use the Deeptrain localized embedding database to store the structured JSON output, allowing your agent to query specific sub-graphs via RAG.