Flowcharts and Graphs

Overview

Deeptrain's Flowcharts and Graphs module provides a specialized pipeline for converting visual structural data into LLM-readable formats. Unlike standard OCR, which treats text as a flat stream, this module identifies nodes, edges, and hierarchical relationships, transforming diagrams into structured schemas or descriptive representations that preserve the original logic.

Processing Visual Diagrams

To process a flowchart or graph, use the process_diagram interface. This method accepts various visual formats and outputs structured data that can be injected directly into your model's context or stored in your localized embedding database.

Supported Formats

Input: PNG, JPG, PDF, SVG.
Output: Mermaid.js syntax, JSON (Node-Link), or Semantic Textual Descriptions.

Basic Usage

from deeptrain import MultiModalConnector

# Initialize the connector
connector = MultiModalConnector(api_key="your_api_key")

# Process a local flowchart image
diagram_data = connector.flowcharts.process(
    source="./assets/workflow_v1.png",
    output_format="mermaid",
    detail_level="high"
)

print(diagram_data.content)

Configuration Parameters

The process method accepts several parameters to fine-tune how visual structures are interpreted:

Structured Output Examples

Mermaid.js Output

Ideal for developers who want to re-render the logic or pass it to models trained on code and markup.

JSON Node-Link Schema

Recommended for applications that need to perform programmatic graph analysis or custom RAG (Retrieval-Augmented Generation) logic.

{
  "nodes": [
    {"id": "n1", "label": "Start", "type": "process"},
    {"id": "n2", "label": "Is Authorized?", "type": "decision"}
  ],
  "edges": [
    {"source": "n1", "target": "n2", "label": "next"}
  ]
}

Integration with LLM Context

Once a flowchart is processed, the resulting data is typically passed to the LLM to provide environmental context or logic rules.

# Integrating with an agent's prompt
response = connector.agents.query(
    prompt="Based on this flowchart, what happens if the user is not authorized?",
    context=diagram_data.content  # The Mermaid or JSON output
)

Best Practices

Image Clarity: For complex architectural diagrams, ensure a minimum resolution of 1024px on the shortest side to maintain node label accuracy.
Contextual Anchoring: When using the "text" output format, Deeptrain provides a semantic narrative. This is often more effective for smaller context windows than raw JSON.
Vector Storage: For large-scale graph datasets, use the Deeptrain localized embedding database to store the structured JSON output, allowing your agent to query specific sub-graphs via RAG.