code2flow

Python Code Flow Analysis Tool - Static analysis for control flow graphs (CFG), data flow graphs (DFG), and call graph extraction.

Performance Optimization

For large projects (>1000 functions), use Fast Mode:

# Ultra-fast analysis (5-10x faster)
code2flow /path/to/project --fast

# Custom performance settings
code2flow /path/to/project \
    --parallel-workers 8 \
    --max-depth 3 \
    --skip-data-flow \
    --cache-dir ./.cache

Performance Tips

Technique	Speedup	Use Case
`--fast` mode	5-10x	Initial exploration
Parallel workers	2-4x	Multi-core machines
Caching	3-5x	Repeated analysis
Depth limiting	2-3x	Large codebases
Skip private methods	1.5-2x	Public API analysis

Benchmarks

Project Size	Functions	Time (fast)	Time (full)
Small (<100)	~50	0.5s	2s
Medium (1K)	~500	3s	15s
Large (10K)	~2000	15s	120s

Features

Control Flow Graph (CFG): Extract execution paths from Python AST
Data Flow Graph (DFG): Track variable definitions and dependencies
Call Graph Analysis: Map function calls and dependencies
Pattern Detection: Identify design patterns (state machines, factories, recursion)
Compact Output: Deduplicated flow diagrams with pattern recognition
Multiple Output Formats: YAML, JSON, Mermaid diagrams, PNG visualizations
LLM-Ready Output: Generate prompts for reverse engineering

Installation

# Install from source
pip install -e .

# Or with development dependencies
pip install -e ".[dev]"

Quick Start

# Analyze a Python project
code2flow /path/to/project

# With verbose output
code2flow /path/to/project -v

# Specify output directory and formats
code2flow /path/to/project -o ./analysis --format yaml,json,mermaid,png

# Use different analysis modes
code2flow /path/to/project -m static    # Fast static analysis only
code2flow /path/to/project -m hybrid     # Combined analysis (default)

Usage

Basic Analysis

code2flow /path/to/project

Analysis Modes

# Static analysis only (fastest)
code2flow /path/to/project -m static

# Dynamic analysis with tracing
code2flow /path/to/project -m dynamic

# Hybrid analysis (recommended)
code2flow /path/to/project -m hybrid

# Behavioral pattern focus
code2flow /path/to/project -m behavioral

# Reverse engineering ready
code2flow /path/to/project -m reverse

Custom Output

code2flow /path/to/project -o my_analysis

Output Files

File	Description
`analysis.yaml`	Complete structured analysis data
`analysis.json`	JSON format for programmatic use
`flow.mmd`	Full Mermaid flowchart (all nodes)
`compact_flow.mmd`	Compact flowchart - deduplicated nodes, grouped by function
`calls.mmd`	Function call graph
`cfg.png`	Control flow visualization
`call_graph.png`	Call graph visualization
`llm_prompt.md`	LLM-ready analysis summary

Compact Flow Format

The compact_flow.mmd file provides optimized output:

Deduplication: Identical node patterns are merged (e.g., x = 1, x = 2 → x = N)
Function Subgraphs: Nodes grouped by function in subgraphs
Pattern Preservation: Control flow structure maintained while reducing file size
Import Reuse: Common patterns linked rather than duplicated

Example compact output:

flowchart TD
    %% Function subgraphs
    subgraph F12345["process_data"]
        N1["x = N"]  
        N2{"if x > 0"}
        N3[/"return x"/]
    end
    
    %% Edges reference deduplicated nodes
    N1 --> N2
    N2 -->|"true"| N3

Understanding the Output

LLM Prompt Structure

The generated prompt includes:

System overview with metrics
Call graph structure
Behavioral patterns with confidence scores
Data flow insights
State machine definitions
Reverse engineering guidelines

Behavioral Patterns

Each pattern includes:

Name: Descriptive identifier
Type: sequential, conditional, iterative, recursive, state_machine
Entry/Exit points: Key functions
Decision points: Conditional logic locations
Data transformations: Variable dependencies
Confidence: Pattern detection certainty

Reverse Engineering Guidelines

The analysis provides specific guidance for:

Preserving call graph structure
Implementing identified patterns
Maintaining data dependencies
Recreating state machines
Preserving decision logic

Advanced Features

State Machine Detection

Automatically identifies:

State variables
Transition methods
Source and destination states
State machine hierarchy

Data Flow Tracking

Maps:

Variable dependencies
Data transformations
Information flow paths
Side effects

Dynamic Tracing

When using dynamic mode:

Function entry/exit timing
Call stack reconstruction
Exception tracking
Performance profiling

Integration with LLMs

The generated system_analysis_prompt.md is designed to be:

Comprehensive: Contains all necessary system information
Structured: Organized for easy parsing
Actionable: Includes specific implementation guidance
Language-agnostic: Describes behavior, not implementation

Example usage with an LLM:

"Based on the system analysis provided, implement this system in Go,
preserving all behavioral patterns and data flow characteristics."

Limitations

Dynamic analysis requires test files
Complex inheritance hierarchies may need manual review
External library calls are treated as black boxes
Runtime reflection and metaprogramming not fully captured

Contributing

The analyzer is designed to be extensible. Key areas for enhancement:

Additional pattern types
Language-specific optimizations
Improved visualization
Real-time analysis mode

License

Apache License 2.0 - see LICENSE for details.

Author

Created by Tom Sapletta - tom@sapletta.com

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
code2logic		code2logic
docs		docs
examples		examples
generated_code		generated_code
logic2code		logic2code
logic2test		logic2test
lolm		lolm
raport		raport
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
API_REFERENCE.md		API_REFERENCE.md
CHANGELOG.md		CHANGELOG.md
CHANGELOG_v2.md		CHANGELOG_v2.md
CONTRIBUTING.md		CONTRIBUTING.md
DOCUMENTATION.md		DOCUMENTATION.md
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
REFACTORING_PLAN.md		REFACTORING_PLAN.md
ROADMAP.md		ROADMAP.md
TICKET		TICKET
TODO.md		TODO.md
VERSION		VERSION
advanced_data_analyzer.py		advanced_data_analyzer.py
advanced_yaml_optimizer.py		advanced_yaml_optimizer.py
complexity_reduction_examples.py		complexity_reduction_examples.py
context.md		context.md
context_final.md		context_final.md
context_fixed.md		context_fixed.md
context_improved.md		context_improved.md
docker-compose.yml		docker-compose.yml
fast_analysis.py		fast_analysis.py
fast_analysis.sh		fast_analysis.sh
final_advanced_data_analyzer.py		final_advanced_data_analyzer.py
fixed_advanced_data_analyzer.py		fixed_advanced_data_analyzer.py
fixed_refactoring_implementation_executor.py		fixed_refactoring_implementation_executor.py
flow.py		flow.py
function-schema.json		function-schema.json
function.toon		function.toon
general_refactoring_template.py		general_refactoring_template.py
generate_graph_viewer.py		generate_graph_viewer.py
generate_index_html.py		generate_index_html.py
goal.yaml		goal.yaml
hybrid_export.py		hybrid_export.py
image-1.png		image-1.png
image.png		image.png
litellm_config.yaml		litellm_config.yaml
llm_refactoring_executor.py		llm_refactoring_executor.py
mermaid_to_png.py		mermaid_to_png.py
ollama_benchmark_results.csv		ollama_benchmark_results.csv
optimization_test_results.yaml		optimization_test_results.yaml
optimize_llm_prompt.py		optimize_llm_prompt.py
optimize_llm_prompt_fixed.py		optimize_llm_prompt_fixed.py
performance_optimizations.py		performance_optimizations.py
pipeline_runner_utils_improved.py		pipeline_runner_utils_improved.py
project.sh		project.sh
project.toon		project.toon
project.toon-schema.json		project.toon-schema.json
project_summary.yaml		project_summary.yaml
project_summary_generator.py		project_summary_generator.py
pyproject.toml		pyproject.toml
refactoring_implementation_executor.py		refactoring_implementation_executor.py
refactoring_implementation_report.yaml		refactoring_implementation_report.yaml
refactoring_validation_report.yaml		refactoring_validation_report.yaml
refactoring_validator.py		refactoring_validator.py
requirements.txt		requirements.txt
setup.py		setup.py
split_yaml.py		split_yaml.py
standalone_test_results.yaml		standalone_test_results.yaml
test_analyzer.py		test_analyzer.py
test_graph_viewer.py		test_graph_viewer.py
test_llm_context.md		test_llm_context.md
ultimate_advanced_data_analyzer.py		ultimate_advanced_data_analyzer.py
ultimate_yaml_optimizer.py		ultimate_yaml_optimizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

code2flow

Performance Optimization

Performance Tips

Benchmarks

Features

Installation

Quick Start

Usage

Basic Analysis

Analysis Modes

Custom Output

Output Files

Compact Flow Format

Understanding the Output

LLM Prompt Structure

Behavioral Patterns

Reverse Engineering Guidelines

Advanced Features

State Machine Detection

Data Flow Tracking

Dynamic Tracing

Integration with LLMs

Limitations

Contributing

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

wronai/code2logic

Folders and files

Latest commit

History

Repository files navigation

code2flow

Performance Optimization

Performance Tips

Benchmarks

Features

Installation

Quick Start

Usage

Basic Analysis

Analysis Modes

Custom Output

Output Files

Compact Flow Format

Understanding the Output

LLM Prompt Structure

Behavioral Patterns

Reverse Engineering Guidelines

Advanced Features

State Machine Detection

Data Flow Tracking

Dynamic Tracing

Integration with LLMs

Limitations

Contributing

License

Author

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages