Skip to content

cocoTwosun/cresco

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cresco - DART Financial Data Pipeline

Korean financial data pipeline that fetches and processes corporate financial data from the DART (Data Analysis, Retrieval and Transfer) system.

Quick Start

1. Setup Environment

cd pipelines

# Copy environment file
cp .env.example .env

# Add your DART API key to .env
# DART_API_KEY=your_actual_api_key

2. Start PostgreSQL (Docker)

# Start PostgreSQL and pgAdmin (from project root)
docker-compose up -d

# Check status
docker-compose ps

# View logs
docker-compose logs -f postgres

Database will be initialized with schema automatically.

Access:

  • PostgreSQL: localhost:5432
  • pgAdmin: http://localhost:5050 (admin@admin.com / admin)

3. Install Dependencies

cd pipelines
uv sync

4. Run Data Pipeline

# Initial data load (first time - fetches all historical data)
uv run main.py --mode initial

# Incremental update (regular updates - fetches only new/latest data)
uv run main.py --mode update

# Custom options
uv run main.py --mode initial --start-year 2020 --end-year 2024
uv run main.py --mode update --year 2025
uv run main.py --mode initial --no-db  # Fetch only, skip database load

Docker Management

# Start services
docker-compose up -d

# Stop services
docker-compose down

# Stop and remove volumes (deletes data)
docker-compose down -v

# View logs
docker-compose logs -f postgres
docker-compose logs -f pgadmin

Database Access

psql (Command Line)

# Connect to database
docker exec -it dart_postgres psql -U postgres -d dart_financial

# Or from host (if psql installed)
psql -h localhost -U postgres -d dart_financial

pgAdmin (Web UI)

  1. Open http://localhost:5050
  2. Login: admin@admin.com / admin
  3. Add Server:
    • Name: DART Financial
    • Host: postgres (container name) or host.docker.internal (from host)
    • Port: 5432
    • Username: postgres
    • Password: postgres
    • Database: dart_financial

Project Structure

cresco/
├── docker-compose.yml          # PostgreSQL + pgAdmin
├── pipelines/
│   ├── main.py                # CLI entry point for data pipeline
│   ├── lib/                   # Core library modules
│   │   ├── storage.py        # Storage abstraction (local/S3)
│   │   ├── initial_load.py   # Initial data load pipeline
│   │   ├── incremental_update.py  # Incremental update pipeline
│   │   ├── data_load.py      # PostgreSQL data loader
│   │   ├── fss_corp_list.py  # Corporate list fetcher
│   │   ├── company.py        # Company info fetcher
│   │   ├── key_accounts.py   # Financial accounts fetcher
│   │   └── key_indexes.py    # Financial indexes fetcher
│   ├── sql/                   # SQL scripts
│   │   ├── schema.sql        # Database schema
│   │   └── cleanup_duplicates.sql
│   ├── .env.example           # Environment template
│   └── data/                  # Collected data (local mode)
│       ├── corp_list_*.csv
│       ├── company_info_*.json
│       ├── key_accounts_*/
│       └── key_indexes_*/
└── CLAUDE.md                  # Development guide

More Information

See CLAUDE.md for detailed documentation on:

  • Pipeline architecture
  • API documentation
  • Database schema
  • Development guide

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%