Korean financial data pipeline that fetches and processes corporate financial data from the DART (Data Analysis, Retrieval and Transfer) system.
cd pipelines
# Copy environment file
cp .env.example .env
# Add your DART API key to .env
# DART_API_KEY=your_actual_api_key# Start PostgreSQL and pgAdmin (from project root)
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f postgresDatabase will be initialized with schema automatically.
Access:
- PostgreSQL:
localhost:5432 - pgAdmin:
http://localhost:5050(admin@admin.com / admin)
cd pipelines
uv sync# Initial data load (first time - fetches all historical data)
uv run main.py --mode initial
# Incremental update (regular updates - fetches only new/latest data)
uv run main.py --mode update
# Custom options
uv run main.py --mode initial --start-year 2020 --end-year 2024
uv run main.py --mode update --year 2025
uv run main.py --mode initial --no-db # Fetch only, skip database load# Start services
docker-compose up -d
# Stop services
docker-compose down
# Stop and remove volumes (deletes data)
docker-compose down -v
# View logs
docker-compose logs -f postgres
docker-compose logs -f pgadmin# Connect to database
docker exec -it dart_postgres psql -U postgres -d dart_financial
# Or from host (if psql installed)
psql -h localhost -U postgres -d dart_financial- Open http://localhost:5050
- Login:
admin@admin.com/admin - Add Server:
- Name:
DART Financial - Host:
postgres(container name) orhost.docker.internal(from host) - Port:
5432 - Username:
postgres - Password:
postgres - Database:
dart_financial
- Name:
cresco/
├── docker-compose.yml # PostgreSQL + pgAdmin
├── pipelines/
│ ├── main.py # CLI entry point for data pipeline
│ ├── lib/ # Core library modules
│ │ ├── storage.py # Storage abstraction (local/S3)
│ │ ├── initial_load.py # Initial data load pipeline
│ │ ├── incremental_update.py # Incremental update pipeline
│ │ ├── data_load.py # PostgreSQL data loader
│ │ ├── fss_corp_list.py # Corporate list fetcher
│ │ ├── company.py # Company info fetcher
│ │ ├── key_accounts.py # Financial accounts fetcher
│ │ └── key_indexes.py # Financial indexes fetcher
│ ├── sql/ # SQL scripts
│ │ ├── schema.sql # Database schema
│ │ └── cleanup_duplicates.sql
│ ├── .env.example # Environment template
│ └── data/ # Collected data (local mode)
│ ├── corp_list_*.csv
│ ├── company_info_*.json
│ ├── key_accounts_*/
│ └── key_indexes_*/
└── CLAUDE.md # Development guide
See CLAUDE.md for detailed documentation on:
- Pipeline architecture
- API documentation
- Database schema
- Development guide