Skip to content

guifrs/1brc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

1 Billion Rows Challenge - Python Solutions

This repository contains my attempts to solve the 1 Billion Rows Challenge using Python. The challenge consists of processing a large text file containing temperature measurements from various weather stations and calculating statistics for each station.

Project Status 🚧

This project is currently under development.

About the Challenge

The 1 Billion Row Challenge (#1BRC) is a programming challenge that involves processing a text file containing one billion rows of temperature measurements from weather stations. Each row contains a station name and a temperature value separated by a semicolon (e.g., Hamburg;12.5).

The goal is to calculate the min, mean, and max temperature for each weather station as efficiently as possible.

Project Structure

.
├── data/                   # Data files
│   ├── measurements.txt    # Generated measurements file
│   └── weather_stations.csv# Weather stations data
├── scripts/               # Solution attempts
│   └── 01_first_try.py   # First implementation
├── create_measurements.py # Script to generate test data
├── pyproject.toml        # Project dependencies
└── README.md

Getting Started

Prerequisites

  • Python 3.12+

Generating Test Data

Use the create_measurements.py script to generate test data:

python create_measurements.py <number_of_rows>

Example:

python create_measurements.py 1_000_000

Solutions & Performance

I will be documenting different approaches and their performance metrics here as I implement them.

Stay tuned for updates!

Contributing

This is a personal challenge project, but feel free to fork it and try your own solutions!

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

About

My Python implementation attempts for the #1BRC (One Billion Row Challenge).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages