PytoTune is an open-source pitch-correction library with a C++ backend, Python bindings, and a command-line interface. It was built as an Algorithm Engineering project and focuses on making the full Auto-Tune-style pipeline transparent, reproducible, and fast.
The project supports two correction modes:
- MIDI-guided correction: match an input WAV file to the notes of a MIDI file
- Scale-guided correction: quantize detected pitches to a musical scale such as
C major,E minor, or more exotic tunings
Under the hood, PytoTune combines:
- YIN-based pitch detection
- Scale or MIDI target pitch generation
- Fourier-based pitch shifting
- OpenMP parallelization
- SIMD acceleration via Google Highway
- Features
- Project Status
- How It Works
- Repository Layout
- Requirements
- Setup and Build
- Python Usage
- CLI Usage
- Testing
- Benchmarking
- Limitations
- Further Reading
- License
- C++20 core implementation for the heavy DSP work
- Python extension module built with
pybind11 - Standalone CLI executable for file-based workflows
- Two pitch-correction modes:
- WAV + MIDI → corrected WAV
- WAV + musical scale → corrected WAV
- Built-in scale parsing, including common western modes and non-standard scales
- Configurable pitch-detection ranges, including presets for common vocal ranges
- Optimized YIN detector with:
- OpenMP parallelization
- SIMD vectorization with Highway
- signal decimation before detection
- Unit tests with GoogleTest
- Linux benchmarking target and helper scripts
PytoTune is currently a research/project codebase rather than a polished end-user package. The core library, CLI,
tests, and Python bindings are present in this repository. However, the repository does not currently ship PyPI
packaging metadata such as a pyproject.toml.
That means local usage is currently centered around building with CMake.
The processing pipeline is:
- Load input data from a WAV file and, optionally, a MIDI file
- Split audio into overlapping windows (default: window size
4096, stride1024) - Detect pitch per window using the YIN algorithm
- Compute target pitches either from:
- a MIDI note sequence, or
- the nearest note in a chosen musical scale
- Apply pitch shifting with a Fourier-transform-based algorithm
- Overlap-add corrected windows into the final output WAV file
This architecture keeps the full signal-processing chain explicit and easy to inspect.
.
├── include/ # Public headers
├── src/ # C++ implementation, CLI, Python bindings
├── tests/ # GoogleTest test suite and sample test assets
├── benchmarks/ # Benchmark executable and helper scripts
├── paper/ # Project paper and figures
├── highway/ # SIMD dependency vendored into the repo
├── CMakeLists.txt # Main build configuration
└── README.md
The most reliable local setup is to follow the CI toolchain choices.
build-essentialcmakeninja-buildpython3
Install command:
sudo apt-get update
sudo apt-get install -y build-essential cmake ninja-build- Homebrew
cmakeninjagcc(usinggcc-15/g++-15)python3
Install command:
brew update
brew install cmake ninja gcc- Visual Studio C++ toolchain
cmakepython
The following dependencies are downloaded automatically during configuration:
pybind11googletest
The project also builds against the bundled highway/ directory.
- Linux/macOS: builds use the Ninja generator
- macOS: uses Homebrew GCC (
gcc-15/g++-15) viaCCandCXX - Windows: use the Visual Studio generator defaults with
-A x64 - Benchmarking: only enabled on Linux
git clone https://github.com/brokkoli71/PytoTune.git
cd PytoTunemkdir -p build
cd build
cmake -S .. -B . -G Ninja -DCMAKE_BUILD_TYPE=Release
ninja -vUse Homebrew GCC 15:
export CC=gcc-15
export CXX=g++-15
mkdir -p build
cd build
cmake -S .. -B . -G Ninja -DCMAKE_BUILD_TYPE=Release
ninja -vcmake -S . -B build -A x64 -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release -- /mThis builds:
- the core static library
- the Python extension module
pytotune - the CLI executable
pytotune_cli - the GoogleTest binary
pytotune_tests
The following CMake options are available:
PYTOTUNE_USE_HWY=ON|OFF— enable/disable Highway SIMD in pitch detectionPYTOTUNE_USE_PREDEFINED_TWIDDLES=ON|OFF— enable/disable predefined twiddle optimization in the FFTPYTOTUNE_REIMPLEMENTED_WINDOWING=ON|OFF— enable/disable the alternative pitch-shifter windowing implementation
Example:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DPYTOTUNE_USE_HWY=OFF
cmake --build build -jIf the compiler name does not exist (for example gcc-15 is missing), check available versions:
ls -1 /opt/homebrew/bin/gcc-* /usr/local/bin/gcc-*Then set CC and CXX to matching binaries before running CMake.
After building, the Python extension module is available in the build directory.
export PYTHONPATH="$PWD/build:$PYTHONPATH"import pytotune
scale = pytotune.Scale.fromName("C major")
pytotune.roundToScale("input.wav", scale, "output_scale.wav")import pytotune
pytotune.matchMidi("input.wav", "reference.mid", "output_midi.wav")import pytotune
pitch_range = pytotune.PitchRange(82.41, 1046.50)
scale = pytotune.Scale.fromName("E minor")
pytotune.roundToScale("input.wav", scale, "output.wav", pitch_range)import pytotune
pitch_range = pytotune.singerToPitchRange("tenor")
pytotune.matchMidi("input.wav", "reference.mid", "output.wav", pitch_range)hearablepianohumanmanwomanbasstenorbaritonaltosopranocat
You can create scales from names such as:
C majorA minorE bluesD dorianchromaticwhole-tonequarter-toneedo19bohlen-pierce
By default, Scale.fromName(...) uses A4 = 442 Hz unless you pass a different tuning explicitly.
Example:
import pytotune
scale = pytotune.Scale.fromName("C major", 440)The repository builds an executable named pytotune_cli.
./build/pytotune_cli midi <wav_input> <midi_input> <wav_output> [<singer> | <fmin> <fmax>]
./build/pytotune_cli scale <wav_input> <scale_name> <wav_output> [<singer> | <fmin> <fmax>]Tune a vocal track to a MIDI file using the default human range:
./build/pytotune_cli midi input.wav reference.mid output.wavTune a file to C major using a preset vocal range:
./build/pytotune_cli scale input.wav "C major" output.wav sopranoTune a file to E minor using a custom pitch-detection range:
./build/pytotune_cli scale input.wav "E minor" output.wav 80 900cd build
ctest --output-on-failurecd build
ctest -C Release --output-on-failureRun from the repository root:
export PYTHONPATH=$PYTHONPATH:$(pwd)/build
python test_bindings.py$env:PYTHONPATH = "$env:PYTHONPATH;$(Get-Location)\build\Release"
python test_bindings.pyThe test data used by the suite lives in tests/data/.
Benchmark support is enabled only on Linux.
The helper scripts in benchmarks/ expect a build directory named cmake-build-relwithdebinfo.
cmake -S . -B cmake-build-relwithdebinfo -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build cmake-build-relwithdebinfo --target pytotune_benchmarks -jExamples:
bash benchmarks/benchmark_pitch_detection_scale.sh
bash benchmarks/benchmark_pipeline_scale.shGenerated CSV outputs are written into the benchmarks/ directory.
- The project is primarily designed for offline processing, not real-time use
- The current workflow is centered around WAV input/output and MIDI reference files
- The pitch detector is best suited to monophonic or voice-like material
- The repository currently provides CMake-based local builds, not a ready-to-publish PyPI package
- Audio quality was not the sole optimization target; the main focus of the project was algorithm engineering and performance analysis
- The full project write-up is available in
paper/paper.tex. The PDF builds are available as workflow artifacts. - Test assets and naming conventions are documented in
tests/data/README.md
This project is licensed under the terms of the LICENSE file in the repository.