Skip to content

ceva-ip/DPDFNet

Repository files navigation

DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN



Project Page arXiv Paper Hugging Face Models Hugging Face Dataset Hugging Face Space

--- Official project for the DPDFNet paper ---

Noisy→Enhanced spectrogram slideshow

Install the PyPI Package

For CPU-only ONNX inference using the packaged CLI and Python API:

pip install dpdfnet

CLI Example

# Enhance one file
dpdfnet enhance noisy.wav enhanced.wav --model dpdfnet4

# Enhance a directory
dpdfnet enhance-dir ./noisy_wavs ./enhanced_wavs --model dpdfnet2

# Download models
dpdfnet download
dpdfnet download dpdfnet8
dpdfnet download dpdfnet4 --force

Python API Example

import soundfile as sf
import dpdfnet

# In-memory enhancement:
audio, sr = sf.read("noisy.wav")
enhanced = dpdfnet.enhance(audio, sample_rate=sr, model="dpdfnet4")
sf.write("enhanced.wav", enhanced, sr)

# Enhance one file:
out_path = dpdfnet.enhance_file("noisy.wav", model="dpdfnet2")
print(out_path)

# Model listing:
for row in dpdfnet.available_models():
    print(row["name"], row["ready"], row["cached"])

# Download models:
dpdfnet.download()				# All models
dpdfnet.download("dpdfnet4")	# Specific model

Run From Source

1) Install dependencies

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2) Download models

Model files are not bundled in this repository.
Download PyTorch checkpoints, TFLite, and ONNX models from Hugging Face:

pip install -U "huggingface_hub[cli]"

# create target dirs
mkdir -p model_zoo/{checkpoints,onnx,tflite}

# PyTorch checkpoints (HF path: checkpoints/* -> local: model_zoo/checkpoints/*)
hf download Ceva-IP/DPDFNet \
  --include "checkpoints/*.pth" \
  --local-dir model_zoo \

# ONNX models (&states) (HF path: onnx/* -> local: model_zoo/onnx/*)
hf download Ceva-IP/DPDFNet \
  --include "onnx/*.onnx" \
  --local-dir model_zoo \

hf download Ceva-IP/DPDFNet \
  --include "onnx/*.npz" \
  --local-dir model_zoo \

# TFLite models (HF path: *.tflite at repo root -> local: model_zoo/tflite/*)
hf download Ceva-IP/DPDFNet \
  --include "*.tflite" \
  --local-dir model_zoo/tflite \

3) Run offline enhancement

Put one or more *.wav files in ./noisy_wavs, then choose one:

Option A: TFLite

python -m tflite_model.infer_dpdfnet_tflite \
	--noisy_dir ./noisy_wavs \
	--enhanced_dir ./enhanced_wavs \
	--model_name dpdfnet4

Option B: ONNX

python -m onnx_model.infer_dpdfnet_onnx \
  	--noisy_dir ./noisy_wavs \
	--enhanced_dir ./enhanced_wavs \
	--model_name dpdfnet4

Enhanced files are written as:

<original_stem>_<model_name>.wav

Audio Samples & Demo

Real-Time Demo

Real-time DPDFNet demo screenshot

Run:

python -m real_time_demo

How it works:

  • Captures microphone audio in streaming hops.
  • Enhances each hop frame-by-frame with ONNX.
  • Displays live noisy vs enhanced spectrograms.
  • Allows you to control the noise‑reduction level during playback: 0 for the raw stream and 1 for the fully enhanced stream.
  • Enables the use of AGC during playback.

To change model, edit MODEL_NAME near the top of real_time_demo.py.

Model Profile

16 kHz models

Model Params [M] MACs [G] TFLite Size [MB] ONNX Size [MB] Intended Use
baseline 2.31 0.36 8.5 8.5 Fastest / lowest resource usage
dpdfnet2 2.49 1.35 10.7 9.9 Real-time / embedded devices
dpdfnet4 2.84 2.36 12.9 11.2 Balanced performance
dpdfnet8 3.54 4.37 17.2 14.1 Best enhancement quality

48 kHz model

Model Params [M] MACs [G] TFLite Size [MB] ONNX Size [MB] Intended Use
dpdfnet2_48khz_hr 2.58 2.42 11.6 10.3 High-resolution 48 kHz audio

Troubleshooting / FAQ

Q: Model files are missing (TFLite / ONNX / checkpoints)

  • Run the Hugging Face download commands from the Run From Source section.
  • Confirm files are in:
    • model_zoo/tflite/
    • model_zoo/onnx/
    • model_zoo/checkpoints/

Q: No .wav files found

  • Both offline scripts scan only the exact folder given by --noisy_dir (non-recursive).
  • Ensure input files use .wav extension.

Q: Real-time demo has audio device errors

  • Check microphone permissions and default input/output device settings.
  • Install host audio dependencies for sounddevice (PortAudio packages on your OS).

Q: Real-time GUI does not open

  • Ensure Qt dependencies from requirements.txt installed successfully.
  • On headless servers, run offline enhancement instead.

Q: I get import/module errors when running commands

  • Run from repo root and use module form exactly as documented (python -m ...).
  • Activate your virtual environment before running commands.

Q: CPU is too slow for my target

  • Try smaller models (baseline, dpdfnet2).
  • Benchmark ONNX runtime using python -m onnx_model.infer_dpdfnet_onnx ... and compare RTF.

Evaluation Metrics

To compute intrusive and non-intrusive metrics on our DPDFNet EvalSet, we use the tools listed below. For aggregate quality reporting, we rely on PRISM, the scale‑normalized composite metric introduced in the DPDFNet paper.

Intrusive metrics: PESQ, STOI, SI-SNR

We provide a dedicated script, pesq_stoi_sisnr_calc.py, which computes PESQ, STOI, and SI-SNR for paired reference and enhanced audio. The script includes a built-in auto-alignment step that corrects small start-time offsets and drift between the reference and the enhanced signals before scoring, to ensure fair comparisons.

Non-intrusive metrics

  • DNSMOS (P.835 & P.808) - We use the official DNSMOS local inference script from the DNS Challenge repository: dnsmos_local.py. Please follow their installation and model download instructions in that project before running.
  • NISQA v2 - We use the official NISQA project: https://github.com/gabrielmittag/NISQA. Refer to their README for environment setup, pretrained model weights, and inference commands (e.g., running nisqa_predict.py on a folder of WAVs).

Citation

@article{rika2025dpdfnet,
 title = {DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN},
 author = {Rika, Daniel and Sapir, Nino and Gus, Ido},
 year = {2025},
}

License

Apache License 2.0. See LICENSE.

About

DPDFNet: causal single-channel speech enhancement that boosts DeepFilterNet2 with dual-path RNN blocks for stronger long-range temporal and cross-band modeling. Repo includes PyTorch implementation + checkpoints, ONNX & TFLite models with inference code, and a real-time demo.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages