Skip to content

Viterbi algorithm does not apply to activation probabilities #59

@sannawag

Description

@sannawag

I would like to use the output of Crepe to determine whether singer is active versus silent at the perceptual level. That should change at the level of seconds, not milliseconds. Setting a hard threshold based on confidence, though, results in a quick alternation between the two states. The alternation shows in the thick vertical lines in the plots below.

Viterbi would be a straightforward approach to smoothing this out. The current version, though, only applies smoothing to the pitch. I wrote an extension and added it to a pull request in case it would be useful for others: #26.

Screen Shot 2020-06-02 at 4 59 50 PM

Screen Shot 2020-06-02 at 5 15 31 PM

Code for this plot:

import csv
import matplotlib.pyplot as plt
import numpy as np

f0 = []
conf = []
thresh = 0.5

with open('MUSDB18HQ/train/Music Delta - Hendrix/vocals.f0.csv') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    line_count = 0
    for row in csv_reader:
        if line_count == 0:
            print(f'Column names are {", ".join(row)}')
            line_count += 1
        else:
            f0.append(float(row[1]))
            conf.append(float(row[2]))
            line_count += 1
    print(f'Processed {line_count} lines.')

voiced = [1 if c > thresh else 0 for c in conf]
# plt.plot(np.array(f0) * np.array(voiced))
plt.plot(np.array(voiced))
plt.show()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions