Skip to content

gdyxml2000/tensorflow-speech-recognition

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

134 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tensorflow Speech Recognition

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks.

Replaces caffe-speech-recognition, see there for some background.

Ultimate goal

Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...

Sample spectrogram, That's what she said, too laid?

Sample spectrogram, Karen uttering 'zero' with 160 words per minute.

Getting started

Toy examples: ./number_classifier_tflearn.py ./speaker_classifier_tflearn.py

Some less trivial architectures: ./densenet_layer.py

Later: ./train.sh ./record.py

Sample spectrogram or record.py

Partners + collaborators wanted

We are in the process of tackling this project in seriousness. If you want to join the party just start with a small pull request.

Update: Nervana demonstrated that it is possible for 'independents' to build models that are state of the art. Unfortunately they didn't open source the software.

###Fun tasks for newcomers

  • Data Augmentation : create on-the-fly modulation of the data: increase the speech frequency, add background noise, alter the pitch etc,...

###Extensions Extensions to current tensorflow which are probably needed:

Even though this project is far from finished we hope it gives you some starting points.

Looking for a tensorflow consultant / deep learning contractor? Reach out to info@pannous.com

About

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.4%
  • Swift 2.6%