Let’s talk about Google DeepMind’s Wavenet! This piece of work is about generating audio waveforms for Text To Speech and more. Text To Speech basically means that we have a voice reading whatever we have written down. The difference in this work, is, however that it can synthesize these samples in someone’s voice provided that we have training samples of this person speaking.
__________________________
The paper “WaveNet: A Generative Model for Raw Audio” is available here:
The blog post about this with the sound samples is available here:
The machine learning reddit thread about this paper is available here:
Recommended for you:
Every Two Minute Papers episode on deep learning:
WE WOULD LIKE TO THANK OUR GENEROUS PATREON SUPPORTERS WHO MAKE TWO MINUTE PAPERS POSSIBLE:
Sunil Kim, Julian Josephs, Daniel John Benton, Dave Rushton-Smith, Benjamin Kang.
We also thank Experiment for sponsoring our series. –
Thanks so much to JulioC EA for the Spanish captions! 🙂
Subscribe if you would like to see more of these! –
Music: Dat Groove by Audionautix is licensed under a Creative Commons Attribution license (
Artist:
The thumbnail background image was found on Pixabay –
Splash screen/thumbnail design: Felícia Fehér –
Károly Zsolnai-Fehér’s links:
Facebook →
Twitter →
Web →
source