닫기
Loading..

전자정보연구정보센터 ICT 융합 전문연구정보의 집대성

추천정보

홈 홈 > E-Link > 추천정보

ICTㆍ융합 분야 관련 사이트 및 서적을 소개합니다.

  • Speech2Face: Learning the Face Behind a Voice
  • Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019

트위터 공유

페이스북 공유

   2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)


Speech2Face: Learning the Face Behind a Voice


Authors

Tae-Hyun Oh, Tali Dekel, Changil Kim, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Wojciech Matusik 


Abstract

How much can we infer about a person's looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking. We design and train a deep neural network to perform this task using millions of natural Internet/Youtube videos of people speaking. During training, our model learns voice-face correlations that allow it to produce images that capture various physical attributes of the speakers such as age, gender and ethnicity. This is done in a self-supervised manner, by utilizing the natural co-occurrence of faces and speech in Internet videos, without the need to model attributes explicitly. We evaluate and numerically quantify how--and in what manner--our Speech2Face reconstructions, obtained directly from audio, resemble the true face images of the speakers.


Review

기계학습을 이용하여 주어진 음성으로부터 상대방의 얼굴을 상상해내는 인공신경망을 제안

이를 통해, 과학자들이 특정한 목소리 발성구조의 특징 외에도, 어떤 얼굴 정보를 목소리로부터 유추할 수 있을지, 그리고 인공지능은 어느 정도로 잘 유추 할 수 있을지를 연구한 논문