My research area is centered around Machine Learning and Artificial Intelligence. Specifically, I leverage machine learning techniques to build ''human like'' machine intelligence across multiple modalities (e.g. visual data, textual data and auditory data). Such multimodal intelligence has been applied to address individual, industrial and social needs.
My current research interests lie in large-scale pretraining and multimodal learning that considering more sensory signals.
I received my Ph.D. degree from the State University of New York at Buffalo while working with Prof. Chang Wen Chen. I have several wonderful internships at Microsoft Research (Redmond and Beijing). I was very fortunate to work with Daniel Mcduff, Yale Song and Mary Czerwinski at MSR, Tao Mei at JD AI Research and Jianlong Fu at MSRA.
One paper on MULTI-REFERENCE NEURAL TTS STYLIZATION was accepted by Interspeech 2020 (Jul, 2020)
Joined Microsoft Cognition (Feb. 2020)
One paper on characterizing bias in visual classifiers was accepted by NeurIPS 2019 (Sep 3, 2019)
One paper on multimodal representation learning was accepted by ICCV 2019 (July 22, 2019)
Zhaoyang Zeng, Sun Yat-sen University 2020
Mingzhi Yu, University of Pittsburgh 2020