top of page

PUBLICATIONS

Selected Conference Publications

for full publications, please go to my google scholar

                                                                                    2023

  • Y. Wei, Y. Sun, R. Zheng, S. Vemprala, R. Bonatti, S. Chen, R. Madaan, Z. Ba, A. Kapoor, S. Ma. Is Imitation All You Need? Generalized Decision-Making with Dual-Phase Training.  ICCV 2023

  • R. Zheng, X. Wang, Y. Sun, S. Ma, J. Zhao, H. Xu, H. Daumé III, F. Huang. TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning. Preprint

  • Y Sun, S Ma, R Madaan, R Bonatti, F Huang, A Kapoor. SMART: Self-supervised Multi-task pretrAining with contRol Transformers. ICLR 2023 Notable top 25% (Spotlight)

  • A Bucker, L Figueredo, S Haddadin, A Kapoor, S Ma, R Bonatti. LaTTe: Language Trajectory TransformEr. ICRA 2023

  • R Bonatti, S Vemprala, S Ma, F Frujeri, S Chen, A Kapoor. Pact: Perception-action causal transformer for autoregressive robotics pre-training. IROS 2023

                                                                                       

                                                                                    2022​

  • J Lei, S Ma, Z Ba, S Vemprala, A Kapoor, K Ren. Masked Autoencoders for Egocentric Video Understanding@ Ego4D Challenge 2022. ECCV 2022

  • R Bonatti, S Vemprala, S Ma, F Frujeri, S Chen, A Kapoor. Pact: Perception-action causal transformer for autoregressive robotics pre-training. NeurIPS worshop 2022

  • D McDuff, Y Song, J Lee, V Vineet, S Vemprala, N A Gyde, H Salman, S Ma, K Sohn, A Kapoor. Causalcity: Complex simulations with agency for causal discovery and reasoning. CLeaR 2022

  • A Bucker, L Figueredo, S Haddadin, A Kapoor, S Ma, R Bonatti. Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers. IROS 2022

  • S Ma, S Vemprala, W Wang, JK Gupta, Y Song, D McDuff, A Kapoor. COMPASS: Contrastive Multimodal Pretraining for Autonomous Systems. IROS 2022                                                                                         

                                                                                       

                                                                                    2021

  • Shuang Ma, Zhaoyang Zeng, Daniel McDuff, Yale Song. Contrastive learning of global and local audio-visual representations. NeurIPS 2021

  • Shuang Ma, Zhaoyang Zeng, Daniel McDuff, Yale Song. Active contrastive learning of audio-visual video representations. ICLR 2021

                                                                                        

                                                                                    2020

  • Matt Whitehill, Shuang Ma, Daniel McDuff, Yale Song. Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency. Interspeech 2020

                                                                                       

                                                                                    2019

  • Shuang Ma, Daniel McDuff, Yale Song. M3D-GAN: Multi-Modal Multi-Domain Translation with Universal Attention. Arxiv 2019

  • Daniel McDuff, Shuang Ma, Yale Song, Ashish Kapoor. Characterizing bias in classifiers using generative models. NeurIPS 2019

  • Shuang Ma, Daniel McDuff, Yale Song. Unpaired image-to-speech synthesis with multimodal information bottleneck. ICCV 2019

  • Shuang Ma, Danile Mcduff, Yale Song.  Neural TTS Stylization with Adversarial and Collaborative Games. ICLR 2019

                                                                                    

                                                                                    2017-2018

  • Shuang Ma, Jianlong Fu, Chang Wen Chen. DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks. CVPR 2018.

  • Shuang Ma, Jing Liu, Chang Wen Chen. A-lamp: Adaptive Layout-Aware Multi-Patch Deep Convolutional Neural Network for Photo Aesthetics Assessment. CVPR 2017

                                                                                                    

                                                                                                  2013-2016

  • Shuang Ma, Chang Wen Chen. D-Sempre: Learning deep semantic-preserving embedding for user interests-social contents modeling. Arxiv

  • Shuang Ma, Chang Wen Chen. Automatic Creation of Magazine-Page-Like Social Media Visual Summary for Mobile Browsing. ICIP 2016.

  • Shuang Ma, Yangyu Fan, Chang Wen Chen. Finding Your Spot: A Photography Suggestion System for Placing Human in the Scene. ICIP 2014.

  • Shuang Ma, Yangyu Fan, Chang Wen Chen. Pose Maker: A Pose Recommendation System for Person in the Landscape Photographing. ACM MM 2014.

  • Shuang Ma, Jiangming Chen, Hu Lu. Approach for License Plate Location using Edge Features Filter and Multi-Decision Mechanism. Computer Engineering and Applications, 2014.

  • Shuang Ma, Yangyu Fan, Tao Lei, Peng Wu. Practical License Plate Recognition Method based on Multi-Feature Extraction. Application Research for Computers, 2013.

Invention Patent

  • Shuang Ma, Yangyu Fan. A License Plate Recognition Method based on Video Streaming. 2014 

bottom of page