微软亚洲研究院傅建龙、刘蓓学术报告会

来源:计算机与人工智能学院 发布日期: Fri Oct 22 00:00:00 CST 2021 浏览次数:1576

 

报告一题目:Image and Video Transformation

报告人:建龙,Microsoft Research Asia (MSRA)Senior Research Manager

报告时间:2021年10月23日(星期六)下午15:30 -- 16:10

报告地点:西南交大犀浦校区3号楼31524会议室

主持人:袁召全

讲者简介:

Jianlong Fu is currently a Senior Research Manager with the Multimedia Search and Mining Group, Microsoft Research Asia (MSRA). He received his Ph.D. degree in pattern recognition and intelligent system from the Institute of Automation, Chinese Academy of Science in 2015. His current research interests include computer vision and multimedia analysis. He has authored or coauthored more than 100 papers in top journals and conferences, and 1 book chapter. He serves as a lead organizer and guest editor of special issue on Fine-grained Categorization in IEEE TPAMI from 2018-2021. He serves as an Industry-Track co-chair in ACM Multimedia 2021, and was area chairs of ACM Multimedia 2018-2021, and IEEE ICME 2019-2021. He was the recipient of the Best Paper Award in ACM Multimedia 2018. He has shipped core technologies to many Microsoft products, including Windows, Office, Bing and XiaoIce. Besides, his team has published the first AI-created Poetry Book in 2018. 

 

报告简介:

Image and video have been becoming the language people use to communicate on the Internet. Multimedia content connects people and appeals to the young. This project aims at deep image and video transformation generating high-quality image and video content in an automatic way and creating more engaging experiences for modern work and life. Our vision is broad and focuses on developing state-of-the-art AI technology for fast, reliable, and cost-effective content creation, communication, and consumption. Our technology can benefit multiple experiences across M365, including enterprise, education, consumer, and device-specific experiences. 

 

 

 

报告二题目:Multimodal learning and pretraining

报告人:刘蓓Microsoft Research Asia (MSRA)Researcher

报告时间:2021年10月23日(星期六)下午16:10 -- 16:50

报告地点:西南交大犀浦校区3号楼31524会议室

主持人:袁召全

讲者简介:

Bei Liu is a researcher in Multimedia Search and Mining Group, Microsoft Research Asia (MSRA). Before joining Microsoft, she received her Ph.D. and master’s degree from Department of Social Informatics, Kyoto University, Japan, in 2018 and 2014, respectively. She received the B.S. degree in the Institute of Software, Nanjing University, China, in 2011. Her current research interests include Vision and Language, visual creation, and object detection. She received the Best Paper Award of ACM Multimedia 2018. She serves as reviewer or meta-reviewer for IEEE Trans. on Multimedia, ACM Multimedia, IEEE International Conference on Multimedia and Expo (ICME), ACM International Conference on Multimedia Retrieval (ICMR) and AAAI. She also received IEEE Trans. on Multimedia 2020 Outstanding Reviewer Award. She also organized challenges in ACM Multimedia 2019 and 2020 and workshop in ICMR 2021.

 

报告简介:

With the success of single modality learning (e.g., vision computing, natural language processing, speech recognition), multimodal learning (e.g., vision-language, speech-language, vision-speech) that involves multiple modalities in one single task has attracted more attention in recent years. Representation learning has always been a key challenge in many tasks of multi-modality domain. Pretraining has been an emerging topic that provides a way to learn strong representation in many single-modality fields. In the last few years, we have witnessed many research works on multimodal pretraining which have achieved the state-of-the-art performances on many vision and language tasks (e.g., image-text retrieval, visual questioning and answering). In this talk, I will give an introduction of our research works on multimodal learning tasks and recent works on multimodal pretraining.