訊息公告

12/16【敬邀演講】韓國首爾大學 Dr.Jonghyun Choi“Understanding Sequences of Visual Data”

12/16【敬邀演講】韓國首爾大學 Dr.Jonghyun Choi“Understanding Sequences of Visual Data”

 

內文

Topic:

Understanding Sequences of Visual Data

 

Abstract:
I will talk about some of our work in the vision and language understanding regime for video understanding, embodied AI and continual learning.

In video understanding task, we explore methods that localize moments in videos, generate image sequences for a story and answer questions about given video. Our work addresses some of challenges, including spatiotemporal attention mechanisms that encode long horizon information encoding.

Transitioning to embodied AI, I will discuss advancements in developing agents capable of performing goal-directed tasks in simulated environments.

By integrating video understanding with action sequence prediction and imitation learning, we enhance the agent's ability to adapt and plan effectively in dynamic scenarios.

This talk will also highlight the role of continual learning and environmental adaptability, showcasing their importance for robust and interactive AI systems if time permits.


Speaker

Dr. Jonghyun Choi

Bio:
Dr. Jonghyun Choi is an associate professor in the Department of Electrical and Computer Engineering at Seoul National University (SNU).

He was an associate professor in Yonsei University (2022-2024), an assistant professor in GIST (2018-2022) and a researcher in Allen Institute for Artificial Intelligence (AI2) (2016-2018) and Comcast Labs, DC (2015).

He received a Ph.D. degree from University of Maryland, College Park, under the supervision of Prof. Larry S. Davis, and a B.S. and a M.S. degree from Seoul National University, under the supervision of Prof. Kyoung-Mu Lee in SNU Computer Vision Lab.

During his PhD, he worked as a research intern in Microsoft Research, Redmond (2014 Summer), Disney Research, Pittsburgh (2014 Spring), Adobe Research, San Jose (2013 Summer) and U.S. Army Research Lab. (Adelphi, MD, Summer 2011).

His research interest is around developing efficient but accurate multi-modal perception models, algorithms and systems in terms of labeling cost and computational complexity of training and inference.

URLhttps://ppolon.github.io/