I am a Ph.D. student at the department of artificial intelligence in Korea University, supervised by Prof. Sangpil Kim. Previously, I worked as a visiting researcher for one year at Samsung Advanced Institute of Technology (SAIT) in collaboration with the Computer Vision Lab (Advisor: Sujin Jang). My research interests lie at the intersection of computer vision, multi-modal generative models, and embodied AI systems. My work explores how multi-modal generative models can be used to interpret and synthesize complex sensory inputs—such as language, vision, and audio—in unified frameworks. Furthermore, I am particularly interested in developing 3D visual perception techniques and domain adaptation methods that enable embodied AI systems to operate reliably in real-world environments.🚗🤖💬
Most recent publications on Google Scholar.
* indicates equal contribution.
Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning
Sungjune Kim*, Gyeongrok Oh*, Heeju Ko, Daehyun Ji, Dongwook Lee, Byung-Jun Lee, Sujin Jang, Sangpil Kim
ICML, 2025.
3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
Gyeongrok Oh*, Sungjune Kim*, Heeju Ko, Hyung-gun Chi, Jinkyu Kim, Dongwook Lee, Daehyun Ji, Sungjoon Choi, Sujin Jang, Sangpil Kim
CVPR, 2025.
FPANet: Frequency-based Video Demoireing using Frame-level Post Alignment
Gyeongrok Oh, Sungjune Kim, Heon Gu, Sang Ho Yoon, Jinkyu Kim*, Sangpil Kim*
Neural Networks, 2024.
MEVG: Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Sangpil Kim
ECCV, 2024.
Sound-Guided Semantic Video Generation
Seunghyun Lee, Gyeongrok Oh, Wonmin Byeon, Wonjeong Ryoo, Sang Ho Yoon, Jinkyu Kim*, Sangpil Kim*
ECCV, 2022.
LVMark: Robust Watermark for Latent Video Diffusion Models
MinHyuk Jang*, Youngdong Jang*, JaeHyeok Lee, Feng Yang, Gyeongrok Oh, Jongheon Jeong, Sangpil Kim
Pre-print, 2025.
Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning
Sungjune Kim*, Gyeongrok Oh*, Heeju Ko, Daehyun Ji, Dongwook Lee, Byung-Jun Lee, Sujin Jang, Sangpil Kim
ICML, 2025.
3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
Gyeongrok Oh*, Sungjune Kim*, Heeju Ko, Hyung-gun Chi, Jinkyu Kim, Dongwook Lee, Daehyun Ji, Sungjoon Choi, Sujin Jang, Sangpil Kim
CVPR, 2025.
FPANet: Frequency-based Video Demoireing using Frame-level Post Alignment
Gyeongrok Oh, Sungjune Kim, Heon Gu, Sang Ho Yoon, Jinkyu Kim*, Sangpil Kim*
Neural Networks, 2024.
MEVG: Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh, Jaehwan Jeong, Sieun Kim, Wonmin Byeon, Jinkyu Kim, Sungwoong Kim, Sangpil Kim
ECCV, 2024.
Robust Sound-Guided Image Manipulation
Seunghyun Lee*, Hyung-gun Chi*, Gyeongrok Oh, Wonmin Byeon, Sang Ho Yoon, Hyunje Park, Wonjun Cho, Jinkyu Kim, Sangpil Kim
Neural Networks, 2024.
Audio-Guided Implicit Neural Representation for Local Image Stylization
Seung Hyun Lee*, Sieun Kim*, Wonmin Byeon, Gyeongrok Oh, Sumin In, Hyeongcheol Park, Sang Ho Yoon, Sung-hee Hong, Jinkyu Kim, and Sangpil Kim
Computational Visual Media, 2024.
CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-based 3D Object Detection
Gyusam Chang*, Wonseok Roh*, Sujin Jang, Dongwook Lee, Daehyun Ji, Gyeongrok Oh, Jinsun Park, Jinkyu Kim, Sangpil Kim
AAAI, 2024.
Functional Hand Type Prior for 3D Hand Pose Estimation and Action Recognition from Egocentric View Monocular Videos
Wonseok Roh, Seung Hyun Lee, Wonjeong Ryoo, Gyeongrok Oh, Soo Yeon Hwang, Hyung-gun Chi, Sangpil Kim
BMVC (Oral), 2023.
Sound-Guided Semantic Video Generation
Seunghyun Lee, Gyeongrok Oh, Wonmin Byeon, Wonjeong Ryoo, Sang Ho Yoon, Jinkyu Kim*, Sangpil Kim*
ECCV, 2022.
Full Resume in PDF (Last update: Mar. 2025).