I'm an AI researcher in Vision Intelligence Lab, led by Jaechul Kim, at AI Lab, LG Electronics.
At LG Electronics, I've worked on Large-Scale Generative Datasets, Vision Foundation Models (e.g. Object Detection, Panoptic Segmentation, Depth Estimation, Pose Estimation, and Face Recognition), and On-Device (e.g. Lightweight Modeling and Quantization).
I completed my Master's program at Sogang University advised by Suk-Ju Kang, and closely collaborated with Kyeongbo Kong.
At Sogang University, I've worked on Diffusion Models, Large Language Models, Egocentric Vision, Hand-Object Interaction, Pose/Gaze Estimation, Image Restoration, and Machine Learning.
Additionally, I'm independently pursuing research on AR/VR, Embodied AI, and Robot Learning with Taein Kwon.
[Sep. 2025] Our papers Replace-in-Ego and GenEgo are accepted to ICCV 2025 Workshop.
[Jan. 2025] Our paper Programmable-Room is accepted to IEEE TMM.
[Sep. 2024] Our paper IRP is selected as Oral Presentation to ECCV 2024 Workshop.
[Sep. 2024] Our papers IRP and IHPT are accepted to ECCV 2024 Workshop.
[Aug. 2024] Our paper AttentionHand is selected as Oral Presentation to ECCV 2024.
[Jul. 2024] Our paper AttentionHand is accepted to ECCV 2024.
[Mar. 2024] I will start my AI researcher position at AI Lab, LG Electronics.
[Feb. 2024] Our paper SEMixup is accepted to IEEE TIM.
[Aug. 2023] Our paper HANDiffusion is accepted to ICCV 2023 Workshop.
[Jun. 2023] Our paper SAAF is accepted to IEEE Access.
We introduce EgoWorld, a novel two-stage framework that reconstructs egocentric view from rich exocentric observations, including depth maps, 3D hand poses, and textual descriptions.
We introduce TransHOI, a novel framework for implicit 3D-aware image translation of hand-object interaction, aiming to generate images from different perspectives while preserving appearance details based on user's description of camera.
We propose a novel method to explicitly encode bounding box and keypoint locations in a single query and learn their interactions through multi-head attention and feed-forward network.
We introduce a text-guided object replacement framework, Replace-in-Ego, which integrates a vision-language model (VLM)-based segmentation model with a diffusion transformer (DiT).
We introduce GenEgo, a novel two-stage framework that generates an egocentric view from multimodal exocentric observations, including projected point clouds, 3D hand poses, and textual descriptions.
Programmable-Room interactively creates and edits textured 3D meshes given user-specified language instructions. Using pre-defined modules, it translates the instruction into python codes which is executed in an order.
We propose a novel method, AttentionHand, for text-driven controllable hand image generation. The performance of 3D hand mesh reconstruction was improved by additionally training with hand images generated by AttentionHand.
We introduce a novel framework, Interactive Room Programmer (IRP), which allows users to conveniently create and modify 3D indoor scenes using natural language.
We propose a new interacting hand pose transfer model, IHPT, which is a diffusion-based approach designed to transfer hand poses between source and target images.
We present a new SEM dataset and a two-stage deep learning method (including SEMixup and SEM-SPNet) that achieve state-of-the-art performance in SEM image restoration and structure prediction under diverse conditions.
We propose a novel gaze tracking method for large screens using a symmetric angle amplifying function and center gravity correction to improve accuracy without personalized calibration, with applications in autonomous vehicles.
AI Researcher @ LG Electronics, South Korea
Mar. 2024 ~ Present
[Project] Large-Scale Generative Datasets for Vision Tasks
[Project] Vision Foundation Model for On-Device
[Under Review] Single Query to Bind Them: Unified Representations for Efficient Human Pose Estimation