🧍‍♂️ Biography

Ge-Peng Ji (🔊pronounced: /ɡə pɛŋ dʒiː/) is a final-year PhD candidate at the School of Computing, Australian National University, advised by Professor Nick Barnes. Prior to this, he received his Master’s degree from the School of Computer Science at Wuhan University in 2021. He has published over 30 peer-reviewed academic papers, with more than 10,000+ citations and an h-index of 24, and holds six Chinese technical patents. He regularly serves as a reviewer for top-tier AI journals and conferences, including TPAMI, IJCV, TMI, CVPR, ICCV, and MICCAI. He was named to the Stanford/Elsevier Top 2% Scientists List in both 2024 and 2025.

👨‍💻 Research Interests

His research centers on subtle visual perception (微视觉感知), aiming to model hard-to-detect patterns - often imperceptible to human vision yet semantically meaningful - in complex environments using advanced AI techniques, including computer vision, multimodal learning, and reinforcement learning. Empowering such perceptual capabilities in intelligent systems has broad real-world implications, including 1️⃣ healthcare AI, where early identification of subtle anomalies in medical imaging can enable timely and potentially life-saving interventions; 2️⃣ camouflaged scene understanding, where objects blend into their surroundings due to low contrast and unclear boundaries; and 3️⃣ high-precision vision applications, where identifying tiny objects or subtle structures requires fine-grained representation and accurate localization.

🔥 News

🚨

NOTE
I am currently on the academic job market and open to postdoctoral and research positions. Please feel free to reach out if you think my background may be a good fit for your group. I am also open to interdisciplinary collaborations and welcome connections with researchers from diverse backgrounds, especially in medical applications.

2026.03 🎉 We are excited to launch the Colon-X project, an open initiative aimed at advancing intelligent colonoscopy toward clinical reasoning.
2026.01 Our paper “Frontiers in Intelligent Colonoscopy” is now officially available on Springer platform. Code can be found on our GitHub repository.

📝 Representative Publications

^ indicates equal contribution, and * denotes corresponding author. For a full list of publications, please visit my 🎓Google Scholar profile.

🚩 [Research Topic #1] Healthcare AI

In healthcare AI, subtle visual perception is critical, as many clinically significant cues manifest as faint, low-contrast, and hard-to-distinguish visual patterns. Examples include early-stage polyps in colonoscopy, ambiguous lesion boundaries, and subtle tissue texture variations in medical imaging. Our goal is to move beyond merely detecting observable abnormalities toward mining subtle clinically-meaningful signals and supporting clinical decision-making, ultimately enabling more reliable early screening, diagnosis, and precision medicine.

[MICCAI 2020] PraNet: Parallel Reverse Attention Network for Polyp Segmentation

Deng-Ping Fan, Ge-Peng Ji, Tao Zhou, Geng Chen, Huazhu Fu*, Jianbing Shen*, Ling Shao, and Ali Borji
Medical Image Computing and Computer Assisted Intervention, Virtual, October 4-8, 2020

Links: 📄Paper (arXiv; Springer; 中译文) | 📦Project Page | 📦Huawei Ascend ModelZoo | 🎬Technical Video at MedicoEval'2020 Workshop | 💡个人博客: 如何用“反向思维”找出隐匿的早期病灶

Keywords: #polyp-segmentation, #intelligent-colonoscopy

TL;DR: A pioneering work that introduced a reverse attention mechanism for early lesion detection (particularly effective for weak boundaries), now widely recognized as a standard baseline in medical image segmentation.

Impact: Early acceptance & Oral Presentation (Top 13%) | MICCAI2025 Young Scientist Publication Impact Award (see certificate & award ceremony photo) | #1 most-cited MICCAI paper (by Google Scholar Metrics 2025) | Ranked #1 in accuracy at the MediaEval 2020 Medico Task | Most Influential Application Paper Award at the Jittor Developer Conference 2021 | Featured in the Stanford AI Index Report 2022

Follow-up Work (PraNet-V2): We extend the concept of reverse attention - originally introduced for binary segmentation in PraNet - to a broader range of multi-class segmentation tasks in medical imaging. Please read our CVM 2026 journal publication (IF: 18.3; 中国最具国际影响力学术期刊; IEEE Xplore; Project Page).

[MICCAI 2021] Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Ge-Peng Ji^, Yu-Cheng Chou^, Deng-Ping Fan*, Geng Chen, Huazhu Fu*, Debesh Jha, Ling Shao
Medical Image Computing and Computer Assisted Intervention, Strasbourg, France, September 27-October 1, 2021

Links: 📄Paper (arXiv; Springer; 中译文) | 📦Project Page | 🎬Introduction Video (YouTube)

Keywords: #video-polyp-segmentation, #intelligent-colonoscopy

TL;DR: We propose a new self-attention mechanism for video polyp segmentation - an extensible, plug-and-play design that delivers ultra-fast inference (170+ FPS on a single RTX 2080 GPU).

Impact: Early acceptance (Top 13%) | MICCAI 2021 Student Travel Award (Top 5.8% = 95/1630) | Widely adopted as a standard baseline with over 200 citations

[MIR 2022] Video Polyp Segmentation: A Deep Learning Perspective (extended version of MICCAI 2021)

Ge-Peng Ji^, Guobao Xiao^, Yu-Cheng Chou^, Deng-Ping Fan*, Kai Zhao, Geng Chen, and Luc Van Gool
Machine Intelligence Research, 2022, 19 (6): 531-549. (IF: 8.7; CiteScore: 13.2; JCR Q1; 中国最具国际影响力学术期刊)

Keywords: #video-benchmark, #video-polyp-segmentation, #intelligent-colonoscopy

Links: 📄Paper (arXiv; Springer; 中译文) | 📦Project Page | 🎬Introduction Video (~2min) | 📰获《机器智能研究》官方微信公众号专题解读

TL;DR: We introduce SUN-SEG, the first large-scale, densely-annotated video polyp segmentation dataset, comprising 1,106 colonoscopy videos with 158.7K human annotated masks. SUN-SEG has established itself as a standard benchmark for medical video analysis and is adopted for building the first video foundation model for endoscopy (Endo-FM by Prof. Qi Dou's group at CUHK).

[MIR 2026] Frontiers in Intelligent Colonoscopy

Ge-Peng Ji, Jingyi Liu, Peng Xu, Nick Barnes, Fahad Shahbaz Khan, Salman Khan, Deng-Ping Fan*
Machine Intelligence Research, 2026, 23 (1), 70-114. (IF: 8.7; CiteScore: 13.2; JCR Q1; 中国最具国际影响力学术期刊)

Keywords: #survey, #multimodal-benchmark, #multimodal-large-language-model, #intelligent-colonoscopy

Links: 📄Paper (arXiv; Springer; 中译文) | 📦Project Page | 📰获《机器智能研究》官方微信公众号专题解读

TL;DR: Three key initiatives advancing multimodal intelligence in colonoscopy:

1. 📖ColonSurvey: The most comprehensive survey to date, analyzing 63 datasets and 137 deep models since 2015, and uncovering foundational resources and transferable insights for the next multimodal era of colonoscopy.

2. 🤗ColonINST: The first large-scale multimodal dataset (450K+ VQA entries), enabling instruction-following capabilities for colonoscopy systems.

3. 🤗ColonGPT: A colonoscopy-specific MLLM with a token-efficient design, reducing visual tokens to ~34% without compromising performance, while remaining efficient enough for training and deployment on consumer-grade GPUs.

[arXiv 2026] Colon-X: Advancing Intelligent Colonoscopy toward Clinical Reasoning

Ge-Peng Ji, Jingyi Liu, Deng-Ping Fan*, Huazhu Fu, Nick Barnes

Keywords: #multimodal-benchmark, #clinical-reasoning, #reinforcement-learning, #multimodal-large-language-model, #intelligent-colonoscopy

Links: 📄Paper (arXiv) | 📦Project Page

TL;DR: We launch the Colon-X project, an open initiative aimed at advancing multimodal intelligence in colonoscopy toward clinical reasoning. Three research highlights include:

1. 🤗ColonVQA: The most extensive database ever built for multimodal colonoscopy analysis, distinguished by its scale (212.7K images and 1.1M+ VQA entries), diversity (76 clinical findings), and coverage (18 tasks).

2. Multimodal Understanding: Benchmarking generalizability (📦ColonEval) and reliability (📦ColonPert) of MLLMs in colonoscopy. The results eveal that clinical outputs from leading MLLMs remain far from robust and trustworthy.

3. Clinical reasoning: To bring reasoning into colonoscopy AI, we design a multi-agent debating workflow to curate a clinically grounded reasoning dataset (📦ColonReason), and propose an R1-style model (🤗ColonR1) with sets a reproducible SOTA baseline for community.

🚩 [Research Topic #2] Camouflaged Scene Understanding

Camouflaged scenarios refer to environments in which objects blend seamlessly into their surroundings, making them inherently difficult to perceive due to low contrast and ambiguous boundaries (Cuthill et al., Nature 2005) Such scenarios is crucial for a range of real-world applications, including bushfire detection and wildlife monitoring/searching/rescuing. Our goal is to develop models capable of modeling subtle visual cues and contextual dependencies, enabling robust detection of camouflaged objects in complex environments.

[TPAMI 2021] Concealed Object Detection (extended version of CVPR 2020)

Deng-Ping Fan, Ge-Peng Ji, Ming-Ming Cheng*, and Ling Shao
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (10): 6024-6042. (IF: 18.6)

Keywords: #image-benchmark, #camouflaged-object-detection, #concealed-object-detection

Links: 📄TPAMI 2021 Journal Paper (arXiv; IEEE Xplore; GitHub; Supplementary Material; 中译文) | 📄CVPR 2020 Conference Paper (CVF Open Access; Github; 中译文) | 📦Project Page | 📦Download COD10K dataset | 🎮Online Playground

TL;DR: Introducing COD10K, a large-scale benchmark for camouflaged object detection, featuring comprehensive datasets and evaluation protocols for advancing the field of camouflage-aware computer vision.

[MIR 2023] Deep Gradient Learning for Efficient Camouflaged Object Detection

Ge-Peng Ji, Deng-Ping Fan*, Yu-Cheng Chou, Dengxin Dai, Alexander Liniger, Luc Van Gool
Machine Intelligence Research, 2023, 20 (1): 92-108. (IF: 8.7; CiteScore: 13.2; JCR Q1; 中国最具国际影响力学术期刊)

Keywords: #efficient-camouflaged-object-segmentation, #polyp-segmentation, #industrial-defect-segmentation

Links: 📄Paper (arXiv; Springer; 中译文) | 📦Project Page (supporting PyTorch/Jittor/Ascend platforms) | 🎬Introduction video (~2min)

TL;DR: Introducing an efficient model (DGNet) for camouflaged object detection that leverages subtle internal texture cues to improve segmentation performance while significantly reducing computational overhead.

[SCIS 2023] SAM Struggles in Concealed Scenes - Empirical Study on Segment Anything

Ge-Peng Ji, Deng-Ping Fan, Peng Xu*, Ming-Ming Cheng, Bowen Zhou, Luc Van Gool
SCIENCE CHINA Information Sciences, 2023, 66: 226101. (IF: 7.6; CiteScore: 12.6; CCF-A; 中国最具国际影响力学术期刊)

Keywords: #image-benchmark, #promptable-segmentation, #camouflaged-object-segmentation, #polyp-segmentation, #industrial-defect-segmentation

Links: 📄Paper (arXiv; Springer) | 📰获《中国科学信息科学》官方微信公众号专题解读

TL;DR: Segment Anything Model (SAM) evaluation on concealed scenes, revealing substantial performance gaps and offering empirical insights for future research.

🚩 [Research Topic #3] High-Precision Vision Applications

In real-world vision systems such as autonomous driving (e.g., distant traffic signs and lane markings), image matting (e.g., hair-level boundary extraction), and remote sensing (e.g., tiny object detection in high-resolution imagery), even pixel-level inaccuracies can lead to critical failures. Our goal is to unlock high-precision representations in complex vision systems, enabling precise structure delineation and reliable tiny object localization.

[ICCV 2021] Full-Duplex Strategy for Video Object Segmentation

Ge-Peng Ji, Deng-Ping Fan*, Keren Fu, Zhe Wu, Jianbing Shen, Ling Shao
International Conference on Computer Vision, Virtual, October 11-17, 2021

Keywords: #video-object-segmentation, #video-salient-object-detection

Links: 📄ICCV 2021 Conference Paper (arXiv; IEEE Xplore; 中译文) | 📄CVMJ 2023 Journal Extension (IF: 18.3; 中国最具国际影响力学术期刊; arXiv; IEEE Xplore) | 📦Project Page

TL;DR: We introduce a full-duplex learning strategy that enforces mutual spatiotemporal constraints, enabling precise segmentation of moving objects in dynamic scenes.

Impact: Honorable Mention Award at CVMJ 2023 (see the certificate)

[ICCV 2025] LawDIS: Language-Window-based Controllable Dichotomous Image Segmentation

Xinyu Yan, Meijun Sun, Ge-Peng Ji, Fahad Shahbaz Khan, Salman Khan, Deng-Ping Fan*
International Conference on Computer Vision, Honolulu, Hawaii, USA, October 19-23, 2025

Keywords: #promptable-segmentation, #stable-diffusion, #multimodal-model, #dischotomous-image-segmentation

Links: 📄Paper (arXiv; CVF Open Access; 中译文) | 📦Project Page | 🎬introduction video (~1min)

TL;DR: Describe, segment, and post-refine! LawDIS enables human users to precisely segment objects with simple language instructions, and interactively post-refine user-specified regions of arbitrary size.

💻 Working Experience

2023.03 - 2023.09 Visiting Scholar, Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE.
2022.04 - 2022.07 Research Assistant, Sensing Intelligence and Machine Learning Laboratory (SIGMA Lab), Wuhan University, Wuhan, China.
2021.06 - 2022.04 Research Intern (Talent Program), Alibaba Group (ICBU Technology Department), Hangzhou, China.
2020.11 - 2021.04 Research Intern, Inception Institute of Artificial Intelligence (IIAI-CV&Med Team), Abu Dhabi, UAE.

💬 Invited Talks

2026.03.19 ANU Seminar Talk “Multimodal Intelligence in Colonoscopy: Perception, Understanding, and Reasoning”
2023.09.23 国际图象图形学学术会议(ICIG 2023) - 多模态数据感知与学习Workshop大会报告 “Towards AI-Powered Colonoscopy” (Page1/Page2)
2023.07.06 机器智能前沿论坛·第2期 “伪装场景感知及多模态应用: 一种基于梯度先验信息学习的高效伪装目标分割方法” (Page & Recorded Video)
2023.03.31 中国图象图形学学会第三期学生会员分享论坛会议 “Towards AI-Powered Colonoscopy” (Page)
2023.01.06 Shenzhen University Talking 2023: “Towards AI-Powered Colonoscopy”
2022.12.07 Anhui University Talking 2022: “Colonoscopy in the AI Era”
2022.10.10 VALSE 2021 Workshop 6 - Spotlight Talking: “Camouflaged Object Detection and its Applications” (Poster)
2021.11.15 Synced-ICCV 2021 “Full-Duplex Strategy for Video Object Segmentation”
2021.08.28 CSIG-ICCV 2021 “Full-Duplex Strategy for Video Object Segmentation” (Page)
2021.05.15 Alibaba Group ICBU Talking 2021: “Camouflaged Object Detection in the Deep Learning Era”

🏆 Honors and Awards

2025 MICCAI Young Scientist Publication Impact Award (Top ~0.2% among 2020-2024 accepted papers, Link)
2025 CVPR 2025 Outstanding Reviewer Award (Top 5.6%=711/12000+, Link)
2025 Stanford/Elsevier Top 2% Scientists List 2025 (Link)
2024 Stanford/Elsevier Top 2% Scientists List 2024 (Link)
2024 CVM Honorable Paper Mention Award (for 2023 publications, see Journal Message & Certificate)
2024 MIR Outstanding Reviewer Award in 2023 (for 2023, MIR Journal News)
2023 IEEE Transactions on Medical Imaging Distinguished Reviewer (Bronze Level, 2022 – 2023)
2021 Jittor Developer Conference Distinguished Paper & Most Influential Application Paper
2021 MICCAI Student Travel Award (MICCAI link)
2020 Multimedia Evaluation Benchmark Workshop (Ranked #1 in accuracy metric)
2019 Sparse Representation and Intelligent Analysis Competition of Remote Sensing Images (Co-organized by Huawei & Wuhan University), Object Detection Track (Top 6%)

📃 Academic Services

Conference Reviewer: ICLR (2022–2023), ICML (2022–2023), NeurIPS (2022–2023), CVPR (2022–2025), ICCV (2023), ECCV (2022), IJCAI (2021–2023), MICCAI (2020, 2022–2023), ICASSP (2023), ICIP (2022–2023), ISMAR (2021), PRCV (2021, 2023), AJCAI (2023), and others.
Journal Reviewer: TPAMI, IJCV, TIP, TMI, TVCG, TCSVT, TMM, JBHI, CVM, MIR, Information Fusion, Neurocomputing, CVIU, Scientific Reports, ESWA, DSP, Sensors, SPIC, The Visual Computer, IJIST, Diagnostics, BBE, and others.

🔗 Useful Resources

Research

Self-improvement

Some Collected Resources for New PhD Students – from Philip Torr’s Homepage
剑桥大学:语言和写作决定人生发展的潜力
AI Interview Prep: Transformers and Attention Mechanisms
GitHub - Tech Interview Handbook

Tools

awesome-ai-research-writing: Make AI Writing Better for Everyone
Find any emojis you need at EmojiDB beta!
New LLM Architecture Gallery by Sebastian Raschka 一口气看完42款LLM架构的图解，太棒了！

AI Blogs

2026.03.29 JEPA (Joint-Embedding Predictive Architecture) 👉 核心观点梳理
- What is JEPA (Joint Embedding Predictive Architecture)? from Turing Post
- 14 JEPA Milestones as a Map of AI Progress from Turing Post
- Yann LeCun talks: Munich presentation (September 29, 2023) & Harvard presentation (March 28, 2024)
2026.03.26 From “Reasoning” Thinking to “Agentic” Thinking by Junyang Lin (@JustinLin610) 👉 核心观点梳理
2026.02.03 International AI Safety Report 2026 by Prof. Yoshua Bengio (Board Chair) 👉 核心观点梳理

Ge-Peng Ji(季葛鹏)