Skip to main content
Cornell University
Learn about arXiv becoming an independent nonprofit.
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.CV

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

  • Thu, 23 Apr 2026
  • Wed, 22 Apr 2026
  • Tue, 21 Apr 2026
  • Mon, 20 Apr 2026
  • Fri, 17 Apr 2026

See today's new changes

Total of 768 entries : 1-50 51-100 101-150 151-200 ... 751-768
Showing up to 50 entries per page: fewer | more | all

Thu, 23 Apr 2026 (showing first 50 of 106 entries )

[1] arXiv:2604.20841 [pdf, html, other]
Title: DeVI: Physics-based Dexterous Human-Object Interaction via Synthetic Video Imitation
Hyeonwoo Kim, Jeonghwan Kim, Kyungwon Cho, Hanbyul Joo
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[2] arXiv:2604.20822 [pdf, html, other]
Title: Global Offshore Wind Infrastructure: Deployment and Operational Dynamics from Dense Sentinel-1 Time Series
Thorsten Hoeser, Felix Bachofer, Claudia Kuenzer
Comments: 25 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[3] arXiv:2604.20813 [pdf, html, other]
Title: Adapting TrOCR for Printed Tigrinya Text Recognition: Word-Aware Loss Weighting for Cross-Script Transfer Learning
Yonatan Haile Medhanie, Yuanhua Ni
Comments: Code and models available at this https URL Pre-trained models: this https URL, this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2604.20806 [pdf, html, other]
Title: OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model
Qiguang Chen, Chengyu Luan, Jiajun Wu, Qiming Yu, Yi Yang, Yizhuo Li, Jingqi Tong, Xiachong Feng, Libo Qin, Wanxiang Che
Comments: ACL 2026 Camera Ready
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[5] arXiv:2604.20800 [pdf, other]
Title: LEXIS: LatEnt ProXimal Interaction Signatures for 3D HOI from an Image
Dimitrije Antić, Alvaro Budria, George Paschalidis, Sai Kumar Dwivedi, Dimitrios Tzionas
Comments: 26 pages, 11 figures, 4 tables. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[6] arXiv:2604.20796 [pdf, other]
Title: LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model
Inclusion AI, Tiwei Bie, Haoxing Chen, Tieyuan Chen, Zhenglin Cheng, Long Cui, Kai Gan, Zhicheng Huang, Zhenzhong Lan, Haoquan Li, Jianguo Li, Tao Lin, Qi Qin, Hongjun Wang, Xiaomei Wang, Haoyuan Wu, Yi Xin, Junbo Zhao
Comments: LLaDA2.0-Uni Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2604.20784 [pdf, html, other]
Title: GeoRect4D: Geometry-Compatible Generative Rectification for Dynamic Sparse-View 3D Reconstruction
Zhenlong Wu, Zihan Zheng, Xuanxuan Wang, Qianhe Wang, Hua Yang, Xiaoyun Zhang, Qiang Hu, Wenjun Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2604.20760 [pdf, html, other]
Title: Exploring High-Order Self-Similarity for Video Understanding
Manjin Kim, Heeseung Kwon, Karteek Alahari, Minsu Cho
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[9] arXiv:2604.20748 [pdf, html, other]
Title: Amodal SAM: A Unified Amodal Segmentation Framework with Generalization
Bo Zhang, Zhuotao Tian, Xin Tao, Songlin Tang, Jun Yu, Wenjie Pei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2604.20730 [pdf, html, other]
Title: Render-in-the-Loop: Vector Graphics Generation via Visual Self-Feedback
Guotao Liang, Zhangcheng Wang, Juncheng Hu, Haitao Zhou, Ziteng Xue, Jing Zhang, Dong Xu, Qian Yu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11] arXiv:2604.20715 [pdf, html, other]
Title: GeoRelight: Learning Joint Geometrical Relighting and Reconstruction with Flexible Multi-Modal Diffusion Transformers
Yuxuan Xue, Ruofan Liang, Egor Zakharov, Timur Bagautdinov, Chen Cao, Giljoo Nam, Shunsuke Saito, Gerard Pons-Moll, Javier Romero
Comments: CVPR 2026 Highlight; Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12] arXiv:2604.20705 [pdf, html, other]
Title: SSL-R1: Self-Supervised Visual Reinforcement Post-Training for Multimodal Large Language Models
Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr, Federico Tombari, Bernt Schiele
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13] arXiv:2604.20696 [pdf, html, other]
Title: R-CoV: Region-Aware Chain-of-Verification for Alleviating Object Hallucinations in LVLMs
Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr, Federico Tombari, Bernt Schiele
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14] arXiv:2604.20665 [pdf, html, other]
Title: The Expense of Seeing: Attaining Trustworthy Multimodal Reasoning Within the Monolithic Paradigm
Karan Goyal, Dikshant Kukreja
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15] arXiv:2604.20650 [pdf, html, other]
Title: MAPRPose: Mask-Aware Proposal and Amodal Refinement for Multi-Object 6D Pose Estimation
Yang Luo, Yan Gong, Yongsheng Gao, Xiaoying Sun, Jie Zhao
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[16] arXiv:2604.20623 [pdf, html, other]
Title: RSRCC: A Remote Sensing Regional Change Comprehension Benchmark Constructed via Retrieval-Augmented Best-of-N Ranking
Roie Kazoom, Yotam Gigi, George Leifman, Tomer Shekel, Genady Beryozkin
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[17] arXiv:2604.20606 [pdf, html, other]
Title: Beyond ZOH: Advanced Discretization Strategies for Vision Mamba
Fady Ibrahim, Guangjun Liu, Guanghui Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[18] arXiv:2604.20594 [pdf, html, other]
Title: Physics-Informed Conditional Diffusion for Motion-Robust Retinal Temporal Laser Speckle Contrast Imaging
Qian Chen, Yuehao Chen, Qiang Wang, Lei Zhu, Yanye Lu, Qiushi Ren
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2604.20591 [pdf, html, other]
Title: Structure-Augmented Standard Plane Detection with Temporal Aggregation in Blind-Sweep Fetal Ultrasound
Keli Niu, He Zhao, Qianhui Men
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2604.20585 [pdf, html, other]
Title: On the Impact of Face Segmentation-Based Background Removal on Recognition and Morphing Attack Detection
Eduarda Caldeira, Guray Ozgur, Fadi Boutros, Naser Damer
Comments: Accepted at FG 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21] arXiv:2604.20574 [pdf, html, other]
Title: Where are they looking in the operating room?
Keqi Chen, Séraphin Baributsa, Lilien Schewski, Vinkle Srivastav, Didier Mutter, Guido Beldi, Sandra Keller, Nicolas Padoy
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2604.20570 [pdf, html, other]
Title: Exploring Spatial Intelligence from a Generative Perspective
Muzhi Zhu, Shunyao Jiang, Huanyi Zheng, Zekai Luo, Hao Zhong, Anzhou Li, Kaijun Wang, Jintao Rong, Yang Liu, Hao Chen, Tao Lin, Chunhua Shen
Comments: Accepted by CVPR 2026. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2604.20544 [pdf, html, other]
Title: Evian: Towards Explainable Visual Instruction-tuning Data Auditing
Zimu Jia, Mingjie Xu, Andrew Estornell, Jiaheng Wei
Comments: Accepted at ACL 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[24] arXiv:2604.20543 [pdf, html, other]
Title: RefAerial: A Benchmark and Approach for Referring Detection in Aerial Images
Guyue Hu, Hao Song, Yuxing Tong, Duzhi Yuan, Dengdi Sun, Aihua Zheng, Chenglong Li, Jin Tang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2604.20486 [pdf, html, other]
Title: ProMMSearchAgent: A Generalizable Multimodal Search Agent Trained with Process-Oriented Rewards
Wentao Yan, Shengqin Wang, Huichi Zhou, Yihang Chen, Kun Shao, Yuan Xie, Zhizhong Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2604.20474 [pdf, html, other]
Title: Random Walk on Point Clouds for Feature Detection
Yuhe Zhang, Zhikun Tu, Zhi Li, Jian Gao, Bao Guo, Shunli Zhang
Comments: 20 pages, 11 figures. Published in Information Sciences
Journal-ref: Information Sciences 709 (2025) 122082
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2604.20473 [pdf, html, other]
Title: Video-ToC: Video Tree-of-Cue Reasoning
Qizhong Tan, Zhuotao Tian, Guangming Lu, Jun Yu, Wenjie Pei
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2604.20470 [pdf, html, other]
Title: DynamicRad: Content-Adaptive Sparse Attention for Long Video Diffusion
Yongji Long, Shijun Liang, Jintao Li, Yun Li
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2604.20460 [pdf, html, other]
Title: CCTVBench: Contrastive Consistency Traffic VideoQA Benchmark for Multimodal LLMs
Xingcheng Zhou, Hao Guo, Rui Song, Walter Zimmer, Mingyu Liu, André Schamschurko, Hu Cao, Alois Knoll
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2604.20429 [pdf, html, other]
Title: Fast-then-Fine: A Two-Stage Framework with Multi-Granular Representation for Cross-Modal Retrieval in Remote Sensing
Xi Chen, Xu Chen, Xiangyang Jia, Xu Zhang, Shuquan Wei, Wei Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2604.20395 [pdf, html, other]
Title: SpaCeFormer: Fast Proposal-Free Open-Vocabulary 3D Instance Segmentation
Chris Choy, Junha Lee, Chunghyun Park, Minsu Cho, Jan Kautz
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[32] arXiv:2604.20393 [pdf, html, other]
Title: MLG-Stereo: ViT Based Stereo Matching with Multi-Stage Local-Global Enhancement
Haoyu Zhang, Jingyi Zhou, Peng Ye, Jiakang Yuan, Lin Zhang, Feng Xu, Tao Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2604.20392 [pdf, html, other]
Title: Self-supervised pretraining for an iterative image size agnostic vision transformer
Nedyalko Prisadnikov, Danda Pani Paudel, Yuqian Fu, Luc Van Gool
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2604.20368 [pdf, html, other]
Title: LaplacianFormer:Rethinking Linear Attention with Laplacian Kernel
Zhe Feng, Sen Lian, Changwei Wang, Muyang Zhang, Tianlong Tan, Rongtao Xu, Weiliang Meng, Xiaopeng Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[35] arXiv:2604.20366 [pdf, html, other]
Title: Mitigating Hallucinations in Large Vision-Language Models without Performance Degradation
Xingyu Zhu, Junfeng Fang, Shuo Wang, Beier Zhu, Zhicai Wang, Yonghui Yang, Xiangnan He
Comments: ACL 2026 (Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2604.20361 [pdf, html, other]
Title: Object Referring-Guided Scanpath Prediction with Perception-Enhanced Vision-Language Models
Rong Quan, Yantao Lai, Dong Liang, Jie Qin
Comments: ICMR 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37] arXiv:2604.20358 [pdf, html, other]
Title: ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval
Zixu Li, Yupeng Hu, Zhiwei Chen, Mingyu Zhang, Zhiheng Fu, Liqiang Nie
Comments: Accepted by CVPR 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2604.20357 [pdf, html, other]
Title: SignDATA: Data Pipeline for Sign Language Translation
Kuanwei Chen, Tingyi Lin
Comments: 7 pages, 1 figure
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[39] arXiv:2604.20354 [pdf, html, other]
Title: Hallucination Early Detection in Diffusion Models
Federico Betti, Lorenzo Baraldi, Lorenzo Baraldi, Rita Cucchiara, Nicu Sebe
Comments: 21 pages, 6 figures, 4 tables. Published in International Journal of Computer Vision (IJCV)
Journal-ref: Int. J. Comput. Vis. 134, 35 (2026)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2604.20350 [pdf, html, other]
Title: X-PCR: A Benchmark for Cross-modality Progressive Clinical Reasoning in Ophthalmic Diagnosis
Gui Wang, Zehao Zhong, YongSong Zhou, Yudong Li, Ende Wu, Wooi Ping Cheah, Rong Qu, Jianfeng Ren, Linlin Shen
Comments: Accept by CVPR2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2604.20336 [pdf, html, other]
Title: Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation
Jiahao Xu, Xiaohan Yuan, Xingchen Wu, Chongyang Xu, Kun Li, Buzhen Huang
Comments: CVPR 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[42] arXiv:2604.20329 [pdf, html, other]
Title: Image Generators are Generalist Vision Learners
Valentin Gabeur, Shangbang Long, Songyou Peng, Paul Voigtlaender, Shuyang Sun, Yanan Bao, Karen Truong, Zhicheng Wang, Wenlei Zhou, Jonathan T. Barron, Kyle Genova, Nithish Kannen, Sherry Ben, Yandong Li, Mandy Guo, Suhas Yogin, Yiming Gu, Huizhong Chen, Oliver Wang, Saining Xie, Howard Zhou, Kaiming He, Thomas Funkhouser, Jean-Baptiste Alayrac, Radu Soricut
Comments: Project Page: this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[43] arXiv:2604.20328 [pdf, html, other]
Title: Hybrid Latent Reasoning with Decoupled Policy Optimization
Tao Cheng, Shi-Zhe Chen, Hao Zhang, Yixin Qin, Jinwen Luo, Zheng Wei
Comments: Tech report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[44] arXiv:2604.20319 [pdf, html, other]
Title: SurgCoT: Advancing Spatiotemporal Reasoning in Surgical Videos through a Chain-of-Thought Benchmark
Gui Wang, YongSong Zhou, Kaijun Deng, Wooi Ping Cheah, Rong Qu, Jianfeng Ren, Linlin Shen
Comments: Accept by CVPR2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2604.20318 [pdf, html, other]
Title: UniCVR: From Alignment to Reranking for Unified Zero-Shot Composed Visual Retrieval
Haokun Wen, Xuemeng Song, Haoyu Zhang, Xiangyu Zhao, Weili Guan, Liqiang Nie
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[46] arXiv:2604.20317 [pdf, html, other]
Title: MD-Face: MoE-Enhanced Label-Free Disentangled Representation for Interactive Facial Attribute Editing
Xuan Cui, Yunfei Zhao, Bo Liu, Wei Duan, Xingrong Fan
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2604.20307 [pdf, html, other]
Title: Improving Facial Emotion Recognition through Dataset Merging and Balanced Training Strategies
Serap Kırbız
Journal-ref: Journal of the Franklin Institute 362.7 (2025): 107659
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2604.20306 [pdf, html, other]
Title: Dual Causal Inference: Integrating Backdoor Adjustment and Instrumental Variable Learning for Medical VQA
Zibo Xu, Qiang Li, Ke Lu, Jin Wang, Weizhi Nie, Yuting Su
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[49] arXiv:2604.20291 [pdf, html, other]
Title: Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
Pham Phuong Nam Nguyen, Nam Tien Le, Thi Kim Trang Vo, Nhu Tinh Anh Nguyen
Comments: 10 pages, 4 figures. Accepted at the Mobile AI (MAI) 2026 Workshop at CVPR 2026
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2604.20289 [pdf, html, other]
Title: X-Cache: Cross-Chunk Block Caching for Few-Step Autoregressive World Models Inference
Yixiao Zeng, Jianlei Zheng, Chaoda Zheng, Shijia Chen, Mingdian Liu, Tongping Liu, Tengwei Luo, Yu Zhang, Boyang Wang, Linkun Xu, Siyuan Lu, Bo Tian, Xianming Liu
Comments: Technical Report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Total of 768 entries : 1-50 51-100 101-150 151-200 ... 751-768
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status

Advertisement
Advertisement