Oğuzhan Fatih Kar

I am a Machine Learning Researcher at Apple. My research interests are in building generalist multimodal agents that can perceive, reason, and act in physical and digital worlds.

I received my Ph.D. in Computer Science from EPFL, where I was advised by Amir Zamir. My PhD thesis was on building scalable multimodal foundation models that can process diverse inputs such as images, text, 3D, semantics and other sensory data to solve a wide variety of real-world tasks. In 2023/2024, I interned at Google working on vision-language models with Federico Tombari. I received my M.S. and B.S. in Electrical Engineering from METU, where I was advised by Figen Oktem.

Email  /  Google Scholar  /  Github  /  LinkedIn  /  Twitter

Image
Honors
Recent Work
Image
Image
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

R. Ramachandran, A. Garjani, R. Bachmann, A. Atanov*, O.F. Kar*, A. Zamir*
ICLR, 2026
[Website] [Code]

Image
Image
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length

R. Bachmann*, J. Allardice*, D. Mizrahi*, E. Fini, O.F. Kar, E. Amirloo, A. El-Nouby, A. Zamir, A. Dehghan
ICML, 2025
[Website] [Code] [Demo]

Image
Image
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

O.F. Kar*, R. Bachmann*, D. Mizrahi*, A. Garjani, M. Gao, D. Griffiths, J. Hu, A. Dehghan, A. Zamir
NeurIPS, 2024
[Website] [Code] [Demo]

Image
Image
BRAVE: Broadening the visual encoding of vision-language models

O.F. Kar, A. Tonioni, P. Poklukar, A. Kulshrestha, A. Zamir, F. Tombari
ECCV, 2024 [Oral, Top 2%]
[Website]

Image
Image
Unraveling the Key Components of OOD Generalization via Diversification

H. Benoit*, L. Jiang*, A. Atanov*, O.F. Kar, M. Rigotti, A. Zamir
ICLR, 2024
[arXiv]

Image
Image
4M: Massively Multimodal Masked Modeling

D. Mizrahi*, R. Bachmann*, O.F. Kar, T. Yeo, M. Gao, A. Dehghan, A. Zamir
NeurIPS, 2023 [Spotlight, Top 4%]
[Website]

Image
Image
Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback

T. Yeo, O.F. Kar, Z. Sodagar, A. Zamir
ICCV, 2023
[Website]

Image
3D Common Corruptions and Data Augmentation

O.F. Kar, T. Yeo, A. Atanov, A. Zamir
CVPR, 2022 [Oral, Top 4%]
[Website] [Code] [Video] [Live Demo] [TrustML Talk]

Image
Image
Robustness via Cross-domain Ensembles

O.F. Kar*, T. Yeo*, A. Zamir
ICCV, 2021 [Oral, Top 3%]
[Website] [Code] [Video] [Slides]

Image
Image
Robust Learning Through Cross-task Consistency

A. Zamir*, A. Sax*, T. Yeo, O.F. Kar, N. Cheerla, R. Suri, Z. Cao, J. Malik, L. Guibas
Arxiv, 2020. CVPR, 2020 [Best Paper Award Nominee, Oral]
[Live Demo] [Visuals] [Website] [Code] [ECCV 2020 Demo Video]

M.S. Work (2018-2021)

(Complete list on Google Scholar)

Image
Image
High-resolution Multi-spectral Imaging with Diffractive Lenses and Learned Reconstruction

F.S. Oktem, O.F. Kar, C. D. Bezek, F. Kamalabadi
IEEE Transactions on Computational Imaging, 2021
[Arxiv]

Image
Image
Compressive Spectral Imaging with Diffractive Lenses
O.F. Kar, F.S. Oktem
Optics Letters, 2019
[arXiv]

Image
Real-time Compressive Video Reconstruction for Spatial Multiplexing Cameras

O.F. Kar, A. Gungor, H.E. Guven
IEEE GLOBALSIP, 2019
[Visuals]

Image
Image
Learning-based Regularization for Spatial Multiplexing Cameras
O.F. Kar, A. Gungor, H.E. Guven
IEEE GLOBALSIP, 2019

Image
Image
A Transform Learning-based Deconvolution Technique with Super-resolution and Microscanning Applications
A. Gungor*, O.F. Kar*
IEEE ICIP, 2019

Image
Image
A Matrix-free Reconstruction Method for Compressive Focal Plane Array Imaging
A. Gungor, O.F. Kar, H.E. Guven
IEEE ICIP, 2018


Template

Last Update: March 2026