Loris Bazzani

Hi, I'm Loris, an AI Research Leader with 15+ years of experience in AI between academia and big tech. My research focus is in adaptive and collaborative multimodal learning and generation.

~10 yrs

at Amazon

Products Launched

$100M+

Business Impact

50+

Publications

6800+

Citations

H-index

The Data Space: Controllable Multimodal Data Generation & Privacy

The Model Space: Multimodal Adaptation & Specialization

The Interaction Space: Human-AI Co-Design

The Industry Space: Clear Impact on Real-world Use Cases

2026-now

I am searching for a new home to join or create together, where I can lead high-stakes innovation and drive the next generation of AI for industry.

2025-now

Adjunct Professor and Honorary Research Fellow @ University of Verona

Part-time. Driving research on adaptive and collaborative multimodal intelligence, focusing on the intersection of LMMs and interactive systems. Contributing to joint research projects and co-authoring papers with a network of international institutions. Teaching the Data Visualization course as part of the Master’s degree in Data Science.

2016-2025

Principal Scientist @ Amazon

Led high-impact core research and launched 8+ products across Prime Video, Alexa, and mobile/.com shopping generating $100M+ in business impact, related to movies and TV series autotagging, image accessibility, live sports highlights, virtual try-on, interactive product recommendations, and shopping assistants, used by millions of users worldwide. I led scientists and developed novel models and architectures for video understanding, vision-language representation, Large Multimodal Models, and diffusion models.

2014-2015

Postdoc @ Dartmouth College

Research on video understanding, saliency in videos and object localization and detection. Collaborating with Prof. Lorenzo Torresani and Prof. Hugo Larochelle.

2011-2013

Postdoc @ Italian Institute of Technology

Research on video understanding, object recognition, Bayesian networks and Kernel-based methods. Collaborating with Prof. Vittorio Murino.

2009-2012

PhD in Computer Vision and Machine Learning @ University of Verona

Research on person re-identification, video understanding, tracking and attentional models. Supervised by Prof. Vittorio Murino and Prof. Marco Cristani.

2010

Visiting Student @ University of British Columbia

Collaborating with Prof. Nando de Freitas and Prof. Hugo Larochelle.

📢 News

Jun 22, 2026: Invited speaker at the 2026 summer school on Advanced Topics in AI at the Zhejiang Normal University. Thanks Marcello!
Jun 12, 2026: 1/1 papers accepted at MICCAI 2026!
May 21, 2026: Call for paper of the workshop on Human-AI Co-Creation at ECCV 2026.
May 19, 2026: Invited speaker on Adaptive and Collaborative Multimodal Intelligence at the University of Verona. Thanks Alessandro!
May 15, 2026: I will serve as Program Chair at the ANAIS 2026.
May 7, 2026: Obtained the Honorary Research Fellow position at University of Verona.

Interactive Episodic Memory with User Feedback

N. Subedi, L. Bazzani, Z. Al-Halah

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026

Project PDF Code (TBA) Dataset

ProMoE-FL: Prototype-conditioned Mixture of Experts for Multimodal Federated Learning with Missing Modalities

A. Chhetri, B. Niroula, E. Vasquez, Y. R. Shrestha, P. K. Gyawali, L. Bazzani, B. Bhattarai

International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2026

PDF (TBA) Code (TBA)

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

Z. Liu, D. Talon, F. Girella, Z. Ruan, M. Mondo, L. Bazzani, Y. Wang, M. Cristani

Arxiv, 2026

Project PDF

Med-MMFL: A Multimodal Federated Learning Benchmark in Healthcare

A. Chhetri, B. Niroula, P. Shrestha, Y. R. Shrestha, L. A. Anderson, P. K. Gyawali, L. Bazzani, B. Bhattarai

Arxiv, 2026

PDF Code

Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation

E. Zorzi, F. Taioli, Y. Wang, M. Cristani, A. Farinelli, A. Castellini, L. Bazzani

Arxiv, 2026

Project PDF Code (TBA)

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Z. Wang, S. Ramasinghe, C. Xu, J. Monteil, L. Bazzani, T. Ajanthan

International Conference on Computer Vision (ICCV), 2025

PDF

LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts

A. Cao, M. Jaritz, M. Guillaumin, R. de Charette, L. Bazzani

IEEE Winter Conference on Applications of Computer Vision (WACV), 2025

PDF Code

UniCoRN: Unified Commented Retrieval Network with LMMs

M. Jaritz, M. Guillaumin, S. Sternig, L. Bazzani

Arxiv, 2025

PDF

ToFu: Visual Tokens Reduction via Fusion for Multi-modal, Multi-patch, Multi-image Task

V. Pippi, M. Guillaumin, S. Casciarelli, R. Cucchiara, M. Jaritz, L. Bazzani

Arxiv, 2025

PDF

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

X. Yang, Y. Zuo, S. Ramasinghe, L. Bazzani, G. Avraham, A. van den Hengel

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Project PDF Code

iEdit: Localised Text-guided Image Editing with Weak Supervision

R. Bodur, E. Gundogdu, B. Bhattarai, T-K Kim, M. Donoser, L. Bazzani

Computer Vision and Pattern Recognition (CVPR) Workshops, 2024

PDF

[Patent] Interactive Retrieval Using Visual Semantic Matching

US-11720942, 2023

PDF

[Patent] Localized Visual Similarity

US-11809520, 2023

PDF

[Patent] Attribute-based Interactive Product Recommendations

US-11829445, 2023

PDF

[Patent] Visual Blending of Content

US 11416910, 2022

PDF

[Patent] Machine Learning System to Score Alt-text in Image Data

US-11361212, 2022

PDF

Contrastive Language-Action Pre-training for Temporal Localization

M. Xu, E. Gundogdu, M. Lapin, B. Ghanem, M. Donoser, L. Bazzani

Arxiv, 2022

PDF

Learning Attribute-driven Disentangled Representations for Interactive Fashion Retrieval

Y. Hou, E. Vig, M. Donoser, L. Bazzani

International Conference on Computer Vision (ICCV), 2021

PDF Code

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

A. Salvador, E. Gundogdu, L. Bazzani, M. Donoser

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

PDF Code

Localized Triplet Loss for Fine-Grained Fashion Image Retrieval

A. D’Innocente, N. Garg, Y. Zhang, L. Bazzani, M. Donoser

Computer Vision and Pattern Recognition (CVPR) Workshops, 2021

PDF

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

Y. Chen, L. Bazzani

European Conference on Computer Vision (ECCV), 2020

PDF

Image Search with Text Feedback by Visiolinguistic Attention Learning

Y. Chen, S. Gong, L. Bazzani

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Project PDF Code

[Patent] Automated Video Ratings

US-10643074, 2020

PDF

Image Captioning as Neural Machine Translation Task in SOCKEYE

L. Bazzani, T. Domhan, F. Hieber

Arxiv, 2018

PDF Code

Recurrent Mixture Density Network for Spatiotemporal Visual Attention

L. Bazzani, H. Larochelle, L. Torresani

International Conference on Learning Representations (ICLR), 2017

Project PDF Video

Group Detection and Tracking using Sociological Features

S. Vascon, and L. Bazzani

Group and Crowd Behavior for Computer Vision, 2017

PDF

Image and Video Understanding in Big Data

V. Murino, S. Gong, C. C. Loy and L. Bazzani

Special Issue in CVIU, 2017

PDF

Approximate Log-Hilbert-Schmidt Distances Between Covariance Operators for Image Classification

H. Q. Minh, M. San Biagio, L. Bazzani, V. Murino

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

PDF

Self-taught Object Localization with Deep Networks

L. Bazzani, A. Bergamo, D. Anguelov, L. Torresani

IEEE Winter Conference on Applications of Computer Vision (WACV), 2016

Project PDF Code

A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning

H. Q. Minh, L. Bazzani, V. Murino

Journal of Machine Learning Research (JMLR), 2016

Project PDF Code

Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification

H. Q. Minh, M. San Biagio, L. Bazzani, V. Murino

Arxiv, 2016

PDF

Joint Individual-Group Modeling for Tracking

L. Bazzani^*, M. Zanotto^*, M. Cristani, V. Murino

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015

PDF Video Dataset

SDALF: Modeling Human Appearance with Symmetry-driven Accumulation of Local Features

L. Bazzani, M. Cristani, V. Murino

Person Re-identification, 2014

Project PDF Code Video

Weighted Bag of Visual Words for Object Recognition

L. Bazzani^*, M. San Biagio^*, M. Cristani, V. Murino

IEEE International Conference on Image Processing (ICIP), 2014

PDF

A Unifying Framework for Vector-valued Manifold Regularization and Multi-view Learning

H. Q. Minh, L. Bazzani, V. Murino

The 30th International Conference on Machine Learning (ICML), 2013

Project PDF Code

Semi-Supervised Multi-Feature Learning for Person Re-Identification

D. Figueira, L. Bazzani, H.Q. Minh, M. Cristani, A. Bernardino, V. Murino

International Conference on Advanced Video and Signal-based Surveillance (AVSS), 2013

PDF

Person Re-Identification with a PTZ Camera: An Introductory Study

P. Salvagnini, L. Bazzani, M. Cristani, V. Murino

International Conference on Image Processing (ICIP), 2013

PDF

Symmetry-Driven Accumulation of Local Features for Human Characterization and Re-Identification

L. Bazzani, M. Cristani, V. Murino

Computer Vision and Image Understanding (CVIU), 2013

Project PDF Code Video

Social Interactions by Visual Focus of Attention in a Three-Dimensional Environment

L. Bazzani, D. Tosato, M. Cristani, M. Farenzena, G. Pagetti, G. Menegaz, and V. Murino

Expert Systems 2013

Project PDF Code Video

Decentralized Particle Filter for Joint Individual-Group Tracking

L. Bazzani, M. Cristani, V. Murino

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012

PDF Video Dataset

Learning Where to Attend with Deep Architectures for Image Tracking

M. Denil, L. Bazzani, H. Larochelle, and N. de Freitas

Neural Computation, 2012

Project PDF Code Video Dataset

Re-Identification with RGB-D Sensors

B. I. Barbosa, M. Cristani, A. Del Bue, L. Bazzani, V. Murino

European Conference on Computer Vision (ECCV) Workshops, 2012

PDF Dataset

Online Bayesian Non-Parametrics for Social Group Detection

M. Zanotto, L. Bazzani, M. Cristani, V. Murino

British Machine Vision Conference (BMVC), 2012

PDF

Analyzing Groups: A Social Signaling Perspective

L. Bazzani, M. Cristani, G. Paggetti, D. Tosato, G. Menegaz, and V. Murino

Video Analytics for Business Intelligence, 2012

Project PDF Code Video

Multiple-Shot Person Re-Identification by Chromatic and Epitomic Analyses

L. Bazzani, M. Cristani, A. Perina, and V. Murino

Pattern Recognition Letters (PRL), 2012

PDF

Learning Attentional Policies for Object Tracking and Recognition in Video with Deep Networks

L. Bazzani, N. de Freitas, H. Larochelle, V. Murino, J-A Ting

The 30th International Conference on Machine Learning (ICML), 2011

Project PDF Code Video Dataset

Custom Pictorial Structures for Re-Identification

D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, V. Murino

British Machine Vision Conference (BMVC), 2011

Project PDF Video Dataset

Social Interaction Discovery by Statistical Analysis of F-Formations

M. Cristani, L. Bazzani, G. Pagetti, A. Fossati, D. Tosato, A. Del Bue, G. Menegaz, V. Murino

British Machine Vision Conference (BMVC), 2011

Project PDF Dataset

Towards Computational Proxemics: Inferring Social Relations from Interpersonal Distances

M. Cristani, G. Pagetti, A. Vinciarelli, L. Bazzani, G. Menegaz, V. Murino

International Conference on Social Computing (SocialCom), 2011

PDF

Multiple-Shot Person Re-Identification by HPE Signature

L. Bazzani, M. Cristani, A. Perina, M. Farenzena, V. Murino

International Conference on Pattern Recognition (ICPR), 2010

PDF

Person Re-Identification by Symmetry-Driven Accumulation of Local Features

M. Farenzena, L. Bazzani, A. Perina, M. Cristani, V. Murino

Conference on Computer Vision and Pattern Recognition (CVPR), 2010

Project PDF Code Video

Collaborative Particle Filters for Group Tracking

L. Bazzani, M. Cristani, V. Murino

International Conference on Image Processing (ICIP), 2010

PDF

Multimodal Learning with Interaction/User Feedback

Interactive Episodic Memory with User Feedback

N. Subedi, L. Bazzani, Z. Al-Halah

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026

Project PDF Code (TBA) Dataset

Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation

E. Zorzi, F. Taioli, Y. Wang, M. Cristani, A. Farinelli, A. Castellini, L. Bazzani

Arxiv, 2026

Project PDF Code (TBA)

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Z. Wang, S. Ramasinghe, C. Xu, J. Monteil, L. Bazzani, T. Ajanthan

International Conference on Computer Vision (ICCV), 2025

PDF

UniCoRN: Unified Commented Retrieval Network with LMMs

M. Jaritz, M. Guillaumin, S. Sternig, L. Bazzani

Arxiv, 2025

PDF

ToFu: Visual Tokens Reduction via Fusion for Multi-modal, Multi-patch, Multi-image Task

V. Pippi, M. Guillaumin, S. Casciarelli, R. Cucchiara, M. Jaritz, L. Bazzani

Arxiv, 2025

PDF

[Patent] Interactive Retrieval Using Visual Semantic Matching

US-11720942, 2023

PDF

[Patent] Localized Visual Similarity

US-11809520, 2023

PDF

[Patent] Attribute-based Interactive Product Recommendations

US-11829445, 2023

PDF

[Patent] Visual Blending of Content

US 11416910, 2022

PDF

Learning Attribute-driven Disentangled Representations for Interactive Fashion Retrieval

Y. Hou, E. Vig, M. Donoser, L. Bazzani

International Conference on Computer Vision (ICCV), 2021

PDF Code

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

A. Salvador, E. Gundogdu, L. Bazzani, M. Donoser

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

PDF Code

Localized Triplet Loss for Fine-Grained Fashion Image Retrieval

A. D’Innocente, N. Garg, Y. Zhang, L. Bazzani, M. Donoser

Computer Vision and Pattern Recognition (CVPR) Workshops, 2021

PDF

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

Y. Chen, L. Bazzani

European Conference on Computer Vision (ECCV), 2020

PDF

Image Search with Text Feedback by Visiolinguistic Attention Learning

Y. Chen, S. Gong, L. Bazzani

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Project PDF Code

Multimodal Federated Learning

ProMoE-FL: Prototype-conditioned Mixture of Experts for Multimodal Federated Learning with Missing Modalities

A. Chhetri, B. Niroula, E. Vasquez, Y. R. Shrestha, P. K. Gyawali, L. Bazzani, B. Bhattarai

International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2026

PDF (TBA) Code (TBA)

Med-MMFL: A Multimodal Federated Learning Benchmark in Healthcare

A. Chhetri, B. Niroula, P. Shrestha, Y. R. Shrestha, L. A. Anderson, P. K. Gyawali, L. Bazzani, B. Bhattarai

Arxiv, 2026

PDF Code

Multimodal Generative Models

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

Z. Liu, D. Talon, F. Girella, Z. Ruan, M. Mondo, L. Bazzani, Y. Wang, M. Cristani

Arxiv, 2026

Project PDF

LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts

A. Cao, M. Jaritz, M. Guillaumin, R. de Charette, L. Bazzani

IEEE Winter Conference on Applications of Computer Vision (WACV), 2025

PDF Code

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

X. Yang, Y. Zuo, S. Ramasinghe, L. Bazzani, G. Avraham, A. van den Hengel

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Project PDF Code

iEdit: Localised Text-guided Image Editing with Weak Supervision

R. Bodur, E. Gundogdu, B. Bhattarai, T-K Kim, M. Donoser, L. Bazzani

Computer Vision and Pattern Recognition (CVPR) Workshops, 2024

PDF

[Patent] Machine Learning System to Score Alt-text in Image Data

US-11361212, 2022

PDF

Image Captioning as Neural Machine Translation Task in SOCKEYE

L. Bazzani, T. Domhan, F. Hieber

Arxiv, 2018

PDF Code

Video Understanding

Contrastive Language-Action Pre-training for Temporal Localization

M. Xu, E. Gundogdu, M. Lapin, B. Ghanem, M. Donoser, L. Bazzani

Arxiv, 2022

PDF

[Patent] Automated Video Ratings

US-10643074, 2020

PDF

Recurrent Mixture Density Network for Spatiotemporal Visual Attention

L. Bazzani, H. Larochelle, L. Torresani

International Conference on Learning Representations (ICLR), 2017

Project PDF Video

Image and Video Understanding in Big Data

V. Murino, S. Gong, C. C. Loy and L. Bazzani

Special Issue in CVIU, 2017

PDF

Joint Individual-Group Modeling for Tracking

L. Bazzani^*, M. Zanotto^*, M. Cristani, V. Murino

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015

PDF Video Dataset

Decentralized Particle Filter for Joint Individual-Group Tracking

L. Bazzani, M. Cristani, V. Murino

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012

PDF Video Dataset

Learning Where to Attend with Deep Architectures for Image Tracking

M. Denil, L. Bazzani, H. Larochelle, and N. de Freitas

Neural Computation, 2012

Project PDF Code Video Dataset

Learning Attentional Policies for Object Tracking and Recognition in Video with Deep Networks

L. Bazzani, N. de Freitas, H. Larochelle, V. Murino, J-A Ting

The 30th International Conference on Machine Learning (ICML), 2011

Project PDF Code Video Dataset

Collaborative Particle Filters for Group Tracking

L. Bazzani, M. Cristani, V. Murino

International Conference on Image Processing (ICIP), 2010

PDF

Structured Methods for Images

Approximate Log-Hilbert-Schmidt Distances Between Covariance Operators for Image Classification

H. Q. Minh, M. San Biagio, L. Bazzani, V. Murino

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

PDF

Self-taught Object Localization with Deep Networks

L. Bazzani, A. Bergamo, D. Anguelov, L. Torresani

IEEE Winter Conference on Applications of Computer Vision (WACV), 2016

Project PDF Code

A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning

H. Q. Minh, L. Bazzani, V. Murino

Journal of Machine Learning Research (JMLR), 2016

Project PDF Code

Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification

H. Q. Minh, M. San Biagio, L. Bazzani, V. Murino

Arxiv, 2016

PDF

Weighted Bag of Visual Words for Object Recognition

L. Bazzani^*, M. San Biagio^*, M. Cristani, V. Murino

IEEE International Conference on Image Processing (ICIP), 2014

PDF

A Unifying Framework for Vector-valued Manifold Regularization and Multi-view Learning

H. Q. Minh, L. Bazzani, V. Murino

The 30th International Conference on Machine Learning (ICML), 2013

Project PDF Code

Behavioral and Interaction Analysis in Videos

Group Detection and Tracking using Sociological Features

S. Vascon, and L. Bazzani

Group and Crowd Behavior for Computer Vision, 2017

PDF

Social Interactions by Visual Focus of Attention in a Three-Dimensional Environment

L. Bazzani, D. Tosato, M. Cristani, M. Farenzena, G. Pagetti, G. Menegaz, and V. Murino

Expert Systems 2013

Project PDF Code Video

Online Bayesian Non-Parametrics for Social Group Detection

M. Zanotto, L. Bazzani, M. Cristani, V. Murino

British Machine Vision Conference (BMVC), 2012

PDF

Analyzing Groups: A Social Signaling Perspective

L. Bazzani, M. Cristani, G. Paggetti, D. Tosato, G. Menegaz, and V. Murino

Video Analytics for Business Intelligence, 2012

Project PDF Code Video

Social Interaction Discovery by Statistical Analysis of F-Formations

M. Cristani, L. Bazzani, G. Pagetti, A. Fossati, D. Tosato, A. Del Bue, G. Menegaz, V. Murino

British Machine Vision Conference (BMVC), 2011

Project PDF Dataset

Towards Computational Proxemics: Inferring Social Relations from Interpersonal Distances

M. Cristani, G. Pagetti, A. Vinciarelli, L. Bazzani, G. Menegaz, V. Murino

International Conference on Social Computing (SocialCom), 2011

PDF

Cross-Camera Identification

SDALF: Modeling Human Appearance with Symmetry-driven Accumulation of Local Features

L. Bazzani, M. Cristani, V. Murino

Person Re-identification, 2014

Project PDF Code Video

Semi-Supervised Multi-Feature Learning for Person Re-Identification

D. Figueira, L. Bazzani, H.Q. Minh, M. Cristani, A. Bernardino, V. Murino

International Conference on Advanced Video and Signal-based Surveillance (AVSS), 2013

PDF

Person Re-Identification with a PTZ Camera: An Introductory Study

P. Salvagnini, L. Bazzani, M. Cristani, V. Murino

International Conference on Image Processing (ICIP), 2013

PDF

Symmetry-Driven Accumulation of Local Features for Human Characterization and Re-Identification

L. Bazzani, M. Cristani, V. Murino

Computer Vision and Image Understanding (CVIU), 2013

Project PDF Code Video

Re-Identification with RGB-D Sensors

B. I. Barbosa, M. Cristani, A. Del Bue, L. Bazzani, V. Murino

European Conference on Computer Vision (ECCV) Workshops, 2012

PDF Dataset

Multiple-Shot Person Re-Identification by Chromatic and Epitomic Analyses

L. Bazzani, M. Cristani, A. Perina, and V. Murino

Pattern Recognition Letters (PRL), 2012

PDF

Custom Pictorial Structures for Re-Identification

D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, V. Murino

British Machine Vision Conference (BMVC), 2011

Project PDF Video Dataset

Multiple-Shot Person Re-Identification by HPE Signature

L. Bazzani, M. Cristani, A. Perina, M. Farenzena, V. Murino

International Conference on Pattern Recognition (ICPR), 2010

PDF

Person Re-Identification by Symmetry-Driven Accumulation of Local Features

M. Farenzena, L. Bazzani, A. Perina, M. Cristani, V. Murino

Conference on Computer Vision and Pattern Recognition (CVPR), 2010

Project PDF Code Video

🦄 Personal Projects

Music Album: Accepted Insanity

As long-time keyboardist, I am currently composing and producing original synthwave music. This album explores the insanity that sorrounds us but we are somehow programmed to accept.

Project

Personal Hybrid Training Planner

As a runner and passionate about strength training, I created a personalized planner that leverages AI to build highly personalized, multi-week training plans tailored to your specific requirements.

Project

Loris Bazzani

Adjunct Professor and Honorary Research Fellow @ University of Verona

Principal Scientist @ Amazon

Postdoc @ Dartmouth College

Postdoc @ Italian Institute of Technology

PhD in Computer Vision and Machine Learning @ University of Verona

Visiting Student @ University of British Columbia

📢 News

📝 Research, Publications and Patents [Google Scholar]

Interactive Episodic Memory with User Feedback

ProMoE-FL: Prototype-conditioned Mixture of Experts for Multimodal Federated Learning with Missing Modalities

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

Med-MMFL: A Multimodal Federated Learning Benchmark in Healthcare

Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts

UniCoRN: Unified Commented Retrieval Network with LMMs

ToFu: Visual Tokens Reduction via Fusion for Multi-modal, Multi-patch, Multi-image Task

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

iEdit: Localised Text-guided Image Editing with Weak Supervision

[Patent] Interactive Retrieval Using Visual Semantic Matching

[Patent] Localized Visual Similarity

[Patent] Attribute-based Interactive Product Recommendations

[Patent] Visual Blending of Content

[Patent] Machine Learning System to Score Alt-text in Image Data

Contrastive Language-Action Pre-training for Temporal Localization

Learning Attribute-driven Disentangled Representations for Interactive Fashion Retrieval

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Localized Triplet Loss for Fine-Grained Fashion Image Retrieval

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

Image Search with Text Feedback by Visiolinguistic Attention Learning

[Patent] Automated Video Ratings

Image Captioning as Neural Machine Translation Task in SOCKEYE

Recurrent Mixture Density Network for Spatiotemporal Visual Attention

Group Detection and Tracking using Sociological Features

Image and Video Understanding in Big Data

Approximate Log-Hilbert-Schmidt Distances Between Covariance Operators for Image Classification

Self-taught Object Localization with Deep Networks

A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning

Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification

Joint Individual-Group Modeling for Tracking

SDALF: Modeling Human Appearance with Symmetry-driven Accumulation of Local Features

Weighted Bag of Visual Words for Object Recognition

A Unifying Framework for Vector-valued Manifold Regularization and Multi-view Learning

Semi-Supervised Multi-Feature Learning for Person Re-Identification

Person Re-Identification with a PTZ Camera: An Introductory Study

Symmetry-Driven Accumulation of Local Features for Human Characterization and Re-Identification

Social Interactions by Visual Focus of Attention in a Three-Dimensional Environment

Decentralized Particle Filter for Joint Individual-Group Tracking

Learning Where to Attend with Deep Architectures for Image Tracking

Re-Identification with RGB-D Sensors

Online Bayesian Non-Parametrics for Social Group Detection

Analyzing Groups: A Social Signaling Perspective

Multiple-Shot Person Re-Identification by Chromatic and Epitomic Analyses

Learning Attentional Policies for Object Tracking and Recognition in Video with Deep Networks

Custom Pictorial Structures for Re-Identification

Social Interaction Discovery by Statistical Analysis of F-Formations

Towards Computational Proxemics: Inferring Social Relations from Interpersonal Distances

Multiple-Shot Person Re-Identification by HPE Signature

Person Re-Identification by Symmetry-Driven Accumulation of Local Features

Collaborative Particle Filters for Group Tracking

Multimodal Learning with Interaction/User Feedback

Interactive Episodic Memory with User Feedback

Benchmarking Interaction, Beyond Policy: a Reproducible Benchmark for Collaborative Instance Object Navigation

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

UniCoRN: Unified Commented Retrieval Network with LMMs

ToFu: Visual Tokens Reduction via Fusion for Multi-modal, Multi-patch, Multi-image Task

[Patent] Interactive Retrieval Using Visual Semantic Matching

[Patent] Localized Visual Similarity

[Patent] Attribute-based Interactive Product Recommendations

[Patent] Visual Blending of Content

Learning Attribute-driven Disentangled Representations for Interactive Fashion Retrieval

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Localized Triplet Loss for Fine-Grained Fashion Image Retrieval

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

Image Search with Text Feedback by Visiolinguistic Attention Learning

Multimodal Federated Learning

ProMoE-FL: Prototype-conditioned Mixture of Experts for Multimodal Federated Learning with Missing Modalities

Med-MMFL: A Multimodal Federated Learning Benchmark in Healthcare

Multimodal Generative Models