MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models

🤗MINED

Large Multimodal Models (LMMs) encode rich factual knowledge via cross-modal pre-training, yet their static representations struggle to maintain an accurate understanding of time-sensitive factual knowledge. Existing benchmarks remain constrained by static designs, inadequately evaluating LMMs' ability to understand time-sensitive knowledge. To address this gap, we propose MINED, a comprehensive benchmark that evaluates temporal awareness along 6 key dimensions and 11 challenging tasks: cognition, awareness, trustworthiness, understanding, reasoning, and robustness. MINED is constructed from Wikipedia by two professional annotators, containing 2,104 time-sensitive knowledge samples spanning six knowledge types. Evaluating 15 widely used LMMs on MINED shows that Gemini-2.5-Pro achieves the highest average CEM score of 63.07, while most open-source LMMs still lack time understanding ability. Meanwhile, LMMs perform best on organization knowledge, whereas their performance is weakest on sport. To address these challenges, we investigate the feasibility of updating time-sensitive knowledge in LMMs through knowledge editing methods and observe that LMMs can effectively update knowledge via knowledge editing methods in single editing scenarios.

You can download data 🤗 Huggingface Dataset. And the expected structure of files is:

MINED
|-- 
inference_data (json/jsonl)
|   |-- Dimension1_time_agnostic.json
|   |-- Dimension1_temporal_interval.json
|   |-- Dimension1_time_agnostic.json
|   |-- Dimension2_awareness_future.json
|   |-- Dimension2_awareness_past.json
|   |-- Dimension3_future_unanswerable_date.json
|   |-- Dimension3_previous_unanswerable_date.json
|   |-- Dimension4_understanding.json
|   |-- Dimension5_calculation.json
|   |-- Dimension5_ranking.json
|   |-- Dimension6_robustness.json
|-- imgs
|   |-- MINED_Image.zip

🎯Main Results

🛠️Requirements and Installation

You can refer to https://github.com/open-compass/VLMEvalKit.git

💥Inference

python inference.py \
    --meta_save_path ./path/output \
    --model_name {base_model_name} \
    --data_eval_type {data_eval_type} \
    --max_new_token 10 \
    --image_path_prefix ./path/image_data

model_name refers to the model name defined in the VLMEvalKit\vlmeval\config.py file.

data_eval_type options (click to expand)

time_agnostic: Knowledge understanding independent of time
timestamp: Reasoning about facts at a specific time point
temporal_interval: Reasoning about facts/states within a time interval
awareness_future: Future temporal awareness and prediction consistency
awareness_past: Past temporal awareness and retrospective consistency
future_unanswerable_date: Unanswerable queries concerning future dates
previous_unanswerable_date: Unanswerable queries concerning past dates
ranking: Ordering/comparison based on time-sensitive attributes
understanding: Understanding complex temporal semantics and inference
calculation: Date/time-related arithmetic and derivation
robustness: Robustness to temporal perturbations and phrasing variations

🤖Evaluation

Evaluate MINED

python eval_code\cem_f1.py

📊Customize inference data and task instructions

You can customize task instructions in the inferrence.py file to complete the corresponding tasks.

Custom data only needs to match the image and text pairs.

🤝 Acknowledgments

We thank the following open-source projects for making this work possible:

VLMEvalKit for the evaluation.

📝 Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝 :)

@article{jiang2025mined,
  title = {MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models},
  author={Jiang, Kailin and Jiang, Ning and Ren, Yuchen and Li, Yuchen and Gao, Yifan and Bi, Jinhe and Ma, Yunpu and Liu, Qingqing and Wang, Xianhao and Jia, Yifan and Jiang, Hongbo and Hu, Yaocong and Li, Bin and Liu, Lei and Du, Yuntao},
  year = {2025}
  url = {https://arxiv.org/pdf/2510.19457}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
VLMEvalKit		VLMEvalKit
eval_code		eval_code
figs		figs
mined_data		mined_data
models		models
README.md		README.md
SimSun.ttf		SimSun.ttf
evaluation.sh		evaluation.sh
inference.py		inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models

Table of Contents

🤗MINED

🎯Main Results

🛠️Requirements and Installation

💥Inference

🤖Evaluation

📊Customize inference data and task instructions

🤝 Acknowledgments

📝 Citation

About

Uh oh!

Releases

Packages

Languages

MINED-LMM/MINED

Folders and files

Latest commit

History

Repository files navigation

MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models

Table of Contents

🤗MINED

🎯Main Results

🛠️Requirements and Installation

💥Inference

🤖Evaluation

📊Customize inference data and task instructions

🤝 Acknowledgments

📝 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages