A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences-369IT编程

admin管理员组
文章数量:1130349

本文是LLM系列文章，针对《Mementos: A Comprehensive Benchmark for Multimodal Large
Language Model Reasoning over Image Sequences》的翻译。

Mementos：基于图像序列的多模态大型语言模型推理的综合基准

摘要
1 引言
2 Mementos
3 实验
4 相关工作
5 结论和未来工作

摘要

多模态大型语言模型（MLLMs）已经证明能够熟练处理各种视觉语言任务。然而，目前的MLLM基准主要用于评估基于单个图像的静态信息的推理，而现代MLLM从图像序列中推断的能力，这对理解我们不断变化的世界至关重要，但研究较少。为了应对这一挑战，本文介绍了Mementos，这是一种旨在评估MLLMs序列图像推理能力的新基准。Mementos以4761个不同长度的不同图像序列为特征。我们还使用GPT-4辅助的方法来评估MLLM的推理性能。通过仔细评估Mementos上最近的九种MLLMs，包括GPT4V和Gemini，我们发现它们很难准确描述给定图像序列的动态信息，经常导致对象及其相应行为的幻觉/误传。我们的定量分析和案例研究确定了影响MLLMs序列图像推理的三个关键因素：对象和行为幻觉之间的相关性、共现行为的影响以及行为幻觉的复合影响。我们的数据集在https://github/umd-huanglab/Mementos上可用。

1 引言

2 Mementos

3 实验

4 相关工

本文是LLM系列文章，针对《Mementos: A Comprehensive Benchmark for Multimodal Large
Language Model Reasoning over Image Sequences》的翻译。

Mementos：基于图像序列的多模态大型语言模型推理的综合基准

摘要
1 引言
2 Mementos
3 实验
4 相关工作
5 结论和未来工作

摘要

1 引言

2 Mementos

3 实验

4 相关工

本文标签： Multimodal LARGE Comprehensive Benchmark language

版权声明：本文标题：A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://it.en369.cn/jiaocheng/1758729264a2783420.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

369IT编程

A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

Mementos：基于图像序列的多模态大型语言模型推理的综合基准

摘要

1 引言

2 Mementos

3 实验

4 相关工

Mementos：基于图像序列的多模态大型语言模型推理的综合基准

摘要

1 引言

2 Mementos

3 实验

4 相关工

更多相关文章

【GNN综述2】 2019 A Comprehensive Survey on Graph Neural Networks

A Comprehensive Survey on Graph Anomaly Detection with Deep Learning——前言

读《Diffusion Models: A Comprehensive Survey of Methods and Applications》综述

Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey

A Comprehensive Survey on Graph NeuralNetworks（GNN综述）

《A Comprehensive Survey on Community Detection with Deep Learning》简要笔记

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

Comprehensive Deep Learning Tutorial: 项目使用说明

a comprehensive guide for linear ridge and lasso regression

【综述】Diffusion Models: A Comprehensive Survey of Methods and Applications

每日论文：《CENTIME: A Direct Comprehensive Traffic Features Extraction for Encrypted Traffic Classificati

ck+database:Comprehensive Database for Facial Expression Analysis论文笔记

【NSDI 2020】Diamond-Miner: Comprehensive Discovery of the Internet‘s Topology Diamonds

Unlocking the Potential of Use Case Diagrams: A Comprehensive Guide

【论文笔记】 图神经网络综述 A Comprehensive Survey on Graph Neural Networks

LLMs之Benchmark之TableBench：《TableBench: A Comprehensive and Complex Benchmark for Table Question Answ

(深度学习社区发现综述)A Comprehensive Survey on Community Detection with Deep Learning

HTML5 Web SQL: A Comprehensive Guide

JavaScript Navigator: A Comprehensive Guide to Mastering the Browser‘s Navigation Object

【论文阅读】CentralNet: a Multilayer Approach for Multimodal Fusion

发表评论

推荐文章

iOS- 资源大全中文版

conda安装GPU版pytorch，却是cpu版本的原因

微软掀起生产力革命！GPT-4o 重塑 Windows，奥特曼新模型剧透登场

win10 1050Ti 笔记本配置 TensorFlow-gpu 过程（多图超详细）

pycharm怎么把中文翻译成英文

热门文章

性价比高台式计算机,2020性价比高的台式电脑排行推荐

360极速浏览器不支持看PDF

不敲代码用ChatGPT开发一个App

u8转完看不到菜单_进入软件后所有菜单栏都不显示

AutoCAD2007 17.0.54.110 中文版

电脑手机模拟器模拟手机浏览器，在线浏览手机网站

在局域网搭建一个带 web 操作页面的 git 版本服务器 - Gitlab

【机器视觉】Halcon 最新版本安装教程（HALCON 24.11.1.1 Progress-Steady）

android中自动翻译你看不懂的英文代码插件，让你实现在androistudio中学习英语！！

【笔记本】2025年高性价比笔记本选购指南--大学生怎么选择适合自己的笔记本

最新文章

Sublime 32位 激活码

windows下载安装远程桌面工具RealVNC-Server教程(RealVNC_E4_6_1版带注册码)

【亲测免费】 抖音直播伴侣推流密钥获取工具使用教程

【亲测免费】 Proxifer 安装包与注册码

Royal TSX许可证密钥(6.x后所有版本都可以用)

程序员刚毕业，先去大厂镀金还是先去小厂攒经验？

万象2008清空boss账户密码

【Tools】GitBook简明教程

oracle exadata celldisk 闪存盘受损导致性能下降

SDUT 2138 图结构练习——BFSDFS——判断可达性

WordPress get parent category taxonomy

Omit specific product categories from WooCommerce shortcode

Updating Posts table in database without overwriting user generated content

php - Use wp_get_recent_posts with search term

responsive - How to exclude an image size from the Wordpress srcset

【论文笔记】图神经网络综述 A Comprehensive Survey on Graph Neural Networks

Sublime 32位激活码

【亲测免费】抖音直播伴侣推流密钥获取工具使用教程