Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Surve-369IT编程

admin管理员组
文章数量:1130349

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

摘要

具有抽象推理能力的强人工智能（Strong AI）或通用人工智能（AGI）是下一代人工智能的目标。大型语言模型 (LLM) 的最新进展以及新兴的多模态大型语言模型 (MLLM) 领域在各种多模态任务和应用程序中展示了令人印象深刻的功能。特别是，各种 MLLM 都具有不同的模型架构、训练数据和训练阶段，已在广泛的 MLLM 基准上进行了评估。这些研究在不同程度上揭示了 MLLM 当前能力的不同方面。然而，MLLM 的推理能力尚未得到系统研究。在本次调查中，我们全面回顾了现有的多模态推理评估协议，对 MLLM 的前沿进行了分类和说明，介绍了 MLLM 在推理密集型任务中应用的最新趋势，最后讨论了当前的实践和未来的方向。我们相信我们的调查为多模态推理这一重要主题奠定了坚实的基础并阐明了清楚了

导言

不足

LM在一些推理领域不行，例如数学问题

MLLM和LM都有幻觉问题

MLLM的推理任务定义和分类

定义

Reasoning is one of the fundamental intelligent behaviors of human beings, which requires understanding and analyzing given conditions and background knowledge to derive a new conclusion logically and rationally
推理要遵守的：推理规则，领域知识

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

摘要

导言

不足

LM在一些推理领域不行，例如数学问题

MLLM和LM都有幻觉问题

MLLM的推理任务定义和分类

定义

分类

分类方法一

正式推理：其中只要前提为真，正式推理的结论就保证为真
非正式推理则不能保证结论的真实性，尤其是当可用信息是不完整或不明确时。通常，非正式推理是用自然语言进行的，

分类方法二

本文标签： Multimodal LARGE Abilities EXPLORING Reasoning

版权声明：本文标题：Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Surve 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://it.en369.cn/jiaocheng/1758732778a2783468.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

更多相关文章

369IT编程

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Surve

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

摘要

导言

最近发展

不足

MLLM的推理任务定义和分类

定义

分类

分类方法一

分类方法二

Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning

摘要

导言

最近发展

不足

MLLM的推理任务定义和分类

定义

分类

分类方法一

分类方法二

更多相关文章

《2023 ChatGPT for Robotics:Design Principles and Model Abilities》阅读笔记

The Rise and Potential of Large Language ModelBased Agents:A Survey---代理社会

Attention-Driven Reasoning: Unlocking the Potential of Large Language Models

Unleashing the Potential of Large Language Models as Prompt Optimizers

The Rise and Potential of Large Language Model Based Agents: A Survey

[arxiv论文阅读] LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding

Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection

Crusher industry large potential demand side quickly capture opportunities

A Large-Scale Chinese Short-Text Conversation Dataset

论文阅读：A Large-Scale Chinese Short-Text Conversation Dataset（CDial-GPT）

AGI之MFM：《Multimodal Foundation Models: From Specialists to General-Purpose Assistants多模态基础模型：从专家到通用助

Exploring Explainability

Large Language Models on Graphs: A Comprehensive Survey

A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences

A Comprehensive Evaluation on Event Reasoning of Large Language Models

Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms

Comprehensive Multimodal Segmentation in Medical Imaging

A COMPREHENSIVE SURVEY ON EVALUATING LARGE LANGUAGE MODEL APPLICATIONS IN THE MEDICAL INDUSTRY

论文阅读 【CVPR-2022】 A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for

【论文阅读】CentralNet: a Multilayer Approach for Multimodal Fusion

发表评论

推荐文章

oppo手机删了android怎么办,OPPO手机越用越卡？1删除这4个僵尸文件夹，流畅如新机...

Deep Freeze冰点还原8.57软件最新版下载及详细安装教程

Dell戴尔笔记本G15 5515 Ryzen Edition原装出厂系统恢复指南

cad电气版下载安装（附安装包）CAD Electrical 2025下载安装图文教程

实用软件分享

热门文章

Android手机安全软件的恶意程序检测靠谱吗--LBE安全大师、腾讯手机管家、360手机卫士恶意软件检测方法研究...

app上应用市场，被腾讯手机管家报病毒 a.gray.sexpay.m

经验分享 UEFI win7重装系统

【Qt运行流程详解】从启动到事件循环的深入解读

潜在解决方法-系统映像还原失败，找不到可用于恢复系统盘的磁盘

【免费下载】 CAD全版本万能字体介绍

超分辨率技术AI人工智能老照片修复自动人像脑补照片高清重建人脸模糊图片变清晰软件

catalog英文翻译_“目录”的英文翻译是“catalog&amp;quot;还是&amp;quot;contents&amp;quot;

[人工智能-深度学习-39]：环境搭建 - 训练主机硬件选择全指南（CPUGPU内存硬盘电源）

计算机必须配置的设备是,CSGO Mengxin必须查看计算机配置和设备选择建议以及经验分享...

最新文章

Sublime 32位 激活码

windows下载安装远程桌面工具RealVNC-Server教程(RealVNC_E4_6_1版带注册码)

【亲测免费】 抖音直播伴侣推流密钥获取工具使用教程

【亲测免费】 Proxifer 安装包与注册码

Royal TSX许可证密钥(6.x后所有版本都可以用)

程序员刚毕业，先去大厂镀金还是先去小厂攒经验？

万象2008清空boss账户密码

【Tools】GitBook简明教程

oracle exadata celldisk 闪存盘受损导致性能下降

SDUT 2138 图结构练习——BFSDFS——判断可达性

WordPress get parent category taxonomy

Omit specific product categories from WooCommerce shortcode

Updating Posts table in database without overwriting user generated content

php - Use wp_get_recent_posts with search term

responsive - How to exclude an image size from the Wordpress srcset

论文阅读【CVPR-2022】 A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for

catalog英文翻译_“目录”的英文翻译是“catalog"还是"contents"

Sublime 32位激活码

【亲测免费】抖音直播伴侣推流密钥获取工具使用教程