multimodal - 搜索 News

资讯

16 天

近日，阿里巴巴通义实验室联合悉尼大学、DeepGlint和帝国理工学院的研究团队发布了一项创新研究，提出了**UniME（Universal Multimodal Embedding）**框架，旨在突破多媒体AI在图文理解中的局限。这项研究于2025年4月发表，论文标题为《Breaking the Modality Barrier: Universal Embedding Learning ...

腾讯网16 天

让多媒体AI突破隔阂：阿里巴巴团队如何让机器真正“看懂”图文组合

为了解决这个问题，研究团队提出了一个名为UniME（Universal Multimodal Embedding，通用多模态嵌入）的创新框架。这个框架就像一个经过特殊训练的多语言专家，不仅能深度理解图片和文字的内容，还能准确地判断它们之间的关联程度。

来自MSN5月

微软发展自家多模态模型Phi-4-multimodal，56亿参数支持 ...

微软发布Phi-4-multimodal，这是一款小型语言模型（SLM）具备处理语音、图像与文本的能力，已于Azure AI Foundry、Hugging Face及Nvidia API Catalog上线。相较于 ...

10 天

Sunrise Raises $139 Million in Pre-A Round as China Ramps Up GPU Independence Push

AsianFin— Sunrise, a domestic AI chipmaker spun off from SenseTime’s core semiconductor division, has raised nearly $139 ...

20 天

编码器-解码器架构的复兴？谷歌一口气发布32个T5Gemma模型

首先，谷歌发布了一系列用于健康 AI 开发的多模态模型 MedGemma ，其中包含 4B 和 27B 两个大小的几个不同模型：MedGemma 4B Multimodal、MedGemma 27B Text 和 MedGemma 27B Multimodal。

China Internet Information Center1月

Multimodal LLMs can develop human-like object concept representations ...

BEIJING, June 10 (Xinhua) -- A group of Chinese scientists confirmed that multimodal large language models (LLMs) can spontaneously develop human-like object concept representatio ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果