Multimodal - 搜索 News

资讯

Chinese Academy of Engineering unveils list of key emerging AI technologies

BEIJING, July 31 (Xinhua) -- The Chinese Academy of Engineering (CAE) on Thursday released a list of next-generation information engineering and emerging artificial intelligence (AI) technologies that ...

13 天

Sunrise Raises $139 Million in Pre-A Round as China Ramps Up GPU Independence Push

AsianFin— Sunrise, a domestic AI chipmaker spun off from SenseTime’s core semiconductor division, has raised nearly $139 ...

19 天

阿里团队推出UniME，多模态AI理解能力再升级

近日，阿里巴巴通义实验室联合悉尼大学、DeepGlint和帝国理工学院的研究团队发布了一项创新研究，提出了**UniME（Universal Multimodal Embedding）**框架，旨在突破多媒体AI在图文理解中的局限。这项研究于2025年4月发表，论文标题为《Breaking the Modality Barrier: Universal Embedding Learning ...

科技行者 on MSN19 天

让多媒体AI突破隔阂：阿里巴巴团队如何让机器真正"看懂"图文组合

这项由阿里巴巴通义实验室联合悉尼大学、DeepGlint和帝国理工学院的研究团队完成的工作发表于2025年4月，论文标题为《Breaking the Modality Barrier: Universal Embedding Learning with Multimodal ...

unite22 天

多模态人工智能的兴起：这些模型真的智能吗？ - Unite.AI

继法学硕士（LLM）的成功之后，人工智能行业正朝着多模态系统的方向发展。2023年，多模态人工智能市场规模达到1.2亿美元，预计到30年，年增长率将超过2032%。与仅处理文本的传统法学硕士（LLM）不同，多模态人工智能可以同时处理文本、图像、音频和视频。例如，当[…] ...

23 天

编码器-解码器架构的复兴？谷歌一口气发布32个T5Gemma模型

首先，谷歌发布了一系列用于健康 AI 开发的多模态模型 MedGemma ，其中包含 4B 和 27B 两个大小的几个不同模型：MedGemma 4B Multimodal、MedGemma 27B Text 和 MedGemma 27B Multimodal。

科技行者 on MSN23 天

当AI学会用代码"看懂"数学图形：香港中文大学团队让机器也能做几何题

这项由香港中文大学多媒体实验室的王轲、潘俊廷、魏琳达等研究团队开发的突破性研究发表于2025年5月，论文题为"MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical ...

生物通1月

基于成像技术的单细胞转录组与多模态分析（STAMP ...

基于成像技术的单细胞转录组与多模态分析（STAMP）：突破测序限制的高通量细胞图谱构建新策略《Cell》：STAMP: Single-cell transcriptomics analysis and multimodal profiling through imaging 【字体：大中小】时间：2025年06月20日来源：Cell 45.5 ...

IT之家1月

谷歌 Search Live 语音搜索功能上线：聊着聊着就能找到 ...

IT之家 6 月 19 日消息，谷歌今日在美国正式上线了其全新的 Search Live 语音搜索功能，适用于 iOS 和 Android 版应用。其基于 Gemini 模型，只要在 Labs 中启用 AI 模式，就能体验这一功能。届时，用户可以与搜索引擎进行自然的语音对话。

China Internet Information Center1月

Multimodal LLMs can develop human-like object concept representations ...

With the advent of LLMs such as ChatGPT, scientists have started to wonder whether these models can develop human-like object concept representations from linguistic and multimodal data.

GitHub4月

MuYuU0/CCAC2025-Chinese-multimodal-sarcasm-calculation

Contribute to MuYuU0/CCAC2025-Chinese-multimodal-sarcasm-calculation development by creating an account on GitHub.

GitHub4月

Emotion Change Reasoning in Multimodal Dialogues - GitHub

Emotion Change Reasoning in Multimodal Dialogues. Contribute to AIM3-RUC/MECR_CCAC2025 development by creating an account on GitHub.

一些您可能无法访问的结果已被隐去。

显示无法访问的结果