06版 - 坚守创新“长期主义”

· · 来源:tutorial在线

印尼:3月28日起将限制16岁以下儿童使用YouTube、Facebook等平台

抖音商城38好物节3月4日正式开启

中國「兩會」五大焦點,更多细节参见新收录的资料

当地时间3月7日晚,以色列国土防卫司令部表示,根据最新战事评估,尽管伊朗导弹发射频率有所下降,但全国性限制措施仍然保持不变,目前的限制措施将持续到当地时间3月9日20时。

A growing countertrend towards smaller (opens in new tab) models aims to boost efficiency, enabled by careful model design and data curation – a goal pioneered by the Phi family of models (opens in new tab) and furthered by Phi-4-reasoning-vision-15B. We specifically build on learnings from the Phi-4 and Phi-4-Reasoning language models and show how a multimodal model can be trained to cover a wide range of vision and language tasks without relying on extremely large training datasets, architectures, or excessive inference‑time token generation. Our model is intended to be lightweight enough to run on modest hardware while remaining capable of structured reasoning when it is beneficial. Our model was trained with far less compute than many recent open-weight VLMs of similar size. We used just 200 billion tokens of multimodal data leveraging Phi-4-reasoning (trained with 16 billion tokens) based on a core model Phi-4 (400 billion unique tokens), compared to more than 1 trillion tokens used for training multimodal models like Qwen 2.5 VL (opens in new tab) and 3 VL (opens in new tab), Kimi-VL (opens in new tab), and Gemma3 (opens in new tab). We can therefore present a compelling option compared to existing models pushing the pareto-frontier of the tradeoff between accuracy and compute costs.

图片报道

分享本文:微信 · 微博 · QQ · 豆瓣 · 知乎