VisualGLM-6B is participating 2021 OSC China Open Source Project Selection , please vote for it!
VisualGLM-6B in 2021 OSC China Open Source Project Selection {{projectVoteCount} has been obtained in, please vote for it!
2021 OSC China Open Source Project Selection It is in hot progress. Come and vote for your favorite open source project!
2021 OSC China Open Source Project Selection>>> Midfield Review
VisualGLM-6B won the 2021 OSC China Open Source Project Selection "The Best Popularity Project" !
Authorization Agreement Apache
development language Python View source code »
operating system Cross platform
Software type Open source software
Open source organizations nothing
region domestic
deliverer game
intended for unknown
Recording time 2023-05-19

Software Introduction

VisualGLM-6B is an open source that supports Image, Chinese and English Based on ChatGLM-6B, with 6.2 billion parameters; The image part builds a bridge between the visual model and the language model by training BLIP2-Qformer. The overall model has a total of 7.8 billion parameters.

VisualGLM-6B relies on 30M high-quality Chinese image text pairs from the CogView dataset to conduct pre training with 300M screened English image text pairs, with the same weight in both Chinese and English. This training method better aligns visual information to the semantic space of ChatGLM; In the later fine-tuning stage, the model is trained on the long visual question and answer data to generate answers that meet human preferences.

Expand to read the full text

code

Gitee index of is
exceed Items for

comment

Click to lead the topic 📣 Post and join the discussion 🔥
Published information
2023/06/21 14:28

Ant Group confirmed that it is developing language and multi-modal large model, naming "Zhenyi"

According to the exclusive news of the Science and Technology Innovation Board Daily, Ant Group's technology research and development team is developing a language and multi-modal large model - named "Zhenyi" internally. The project has been highly valued by Ant Group's management and has been launched for several months. Multimodal large model refers to a model that combines multimodal information such as text, image, video, audio and so on for training. Previously, Ilya Sutskever, the co-founder of OpenAI, said, "The long-term goal of AI is to build a multimodal neural network, that is, AI can learn the concepts between different modes, so as to better understand the world."

two
four
No more
Loading failed, please refresh the page
Click to load more
Loading
next page
{{o.pubDate | formatDate}}

{{formatAllHtml(o.title)}}

{{parseInt(o.replyCount) | bigNumberTransform}}
{{parseInt(o.viewCount) | bigNumberTransform}}
No more
No content temporarily
Issued a question and answer
{{o.pubDate | formatDate}}

{{formatAllHtml(o.title)}}

{{parseInt(o.replyCount) | bigNumberTransform}}
{{parseInt(o.viewCount) | bigNumberTransform}}
No more
No content temporarily
No content temporarily
zero comment
six Collection
 OSCHINA
Log in to view more high-quality content
 Back to top
Top