OpenBuddy releases a new generation of cross language dialogue model based on Llama 2, which is open source and commercially available

Source: OSCHINA
Edit: game
2023-07-26 10:36:00

OpenBuddy LLaMA2-13B is a new cross language dialogue model based on Llama 2.

Llama 2 It is the latest model base released by Meta. Its data is more sufficient than the previous generation models, and it supports the commercial use of monthly live life of less than 700 million. This means that more companies and teams can use this model for commercial development and promote the popularization and innovation of various applications.

However, as in the previous version, Llama 2 still has some limitations: the LLaMA model base is mainly based on English training data, and cross language scenario applications are not considered. The model itself performs well in English, but in Chinese and other non English languages, the output of the model is not satisfactory.

In addition, the generalization ability and multi round dialogue ability of the LLaMA2 Chat model itself also have limitations.

In view of the limitations of LLaMA2, the OpenBuddy team designed and experimented with a variety of fine-tuning schemes based on their experience in fine-tuning models. Finally, 13B was selected as the model base of the first version, because it is the best model size that can be deployed by individual users and has emerging potential.

After several rounds of fine tuning and repeated experiments, the OpenBuddy team successfully completed the training of the first version of OpenBuddy LLaMA2-13B.

OpenBuddy-LLaMA2-13B

In the process of testing OpenBuddy LLaMA2-13B, the team found that models emerged Strong generalization ability and speculative ability This is the most satisfactory 13B model in their R&D process.

The model has certain critical thinking ability, will not blindly trust the information given by users, and can point out the user's errors or information deficiencies:

The model also has a certain degree of analysis and induction ability. In some scenarios, it can find the potential laws in the input information and give the analysis results:

In addition, the content creation ability and instruction compliance ability of the model have also been further improved, which can produce content that meets user needs:

It is reported that the OpenBuddy LLaMA2-13B model also participated in the HuggingFace's Open LLM Leader Board test list and achieved very high results. Model in English comprehensive ability scoring Exceeds Vicuna, WizardLM 1.0, Meta's official Llama2 chat And many other 13B models, and even can be close to large models with multiple scales such as MPT-30B.

Expand to read the full text
Click to lead the topic 📣 Post and join the discussion 🔥
zero comment
seven Collection
 Back to top
Top