Xiaomi large model, do not engage in "ChatGPT"

Source: Shen Ran, Authors: Jin Yufan, He Shulong, Editor: He Shulong

Image source: Generated by Unbounded AI tool

Half a year after the launch of ChatGPT, a large-scale model chase continued to be staged on both sides of the Pacific Ocean.

The alliance formed by OpenAI, Microsoft and Nvidia is running wild on the east coast of the Pacific Ocean. Since March this year, Chinese technology companies have urgently followed up. Baidu, Ali, SenseTime, and iFlytek have successively launched "ChatGPT-like" products. Tencent, Huawei, and JD. The times are also "ten times bigger" opportunities.

At the moment of the "Hundred Models War", Xiaomi, as a large domestic technology company, appears to be extraordinarily calm.

Lei Jun, the head of Xiaomi, said that Xiaomi is developing some technologies and products, and will demonstrate them to everyone after they are polished. Lu Weibing, president of Xiaomi Group, said that Xiaomi currently has an AI team of more than 1,200 people, and will actively embrace large-scale models and integrate them with business depth, but will not make general-purpose large-scale models like OpenAI.

These information have deepened the doubts of the outside world: Will Xiaomi join the "Hundred Models War"?

Dr. Wang Bin, director of AI Lab of Xiaomi Group, told Shenran that Xiaomi will develop its own general-purpose model, but will not release a ChatGPT-like product alone, "nor will it release a PPT, or demonstrate a few examples, Let’s say we have a large model”, but the self-developed large model will eventually be brought out by the product.

This is the first time that the route and progress of the large model have been disclosed to the outside world after Xiaomi officially announced the large model team. On April 14 this year, Xiaomi announced that the large model team would be led by Luan Jian and report to Wang Bin. Wang Bin has been engaged in research and development related to NLP (Natural Language Processing) in the Chinese Academy of Sciences for more than 20 years. He joined Xiaomi in 2018 and has been in charge of the AI laboratory since 2019. AI Lab is the core department of Xiaomi's AI strategy.

Xiaomi, who has made a large-scale dialogue model, is a rare rationalist in the general-purpose pre-trained large-scale language model. Wang Bin revealed that currently there are more than 30 full-time large-scale model teams, and will not expand rapidly immediately; the goal of this team is still a large-scale language model, and the target base model parameters of the first step are tens of billions** , and then depending on the previous climbing results, the next step will be decided.

"There is still a long way to go from the development of large-scale models to the landing. Whether they can find suitable important scenes is a pain point for many large-scale model companies." In Wang Bin's view, Xiaomi's advantage is that it has enough ready-made large models. Landing scenarios, including Xiao Ai, loT, autonomous driving, robots, etc., rich application scenarios can also feed back the ability of large models.

Xiaomi has no shortage of scenarios, but to train a large model, the accumulation of data, computing power, and talents is indispensable. Wang Bin said that Xiaomi has a certain reserve of talents, and the challenges in terms of computing power and data volume are relatively large. On the one hand, the computing power needs to overcome system-level challenges, and the training cost must be controllable; on the other hand, it takes a lot of time and cost to obtain and clean high-quality data.

In the new wave of AI large models, why doesn't the Xiaomi AI team release "ChatGPT-like products"? How does Xiaomi judge the technical route and technical difficulty of the large model? A few days ago, He Shulong, editor-in-chief of Shenran, had a dialogue with Wang Bin, director of the AI Laboratory of Xiaomi Technical Committee. The following is the core content:

Xiaomi large model: 30 people in the team, no "ChatGPT-like"

**Shen Ran: On April 14th, Xiaomi appointed Luan Jian as the head of the large model team to report to you. Can you tell us how the Xiaomi large model team was born? **

**Wang Bin:**The big model team was announced in April, but it had already started operation before that.

On November 30 last year, after OpenAI released ChatGPT, a bunch of us quickly registered an account and started playing on it. ChatGPT is indeed subversive. We have been working on AI for so many years, and many of its capabilities have exceeded the expectations of our developers.

Soon, we organized a number of internal large-scale model communication groups to discuss large-scale model technology and its disruptive impact on machine translation, man-machine dialogue, intelligent question answering, and customer service. **Many of the people who participated in the early workshops later became key members of the full-time mockup team. **

**Shen Ran: Will the Xiaomi large-scale model team come a bit late? **

Wang Bin: For large models, we belong to the rational school.

Before the birth of ChatGPT, Xiaomi had done internal research and development and application of large models, mainly in the form of pre-training + downstream task supervision and fine-tuning for man-machine dialogue, and the parameters of the model were in the billions. Of course, this type of model is not a general-purpose large-scale model as it is now called.

We are very clear that the development and application of the general large model is a long-term work, not a matter of time. We were walking according to our own time plan and steps. At that time, we felt that the time was up, so we made a team release.

**Shen Ran: How many people are there in the large model team? Are there any plans to continue expanding? **

**Wang Bin:**The main team currently has more than 30 people. We are currently preparing according to the aspects of talents, data, models, computing power, evaluation, and products, and then gradually adjust or expand after a certain stage.

We will not immediately expand the number of people, such as recruiting 100 people at once. Because in the climbing stage of accumulating capacity, recruiting so many people may not know how to arrange it, but it is a waste.

With the continuous disclosure of information about large models and the continuous influx of capital and talents, the field of large models has developed very fast, and everyone's views have changed greatly. When ChatGPT first came out not long ago, everyone felt that it was basically impossible to realize a similar large-scale model, but slowly, many people felt that the possibility was very high, and some people believed that many products could be satisfied without such a large-scale model. demand. Everyone's investment intensity is also very different. Some people may think that the team needs at least a few hundred people, and some people think that it is not necessary.

**Shenran: Are there any phased plans for the future, when will it be tested internally and released externally? **

Wang Bin: Unlike other companies, Xiaomi is born with the attributes of products. I believe that when the large model of Xiaomi comes out, it is brought out by the product.

We may test internally before Q3. However, this is not an inevitable node.

**Shen Ran: In other words, Xiaomi will not release a ChatGPT-like product? **

Wang Bin: Yes, we will not release a PPT, or demonstrate that we have a large model. Rich application scenarios are our biggest advantage. **The large Xiaomi model will be more closely integrated with the scene, and the corresponding release plan must be made around the rhythm of the product. **

**Shenran: In addition to manpower, what is the cost of computing power for Xiaomi to make a large model? **

Wang Bin: We are a medium-scale investment, and we will decide on the next step of investment based on the results of the previous climb.

Our basic judgment is that the model suitable for Xiaomi products and businesses may have parameters in the tens of billions**, which will be lower than the scale of 100 billion, and the investment in training machines is about tens of millions of RMB.

**Shen Ran: How is the model with billions of parameters made by Xiaomi before? **

**Wang Bin: **ChatGPT released last year is a kind of large-scale model, called a general-purpose pre-trained language large model. But the big model itself appeared very early, and everyone has different routes and methods.

We started to follow up on the large model earlier. At that time, we made a dialogue-specific model with about 2.8 billion to 3 billion parameters. It is realized by fine-tuning the dialogue data on the basis of the pre-trained base model. It is not the current general-purpose large model, but is dedicated to man-machine dialogue. Sex, let it go on. Later, this model was launched to Xiaoai, and a small-scale online test was carried out.

Therefore, AIGC has already been used in Xiao Ai, but at the product level, we do not use this large model entirely, but use the complementarity of the traditional model and the large dialogue model to use the two together.

Xiaomi’s general-purpose large model is likely to be this hybrid model when it is launched into the product. The problems that the traditional model handles very well are handed over to the traditional model. The large model solves the problems it is good at, such as some small probability events or long-tail dialogues.

The dialogue level of the general-purpose large model that has come out now is significantly higher than that of the previous dialogue-specific large model, so this part of the team has also transferred to the general-purpose large model. This team ran through the entire training process of the large dialogue model, climbed over some pits, and with the accumulation of data, it has certain advantages.

Millet large model: the scene is dominant, and the data is a problem

**Shenran: During this period of time, the technological progress has been very rapid, and domestic large-scale models are being released intensively. Will you be anxious because of the slow progress? **

Wang Bin: I used to be quite anxious for a while, because I was a little panicked if I didn’t end up doing it all the time, and you would think, “How can others make such fast progress and make it all at once?” Now we go down to do it No more worries.

It is said that China is now a "hundred model war", and more than 80 large models have been released, some of which provide internal testing, and some are only released by PPT. The effect of some models is still good. Judging from the level of release, the level of our existing self-developed large models does not seem to be worse than many models. But we are in no rush to do an external release. First, for a company like Xiaomi, it doesn't make much sense. Second, we still hope to make the self-developed model better around the product, and then release it together.

**Shenran: Do you think the large models of domestic companies have a chance to catch up with OpenAI? How big is the gap? They like to use three months, six months to describe. **

Wang Bin: At present, OpenAI must be very advanced. It has invested early and has a very strong accumulation in talents, data, computing power, engineering, and products. From the domestic situation, I feel that there is still a certain gap between OpenAI and OpenAI. Some people say it is three months or six months, while others say it is one year or two years. In terms of time, it's hard to say.

Because how to evaluate a large model is a very difficult problem in itself. Now there are rankings of various large models, but none of them have been unanimously recognized by everyone. **There is no real evaluation standard, so talking about catching up in three months or six months is just a slap in the face. **

As for whether it is possible for China to catch up with OpenAI, I was pessimistic in the early days and thought it was almost impossible, but with the influx of various open source solutions, various teams and capital, my judgment is more optimistic. I think that China has the opportunity to narrow the distance with OpenAI, to approach or even surpass it in many scenarios.

**Large models do not seem to have such a high threshold for chips. Through the continuous accumulation and optimization of talents, data, computing power, etc., it is possible to continuously narrow the gap. **

**Shenran: Which types of domestic companies have more advantages in large-scale models? Where is the opportunity for Xiaomi? **

Wang Bin: Regardless of large companies or small and medium start-up companies, each has its own living space. The big model is an ecology, and not a single big one can take it all. All companies in the ecology, including computing power, data, applications, and companies that really make big models, have their own opportunities.

Large-scale models like Xiaomi have the advantage of application scenarios. We believe that the combination of large models and scenes will be a huge opportunity.

Because if you just release a large model and no one uses it, it may not be able to develop quickly through rolling. And we can immediately land on the scene, and through continuous iteration, we can give full play to the power of the large model in these scenes.

Although we currently only integrate a main team of more than 30 people, there are actually a lot of people on the periphery. In the entire AI laboratory, there are more than 100 people who have NLP background and are doing specific applications, including knowledge graph, machine translation, man-machine dialogue, intelligent customer service, and intelligent question answering. They are all people with the basic thinking of large models and related technologies, and are promoting the exploration of large models from the perspective of their respective applications.

Wang Bin

**Shen Ran: How valuable is Xiaomi's accumulation in NLP research to large models? **

Wang Bin: There are two opinions in the industry. One way of saying is that those of us may have no jobs, and AI has killed us, especially those who do NLP may have no jobs. There is also a saying that, after all, the big model is made from NLP, and those who do NLP have inherent advantages.

Both of these statements have some truth, but after all, it involves my job, I am more inclined to the latter statement.

Large models were originally explored in various fields, including vision, speech, and NLP. But why it is the first breakthrough in the field of NLP, I believe there are essential reasons for this. I understand at least two points: the first is the richness and easy availability of language data, and the second is that there is a very rich knowledge reflecting the human thinking process hidden behind the language data.

So I believe that people who have accumulated in the NLP field for many years have certain innate advantages in understanding and transforming large models. Many members of Xiaomi's large-scale model team originally worked in the direction of NLP. Several start-up companies that are very good at making large-scale models in China also came out of the NLP field.

**Shen Ran: What are the current difficulties for Xiaomi to overcome the large model? How to overcome it? **

**Wang Bin:**First of all, I still want to say that the large model itself has very huge challenges.

A huge challenge is the uncertainty of technology. We have seen some reports, and even the OpenAI team themselves are not very clear about the real principles behind the large model, and if they do it again, they are not sure whether the same "emergent" results will occur. I believe that OpenAI is telling the truth on this point. Due to the great uncertainty in technology, investment cannot guarantee that a large model that meets expectations can be trained.

The accumulation of high-quality data is also a challenge. It is generally believed that large models require extremely large and high-quality training data. The quality of data publicly available on the Internet is relatively poor in general, so the acquisition and cleaning of ** data are relatively big challenges. **

Another challenge is of course computing power. First of all, it does not mean that there are so many cards that can be trained. How to make good use of these cards is a system-level challenge in itself. Secondly, because mistakes may be made during the training process, the money may be burned, and nothing can be burned, so it depends on whether you have the ability to train a large model at a controllable cost.

Practically speaking, the current challenges of ** data and computing power are still relatively large, especially large-scale high-quality data **. After the previous period of climbing, we are now basically sure that as long as the data is in place and using the existing computing power, we can probably know how many days it will take to train a good base model.

**Shenran: Has the cost of large model training been reduced now? **

Wang Bin: On the one hand, the cost of trial and error is lower than before. Because large model training may take detours and fail, but with the disclosure of various information, it is possible to quickly find the correct direction of training. On the other hand, many cloud computing, chip and other companies, as well as many start-up companies, are providing lower-cost large model training and inference services. With the further development of the entire ecology, I believe that the cost of training will continue to decrease.

How does the large model affect Xiaomi's business?

**Shen Ran: Can you introduce the Xiaomi AI Lab you are responsible for in detail? **

Wang Bin: After the birth of "AlphaGo" in 2016, Mr. Lei immediately promoted the construction of the AI team. The AI Lab was formally established in 2016, and I have been in charge since 2019.

It turns out that the AI Lab is part of the Ministry of Artificial Intelligence. Later, the Artificial Intelligence Department was merged into the Group Technical Committee, and now the AI Lab is directly under the Technical Committee.

The current team size of the AI Lab is about 350 people, and it has six directions, namely machine learning, natural language processing (NLP), computer vision, acoustics, speech and knowledge graphs.

After the big model came out, the AI Lab set up a full-time big model team. We are now focusing on the language big model, but we are also paying attention to the cross-modal big model.

**Shen Ran: Mr. Lu (President of Xiaomi Group Lu Weibing) said that the Xiaomi AI team currently has more than 1,200 people. In addition to the AI laboratory, what other departments within Xiaomi are strongly related to AI? **

**Wang Bin:**In addition to the AI laboratory, there is also the team of Xiao Ai, both of which are under the technical committee.

In addition to the technical committee, there are many departments with relatively large AI teams, including the autopilot department of the automotive department, the camera department of the mobile phone, and the software department. In addition, the user growth and advertising recommendations in the Internet business department are all related to AI. relevant.

In short, some AI-related teams are in the business department, and some are in the technical committee. The total number is about 1,200. If you consider some small teams, I personally think this number is even larger.

**Shen Ran: What is the role of Xiaomi AI Lab in Xiaomi's AI strategy? **

**Wang Bin:**AI Lab is the research and development and output department of AI technology at the group level. In layman's terms, we are exporting AI technology to the whole company.

We once compared the AI laboratory to the "experimental field" and "ammunition depot" of AI technology at the group level. Because of the rapid development of AI, the AI laboratory will develop some medium and long-term cutting-edge technologies, make reserves around Xiaomi's business, and output "ammunition" when the group needs it.

In terms of AI technology, we must have the most complete reserves in the company, and we are also very powerful in the industry.

**Shen Ran: What are the important research achievements of Xiaomi AI Lab? **

Wang Bin: The concept of our AI laboratory emphasizes the combination of technology and scenarios. Currently, published papers are not regarded as OKR. Therefore, after I came to Xiaomi from the Chinese Academy of Sciences (Chinese Academy of Sciences), I feel that the greatest achievement is not the progress of a single point of technology, but the ingenious integration of technology and products.

Xiaomi is a To C company. Our AI capability output is not exported directly to the outside world for the time being, but through the company's products. We have made a lot of achievements, including many camera and photo album processing algorithms in Xiaomi mobile phones, voice and NLP algorithms involved in Xiao Ai, and AI algorithms in the recommendation, search, and customer service systems of Xiaomi Mall.

Let me give you an example. We have developed an offline translation function on our mobile phone. For example, after going abroad, the network is not so good in many cases. At this time, turn on the translation function of the Xiaomi mobile phone without using the cloud. In the offline state, real-time, privacy and The translation effect is better. The implementation and application of this function is not easy. We have done a lot of optimization work on translation effects and performance.

**In Xiaomi, it is not our own technology, which will be used first. Internal technology must also compete with external technology in a fair way. Only the winner can survive and be applied to products. **

**Shenran: Which businesses of Xiaomi will be affected by the large-scale model technology represented by ChatGPT? **

**Wang Bin:**The strongest ability of the large model, in simple terms, is that it understands people better, and it can obviously optimize the way of human-computer interaction. Xiaomi's Xiao Ai classmate, mobile phone operating system MIUI, car cockpit, IoT, and robots are all typical scenarios where large models are applied.

**Shen Ran: Could you use Xiao Ai as an example? **

Wang Bin: Applied to Xiao Ai, it can do two things at the same time. One is to make the impossible possible, which is equivalent to having new functions. For example, I asked Xiao Ai to make a travel plan or order meals, etc. The original technical ability has not been achieved, and if the user puts it another way, it will be messed up. But With the support of large models, it has a deeper understanding of human speech, so that complex tasks can be completed, and this type of application is feasible.

Another category is the enhancement of the original function, which is equivalent to icing on the cake. Because of the jumpiness and diversity of human expressions, in the process of Xiaoai’s human-computer interaction, the biggest problem is encountering small probability events. We call it Corner Case, and usually adopt a conservative strategy to let Xiaoai say, "I can't answer", "I'm still learning"**. This kind of underpinning answer can also continue the conversation, but the experience is not good. But the large model technology can carry on the dialogue for a longer time, and greatly improve the user satisfaction.

**Shenran: Does the big model have a big impact on the smart home? **

**Wang Bin:**According to my personal understanding, the large model can at least improve the user experience of smart home in terms of interactive capabilities.

Although there are many devices that claim to be "smart", they often behave like "mentally retarded" and the usage rate is not high. For example, turning on the air conditioner or adjusting the temperature of the air conditioner, if the statement is different from the standard command, it may not be possible to control the IoT device.

But after the arrival of the big model, it has a deeper understanding of human language. In many cases, there are various expressions. The big model can translate the user's expression into instructions that the machine can understand. This will drive more people to use smart devices and allow the entire ecosystem to grow faster.

**Shenran: In addition to the improvement of existing business, are there other things that Xiaomi could not do before, but it is possible to do after having a large model? **

Wang Bin: We will make deep collaboration between the large model and these businesses. Of course, in addition to this, we are also looking for more possibilities.

Our team has written a lot of articles to promote large models within the company, including the concept and technology development of large models, and to teach everyone how to use ChatGPT to solve business problems. Mr. Lei has asked every department to learn large-scale models, and requires everyone to have basic large-scale model thinking and think about how to integrate with business.

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
  • Pin