Huawei’s new Atlas 900 cluster for AI processing ran the ResNet-50 test in 59.8 seconds. Ken Hu claims that is 10 seconds faster than the previous record. At Huawei’s big Connect 2019, Hu offered

We’re making it available at a great discount for universities and research institutes around the world. If you’re interested, go ahead and apply now – we’d love to have you try it out.

I’m sure that’s a genuine offer; no company has been more generous with basic research than Huawei and its US$17B R & D budget. The inhouse developed operating system, a major achievement, will go open source in a few months. Google, Facebook, Microsoft, and Baidu have the capability, but few other non-classified entities.

The big boxes in the cluster contain thousands of Ascend 910 chips each delivering teraflops. Power often is now the limiting factor in AI. The models require massive compute power and the problems are getting worse. The box pictured could draw a megawatt of power, thousands of dollars per day.

Residual neural networks were developed five years ago at Microsoft by Kaiming He 何恺明, Jian Sun, and team. (Abstract below) This is state-of-the-art work being advanced at Google, Facebook, and other leading AI firms. That’s the level Huawei is working at today. Huawei is spending $billions to build two major lines of business, cloud and AI.

Professor Jürgen Schmidhuber writes on the relationship of ResNets and Highway Networks in the comments at end.

Sun is now at MEGVII Technology, a very fast-growing facial recognition company serving Alibaba, the Chinese government, and others. MEGVII is moving toward a US$4B IPO. That’s the kind of customer Huawei will hope to win.

Here’s the info on ResNet followed by Huawei’s initial presentation on the Ascend.

Abstract Residual Neural Networks

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [41] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1 , where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Huawei launches Ascend 910, the world’s most powerful AI processor, and MindSpore, an all-scenario AI computing framework

Eric Xu: We promised a full-stack, all-scenario AI portfolio. And today we delivered.

[Shenzhen, China, August 23, 2019] Huawei officially launched the world’s most powerful AI processor – the Ascend 910 – as well as an all-scenario AI computing framework, MindSpore.

“We have been making steady progress since we announced our AI strategy in October last year,” said Eric Xu, Huawei’s Rotating Chairman. “Everything is moving forward according to plan, from R&D to product launch. We promised a full-stack, all-scenario AI portfolio. And today we delivered, with the release of Ascend 910 and MindSpore. This also marks a new stage in Huawei’s AI strategy.”

Eric Xu, Huawei’s Rotating Chairman, announcing the release of the Ascend 910 AI processor and MindSpore AI computing framework on August 23, 2019

Ascend 910: More computing power than any other AI processor in the world

The Ascend 910 is a new AI processor that belongs to Huawei’s series of Ascend-Max chipsets. Huawei announced the processor’s planned specs at its 2018 flagship event, Huawei Connect. After a year of ongoing development, test results now show that the Ascend 910 processor delivers on its performance goals with much lower power consumption than originally planned.

For half-precision floating point (FP16) operations, Ascend 910 delivers 256 TeraFLOPS. For integer precision calculations (INT8), it delivers 512 TeraOPS. Despite its unrivaled performance, Ascend 910’s max power consumption is only 310W, much lower than its planned specs (350W).

“Ascend 910 performs much better than we expected,” said Xu. “Without a doubt, it has more computing power than any other AI processor in the world.”

Ascend 910 is used for AI model training. In a typical training session based on ResNet-50, the combination of Ascend 910 and MindSpore is about two times faster at training AI models than other mainstream training cards using TensorFlow.

Moving forward, Huawei will continue investing in AI processors to deliver more abundant, affordable, and adaptable computing power that meets the needs of a broad range of scenarios (e.g., edge computing, on-vehicle computing for autonomous driving, and training).

MindSpore: All-scenario AI computing framework

Huawei also launched MindSpore today, an AI computing framework that supports development for AI applications in all scenarios.

AI computing frameworks are critical to making AI application development easier, making AI applications more pervasive and accessible, and ensuring privacy protection.

In 2018, Huawei announced the three development goals for its AI framework:

  • Easy development: Dramatically reduces training time and costs
  • Efficient execution: Uses the least amount of resources with the highest possible OPS/W
  • Adaptable to all scenarios: Including device, edge, and cloud applications

MindSpore marks significant progress towards these goals. As privacy protection grows more important than ever, support for all scenarios is essential for enabling secure, pervasive AI. This is a key component in the MindSpore framework, which can readily adapt to different deployment needs. Resource budget environments can be as large and complicated or small and simple as needed – MindSpore supports them all.

MindSpore helps ensure user privacy because it only deals with gradient and model information that has already been processed. It doesn’t process the data itself, so private user data can be effectively protected even in cross-scenario environments. In addition, MindSpore has built-in model protection technology to ensure that models are secure and trustworthy.

The MindSpore AI framework is adaptable to all scenarios – across all devices, edge, and cloud environments – and provides on-demand cooperation between them. Its “AI Algorithm As Code” design concept allows developers to develop advanced AI applications with ease and train their models more quickly.

In a typical neural network for natural language processing (NLP), MindSpore has 20% fewer lines of core code than leading frameworks on the market, and it helps developers raise their efficiency by at least 50%.

Through framework innovation, as well as co-optimization of MindSpore and Ascend processors, Huawei’s solution can help developers more effectively address complex AI computing challenges and the need for a diverse range of computing power for different applications. This results in stronger performance and more efficient execution. In addition to Ascend processors, MindSpore also supports GPUs, CPUs, and other types of processors.

When introducing MindSpore, Xu emphasized Huawei’s commitment to helping build a more robust and vibrant AI ecosystem. “MindSpore will go open source in the first quarter of 2020. We want to drive broader AI adoption and help developers do what they do best.”

Enabling truly pervasive AI

Before announcing the release of Ascend 910 and MindSpore, Xu revisited Huawei’s AI strategy:

  • Invest in AI research: Develop fundamental machine learning capabilities in computer vision, natural language processing, decision and inference, etc. Focus on:

o Data and power-efficiency (i.e., use less data, computing, and energy)

o Security and trustworthiness

o Automation / autonomy

  • Build a full-stack AI portfolio

o Adaptive to all scenarios, including both standalone and cooperative scenarios between cloud, edge, and device

o Abundant and affordable computing power

o Efficient and easy-to-use AI platform with full-pipeline services

  • Cultivate talent and an open ecosystem: Collaborate widely with global academia, industries, and partners
  • Strengthen existing portfolio: Bring an AI mindset and techniques into existing products and solutions to create greater value and enhance competitive strengths
  • Drive operational efficiency: Use AI to automate high-volume, repetitive tasks for better efficiency and quality

Huawei’s AI portfolio covers all deployment scenarios, including public cloud, private cloud, edge computing, IoT industry devices, and consumer devices. The portfolio is also full-stack: It includes the Ascend IP and chip series, chip enablement layer CANN, training and inference framework MindSpore, and application enablement platform called ModelArts.

Huawei defines AI as a new general purpose technology, like railroads and electricity in the 19th century, and cars, computers, and the Internet in the 20th century. The company believes that AI will be used in almost every sector of the economy.

According to Xu, AI is still in its early stages of development, and there are a number of gaps to close before AI can become a true general purpose technology. Huawei’s AI strategy is designed to bridge these gaps and speed up adoption on a global scale. Specifically, Huawei wants to drive change in ten areas:

  1. Provide stronger computing power to increase the speed of complex model training from days and months to minutes – even seconds.
  2. Provide more affordable and abundant computing power. Right now, computing power is both costly and scarce, which limits AI development.
  3. Offer an all-scenario AI portfolio, meeting the different needs of businesses while ensuring that user privacy is well protected. This portfolio will allow AI to be deployed in any scenario, not just public cloud.
  4. Invest in basic AI algorithms. Algorithms of the future should be data-efficient, meaning they can deliver the same results with less data. They should also be energy-efficient, producing the same results with less computing power and less energy.
  5. Use MindSpore and ModelArts to help automate AI development, reducing reliance on human effort.
  6. Continue to improve model algorithms to produce industrial-grade AI that performs well in the real world, not just in tests.
  7. Develop a real-time, closed-loop system for model updates, making sure that enterprise AI applications continue to operate in their most optimal state.
  8. Maximize the value of AI by driving synergy with other technologies like cloud, IoT, edge computing, blockchain, big data, and databases.
  9. With a one-stop development platform of the full-stack AI portfolio, help AI become a basic skill for all application developers and ICT workers. Today only highly-skilled experts can work with AI.
  10. Invest more in an open AI ecosystem and build the next generation of AI talent to meet the growing demand for people with AI capabilities.

Wide adoption of Ascend 310 and ModelArts

At Huawei Connect 2018, Huawei announced its AI strategy and full-stack, all-scenario AI portfolio, including the Ascend 310 AI processor and ModelArts that provides full-pipeline model production services.

Ascend 310 is Huawei’s first commercial AI System on a Chip (SoC) in the Ascend-Mini series. With a maximum power consumption of 8W, Ascend 310 delivers 16 TeraOPS in integer precision (INT8) and 8 TeraFLOPS in half precision (FP16), making it the most powerful AI SoC for edge computing. It also comes with a 16-channel FHD video decoder.

Since its launch, Ascend 310 has already seen wide adoption in a broad range of products and cloud services. For example, Huawei’s Mobile Data Center (MDC), which employs Ascend 310, has been used by many leading automakers in shuttle buses, new-energy vehicles, and autonomous driving.

The Ascend 310-powered Atlas series acceleration card and server are now part of dozens of industry solutions (e.g., smart transportation and smart grid) developed by dozens of partners.

Ascend 310 also enables Huawei Cloud services like image analysis, optical character recognition (OCR), and intelligent video analysis. There are more than 50 APIs for these services. At present, the number of API calls per day has exceeded 100 million, and this figure is estimated to hit 300 million by the end of 2019. More than 100 companies are using Ascend 310 to develop their own AI algorithms.

Huawei’s ModelArts provides model development services spanning the full pipeline, from data collection and model development to model training and deployment. At present, more than 30,000 developers are using ModelArts to handle 4,000+ training tasks per day (for a total of 32,000 training hours). Among these tasks, 85% are related to visual processing, 10% are for processing audio data, and 5% are related to machine learning.

With today’s launch of Ascend 910 and MindSpore, Huawei has unveiled all the key components of its full-stack, all-scenario AI portfolio. “Everything is moving forward according to plan. We promised a full-stack, all-scenario AI portfolio. And today we delivered,” said Xu. This launch is a new milestone in Huawei’s AI roadmap; it’s also a new beginning.

At the end of his presentation, Xu added that Huawei will debut more AI products at its upcoming conference, Huawei Connect 2019, which will be held between September 18 and 20 in Shanghai. Huawei is working closely with its partners to make AI more pervasive and accessible, and help bring the benefits of digital technology to every person, home, and organization.