Industrial manufacturing
Industrial Internet of Things | Industrial materials | Equipment Maintenance and Repair | Industrial programming |
home  MfgRobots >> Industrial manufacturing >  >> Manufacturing Technology >> Industrial Technology

Microsoft Unveils Breakthrough in Conversational AI with Advanced Multi-Task Neural Network

Robust and universal language representations are important for obtaining decent results on various Natural Language Processing (NLP) tasks. Ensemble learning is one of the most efficient approaches to enhance model generalization. So far, developers have used it to obtain state-of-the-art outcomes in a variety of natural language understanding (NLU) tasks, ranging from machine reading comprehension to question answering.

However, such ensemble models contain hundreds of deep neural networks (DNN) models and are quite expensive to implement. Pretrained models, such as GPT and BERT, are also very expensive to deploy. GPT, for instance, consists of 48 transformer layer with 1.5 billion parameters, while BERT has 24 transformer layers with 344 million parameters.

In 2019, Microsoft came up with its own natural language processing (NLP) algorithm, named Multi-Task DNN. They have now updated this algorithm to obtain impressive outcomes.

Extending Knowledge Distillation

The research team compressed several ensembled models into one Multi-Task DNN, using knowledge distillation. They used the ensemble model [in an offline manner] to generate soft targets for every single task in the training dataset. Compared to hard targets, they offer more helpful data per training sample.

Let’s take a sentence for example, “I had a good chat with John last evening”, the sentiment in this phrase is unlikely to be negative. However, the sentence “We had an intriguing conversation last evening” can either be negative or positive, based on the context.

Reference: arXiv:1904.09482 | Microsoft Research Blog 

The researchers used both the correct targets and soft targets across various tasks to train a single MT-DNN. They utilized cuDNN accelerated PyTorch deep learning framework to train and test the new model on NVIDIA Tesla V100 GPUs.

Results

They compared distilled MT-DNN with normal MT-DNN and BERT. The outcomes show that the distilled MT-DNN outperforms both models by a significant margin, in terms of overall score on the General Language Understanding Evaluation (GLUE) benchmark, which is used for testing system performance on a wide range of linguistic phenomena.

GLUE benchmark score 

The benchmark comprises of 9 NLU tasks, including text similarity, textual entailment, sentiment analysis, and question answering. The data contains several hundred sentence pairs drawn from different sources, such as academic and encyclopedic text, news and social media.

All experiments performed in this research clearly show that language representation learned through distilled MT-DNN is more universal and robust than normal MT-DNN and BERT.

Read: Bosque: Microsoft’s New Programming Language Without Loops

In the coming years, researchers will try to find better ways of combining hard correct targets and soft targets for multi-task learning. And, rather than compressing a complicated model to a simpler one, they will explore better ways of using knowledge distillation to enhance model performance regardless of its complexity.


Industrial Technology

  1. Augmented Reality: Transforming Connected Field Service
  2. Meet the IT/OT Hybrid Professional: Bridging Data Centers and Industrial Automation
  3. Microsoft Unveils AI Research Incubator Focused on General‑Purpose Intelligence
  4. Microsoft Unveils Record‑Breaking 17‑Billion‑Parameter Language Model
  5. New Steel and Aluminum Tariffs: Immediate Supply Chain Impacts Revealed
  6. How the E‑Commerce Shift Is Reshaping Winning Marketing Strategies
  7. Industry 5.0: Redefining Manufacturing for the Future
  8. Current Trends and Innovations Shaping the Plastics Industry
  9. Revolutionizing Cuisine: How 3D Printing Transforms Food Production
  10. Understanding FRLs: Key to Compressor Efficiency