Google: AI can automate text summaries

(Original title: Google: AI can automate text summaries)

Google's artificial intelligence enables state-of-the-art text snippet performance

Automatic text summarization is one of the directions that machine learning algorithms are working on, and a recent paper published by Microsoft also shows this trend. This is great news for workers who read a lot of text messages every day. Surveys have shown that such workers spend approximately 2.6 hours a day reading information alone.

Correspondingly, Google Brain and a team at Imperial College London built a system-Pregasining (Pre-training with Extracted Gap-sentences for Abstractive Summarization Sequence-to-sequence), which uses Google ’s Transformer architecture and combines Pre-training goals for text summary capabilities. It is said to have reached the most advanced level in 12 tests, including science, stories, email, patents and legislation. Not only that, it also performed amazingly in textual integration tests that lacked material.

As the researchers have pointed out, the purpose of text summaries is to summarize the input documents and generate their accurate and concise summaries.

The abstract summary is not simply copying and pasting text fragments from the input text, but it will generate new words or summarize important information, so that the output language remains smooth.

Transformers are a neural structure that researchers at Google Brain (Google's artificial intelligence research unit) are introducing.

It extracts features and learns to make predictions in the same way as all deep neural networks: neurons are arranged in layers that are connected to each other. These layers pass the signal of input data and adjust the weight of each connection.

But the Transformers architecture is unique: each output element and each input element are connected, and the weights between them are calculated dynamically.

In testing, the research team selected the best performing Pegasus model, which contains 568 million parameters. It has two training materials. One is 750GB of text extracted from 350 million web pages. There is also a training material covering 1.5 billion news articles, totaling 3.8TB. Researchers said that in the latter case, they used whitelisted domains to implant web crawlers, covering uneven quality content.

According to the researchers, Pegasus's summary language is excellent, with high levels of fluency and coherence. In addition, in a text-poor environment, even with only 100 example articles, the quality of the abstracts it produces is comparable to models trained on a complete dataset of 20,000 to 200,000 articles.

Source: NetEase Smart, translated by Google Translate

Statement: this information is reprinted from authoritative news media. Reprinted for the purpose of transmitting more information and academic exchange, it is not used for commercial purposes, and does not mean to agree with its views or confirm its description. The content of this article is for reference only. If you violate the rights and interests of a third party, please contact us and we will deal with it as soon as possible.

Google: AI can automate text summaries

推荐

为你推荐

Musk hints that Tesla may have liquidated its Bitcoin holdings

"Tianwen-1" probe successfully landed on Mars, NASA sends congratulatory message

Italy: Google fined 102 million euros for abusing market dominance

Musk: Bitcoin is becoming less and less environmentally friendly, Tesla suspends accepting payments

Catch up with SpaceX! Russia will launch 36 more OneWeb satellites at the end of May