Toggle light / dark theme

A team of researchers from OpenAI recently published a paper describing GPT-3, a deep-learning model for natural-language with 175 billion parameters, 100x more than the previous version, GPT-2. The model is pre-trained on nearly half a trillion words and achieves state-of-the-art performance on several NLP benchmarks without fine-tuning.

In paper published on arXiv, a team of over 30 co-authors described the model and several experiments. The researchers’ goal was to produce an NLP system that performs well on a variety of tasks with little or no fine-tuning, and previous work had indicated that larger models might be the solution. To test that hypothesis, the team increased the size of their previous model, GPT-2, from 1.5 billion parameters to 175 billion. For training, the team collected several datasets, including the Common Crawl dataset and the English-language Wikipedia. The model was evaluated against several NLP benchmarks, matching state-of-the-art performance on “closed-book” question-answering tasks and setting a new record for the LAMBADA language modeling task.

OpenAI made headlines last year with GPT-2 and their decision not to release the 1.5 billion parameter version of the trained model due to “concerns about malicious applications of the technology.” GPT-2 is one of many large-scale NLP models based on the Transformer architecture. These models are pre-trained on large text corpora, such as the contents Wikipedia, using self-supervised learning. In this scenario, instead of using a dataset containing inputs paired with expected outputs, the model is given a sequence of text with words “masked” and it must learn to predict the masked words based on the surrounding context. After this pre-training, the models are then fine-tuned with a labelled benchmark dataset for a particular NLP task, such as question-answering.

This paper describes the design, implementation, and evaluation of VanarSena, an automated fault finder for mobile applications (“apps’‘). The techniques in VanarSena are driven by a study of 25 million real-world crash reports of Windows Phone apps reported in 2012. Our analysis indicates that a modest number of root causes are responsible for many observed failures, but that they occur in a wide range of places in an app, requiring a wide coverage of possible execution paths. VanarSena adopts a “greybox’’ testing method, instrumenting the app binary to achieve both coverage and speed. VanarSena runs on cloud servers: the developer uploads the app binary; VanarSena then runs several app “monkeys’’ in parallel to emulate user, network, and sensor data behavior, returning a detailed report of crashes and failures. We have tested VanarSena with 3000 apps from the Windows Phone store, finding that 1108 of them had failures; VanarSena uncovered 2969 distinct bugs in existing apps, including 1227 that were not previously reported. Because we anticipate VanarSena being used in regular regression tests, testing speed is important. VanarSena uses two techniques to improve speed. First, it uses a “hit testing’’ method to quickly emulate an app by identifying which user interface controls map to the same execution handlers in the code. Second, it generates a ProcessingCompleted event to accurately determine when to start the next interaction. These features are key benefits of VanarSena’s greybox philosophy.

2014-06

http://hdl.handle.net/1721.1/110759

Roboticists at the University of California San Diego have developed flexible feet that can help robots walk up to 40 percent faster on uneven terrain such as pebbles and wood chips. The work has applications for search-and-rescue missions as well as space exploration.

“Robots need to be able to walk fast and efficiently on natural, uneven terrain so they can go everywhere humans can go, but maybe shouldn’t,” said Emily Lathrop, the paper’s first author and a Ph.D. student at the Jacobs School of Engineering at UC San Diego.

The researchers will present their findings at the RoboSoft conference which takes place virtually May 15 to July 15, 2020.

Is it possible some instances of artificial intelligence are not as intelligent as we thought?

Call it artificial artificial intelligence.

A team of computer graduate students reports that a closer examination of several dozen information retrieval algorithms hailed as milestones in artificial research were in fact nowhere near as revolutionary as claimed. In fact, AI used in those algorithms were often merely minor tweaks of previously established routines.

EPFL researchers have developed electronic fibers that, when embedded in textiles, can be used to collect data about our bodies by measuring fabric deformation. Their technology employs flexible transmission lines and offers a host of applications, including in the medical industry.

Professor Fabien Sorin and doctoral assistant Andreas Leber, at the Laboratory of Photonic Materials and Fibre Devices (FIMAP) in EPFL’s School of Engineering, have developed a technology that can be used to detect a body’s movements—and a whole lot more.

“Imagine clothing or hospital bed sheets capable of monitoring your breathing and physical gestures, or AI-powered textiles that allow humans to interact more safely and intuitively with robots” says Leber. “The flexible transmission lines that we’ve developed can do all of this.”

As artificial intelligence (AI) becomes increasingly used for critical applications such as diagnosing and treating diseases, predictions and results regarding medical care that practitioners and patients can trust will require more reliable deep learning models.

In a recent preprint (available through Cornell University’s open access website arXiv), a team led by a Lawrence Livermore National Laboratory (LLNL) computer scientist proposes a novel aimed at improving the reliability of classifier models designed for predicting disease types from diagnostic images, with an additional goal of enabling interpretability by a medical expert without sacrificing accuracy. The approach uses a concept called confidence calibration, which systematically adjusts the ’s predictions to match the human expert’s expectations in the .

“Reliability is an important yardstick as AI becomes more commonly used in high-risk applications, where there are real adverse consequences when something goes wrong,” explained lead author and LLNL computational scientist Jay Thiagarajan. “You need a systematic indication of how reliable the model can be in the real setting it will be applied in. If something as simple as changing the diversity of the population can break your system, you need to know that, rather than deploy it and then find out.”

Researchers in Italy have melded the emerging science of convolutional neural networks (CNNs) with deep learning — a discipline within artificial intelligence — to achieve a system of market forecasting with the potential for greater gains and fewer losses than previous attempts to use AI methods to manage stock portfolios. The team, led by Prof. Silvio Barra at the University of Cagliari, published their findings on IEEE/CAA Journal of Automatica Sinica.

The University of Cagliari-based team set out to create an AI-managed “buy and hold” (B&H) strategy — a system of deciding whether to take one of three possible actions — a long action (buying a stock and selling it before the market closes), a short action (selling a stock, then buying it back before the market closes), and a hold (deciding not to invest in a stock that day). At the heart of their proposed system is an automated cycle of analyzing layered images generated from current and past market data. Older B&H systems based their decisions on machine learning, a discipline that leans heavily on predictions based on past performance.

By letting their proposed network analyze current data layered over past data, they are taking market forecasting a step further, allowing for a type of learning that more closely mirrors the intuition of a seasoned investor rather than a robot. Their proposed network can adjust its buy/sell thresholds based on what is happening both in the present moment and the past. Taking into account present-day factors increases the yield over both random guessing and trading algorithms not capable of real-time learning.

The Navy is also developing a family of unmanned surface vessels that are intended to increase the offensive punch for less money, while increasing the number of targets the Chinese military would have to locate in a fight.

That’s a push that earned the endorsement of Chief of Naval Operations Adm. Michael Gilday in comments late last year.

“I know that the future fleet has to include a mix of unmanned,” Gilday said. “We can’t continue to wrap $2 billion ships around 96 missile tubes in the numbers we need to fight in a distributed way, against a potential adversary that is producing capability and platforms at a very high rate of speed. We have to change the way we are thinking.”

The new stealth U.S. Air Force B-21 bomber has taken yet another key technological step toward being ready for war, through integrated computer automation designed to streamline information, improve targeting and offer pilots organized warzone information in real-time.

Air Force and Northrop Grumman developers recently completed an essential software-empowered process intended to bring greater levels of information processing, data management and new measures of computerized autonomy, according to published statements from Air Force Acquisition Executive Dr. William Roper.