Groundbreaking observations by the James Webb Space Telescope of an early galaxy merger indicate faster and more efficient star formation than previously understood, revealing complex stellar populations and challenging current cosmological theories. Galaxies and stars developed faster after t.
Language models often need more exposure to fruitful mistakes during training, hindering their ability to anticipate consequences beyond the next token. LMs must improve their capacity for complex decision-making, planning, and reasoning. Transformer-based models struggle with planning due to error snowballing and difficulty in lookahead tasks. While some efforts have integrated symbolic search algorithms to address these issues, they merely supplement language models during inference. Yet, enabling language models to search for training could facilitate self-improvement, fostering more adaptable strategies to tackle challenges like error compounding and look-ahead tasks.
Researchers from Stanford University, MIT, and Harvey Mudd have devised a method to teach language models how to search and backtrack by representing the search process as a serialized string, Stream of Search (SoS). They proposed a unified language for search, demonstrated through the game of Countdown. Pretraining a transformer-based language model on streams of search increased accuracy by 25%, while further finetuning with policy improvement methods led to solving 36% of previously unsolved problems. This showcases that language models can learn to solve problems via search, self-improve, and discover new strategies autonomously.
Recent studies integrate language models into search and planning systems, employing them to generate and assess potential actions or states. These methods utilize symbolic search algorithms like BFS or DFS for exploration strategy. However, LMs primarily serve for inference, needing improved reasoning ability. Conversely, in-context demonstrations illustrate search procedures using language, enabling the LM to conduct tree searches accordingly. Yet, these methods are limited by the demonstrated procedures. Process supervision involves training an external verifier model to provide detailed feedback for LM training, outperforming outcome supervision but requiring extensive labeled data.
An enzyme in a cyanobacterium can take the unusual form a triangle containing ever-smaller triangular gaps, making a fractal pattern.
By Alex Wilkins
Researchers have created an “atlas” of the human ovary, which could lead to the development of artificial ovaries and restore fertility in patients.
Researchers at Tohoku University have developed a paper-based Magnesium-air battery that is eco-friendly and powerful.
A multidisciplinary team is teaching dog-like robots to navigate the moon’s craters and other challenging planetary surfaces.
As part of the research funded by NASA, researchers from various universities and NASA Johnson Space Center tested a quadruped named Spirit at Palmer Glacier on Oregon’s Mount Hood.
During five days of testing in the summer of 2023, Spirit traversed various terrains, ambling over, across, and over around shifting earth, mushy snow, and stones with his spindly metal legs.
Generative AI companies are dominant on this year’s list of the most promising AI startups, heralding a coming productivity revolution.
In a move that directly challenges Nvidia in the lucrative AI training and inference markets, Intel announced its long-anticipated new Intel Gaudi 3 AI accelerator at its Intel Vision event.
The new accelerator offers significant improvements over the previous generation Gaudi 3 processor, promising to bring new competitiveness to training and inference for LLMs and multimodal models.
Gaudi 3 dramatically increases AI compute capabilities, delivering substantial improvements over Gaudi 2 and competitors, particularly in processing BF16 data types, which are crucial for AI workloads.
Databricks’ New Open Source LLM
Posted in robotics/AI
Data analytics company Databricks says its mission is to deliver data intelligence to every enterprise by allowing organizations to understand and use their unique data to build their own AI systems. Central to that mission is the ability to use a large language model tailored to the needs of the enterprise.
Databricks addresses the need for open LLMs with the release of DBRX, a new open, general-purpose large language model that sets new benchmarks for performance and efficiency. The announcement continues the recent trend of open large language models adapted for the needs of the enterprise.
The open-source DBRX large language model was developed by Databricks’ Mosaic Research team, which it acquired in June 2023 as part of its MosaicML acquisition.
OpenAI team illiegaly used more than one million hours of YouTube videos, here is why.