Alaska’s capital orders evacuations as record glacial flooding threatens homes.
Juneau braces for record Mendenhall Glacier flooding as climate change drives faster glacial melt and billions of gallons surge toward homes.
New AI method speeds up calculations to protect fusion reactors from plasma heat.
Scientists in the US have introduced a novel artificial intelligence (AI) approach that can protect fusion reactors from the extreme heat generated by plasma.
The new method, which is called HEAT-ML, was developed by researchers from Commonwealth Fusion Systems (CFS), the US Department of Energy’s (DOE) Princeton Plasma Physics Laboratory (PPPL), and Oak Ridge National Laboratory.
It is reportedly capable of quickly identifying magnetic shadows, which are critical areas protected from the intense heat of the plasma, and therefore help prevent potential problems before they start.
Chinese researchers unveil 10x larger atom array for next-gen quantum processors.
Scientists in China have achieved a significant breakthrough in advancing quantum physics.
A team of researchers has developed the largest array of atoms for quantum computing.
The key component for a quantum computer is reportedly capable of creating arrays 10 times larger than previous systems.
AI-powered brain implant restores speech in paralysis patient after 18 years.
UC Berkeley and UCSF use AI-driven brain-computer interface to restore near real-time speech in paralysis patient.
Discover how to create AI experiences with Copilot Studio and build low-code solutions using Microsoft Power Platform. Join the Microsoft Power Up Program today and get ready for the future of work.
A new software tool developed by Cornell researchers can model a small city’s building energy use within minutes on a standard laptop, then run simulations to help policymakers prioritize the most cost-effective approaches to decarbonization.
Using the City of Ithaca, New York, as a case study, the urban building energy model quickly mapped more than 5,000 residential and commercial buildings and their baseline energy use. Simulated investments in weatherization, electric heat pumps and rooftop solar panels, while also factoring in financial incentives, generated insights that are informing city efforts to achieve carbon neutrality by 2030.
The tool’s automated workflow, accessibility and accuracy—without advanced computing power—could be particularly valuable for smaller cities that lack resources and expertise dedicated to decarbonization, the researchers said. But they said the new model—now also supporting the county that surrounds Ithaca—could be further scaled up to serve big cities or an entire state.
Information tasks such as writing surveys or analytical reports require complex search and reasoning, and have recently been grouped under the umbrella of \textit{deep research} — a term also adopted by recent models targeting these capabilities. Despite growing interest, the scope of the deep research task remains underdefined and its distinction from other reasoning-intensive problems is poorly understood. In this paper, we propose a formal characterization of the deep research (DR) task and introduce a benchmark to evaluate the performance of DR systems. We argue that the core defining feature of deep research is not the production of lengthy report-style outputs, but rather the high fan-out over concepts required during the search process, i.e., broad and reasoning-intensive exploration. To enable objective evaluation, we define DR using an intermediate output representation that encodes key claims uncovered during search-separating the reasoning challenge from surface-level report generation. Based on this formulation, we propose a diverse, challenging benchmark LiveDRBench with 100 challenging tasks over scientific topics (e.g., datasets, materials discovery, prior art search) and public interest events (e.g., flight incidents, movie awards). Across state-of-the-art DR systems, F1 score ranges between 0.02 and 0.72 for any sub-category. OpenAI’s model performs the best with an overall F1 score of 0.55. Analysis of reasoning traces reveals the distribution over the number of referenced sources, branching, and backtracking events executed by current DR systems, motivating future directions for improving their search mechanisms and grounding capabilities. The benchmark is available at https://github.com/microsoft/LiveDRBench.