A new theorem shows that no universal entanglement purification protocol exists for all two-qubit entangled states if one is limited to just standard local operations and classical communication.
View a PDF of the paper titled J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning, by Chenxi Whitehouse and 6 other authors
Posted in robotics/AI | Leave a Comment on View a PDF of the paper titled J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning, by Chenxi Whitehouse and 6 other authors
The progress of AI is bottlenecked by the quality of evaluation, and powerful LLM-as-a-Judge models have proved to be a core solution. Improved judgment ability is enabled by stronger chain-of-thought reasoning, motivating the need to find the best recipes for training such models to think. In this work we introduce J1, a reinforcement learning approach to training such models. Our method converts both verifiable and non-verifiable prompts to judgment tasks with verifiable rewards that incentivize thinking and mitigate judgment bias. In particular, our approach outperforms all other existing 8B or 70B models when trained at those sizes, including models distilled from DeepSeek-R1. J1 also outperforms o1-mini, and even R1 on some benchmarks, despite training a smaller model. We provide analysis and ablations comparing Pairwise-J1 vs Pointwise-J1 models, offline vs online training recipes, reward strategies, seed prompts, and variations in thought length and content. We find that our models make better judgments by learning to outline evaluation criteria, comparing against self-generated reference answers, and re-evaluating the correctness of model responses.
Three year old Space Solar has completed an 18-month engineering design and development project for key parts of its modular space-based solar power system.
Tesla is developing a terawatt-level supercomputer at Giga Texas to enhance its self-driving technology and AI capabilities, positioning the company as a leader in the automotive and renewable energy sectors despite current challenges ## ## Questions to inspire discussion.
Tesla’s Supercomputers.
💡 Q: What is the scale of Tesla’s new supercomputer project?
A: Tesla’s Cortex 2 supercomputer at Giga Texas aims for 1 terawatt of compute with 1.4 billion GPUs, making it 3,300x bigger than today’s top system.
💡 Q: How does Tesla’s compute power compare to Chinese competitors?
A: Tesla’s FSD uses 3x more compute than Huawei, Xpeng, Xiaomi, and Li Auto combined, with BYD not yet a significant competitor. Full Self-Driving (FSD)
This JAMA Patient Page describes the condition of chronic migraine, its risk factors, and preventive and acute treatment options.
Neurons in the rodent dorsomedial prefrontal cortex encode a flexible internal model of emotion by linking directly experienced and inferred associations with aversive experiences.
We know that earlier cancer diagnosis leads to better survival rates, but we aren’t using the data we already have to guide faster testing and treatment, says a Universityof Melbourne expert
Scientists at CERN’s Large Hadron Collider successfully transformed lead into gold atoms, achieving an ancient alchemist dream through modern physics.
In a new study published in Physical Review Letters, scientists have estimated a new lower bound on the mass of ultra-lightweight bosonic dark matter particles.
Purported to make up about 85% of the matter content in the universe, dark matter has eluded direct observation. Its existence is only inferred by its gravitational effects on cosmic structures.
Because of this, scientists have been unable to identify the nature of dark matter and, therefore, its mass. According to our current model of quantum mechanics, all fundamental particles must be either fermions or bosons.
Ask scientists which gene-editing tool is most needed to advance gene therapy, and they’d probably describe a system that’s now close to realization in the labs of Samuel Sternberg at Columbia University Vagelos College of Physicians and Surgeons and David Liu at the Broad Institute of MIT and Harvard.
The gene editor—called evoCAST—goes a long way toward solving a problem that has confounded the development of gene therapies from the field’s beginnings: How to add long stretches of DNA to defined locations in the human genome without creating unwanted modifications.
The latest iteration of the editor, which utilizes complex enzymes found in bacteria, can be programmed to insert an entire gene—or multiple genes—into a specific location in the human genome with an efficiency suitable for gene therapy. Details of the editor are described in a paper published in Science.