Toggle light / dark theme

The progress of AI is bottlenecked by the quality of evaluation, and powerful LLM-as-a-Judge models have proved to be a core solution. Improved judgment ability is enabled by stronger chain-of-thought reasoning, motivating the need to find the best recipes for training such models to think. In this work we introduce J1, a reinforcement learning approach to training such models. Our method converts both verifiable and non-verifiable prompts to judgment tasks with verifiable rewards that incentivize thinking and mitigate judgment bias. In particular, our approach outperforms all other existing 8B or 70B models when trained at those sizes, including models distilled from DeepSeek-R1. J1 also outperforms o1-mini, and even R1 on some benchmarks, despite training a smaller model. We provide analysis and ablations comparing Pairwise-J1 vs Pointwise-J1 models, offline vs online training recipes, reward strategies, seed prompts, and variations in thought length and content. We find that our models make better judgments by learning to outline evaluation criteria, comparing against self-generated reference answers, and re-evaluating the correctness of model responses.

Tesla is developing a terawatt-level supercomputer at Giga Texas to enhance its self-driving technology and AI capabilities, positioning the company as a leader in the automotive and renewable energy sectors despite current challenges ## ## Questions to inspire discussion.

Tesla’s Supercomputers.

💡 Q: What is the scale of Tesla’s new supercomputer project?

A: Tesla’s Cortex 2 supercomputer at Giga Texas aims for 1 terawatt of compute with 1.4 billion GPUs, making it 3,300x bigger than today’s top system.

💡 Q: How does Tesla’s compute power compare to Chinese competitors?

A: Tesla’s FSD uses 3x more compute than Huawei, Xpeng, Xiaomi, and Li Auto combined, with BYD not yet a significant competitor. Full Self-Driving (FSD)

In a new study published in Physical Review Letters, scientists have estimated a new lower bound on the mass of ultra-lightweight bosonic dark matter particles.

Purported to make up about 85% of the matter content in the universe, dark matter has eluded direct observation. Its existence is only inferred by its gravitational effects on cosmic structures.

Because of this, scientists have been unable to identify the nature of dark matter and, therefore, its mass. According to our current model of quantum mechanics, all fundamental particles must be either fermions or bosons.

Ask scientists which gene-editing tool is most needed to advance gene therapy, and they’d probably describe a system that’s now close to realization in the labs of Samuel Sternberg at Columbia University Vagelos College of Physicians and Surgeons and David Liu at the Broad Institute of MIT and Harvard.

The gene editor—called evoCAST—goes a long way toward solving a problem that has confounded the development of gene therapies from the field’s beginnings: How to add long stretches of DNA to defined locations in the without creating unwanted modifications.

The latest iteration of the editor, which utilizes complex enzymes found in bacteria, can be programmed to insert an entire gene—or multiple genes—into a specific location in the human genome with an efficiency suitable for gene therapy. Details of the editor are described in a paper published in Science.

Konstantin Vodopyanov, a professor at the College of Sciences and CREOL, the College of Optics and Photonics, recently co-authored a study published in the journal Optica. This research examines electro-optic sampling (EOS), a technique that advances fields such as quantum physics, molecular spectroscopy and biomedical sensing.

As a professor at the two colleges, Vodopyanov shows how working across different fields can lead to new ideas. The Optica Fellow’s research, which combines interdisciplinary work, is shaping the future of quantum physics and other areas of science.

His new study explores how EOS transmits through crystals that change in response to an applied electric field. This technique allows researchers to accurately capture the shape and timing of electric fields across a broad range of frequencies.