This CU Cancer Center symposium was presented by Sandra McAllister, PhD, assistant professor of medicine at the Harvard Medical School, and associate scientist at Brigham \& Women’s Hospital.
This CU Cancer Center symposium was presented by Sandra McAllister, PhD, assistant professor of medicine at the Harvard Medical School, and associate scientist at Brigham \& Women’s Hospital.
Posted in habitats
Audio samples for the paper “BASE TTS: Lessons from building a billion-parameter text-to-speech model on 100K hours of data”
Abstract: We introduce a text-to-speech (TTS) model called BASE TTS, which stands for Big Adaptive Streamable TTS with Emergent abilities. BASE TTS is the largest TTS model to-date, trained on 100K hours of public domain speech data, achieving a new state-of-the-art in speech naturalness. It deploys a 1-billion-parameter autoregressive Transformer that converts raw texts into discrete codes (“speechcodes”) followed by a convolution-based decoder which converts these speechcodes into waveforms in an incremental, streamable manner. Further, our speechcodes are built using a novel speech tokenization technique that features speaker ID disentanglement and compression with byte-pair encoding. Echoing the widely-reported “emergent abilities” of large language models when trained on increasing volume of data, we show that BASE TTS variants built with 10K+ hours and 500M+ parameters begin to demonstrate natural prosody on textually complex sentences.
* All-In Podcast: E166: Mind-blowing AI Video: OpenAI launches Sora, Biden too old? more * AI Explained: Sora – Full Analysis (with new details) * AI Explained: Gemini 1.5 and The Biggest Night in AI – YouTube.
* V-JEPA: The next step toward advanced machine intelligence * Marc Raibert: Boston Dynamics and the Future of Robotics | Lex Fridman Podcast #412 * Two Minute Papers: OpenAI Sora: The Age Of AI Is Here!
* David Shapiro: AGI in 7 Months! Gemini, Sora, Optimus, & Agents – It’s about to get REAL WEIRD out there! * Cube: 47. Zuck and Hock, MWC Preview, the Battle for Enterprise AI * Elon Musk on X: What matters w Powerwall 3 is that it can handle peak power of ~30kW, which is enough to handle dryers and air-conditioners.
We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Our largest model, Sora, is capable of generating a minute of high fidelity video. Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.
Apple is working on a new feature for its Xcode development environment that could make coding a breeze for developers. The feature, reportedly similar to Microsoft’s GitHub Copilot, will use artificial intelligence to suggest and complete blocks of code in real time.
Generative AI for Xcode
According to Bloomberg, Apple is also looking into using AI to generate code for testing applications, which can be a time-consuming and tedious task for developers. The company is testing the new tools internally before launching them to the public, possibly later this year.
Vacuums, tire-pressure tools, and car washes aren’t typical at Tesla Supercharger stops or other charging stations, but a survey suggests EV drivers want them.
Scientists use attosecond X-ray pulses to freeze atomic motion. Discover the immediate electronic response to X-rays on matter.
The US Federal Trade Commission moved to put new rules into place around impersonation, citing the rising threat of scams enabled by generative artificial intelligence.