Toggle light / dark theme

Meta has created a system that can embed hidden signals, known as watermarks, in AI-generated audio clips, which could help in detecting AI-generated content online.

The tool, called AudioSeal, is the first that can pinpoint which bits of audio in, for example, a full hourlong podcast might have been generated by AI. It could help to tackle the growing problem of misinformation and scams using voice cloning tools, says Hady Elsahar, a research scientist at Meta. Malicious actors have used generative AI to create audio deepfakes of President Joe Biden, and scammers have used deepfakes to blackmail their victims. Watermarks could in theory help social media companies detect and remove unwanted content.

OpenAI has acquired Rockset, which builds tools to drive real-time search and data analytics.

In a post on its official blog, OpenAI said that it would integrate Rockset’s technology to “power [its] infrastructure across products.” Members of Rockset’s team will join OpenAI, and Rockset’s existing customers will be transitioned off of Rockset’s platform “gradually.”

The financial terms weren’t disclosed.

A high school robotics team has built the world’s smallest and cheapest network switch.

The device was created by Murex Robotics and formed by students from Phillips Exeter Academy in New Hampshire.

The invention was necessary because they could not find an affordable embedded ethernet switch for a remotely operated vehicle (ROV) they were building for an underwater drone competition.

Reconstructing a scene using a single-camera viewpoint is challenging. Researchers have deployed generative artificial intelligence (AI) to achieve this. However, the models can hallucinate objects when determining what is obscured.

An alternate approach is to use shadows in a color image to infer the shape of the hidden object. However, the method falls short when the shadows are hard to see.

To overcome these limitations, the MIT researchers used a single-photon LiDAR. A LiDAR emits pulses of light, and the time it takes for these signals to bounce back to the sensor creates a 3D map of a scene.

Large language models have emerged as a transformative technology and have revolutionized AI with their ability to generate human-like text with seemingly unprecedented fluency and apparent comprehension. Trained on vast datasets of human-generated text, LLMs have unlocked innovations across industries, from content creation and language translation to data analytics and code generation. Recent developments, like OpenAI’s GPT-4o, showcase multimodal capabilities, processing text, vision, and audio inputs in a single neural network.

Despite their potential for driving productivity and enabling new forms of human-machine collaboration, LLMs are still in their nascent stage. They face limitations such as factual inaccuracies, biases inherited from training data, lack of common-sense reasoning, and data privacy concerns. Techniques like retrieval augmented generation aim to ground LLM knowledge and improve accuracy.

To explore these issues, I spoke with Amir Feizpour, CEO and founder of AI Science, an expert-in-the-loop business workflow automation platform. We discussed the transformative impacts, applications, risks, and challenges of LLMs across different sectors, as well as the implications for startups in this space.