Toggle light / dark theme

Researchers extract audio from still images and silent videos

What if you could hear photos? Impossible, right? Not anymore – with the help of artificial intelligence (AI) and machine learning, researchers can now get audio from photos and silent videos.

Academics from four US universities have teamed up to develop a technique called Side Eye that can extract audio from static photos and silent – or muted – videos.

The technique targets the image stabilization technology that is now virtually standard across most modern smartphones.

AI language models can exceed PNG and FLAC in lossless compression, says study

Effective compression is about finding patterns to make data smaller without losing information. When an algorithm or model can accurately guess the next piece of data in a sequence, it shows it’s good at spotting these patterns. This links the idea of making good guesses—which is what large language models like GPT-4 do very well —to achieving good compression.

In an arXiv research paper titled “Language Modeling Is Compression,” researchers detail their discovery that the DeepMind large language model (LLM) called Chinchilla 70B can perform lossless compression on image patches from the ImageNet image database to 43.4 percent of their original size, beating the PNG algorithm, which compressed the same data to 58.5 percent. For audio, Chinchilla compressed samples from the LibriSpeech audio data set to just 16.4 percent of their raw size, outdoing FLAC compression at 30.3 percent.

In this case, lower numbers in the results mean more compression is taking place. And lossless compression means that no data is lost during the compression process. It stands in contrast to a lossy compression technique like JPEG, which sheds some data and reconstructs some of the data with approximations during the decoding process to significantly reduce file sizes.

How to Prepare for a GenAI Future You Can’t Predict

Given the staggering pace of generative AI development, it’s no wonder that so many executives are tempted by the possibilities of AI, concerned about finding and retaining qualified workers, and humbled by recent market corrections or missed analyst expectations. They envision a future of work without nearly as many people as today. But this is a miscalculation. Leaders, understandably concerned about missing out on the next wave of technology, are unwittingly making risky bets on their companies’ futures. Here are steps every leader should take to prepare for an uncertain world where generative AI and human workforces coexist but will evolve in ways that are unknowable.

Page-utils class= article-utils—vertical hide-for-print data-js-target= page-utils data-id= tag: blogs.harvardbusiness.org, 2007/03/31:999.361032 data-title= How to Prepare for a GenAI Future You Can’t Predict data-url=/2023/08/how-to-prepare-for-a-genai-future-you-cant-predict data-topic= Strategic planning data-authors= Amy Webb data-content-type= Digital Article data-content-image=/resources/images/article_assets/2023/08/Aug23_31_1500235907-383x215.jpg data-summary=

A framework for making plans in the midst of great uncertainty.

Sam Altman Says He Intends to Replace Normal People With AI

That’s one way to talk about other human beings.

As writer Elizabeth Weil notes in a new profile of OpenAI CEO Sam Altman in New York Magazine, the powerful AI executive has a disconcerting penchant for using the term “median human,” a phrase that seemingly equates to a robotic tech bro version of “Average Joe.”

Altman’s hope is that artificial general intelligence (AGI) will have roughly the same intelligence as a “median human that you could hire as a co-worker.”

Will AI make us crazy?

Coverage of the risks and benefits of AI have paid scant attention to how chatbots might affect public health at a time when depression, suicide, anxiety, and mental illness are epidemic in the United States. But mental health experts and the healthcare industry view AI mostly as a promising tool, rather than a potential threat to mental health.

Meta putting AI in smart glasses, assistants and more

People will laugh and dismiss it and make comparisons to googles clown glasses. But around 2030 Augmented Reality glasses will come out. Basically, it will be a pair of normal looking sunglasses w/ smart phone type features, Ai, AND… VR stuff.


Meta chief Mark Zuckerberg on Wednesday said the tech giant is putting artificial intelligence into digital assistants and smart glasses as it seeks to gain lost ground in the AI race.

Zuckerberg made his announcements at the Connect developers conference at Meta’s headquarters in Silicon Valley, the company’s main annual product event.

“Advances in AI allow us to create different (applications) and personas that help us accomplish different things,” Zuckerberg said as he kicked off the gathering.

Tim Cook confirms Apple is researching ChatGPT-style AI

Apple CEO Tim Cook has told UK press that the company is “of course” working on generative AI, and that he expects to hire more Artificial intelligence staff in that country.

Just hours after Apple put a spotlight on how it supports over half a million jobs in the UK, Tim Cook has been talking about increasing that by hiring more staff working in AI.

According to London’s Evening Standard, Cook was asked by the PA news agency about AI and hiring in the UK. Cook said: “We’re hiring in that area, yes, and so I do expecting [recruitment] to increase.”

Autonomous Racing Drones Are Starting To Beat Human Pilots

Even with all the technological advancements in recent years, autonomous systems have never been able to keep up with top-level human racing drone pilots. However, it looks like that gap has been closed with Swift – an autonomous system developed by the University of Zurich’s Robotics and Perception Group.

Previous research projects have come close, but they relied on optical motion capture settings in a tightly controlled environment. In contrast, Swift is completely independent of remote inputs and utilizes only an onboard computer, IMU, and camera for real-time for navigation and control. It does however require a pretrained machine learning model for the specific track, which maps the drone’s estimated position/velocity/orientation directly to control inputs. The details of how the system works is well explained in the video after the break.

The paper linked above contains a few more interesting details. Swift was able to win 60% of the time, and it’s lap times were significantly more consistent than those of the human pilots. While human pilots were often faster on certain sections of the course, Swift was faster overall. It picked more efficient trajectories over multiple gates, where the human pilots seemed to plan one gate in advance at most. On the other hand human pilots could recover quickly from a minor crash, where Swift did not include crash recovery.