One of the significant challenges in AI research is the computational inefficiency in processing visual tokens in Vision Transformer (ViT) and Video Vision Transformer (ViViT) models. These models process all tokens with equal emphasis, overlooking the inherent redundancy in visual data, which results in high computational costs. Addressing this challenge is crucial for the deployment of AI models in real-world applications where computational resources are limited and real-time processing is essential.
Current methods like ViTs and Mixture of Experts (MoEs) models have been effective in processing large-scale visual data but come with significant limitations. ViTs treat all tokens equally, leading to unnecessary computations. MoEs improve scalability by conditionally activating parts of the network, thus maintaining inference-time costs. However, they introduce a larger parameter footprint and do not reduce computational costs without skipping tokens entirely. Additionally, these models often use experts with uniform computational capacities, limiting their ability to dynamically allocate resources based on token importance.
And this shows one of the many ways in which the Economic Singularity is rushing at us. The 🦾🤖 Bots are coming soon to a job near you.
NVIDIA unveiled a suite of services, models, and computing platforms designed to accelerate the development of humanoid robots globally. Key highlights include:
- NVIDIA NIM™ Microservices: These containers, powered by NVIDIA inference software, streamline simulation workflows and reduce deployment times. New AI microservices, MimicGen and Robocasa, enhance generative physical AI in Isaac Sim™, built on @NVIDIAOmniverse - NVIDIA OSMO Orchestration Service: A cloud-native service that simplifies and scales robotics development workflows, cutting cycle times from months to under a week. - AI-Enabled Teleoperation Workflow: Demonstrated at #SIGGRAPH2024, this workflow generates synthetic motion and perception data from minimal human demonstrations, saving time and costs in training humanoid robots.
NVIDIA’s comprehensive approach includes building three computers to empower the world’s leading robot manufacturers: NVIDIA AI and DGX to train foundation models, Omniverse to simulate and enhance AIs in a physically-based virtual environment, and Jetson Thor, a robot supercomputer. The introduction of NVIDIA NIM microservices for robot simulation generative AI further accelerates humanoid robot development.
Nvidia’s upcoming artificial intelligence chips will be delayed by three months or more due to design flaws, a snafu that could affect customers such as Meta Platforms, Google and Microsoft that have collectively ordered tens of billions of dollars worth of the chips, according to two people who help produce the chip and server hardware for it.
Nvidia this week told Microsoft, one of its biggest customers, and another large cloud provider about a delay involving the most advanced AI chip in its new Blackwell series of chips, according to a Microsoft employee and another person with direct knowledge.
Electrically powered artificial muscle fibers (EAMFs) are emerging as a revolutionary power source for advanced robotics and wearable devices. Renowned for their exceptional mechanical properties, integration flexibility, and functional versatility, EAMFs are at the forefront of cutting-edge innovation.
A recent review article on this topic was published online in the National Science Review (“Emerging Innovations in Electrically Powered Artificial Muscle Fibers”).
Schematic of electrically powered artificial muscle fibers categorized from the mechanism, material components, and configurations, as well as their application fields. (Image: Science China Press)
Weather and climate experts are divided on whether AI or more traditional methods are most effective. In this new model, Google’s researchers bet on both.
Baidu’s novel self-reasoning AI framework aims to enhance language model reliability, potentially eliminating ‘hallucinations’ and setting new standards for AI accuracy and transparency.