Toggle light / dark theme

LaViDa: A Large Diffusion Language Model for Multimodal Understanding

View recent discussion. Abstract: Modern Vision-Language Models (VLMs) can solve a wide range of tasks requiring visual reasoning. In real-world scenarios, desirable properties for VLMs include fast inference and controllable generation (e.g., constraining outputs to adhere to a desired format). However, existing autoregressive (AR) VLMs like LLaVA struggle in these aspects. Discrete diffusion models (DMs) offer a promising alternative, enabling parallel decoding for faster inference and bidirectional context for controllable generation through text-infilling. While effective in language-only settings, DMs’ potential for multimodal tasks is underexplored. We introduce LaViDa, a family of VLMs built on DMs. We build LaViDa by equipping DMs with a vision encoder and jointly fine-tune the combined parts for multimodal instruction following.

Digital Frontier: Where Brain-computer Interfaces & AR/VR Could One Day Meet

Whenever I used to think about brain-computer interfaces (BCI), I typically imagined a world where the Internet was served up directly to my mind through cyborg-style neural implants—or basically how it’s portrayed in Ghost in the Shell. In that world, you can read, write, and speak to others without needing to lift a finger or open your mouth. It sounds fantastical, but the more I learn about BCI, the more I’ve come to realize that this wish list of functions is really only the tip of the iceberg. And when AR and VR converge with the consumer-ready BCI of the future, the world will be much stranger than fiction.

Be it Elon Musk’s latest company Neuralink —which is creating “minimally invasive” neural implants to suit a wide range of potential future applications, or Facebook directly funding research on decoding speech from the human brain—BCI seems to be taking an important step forward in its maturity. And while these well-funded companies can only push the technology forward for its use as a medical devices today thanks to regulatory hoops governing implants and their relative safety, eventually the technology will get to a point when it’s both safe and cheap enough to land into the brainpan’s of neurotypical consumers.

Although there’s really no telling when you or I will be able to pop into an office for an outpatient implant procedure (much like how corrective laser eye surgery is done today), we know at least that this particular future will undoubtedly come alongside significant advances in augmented and virtual reality. But before we consider where that future might lead us, let’s take a look at where things are today.

The Rise of Cyborgs: Merging Man with Machine | Terrifying Future of Human Augmentation

Human cyborgs are individuals who integrate advanced technology into their bodies, enhancing their physical or cognitive abilities. This fusion of man and machine blurs the line between science fiction and reality, raising questions about the future of humanity, ethics, and the limits of human potential. From bionic limbs to brain-computer interfaces, cyborg technology is rapidly evolving, pushing us closer to a world where humans and machines become one.

Shocking transformation, futuristic nightmare, beyond human limits, man merges with machine, terrifying reality, future is now, ultimate evolution, secret experiments exposed, technology gone too far, sci-fi turns real, mind-blowing upgrade, science fiction no more, unstoppable machine man, breaking human boundaries, dark future ahead, human cyborgs, cyborg technology, cyborg implants, cyborg augmentation, cyborg evolution, cyborg future, cyborg innovations, cyborg advancements, cyborg ethics, cyborg integration, cyborg society, cyborg culture, cyborg development, cyborg research, cyborg science, cyborg engineering, cyborg design, cyborg applications, cyborg trends, cyborg news, cyborg updates, cyborg breakthroughs, cyborg discoveries, cyborg implants, bionic limbs, neural interfaces, prosthetic enhancements, biohacking, cybernetics, exoskeletons, brain-computer interfaces, robotic prosthetics, augmented humans, wearable technology, artificial organs, human augmentation, smart prosthetics, neuroprosthetics, biomechatronics, implantable devices, synthetic biology, transhumanism, bioengineering, nanotechnology, genetic engineering, bioinformatics, artificial intelligence, machine learning, robotics, automation, virtual reality, augmented reality, mixed reality, haptic feedback, sensory augmentation, cognitive enhancement, biofeedback, neurofeedback, brain mapping, neural networks, deep learning, biotechnology, regenerative medicine, tissue engineering, stem cells, gene therapy, personalized medicine, precision medicine, biomedical engineering, medical devices, health tech, digital health, telemedicine, eHealth, mHealth, health informatics, wearable sensors, fitness trackers, smartwatches, health monitoring, remote monitoring, patient engagement, health apps, health data, electronic health records, health analytics, health AI, medical robotics, surgical robots, rehabilitation robotics, assistive technology, disability tech, inclusive design, universal design, accessibility, adaptive technology, human-machine interaction, human-computer interaction, user experience, user interface, UX design, UI design, interaction design, design thinking, product design, industrial design, innovation, technology trends, future tech, emerging technologies, disruptive technologies, tech startups, tech entrepreneurship, venture capital, startup ecosystem, tech innovation, research and development, R&D, scientific research, science and technology, STEM, engineering, applied sciences, interdisciplinary research, academic research, scholarly articles, peer-reviewed journals, conferences, symposiums, workshops, seminars, webinars, online courses, e-learning, MOOCs, professional development, continuing education, certifications, credentials, skills development, career advancement, job market, employment trends, workforce development, labor market, gig economy, freelancing, remote work, telecommuting, digital nomads, coworking spaces, collaboration tools, project management, productivity tools, time management, work-life balance, mental health, wellness, self-care, mindfulness, meditation, stress management, resilience, personal growth, self-improvement, life coaching, goal setting, motivation, inspiration, success stories, case studies, testimonials, reviews, ratings, recommendations, referrals, networking, professional associations, industry groups, online communities, forums, discussion boards, social media, content creation, blogging, vlogging, podcasting, video production, photography, graphic design, animation, illustration, creative arts, performing arts, visual arts, music, literature, film, television, entertainment, media, journalism, news, reporting, storytelling, narrative, communication, public speaking, presentations, persuasion, negotiation, leadership, management, entrepreneurship, business, marketing, advertising, branding, public relations, sales, customer service, client relations, customer experience, market research, consumer behavior, demographics, psychographics, target audience, niche markets, segmentation, positioning, differentiation, competitive analysis, SWOT analysis, strategic planning, business development, growth strategies, scalability, sustainability, corporate social responsibility, ethics, compliance, governance, risk management, crisis management, change management, organizational behavior, corporate culture, diversity and inclusion, team building, collaboration, innovation management, knowledge management, intellectual property, patents, trademarks, copyrights.

Motion artifact–controlled micro–brain sensors between hair follicles for persistent augmented reality brain–computer interfaces

Modern brain–computer interfaces (BCI), utilizing electroencephalograms for bidirectional human–machine communication, face significant limitations from movement-vulnerable rigid sensors, inconsistent skin–electrode impedance, and bulky electronics, diminishing the system’s continuous use and portability. Here, we introduce motion artifact–controlled micro–brain sensors between hair strands, enabling ultralow impedance density on skin contact for long-term usable, persistent BCI with augmented reality (AR). An array of low-profile microstructured electrodes with a highly conductive polymer is seamlessly inserted into the space between hair follicles, offering high-fidelity neural signal capture for up to 12 h while maintaining the lowest contact impedance density (0.03 kΩ·cm−2) among reported articles. Implemented wireless BCI, detecting steady-state visually evoked potentials, offers 96.4% accuracy in signal classification with a train-free algorithm even during the subject’s excessive motions, including standing, walking, and running. A demonstration captures this system’s capability, showing AR-based video calling with hands-free controls using brain signals, transforming digital communication. Collectively, this research highlights the pivotal role of integrated sensors and flexible electronics technology in advancing BCI’s applications for interactive digital environments.

Tiny brain sensor shows 96.4% accuracy in identifying neural signals

Researchers from Georgia Institute of Technology (Georgia Tech) have developed a microscopic brain sensor which is so tiny that it can be placed in the small gap between your hair follicles on the scalp, slightly under the skin. The sensor is discreet enough not to be noticed and minuscule enough to be worn comfortably all day.

Brain sensors offer high-fidelity signals, allowing your brain to communicate directly with devices like computers, augmented reality (AR) glasses, or robotic limbs. This is part of what’s known as a Brain-Computer Interface (BCI).

3D streaming gets leaner by seeing only what matters

A new approach to streaming technology may significantly improve how users experience virtual reality and augmented reality environments, according to a study from NYU Tandon School of Engineering.

The research—presented in a paper at the 16th ACM Multimedia Systems Conference (ACM MMSys 2025) on April 1, 2025—describes a method for directly predicting visible content in immersive 3D environments, potentially reducing bandwidth requirements by up to 7-fold while maintaining visual quality.

The technology is being applied in an ongoing NYU Tandon project to bring point cloud video to dance education, making 3D dance instruction streamable on standard devices with lower bandwidth requirements.

Programmable pixels could advance infrared light applications

Without the ability to control infrared light waves, autonomous vehicles wouldn’t be able to quickly map their environment and keep “eyes” on the cars and pedestrians around them; augmented reality couldn’t display realistic 3D displays; doctors would lose an important tool for early cancer detection. Dynamic light control allows for upgrades to many existing systems, but complexities associated with fabricating programmable thermal devices hinder availability.

A new active metasurface, the electrically-programmable graphene field effect transistor (Gr-FET), from the labs of Sheng Shen and Xu Zhang in Carnegie Mellon University’s College of Engineering, enables the control of mid-infrared states across a wide range of wavelengths, directions, and polarizations. This enhanced control enables advancements in applications ranging from infrared camouflage to personalized health monitoring.

“For the first time, our active metasurface devices exhibited the monolithic integration of the rapidly modulated temperature, addressable pixelated imaging, and resonant infrared spectrum,” said Xiu Liu, postdoctoral associate in mechanical engineering and lead author of the paper published in Nature Communications. “This breakthrough will be of great interest to a wide range of infrared photonics, , biophysics, and thermal engineering audiences.”

How to Remote Control a Human Being | Misha Sra | TEDxBeaconStreetSalon

For over a century, galvanic vestibular stimulation (GVS) has been used as a way to stimulate the inner ear nerves by passing a small amount of current.

We use GVS in a two player escape the room style VR game set in a dark virtual world. The VR player is remote controlled like a robot by a non-VR player with GVS to alter the VR player’s walking trajectory. We also use GVS to induce the physical sensations of virtual motion and mitigate motion sickness in VR.

Brain hacking has been a futurist fascination for decades. Turns out, we may be able to make it a reality as research explores the impact of GVS on everything from tactile sensation to memory.

Misha graduated in June 2018 from the MIT Media Lab where she worked in the Fluid Interfaces group with Prof Pattie Maes. Misha works in the area of human-computer interaction (HCI), specifically related to virtual, augmented and mixed reality. The goal of her work is to create systems that use the entire body for input and output and automatically adapt to each user’s unique state and context. Misha calls her concept perceptual engineering, i.e., immersive systems that alter the user’s perception (or more specifically the input signals to their perception) and influence or manipulate it in subtle ways. For example, they modify a user’s sense of balance or orientation, manipulate their visual attention and more, all without the user’s explicit awareness, and in order to assist or guide their interactive experience in an effortless way.

The systems Misha builds use the entire body for input and output, i.e., they can use movement, like walking, or a physiological signal, like breathing as input, and can output signals that actuate the user’s vestibular system with electrical pulses, causing the individual to move or turn involuntarily. HCI up to now has relied upon deliberate, intentional usage, both for input (e.g., touch, voice, typing) and for output (interpreting what the system tells you, shows you, etc.). In contrast, Misha develops techniques and build systems that do not require this deliberate, intentional user interface but are able to use the body as the interface for more implicit and natural interactions.

Misha’s perceptual engineering approach has been shown to increase the user’s sense of presence in VR/MR, provide novel ways to communicate between the user and the digital system using proprioception and other sensory modalities, and serve as a platform to question the boundaries of our sense of agency and trust.

/* */