Toggle light / dark theme

Large language models (LLMs) have enabled a new data-efficient learning paradigm wherein they can be used to solve unseen new tasks via zero-shot or few-shot prompting. However, LLMs are challenging to deploy for real-world applications due to their sheer size. For instance, serving a single 175 billion LLM requires at least 350GB of GPU memory using specialized infrastructure, not to mention that today’s state-of-the-art LLMs are composed of over 500 billion parameters. Such computational requirements are inaccessible for many research teams, especially for applications that require low latency performance.

To circumvent these deployment challenges, practitioners often choose to deploy smaller specialized models instead. These smaller models are trained using one of two common paradigms: fine-tuning or distillation. Fine-tuning updates a pre-trained smaller model (e.g., BERT or T5) using downstream manually-annotated data. Distillation trains the same smaller models with labels generated by a larger LLM. Unfortunately, to achieve comparable performance to LLMs, fine-tuning methods require human-generated labels, which are expensive and tedious to obtain, while distillation requires large amounts of unlabeled data, which can also be hard to collect.

In “Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes”, presented at ACL2023, we set out to tackle this trade-off between model size and training data collection cost. We introduce distilling step-by-step, a new simple mechanism that allows us to train smaller task-specific models with much less training data than required by standard fine-tuning or distillation approaches that outperform few-shot prompted LLMs’ performance. We demonstrate that the distilling step-by-step mechanism enables a 770M parameter T5 model to outperform the few-shot prompted 540B PaLM model using only 80% of examples in a benchmark dataset, which demonstrates a more than 700x model size reduction with much less training data required by standard approaches.

It lets researchers extract pixel-by-pixel information from nanoscale.

The nanoscale refers to a length scale that is extremely small, typically on the order of nanometers (nm), which is one billionth of a meter. At this scale, materials and systems exhibit unique properties and behaviors that are different from those observed at larger length scales. The prefix “nano-” is derived from the Greek word “nanos,” which means “dwarf” or “very small.” Nanoscale phenomena are relevant to many fields, including materials science, chemistry, biology, and physics.

For the first time, a team from the University of Minnesota Twin Cities has synthesized a thin film of a unique topological semimetal material that has the potential to generate more computing power and memory storage while using significantly less energy. Additionally, the team’s close examination of the material yielded crucial insights into the physics behind its unique properties.

The study was recently published in the journal Nature Communications.

<em>Nature Communications</em> is a peer-reviewed, open-access, multidisciplinary, scientific journal published by Nature Portfolio. It covers the natural sciences, including physics, biology, chemistry, medicine, and earth sciences. It began publishing in 2010 and has editorial offices in London, Berlin, New York City, and Shanghai.

‘They’re not announcing like, ‘We have created a model that does a particular thing.’ Instead, they’re saying ‘We are planning to create a resource that is going to be available for biologists to create new models,’ Carpenter said.

The Chan Zuckerberg Initiative, the couple’s LLC, told The Register that they plan to have their product running by 2024. The company also declined to tell the Register how much it’ll have to spend to make its product.

It could be a hefty bill, considering that the computer parts it wants to use are in high demand and low supply, The Register reported.

The newly upgraded Linac Coherent Light Source (LCLS) X-ray free-electron laser (XFEL) at the Department of Energy’s SLAC National Accelerator Laboratory successfully produced its first X-rays, and researchers around the world are already lined up to kick off an ambitious science program.

The upgrade, called LCLS-II, creates unparalleled capabilities that will usher in a new era in research with X-rays.

Scientists will be able to examine the details of quantum materials with unprecedented resolution to drive new forms of computing and communications; reveal unpredictable and fleeting chemical events to teach us how to create more sustainable industries and ; study how carry out life’s functions to develop new types of pharmaceuticals; and study the world on the fastest timescales to open up entirely new fields of scientific investigation.

Year 2022 😗😁


During the IEEE International Electron Devices Meeting (or IEDM), Intel claimed that by 2030, there would be circuits with transistor counts of a trillion, roughly ten times the number of transistors currently available on modern CPUs.

At the meeting, Intel’s Components Research Group laid down its prediction for the future of circuits manufacturing (via sweclockers) and how new packaging technologies and materials will allow chipmakers to build chips with 10x the transistor density, keeping in Moore’s Law.

Researchers have developed a method of “wiring up” graphene nanoribbons (GNRs), a class of one-dimensional materials that are of interest in the scaling of microelectronic devices. Using a direct-write scanning tunneling microscopy (STM) based process, the nanometer-scale metal contacts were fabricated on individual GNRs and could control the electronic character of the GNRs.

The researchers say that this is the first demonstration of making metal contacts to specific GNRs with certainty and that those contacts induce device functionality needed for transistor function.

The results of this research, led by electrical and (ECE) professor Joseph Lyding, along with ECE graduate student Pin-Chiao Huang and and engineering graduate student Hongye Sun, were recently published in the journal ACS Nano.

Data centre energy consumption could be cut with new, ‘breakthrough’ photonic chips that are more efficient than today’s chips.

Data centres can consume up to 50 times more energy per square foot of floor space than a typical office building and account for roughly 2 per cent of all electricity use in the US.

In recent years, the number of data centres has risen rapidly due to soaring demand from the likes of Facebook, Amazon, Microsoft and Google.

Researchers at MESA+ Institute for Nanotechnology developed a tool that can measure the size of a plasma source and the color of the light it emits simultaneously. “Measuring both at the same time enables us to further improve lithography machines for smaller, faster and improved chips.” The article is highlighted as an Editor’s pick in Optics Letters.

Lithography machines are central to the process of making the microchips that are needed for almost all our . To produce the smallest chips, these machines need precision-engineered lenses, mirrors and light sources. “Traditionally, we could only look at the amount of light produced, but to further improve the chipmaking process, we also want to study the colors of that light and the size of its source,” explains Muharrem Bayraktar, assistant professor at the XUV Optics Group.

The extreme ultraviolet light is emitted by a plasma source, produced by aiming lasers at metal droplets. With sets of special mirrors, this light is aimed at a silicon wafer to create the smallest microchips imaginable. “We want to make the plasma as small as possible. Too large and you ‘waste’ a lot of light because the mirrors cannot catch all the light,” says Bayraktar.