For an average person, just the term Artificial Intelligence (AI) may be quite confusing, as it seems to cover all aspects of what seems to be ‘unnatural’. It may start in difficulty to differentiate between Information Technology (IT) and AI.
If electronics tried selling themselves by speaking to you, would you have a greater urge to buy them? This is what a recent study published in Decision Support Systems hopes to address as a research duo investigated how artificial intelligence (AI) could be used as a productive marketing and retail tool for selling their products. This study holds the potential to help researchers, businesses, and consumers better understand how AI can be sold using anthromorphism (possessing human attributes).
“Companies have long used cartoon-like characters to sell products. We are familiar with the ‘M&M spokescandies’, for example,” said Dr. Alan Dennis, who is a Professor of Information Systems in the Kelley School of Business at Indiana University and co-author on the study. “But adding human features to a product can be a powerful way to influence consumers’ perceptions and decision making, because it can trigger anthromorphism.”
For the study, the researchers enlisted approximately 50 undergraduate students and asked them to pretend they were new master’s degree students who needed a new television, camera, or laptop for their studies. Using an eBay-style auction website, the students then bid on the products after watching a two-minute video exhibiting a speaker with human attributes which described the product. The goal of the study was to ascertain how much the students were willing to bid on the products with the video compared to products without, all while using an Emotiv EPOC EEG headset to gather data on their brain activity.
Just a few weeks back, I wrote that we are probably still some way from being able to create a movie from a natural language prompt.
OpenAI’s Sora, a groundbreaking text-to-video model, has catapulted the AI community years ahead, offering near-photorealistic videos from text prompts.
Gemma 2B and Gemma 7B are smaller open-source AI models for language tasks in English.
Google has released Gemma 2B and 7B, a pair of open-source AI models that let developers use the research that went into its flagship Gemini more freely.
Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models.
The company plans to release a subscription plan for the tool after it is out of beta.
Adobe’s new AI Assistant instantly generates summaries and insights from long documents, answers questions and formats information for sharing in emails, reports and presentations.
Fusion powers the Sun, and, by extension, makes life on Earth possible.
Researchers use AI to predict and prevent plasma instabilities in fusion reactors, averting reaction disruptions. Experiments show AI forecasts issues 300 milliseconds early, allowing real-time adjustments for stability.
The Tokenizer is a necessary and pervasive component of Large Language Models (LLMs), where it translates between strings and tokens (text chunks). Tokenizers are a completely separate stage of the LLM pipeline: they have their own training sets, training algorithms (Byte Pair Encoding), and after training implement two fundamental functions: encode() from strings to tokens, and decode() back from tokens to strings. In this lecture we build from scratch the Tokenizer used in the GPT series from OpenAI. In the process, we will see that a lot of weird behaviors and problems of LLMs actually trace back to tokenization. We’ll go through a number of these issues, discuss why tokenization is at fault, and why someone out there ideally finds a way to delete this stage entirely.
Chapters: 00:00:00 intro: Tokenization, GPT-2 paper, tokenization-related issues. 00:05:50 tokenization by example in a Web UI (tiktokenizer) 00:14:56 strings in Python, Unicode code points. 00:18:15 Unicode byte encodings, ASCII, UTF-8, UTF-16, UTF-32 00:22:47 daydreaming: deleting tokenization. 00:23:50 Byte Pair Encoding (BPE) algorithm walkthrough. 00:27:02 starting the implementation. 00:28:35 counting consecutive pairs, finding most common pair. 00:30:36 merging the most common pair. 00:34:58 training the tokenizer: adding the while loop, compression ratio. 00:39:20 tokenizer/LLM diagram: it is a completely separate stage. 00:42:47 decoding tokens to strings. 00:48:21 encoding strings to tokens. 00:57:36 regex patterns to force splits across categories. 01:11:38 tiktoken library intro, differences between GPT-2/GPT-4 regex. 01:14:59 GPT-2 encoder.py released by OpenAI walkthrough. 01:18:26 special tokens, tiktoken handling of, GPT-2/GPT-4 differences. 01:25:28 minbpe exercise time! write your own GPT-4 tokenizer. 01:28:42 sentencepiece library intro, used to train Llama 2 vocabulary. 01:43:27 how to set vocabulary set? revisiting gpt.py transformer. 01:48:11 training new tokens, example of prompt compression. 01:49:58 multimodal [image, video, audio] tokenization with vector quantization. 01:51:41 revisiting and explaining the quirks of LLM tokenization. 02:10:20 final recommendations. 02:12:50??? smile
Exercises: - Advised flow: reference this document and try to implement the steps before I give away the partial solutions in the video. The full solutions if you’re getting stuck are in the minbpe code https://github.com/karpathy/minbpe/bl…
“With the world growing more crowded, the great powers strive to conquer other planets. The race is on. The interplanetary sea has been charted; the first caravelle of space is being constructed. Who will get there first? Who will be the new Columbus?” A robot probe is being readied to explore the secrets of the red planet, Mars. The only component lacking: a human brain. No body. Just the brain. It is needed to deal with unexpected crises in the cold, dark depths of space. The perfect volunteer is found in Colonel Barham, a brilliant but hot-tempered astronaut dying of leukemia. But all goes awry as, stripped of his mortal flesh, Barham — or rather his disembodied brain — is consumed with a newly-found power to control…or destroy. Project psychiatrist Major McKinnon (Grant Williams) diagnoses the brain as having delusions of grandeur…but, just perhaps, Col. Barham has achieved grandeur.