Toggle light / dark theme

These days, we don’t have to wait long until the next breakthrough in artificial intelligence impresses everyone with capabilities that previously belonged only in science fiction.

In 2022, AI art generation tools such as Open AI’s DALL-E 2, Google’s Imagen, and Stable Diffusion took the internet by storm, with users generating high-quality images from text descriptions.

Unlike previous developments, these text-to-image tools quickly found their way from research labs to mainstream culture, leading to viral phenomena such as the “Magic Avatar” feature in the Lensa AI app, which creates stylized images of its users.

Microsoft’s Kosmos-1 can take image and audio prompts, paving the way for the next stage beyond ChatGPT’s text prompts.

Microsoft has unveiled Kosmos-1, which it describes as a multimodal large language model (MLLM) that can not only respond to language prompts but also visual cues, which can be used for an array of tasks, including image captioning, visual question answering, and more.

OpenAI’s ChatGPT has helped popularize the concept of LLMs, such as the GPT (Generative Pre-trained Transformer) model, and the possibility of transforming a text prompt or input into an output.

The bank has been testing the artificial intelligence tool with 300 advisors and plans to roll it out widely in the coming months, according to Jeff McMillan, head of analytics, data and innovation at the firm’s wealth management division.

Morgan Stanley’s move is one of the first announcements by a financial incumbent after the success of OpenAI’s ChatGPT, which went viral late last year by generating human-sounding responses to questions. The bank is a juggernaut in wealth management with more than $4.2 trillion in client assets. The promise and perils of artificial intelligence have been written about for years, but seemingly only after ChatGPT did mainstream users understand the ramifications of the technology.

The idea behind the tool, which has been in development for the past year, is to help the bank’s 16,000 or so advisors tap the bank’s enormous repository of research and data, said McMillan.

Deep Learning (DL) advances have cleared the way for intriguing new applications and are influencing the future of Artificial Intelligence (AI) technology. However, a typical concern for DL models is their explainability, as experts commonly agree that Neural Networks (NNs) function as black boxes. We do not precisely know what happens inside, but we know that the given input is somehow processed, and as a result, we obtain something as output. For this reason, DL models can often be difficult to understand or interpret. Understanding why a model makes certain predictions or how to improve it can be challenging.

This article will introduce and emphasize the importance of NN explainability, provide insights into how to achieve it, and suggest tools that could improve your DL model’s performance.

Its Up!


We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks. For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. We’ve spent 6 months iteratively aligning GPT-4 using lessons from our adversarial testing program as well as ChatGPT, resulting in our best-ever results (though far from perfect) on factuality, steerability, and refusing to go outside of guardrails.

Over the past two years, we rebuilt our entire deep learning stack and, together with Azure, co-designed a supercomputer from the ground up for our workload. A year ago, we trained GPT-3.5 as a first “test run” of the system. We found and fixed some bugs and improved our theoretical foundations. As a result, our GPT-4 training run was (for us at least!) unprecedentedly stable, becoming our first large model whose training performance we were able to accurately predict ahead of time. As we continue to focus on reliable scaling, we aim to hone our methodology to help us predict and prepare for future capabilities increasingly far in advance—something we view as critical for safety.

We are releasing GPT-4’s text input capability via ChatGPT and the API (with a waitlist). To prepare the image input capability for wider availability, we’re collaborating closely with a single partner to start. We’re also open-sourcing OpenAI Evals, our framework for automated evaluation of AI model performance, to allow anyone to report shortcomings in our models to help guide further improvements.

Google will soon offer ways to generate text and images using machine learning in its Workspace products as part of a scramble to catch up with rivals in the new AI race.

Google has announced a suite of upcoming generative AI features for its various Workspace apps, including Google Docs, Gmail, Sheets, and Slides.


Google is pumping its productivity apps full of AI.