-->

4 insightful thoughts on deep learning in 2022


This article is part of a special issue of VB. Read the full series here: How data privacy is transforming marketing.

We are leaving behind another year of exciting developments in artificial intelligence (AI) deep learning, one filled with notable advances, controversies, and, of course, disputes. As we wrap up 2022 and prepare to embrace what 2023 has in store, here are some of the more notable overall trends that marked this year in deep learning.

1. Scale is still an important factor

One theme that has remained constant in deep learning for the past few years is the drive to create larger neural networks. The availability of computing resources makes it possible to scale neural networks, as well as specialized AI hardware, large data sets, and the development of scalable architectures such as the transformer. model.

At the moment, companies are getting better results by scaling neural networks to larger sizes. Last year, DeepMind announced ground squirrel, a large language model (LLM) of 280 billion parameters; Google announced Pathways Language Model (Palm), with 540 billion parameters, and Generalist Language Model (GLaM), with up to 1.2 billion parameters; and Microsoft and Nvidia released the Megatron-Turing NLGan LLM of 530 billion parameters.

One of the interesting aspects of the scale is emerging skills, where the larger models manage to perform tasks that were impossible with the smaller ones. This phenomenon has been especially being imagined in LLMs, where models show promising results across a wider range of tasks and benchmarks as they grow in size.

events

Summit Low-Code/No-Code

Join today’s top executives at the Low-Code/No-Code Summit virtually on November 9. Sign up today to get your free pass.

register here

However, it’s worth noting that some of the fundamental problems of deep learning remain unsolved, even in the largest models (more on that in a bit).

2. Unsupervised learning still pays off

Many successful deep learning applications require humans to tag training examples, also known as supervised learning. But most of the data available on the Internet does not come with the clean labels necessary for supervised learning. And data annotation is expensive and time-consuming, creating bottlenecks. That’s why researchers have long sought advances in unsupervised learning, where deep learning models are trained without the need for human-annotated data.

There has been a lot of progress in this field, in recent years, especially in LLMs, which are mostly trained on large raw data sets collected from the internet. While LLMs continued to make progress in 2022, we also saw other trends in unsupervised learning techniques gaining traction.

For example, there have been phenomenal advances in text-to-image models this year. Models like DALL-E 2 from OpenAI, Google Imageand stability AI stable diffusion have shown the power of unsupervised learning. Unlike older text-to-image models, which required image pairs and well-annotated descriptions, these models use large datasets of loosely captioned images that already exist on the Internet. The large size of their training data sets (which is only possible because there is no need for manual labeling) and the variability of the caption schemes allow these models to find all sorts of intricate patterns between textual and visual information. As a result, they are much more flexible in generating images for various descriptions.

3. Multimodality is advancing by leaps and bounds

Text-to-image generators have another cool feature: they combine multiple data types into a single model. Being able to process multiple modalities allows deep learning models to take on much more complicated tasks.

Multimodality is very important to the kind of intelligence found in humans and animals. For example, when you see a tree and hear the wind rustling in its branches, your mind can quickly associate them. Likewise, when you see the word “tree,” you may quickly conjure up an image of a tree, remember the smell of pine after a rain, or remember other experiences you’ve had before.

Evidently, multimodality has played an important role in making deep learning systems more flexible. This was perhaps best shown by DeepMind’s cats, a deep learning model trained on a variety of data types, including images, text, and proprioception data. Gato showed decent performance in multiple tasks, including image captions, interactive dialogue, control of a robotic arm, and gaming. This is in contrast to classic deep learning models, which are designed to perform a single task.

some researchers have taken the notion so far as to propose that a system like Gato is all we need to achieve artificial general intelligence (AGI). Although many scientists do not agree with this opinion, what is certain is that multimodality has brought important advances for deep learning.

4. Fundamental deep learning problems remain

Despite the impressive achievements of deep learning, some of the problems in the field remain unsolved. Among them are causalitycompositionality, common sense, reasoning, planning, intuitive physics, and abstraction and creation of analogies.

These are some of the mysteries of intelligence that are still being studied by scientists in different fields. Pure data and scale-based deep learning approaches have helped make incremental progress on some of these problems without providing a definitive solution.

For example, larger LLMs can maintain coherence and consistency across longer stretches of text. But they task file that require meticulous step-by-step reasoning and planning.

Similarly, text-to-image generators create amazing graphics but make basic mistakes when asked to draw images that require compositionality or have complex descriptions.

These challenges are being discussed and explored by different scientists, including some of the pioneers of deep learning. Prominent among them is Yann LeCun, the Turing Award-winning inventor of convolutional neural networks (CNN), who recently wrote a long essay on the limits of LLMs that learn only from the text. LeCun is investigating a deep learning architecture that learns world models and may address some of the challenges currently facing the field.

Deep learning has come a long way. But the more we progress, the more we realize the challenges of creating truly intelligent systems. Next year is sure to be as exciting as this one.

The VentureBeat Mission is to be a digital public square for technical decision makers to gain insights into technology and transformative business transactions. Discover our informative sessions.




Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel