What Do Neural Networks Really Learn? Exploring the Brain of an AI Model

What Do Neural Networks Really Learn? Exploring the Brain of an AI Model (www.youtube.com)
from UraniumBlazer@lemm.ee to technology@lemmy.world on 14 Jun 2024 18:36
https://lemm.ee/post/34609635

Neural networks have become increasingly impressive in recent years, but there’s a big catch: we don’t really know what they are doing. We give them data and ways to get feedback, and somehow, they learn all kinds of tasks. It would be really useful, especially for safety purposes, to understand what they have learned and how they work after they’ve been trained. The ultimate goal is not only to understand in broad strokes what they’re doing but to precisely reverse engineer the algorithms encoded in their parameters. This is the ambitious goal of mechanistic interpretability. As an introduction to this field, we show how researchers have been able to partly reverse-engineer how InceptionV1, a convolutional neural network, recognizes images.

#technology

threaded - newest

PipedLinkBot@feddit.rocks on 14 Jun 2024 18:37 next collapse

Here is an alternative Piped link(s):

https://piped.video/jGCvY4gNnA8?si=S4koY5QBcuSFEfbP

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

QuarterSwede@lemmy.world on 15 Jun 2024 13:59 collapse

They absolutely do not learn and we absolutely do know how they work. It’s pretty simple.

Generative AI needs massive training sets that represent the kinds of things it’s asked to represent. Through the process of training, the AI learns the patterns in the data and can generate new data that fits within those patterns. It’s statistics all the way down. In the case of a Large Language Model (LLM) it’s always asking itself, “what’s the next most likely word to come after this previous word, and does that next word make sense within the context of the other words in the sentence?” The LLMs don’t necessarily understand a text as a text; that is, as a sequence of ideas unfolding logically but rather as a set of tokens that carry statistical weights.

jasonheppler.org/2024/05/23/i-made-this/

GamingChairModel@lemmy.world on 16 Jun 2024 21:45 collapse

Yes, but the tokens are more than just a stream of letters, and aren’t saved in the form of words. The information itself is organized into conceptual proximity to other concepts (and distinct from the text itself), and weighted in a way consistent with its training.

That’s why these models can use analogies and metaphors in a persuasive way, in certain contexts. Mix concepts that the training data has never been shown before, and these LLMs can still output something consistent with those concepts.

Anthropic played around with their own model, emphasizing or deemphasizng particular concepts to observe some unexpected behavior.

And we’d have trouble saying whether a model “knows” something if we don’t have a robust definition of when and whether a human brain “knows” something.