Advertisement

"Self-awareness" of LLM - Andrej Karpathy's in-depth explanation of LLM (Part 5)

On the Internet, users often ask LLM questions like"What model are you?"or"Who created you?"kinds of questions. However, these questions are essentiallyMeaningless, because LLMdoes not possess a continuous self-awareness like humans. LLM is not a human being and does not exist persistently in any way. Each time, it processes Tokens and then shuts itself down.

1. Why LLMs do not have a true "self"

  • LLM does not have a persistent existence: Each time a user interacts with the model, the model startsfrom zeroand runs. It has no memoryof previous conversations, nor doesa persistent sense of identity
  • The model is just a statistical machine for processing tokens: it followsthe statistical patterns of the training dataGenerate the most probable output rather than answering based on real "self-awareness".

Why do LLMs provide false self-introductions?

If a user asks an LLM"Who are you?"its response is often inaccurate. For example:

  • User:"Who are you?"
  • Old Falcon 7B model response:

Why is this?

  1. The LLM has not been explicitly informed of its true identity.so it willinfer answers based on training data
  2. bias in the training data
  • A large number of discussions about AI models on the Internet have mentioned“ChatGPT by OpenAI”the model mayLearn this formatand incorrectly apply it to itself.
  • Lack of correct self-definition data
    • If the model is not provided with information about its own identity during training, itcan only guess by statistical patterns,which may lead to hallucinations.

    3. Solution: How to enable LLMs to have the correct "self-awareness"

    Developers can do this in two waysto make the LLM state the correct identity information

    Method 1: Manually providing identity information in the training data

    • For example, the model from Allen AIAlMothrough"hard-coded" dialogue examplessolved this problem:
      • user"Who are you?"
      • Model"I am an open language model developed by Allen AI."
      • In OlMo's training dataset, it includes240 preset question-and-answer pairs
      • Since LLMs will learn these fixed-format answers, in the future when users ask similar questions, the model willparrotthese answers.

    Method 2: Use System Message

    • System message(System Message) isan instruction hidden in the dialogue contextthat allows LLMs to remember their roles. For example:
      • In ChatGPT, OpenAI may add aninvisible system message to the model.
        You are ChatGPT 4.0, trained by OpenAI with a knowledge cutoff date of September 2023.
      • In this way,the model receives this information at the start of each conversationand thus provides consistent responses.