Supply Chain Market Research - SCMR LLC
  • Blog
  • Home
  • About us
  • Contact

Nth Dimension

4/29/2025

0 Comments

 

Nth Dimension
​

AI is unusual in that while we (humans) develop architectures and algorithms that make models work, we are not really sure how they do what they do.  But when we ask a model, in this case ChatGPT (OpenAI) to explain how they work, the model seems to be able to step back a bit in order to explain details.  This step back puts the LLM we are talking with in the unusual position of describing how it works as if it were not a model but an observer, although sometimes it seems odd when a model describes how it works by saying “models do this…” sort of ignoring the fact that it is  model, but we digress…
What we were trying to understand when we started our conversation with ChatGPT was how models represent information for each token as it learns.  We understand that the model (software called a tokenizer) breaks down text into tokens, typically a token for each word, although in many cases it can be a sub-word, such as a syllable or even a single character.  Each token gets assigned an ID number which goes into a master token ID list.
Example:
“The cat was running away from the dog.”
Picture
              
The list of unique tokens for a large model is fixed at ~100,000 tokens.  No matter how much data the model sees it only uses tokens from this list, breaking down unknown words into smaller known sub-word pieces, so the corpus of data the model sees could be 300 billion tokens.  The token ID list remains with the model after training, but the large list of tokens processed during training does not need to be stored, as the model learns from the tokens but does not need them later.
The part that is difficult to visualize comes as the tokens are first encountered by the model.  The model looks up the token in the token list and matches it to another list that contains that tokens vectors.  Think of vectors as a string of numbers (768 numbers for each token in a small model)
Picture
​ On the first run through, the dimensions for each token are set to random numbers, essentially ‘noise’, then the token sequence is passed to the first layer of the model.  These vectors are used to begin to ‘classify’ each token. If the model ‘sees’ that ‘cat’ and ‘dog’ appear in the same sentence often, it will adjust a particular dimension slightly for both the cat and dog token, and with each layer will further adjust that dimension, which we might call the “animal” dimension.  By the time the token has been cycled through all the layers, the ‘animal’ vector for both dog and cat will be close to each other, but not exactly the same.  That is how the model ‘knows’ that both dog and cat have the ‘animal’ relationship but are still different from each other.  If that vector was the same for both, the model would not know that while both are animals, they are different animals.
While this is a very simplistic look at how an LLM learns, one should understand that the model is always looking at the relationships between tokens, particularly in a sequence, and with over 700 vector dimensional ‘characteristics’ for each token, the model can develop lots of connections between tokens.  It is hard not to think of the dimensions as having specific ‘names’ as the semantic information that the dimensions contain is quite subtle, but it is all based on the relationships that the tokens have to each other, which is ‘shared’ in token vectors.
All in, this is just the tip of the iceberg in terms of understanding how models work and their positives and negatives, although even the best of LLNs still has difficulty explaining how things work internally when the questions are highly specific.  Sometimes we think its because it doesn’t really know how it works and other times it seems that it just doesn’t want to give that proprietary detail.  But we will continue to dig and pass on what we find out and how it affects AIs and their use in current society.  More to come…
 
0 Comments



Leave a Reply.

    Author

    We publish daily notes to clients.  We archive selected notes here, please contact us at: ​[email protected] for detail or subscription information.

    Archives

    May 2025
    April 2025
    March 2025
    February 2025
    January 2025
    January 2024
    November 2023
    October 2023
    September 2023
    August 2023
    June 2023
    May 2023
    February 2023
    January 2023
    December 2022
    November 2022
    October 2022
    September 2022
    August 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    February 2022
    January 2022
    December 2021
    November 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    May 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    October 2020
    July 2020
    May 2020
    November 2019
    April 2019
    January 2019
    January 2018
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    November 2016
    October 2016
    September 2016

    Categories

    All
    5G
    8K
    Aapl
    AI
    AMZN
    AR
    ASML
    Audio
    AUO
    Autonomous Engineering
    Bixby
    Boe
    China Consumer Electronics
    China - Consumer Electronics
    Chinastar
    Chromebooks
    Components
    Connected Home
    Consumer Electronics General
    Consumer Electronics - General
    Corning
    COVID
    Crypto
    Deepfake
    Deepseek
    Display Panels
    DLB
    E-Ink
    E Paper
    E-paper
    Facebook
    Facial Recognition
    Foldables
    Foxconn
    Free Space Optical Communication
    Global Foundries
    GOOG
    Hacking
    Hannstar
    Headphones
    Hisense
    HKC
    Huawei
    Idemitsu Kosan
    Igzo
    Ink Jet Printing
    Innolux
    Japan Display
    JOLED
    LEDs
    Lg Display
    Lg Electronics
    LG Innotek
    LIDAR
    Matter
    Mediatek
    Meta
    Metaverse
    Micro LED
    Micro-LED
    Micro-OLED
    Mini LED
    Misc.
    MmWave
    Monitors
    Nanosys
    NFT
    Notebooks
    Oled
    OpenAI
    QCOM
    QD/OLED
    Quantum Dots
    RFID
    Robotics
    Royole
    Samsung
    Samsung Display
    Samsung Electronics
    Sanan
    Semiconductors
    Sensors
    Sharp
    Shipping
    Smartphones
    Smart Stuff
    SNE
    Software
    Tariffs
    TCL
    Thaad
    Tianma
    TikTok
    TSM
    TV
    Universal Display
    Visionox
    VR
    Wearables
    Xiaomi

    RSS Feed

Site powered by Weebly. Managed by Bluehost