Supply Chain Market Research - SCMR LLC
  • Blog
  • Home
  • About us
  • Contact

The Physical World

5/28/2025

0 Comments

 

The Physical World
​

As we have noted previously, AI systems fundamentally operate as matching systems. During training, they establish a vast array of statistical relationships, which they then leverage to construct responses to queries. These responses are statistically determined to be the most logical based on their training data. While this is a simplification, models learn billions of relationships between tokens (such as letters or words) during training, enabling them to appear to "reason." Based on this training, a model might analyze a query by evaluating a token in relation to surrounding tokens (e.g., the token before or after it, the last two tokens of the previous word, or even the last ten tokens) and almost any level of complexity, both forward and backward, to predict the most accurate next token.
The use of the word ‘reason’ is really a stretch even as the definition “think, understand, and form judgments by a process of logic”, is correct in the ‘logic’ part, but far off on ‘think’ and ‘understand’.There is no understanding, just the ability to use those relationships learned during training to come up with the most logical answer.  This becomes very apparent when it comes to physics, as much in physics requires physical reasoning., and it seems that big models have considerable difficulty correctly solving physics problems.  A group of researchers at the University of Michigan, University of Toronto, and the University of of Hong Kong, decided to create a group of 6,000 physics questions to see if models were up to the task when it comes to physics, even though the same models were able to solve Olympiad mathematical problems with human level accuracy on standard benchmarking platforms.
The researchers used 6 physics domains: mechanics, Electromagnetism, Thermodynamics, Wave/Acoustics, Optics, and Modern Physics, and before we go further, we were quickly humbled upon seeing even the simplest of the 3,000 questions.  That said, physical problem-solving fundamentally differs from pure mathematical reasoning or science knowledge question answering by requiring models to decode implicit conditions in the questions (e.g., interpreting "smooth surface" in a question as the coefficient of friction equals to zero), and maintain physical consistency as the laws of physics don’t change with different reasoning pathways.  There is a need for visual perception in physics that does not appear in mathematics and that presents a challenge for large models and the new benchmark that the researchers developed is not only 50% open-ended questions, but has 3,000 unique images that the model must decipher.
Picture
With new models being foisted into public scrutiny almost daily, invariably along with benchmarks that prove the new model is not only better than previous models, but makes us humans look like we had trouble with 5th grade math.  When it comes to physics however, we will just show the results of the scores and let the table below do the talking.  The experts indicated below were under-graduate and graduate physics students who were divided into 3 groups based on their answers to 18 classification questions.
Picture
​Shen, Hui. “PHYX: Does Your Model Have the ‘Wits’ for Physical Reasoning?” Arxiv, 2025.
We cannot answer why the models did not fare well, but we can give some understanding to the type of errors that were found:
  • Visual Reasoning (39.6%) – An inability of the model to correctly extract visual information.
  • Text Reasoning Errors (13.5%) – Incorrect processing or interpretation of textual content.
  • Lack of Knowledge (38.5%) – Incomplete understanding of specific domain knowledge.
  • Calculation Errors (8.3%) – Mistakes in arithmetic operations or unit conversions.
 
All in, typical benchmarks overlook physical reasoning, and that requires integrating domain knowledge and real-world contraints, difficult tasks for models that don’t live in the real world.  Relying on memorized information, superficial visual patterns, and mathematical formula do not generate real understanding.  While the researchers note that while schematics and textbook style illustrations might be suitable for evaluating conceptual reasoning, they might not capture the complexity of perception in natural enviroments.  You have to live it to understand it.
0 Comments



Leave a Reply.

    Author

    We publish daily notes to clients.  We archive selected notes here, please contact us at: ​[email protected] for detail or subscription information.

    Archives

    May 2025
    April 2025
    March 2025
    February 2025
    January 2025
    January 2024
    November 2023
    October 2023
    September 2023
    August 2023
    June 2023
    May 2023
    February 2023
    January 2023
    December 2022
    November 2022
    October 2022
    September 2022
    August 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    February 2022
    January 2022
    December 2021
    November 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    May 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    October 2020
    July 2020
    May 2020
    November 2019
    April 2019
    January 2019
    January 2018
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    November 2016
    October 2016
    September 2016

    Categories

    All
    5G
    8K
    Aapl
    AI
    AMZN
    AR
    ASML
    Audio
    AUO
    Autonomous Engineering
    Bixby
    Boe
    China Consumer Electronics
    China - Consumer Electronics
    Chinastar
    Chromebooks
    Components
    Connected Home
    Consumer Electronics General
    Consumer Electronics - General
    Corning
    COVID
    Crypto
    Deepfake
    Deepseek
    Display Panels
    DLB
    E-Ink
    E Paper
    E-paper
    Facebook
    Facial Recognition
    Foldables
    Foxconn
    Free Space Optical Communication
    Global Foundries
    GOOG
    Hacking
    Hannstar
    Headphones
    Hisense
    HKC
    Huawei
    Idemitsu Kosan
    Igzo
    Ink Jet Printing
    Innolux
    Japan Display
    JOLED
    LEDs
    Lg Display
    Lg Electronics
    LG Innotek
    LIDAR
    Matter
    Mediatek
    Meta
    Metaverse
    Micro LED
    Micro-LED
    Micro-OLED
    Mini LED
    Misc.
    MmWave
    Monitors
    Nanosys
    NFT
    Notebooks
    Oled
    OpenAI
    QCOM
    QD/OLED
    Quantum Dots
    RFID
    Robotics
    Royole
    Samsung
    Samsung Display
    Samsung Electronics
    Sanan
    Semiconductors
    Sensors
    Sharp
    Shipping
    Smartphones
    Smart Stuff
    SNE
    Software
    Tariffs
    TCL
    Thaad
    Tianma
    TikTok
    TSM
    TV
    Universal Display
    Visionox
    VR
    Wearables
    Xiaomi

    RSS Feed

Site powered by Weebly. Managed by Bluehost