Supply Chain Market Research - SCMR LLC
  • Blog
  • Home
  • About us
  • Contact

Will AI Cause the End of Social Media?

1/14/2025

0 Comments

 

Will AI Cause the End of Social Media?
​

We have written a number of times about deepfakes, those seemingly real images created by AI systems.  They are, at the least, annoying, but also erode what confidence in social media’s ‘validity’ that currently exists.  If you cannot believe what you see, then why bother to look, especially as Ai systems improve their ability to create even more accurate images. There will always be those who don’t care if the image is real, as long as it provides a few moments of entertainment, and perhaps generates some focus on those who circulate them, but while deepfakes are a problem, and a difficult one to solve, there is a bigger one.
In order for models to increase their accuracy, they need more examples.  This could mean more examples of text from famous authors, more annotated images of gazebos, dogs, ships, flagpoles, or more examples of even more specific data, such as court cases or company financial information.  Current Large Language Models (LLM) are trained on textual and code datasets that contain trillions of words, but the constantly expanding sum total of human text is loosely estimated in the quadrillions, so even a massive training dataset would represent less than a 10th of a percent of the corpus of human text.  It would seem that the chance that model builders will run out of data for training models would be something of concern far in the future, but that is not the case.
Models are now able to scrap information from the internet, which is eventually added to its training data when fine-tuned or updated.  The next iteration of the model is trained recursively, using the previous models’ expanded dataset, so Model V.2 generates output based on model V.1’s original training data and what it found on the internet.  Model V.3 uses model V.2’s expanded dataset, including what it finds on the internet to derive its own output, with subsequent models continuing that process.  This means that while model V.1 was originally trained on ‘reality, adding data from the internet, which we might loosely call ‘mostly realistic’ taints that models output slightly, say from ‘realistic’ to ‘almost completely realistic’.  Model V.2’s input is now ‘almost completely realistic’ but its output is ‘mostly realistic’ and with that input for the next iteration, model V3, its output is ‘somewhat realistic’.
Of course these are illustrations of the concept, but they do represent the degenerative process that can occur when models are trained on ‘polluted’ data, particularly data created by other models.  The result is model collapse, which can happen relatively quickly as the model loses information about the distribution of its own information over time.  Google (GOOG) and other model builders have noted the risk and have tried to limit the source of internet articles and data to more trustworthy sources, although that is a subjective classification, but as the scale of LLMs continues to increase the need for more training data will inevitably lead to the inclusion of data generated from other models and some of that data will come without provenance.
There is the possibility that the AI community will coordinate efforts to certify the data being used for training, or the data being scraped from the internet, and will share that information.  But at least at this point in the Ai cycle, model builders cannot even agree what data needs to be licensed and what does not, so it would seem that adding internet data will only hasten the degradation of LLM model effectiveness.
How does this affect social media?  It doesn’t.  Social media has a low common denominator.  The point of social media is not to inform and educate, it is to entertain and communicate, so there will always be a large community of social media users that don’t care whether what they see on social media is accurate or even real, as long as it can capture attention for some finite period of time and possibly generate status for the information provider, regardless of its accuracy.  Case in point the ‘Child under the Rubble’ photo we showed recently or the image of the Pentagon on Fire that was circulated months ago, both of which were deepfakes.
 In fact, we believe it is easier to spot deepfakes than it might be to spot inaccuracies or incorrect information from textual LLMs, as specific subject knowledge would be required for each LLM output statement.  It is a scary thought that while the industry predicts that model accuracy will continue to improve until models have the ability to far surpass human response accuracy, there is the potential that models will slowly (or rapidly) begin to lose sight of the hard data on which they were originally trained as it becomes polluted with less accurate and less realistic data; sort of similar to the old grade school game of telephone.  If that is the case, social media will continue but the value of LLMs will diminish.
 
Picture
Figure 5 - The Telephone Game - Source: Normal Rockwell
0 Comments



Leave a Reply.

    Author

    We publish daily notes to clients.  We archive selected notes here, please contact us at: ​[email protected] for detail or subscription information.

    Archives

    May 2025
    April 2025
    March 2025
    February 2025
    January 2025
    January 2024
    November 2023
    October 2023
    September 2023
    August 2023
    June 2023
    May 2023
    February 2023
    January 2023
    December 2022
    November 2022
    October 2022
    September 2022
    August 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    February 2022
    January 2022
    December 2021
    November 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    May 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    October 2020
    July 2020
    May 2020
    November 2019
    April 2019
    January 2019
    January 2018
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    November 2016
    October 2016
    September 2016

    Categories

    All
    5G
    8K
    Aapl
    AI
    AMZN
    AR
    ASML
    Audio
    AUO
    Autonomous Engineering
    Bixby
    Boe
    China Consumer Electronics
    China - Consumer Electronics
    Chinastar
    Chromebooks
    Components
    Connected Home
    Consumer Electronics General
    Consumer Electronics - General
    Corning
    COVID
    Crypto
    Deepfake
    Deepseek
    Display Panels
    DLB
    E-Ink
    E Paper
    E-paper
    Facebook
    Facial Recognition
    Foldables
    Foxconn
    Free Space Optical Communication
    Global Foundries
    GOOG
    Hacking
    Hannstar
    Headphones
    Hisense
    HKC
    Huawei
    Idemitsu Kosan
    Igzo
    Ink Jet Printing
    Innolux
    Japan Display
    JOLED
    LEDs
    Lg Display
    Lg Electronics
    LG Innotek
    LIDAR
    Matter
    Mediatek
    Meta
    Metaverse
    Micro LED
    Micro-LED
    Micro-OLED
    Mini LED
    Misc.
    MmWave
    Monitors
    Nanosys
    NFT
    Notebooks
    Oled
    OpenAI
    QCOM
    QD/OLED
    Quantum Dots
    RFID
    Robotics
    Royole
    Samsung
    Samsung Display
    Samsung Electronics
    Sanan
    Semiconductors
    Sensors
    Sharp
    Shipping
    Smartphones
    Smart Stuff
    SNE
    Software
    Tariffs
    TCL
    Thaad
    Tianma
    TikTok
    TSM
    TV
    Universal Display
    Visionox
    VR
    Wearables
    Xiaomi

    RSS Feed

Site powered by Weebly. Managed by Bluehost