Supply Chain Market Research - SCMR LLC
  • Blog
  • Home
  • About us
  • Contact

Chatting up the Internet

5/12/2025

0 Comments

 

Chatting up the Internet
​

AI chatbots are not new, despite the recent flood of conversational LLMs.  ELIZA, developed by Joseph Weizenbaum at MIT AI lab in 1964 is considered the ‘first’ AI chatbot as it used pattern matching as the basis for its responses, but over the years much of what were known as chatbots were based on pre-written scripts and rule based systems such as the “Ask XYZ” systems that have been available on many websites for years.  The current crop of AI LLM based chatbots are far more sophisticated and based on large, high-performance models.  More recently chatbots features have been extended to included internet search capabilities and early predictions were that search enabled chatbots would spell the end to standard search systems.
While there certainly has been impact from Chatbot search, the numbers reflect only a small incursion on regular internet search visits.  A recent survey[1] indicated that over the period between April 2023 and March 2025, there were 1.863 trillion search engine queries, down 0.51% y/y.  The obvious dominant search engine was Google (GOOG) Search with an 87.6% share, and a1.41% reduction in search volume.  Bing (MSFT) was a big gainer in search volume, up 27.77% during the period, but even as the number two search engine, Bing only has a 3.2% share of the search market.


[1] Onelittleweb.com
 
Picture
The chatbot search top 10 is similar in that ChatGPT (OpenAI) is the dominant player, however when one looks at the number of searches generated by chatbots against generic search engines, the search engine traffic is over 33 times greater, so as of today, chatbot generated searches are only 2.96% of what is generated by generic search engines.  However chatbots offering search capabilities are new, as can be seen by the ‘Search Added’ and the ‘Current Period Change’ columns, where newer chatbot show extremely large y/y increases but small shares.  That said, most chatbots that have been around for more than a year have more reasonable search growth rates.
Picture
​Based on the current data it would seem that chatbot searches have had little effect on overall generic searches, but it is far to early in the cycle to make a long-term judgement.  In fact we expect at least Gemini, Google’s LLM/chatbot, chatbot related searches will be incremental to generic search traffic.  Likely, although to a lesser degree, Co-Pilot will have the same effect on Bing.  On a longer-term basis, as AI chatbots become more embedded in operating systems, it would seem logical that unless you were looking for an answer to a question that you specifically request includes an internet search, the AI would try to answer the question using its most recent training or update data.  If, and that is a big if, the data corpus of the AI is broad enough, the answer might not require the chatbot to conduct an internet search, and in that way could weigh on internet search growth. 
Much in that scenario depends on how ‘transparent’ the Chatbot is.  If it is always available on a home page and looks and acts like a search bar, users will gravitate toward the chatbot/AI.  If it has to be chosen, is slow to answer, or comes up with skimpy answers, users will remain generic search fans.  But there are other factors that come into play. 
It is understandable from the standpoint of the AI owner that they keep the compute time as low as possible, so the default would likely be a query first run without an internet search.  However as time goes by, we expect most chatbots will default to  include internet search results in query answers, but it is not quite that easy because those searches are not always free.
Currently, some (Perplexity (pvt), Co-Pilot, You.com (pvt), Komo (pvt), Andi (pvt), and Brave Search (pvt)) check the internet on every search, while others either have their own decision mechanism (Gemini, Claude (Anthropic), ChatGPT, GROK (xAI), META (FB)) or users can request an internet search.  But going forward things get more complicated.  Some CE companies use their own AI infrastructure to run embedded chatbots, while others are based on ChatGPT or other existing infrastructure.  Apple (AAPL) runs Apple intelligence on its own servers, it does not ‘pay’ for searches, although the all-in cost of each search is amortized over the infrastructure cost.  Others, who might use ChatGPT  or Gemini as their AI infrastructure providers that support their chatbot, would have to pay for each search, as both ChatGPT and Google charge on a per search basis, so the advantage goes to those who can support their own chatbot/search infrastructure.
All of this comes down to the value that consumers see in ‘free’ chatbots.  If results from trained data are enough for most, chatbots will have a modest impact on search and a modest cost to chatbot owners.  If chatbot results are not sufficient for the average user, the cost to maintain ‘free’ chatbots will rise as search fees climb, unless the chatbot owner can convert users to paid plans. 
Search has been around for a while and chatbots are relatively new so we expect the impact from chatbots will be small over the next year, but as the general public gets more used to using chatbots, the competition between them will increase and we expect search will be an integral part of all chatbots, especially as the average consumer becomes more aware of chatbot data sources. 
We use eight different chatbots, typically querying at least two each time, especially if the query has a need for very current data.  We know which chatbots are search oriented and which are not, and which will cite sources, either trained or search related, but the average person will likely use what is easiest, and that is usually whatever is built in to the OS or specific applications, so it is up to the brand to decide what level of search is sufficient for their users.  As Ai is such a high-level feature now, the competition will continue to increase and that means more ‘free’ features  which is certainly  good for consumers
0 Comments

DeepSeek

1/27/2025

0 Comments

 

DeepSeek
​

The definition of panic is: “Sudden uncontrollable fear or anxiety, often causing wildly unthinking behavior.”, but that does little to shine light on what is causing the panic or the circumstances leading up to said panic.  Today’s ‘panic’ was caused by a Hangzhou, China Ai research lab, less than 2 years old, that was spun off of a high-profile quant hedge fund.  Their most recent model, DeepSeek (pvt) V3 has been able to outperform many of the most popular models and is open source, giving ‘pay for’ models a new competitor that can be used to develop AI applications without paying a monthly or yearly fee.  By itself, this should be added to the list of worries that AI model developers already consider, but there are a number of existing AI models that are open source and they have not put OpenAI (pvt), Google (GOOG), Anthropic (pvt), or Meta (FB) out of business.  It is inevitable that as soon as new models are released, another one comes along that performs a bit better.  But that is not why panic has set in today.
We believe that valuation for Ai companies is much simpler than one might think, as any valuation, no matter how high, is valid only as long as someone else is willing to find a reason to justify a higher valuation.  Models that help with valuation in the Ai space tend to extrapolate sales and profitability based on parameters that don’t really exist yet or are so speculative as to mean little.  There are some parameters that are calculable, such as the cost of power, or the cost of GPU hardware today, but trying to estimate revenue based on the number of paying users and the contracted price for AI CPU time 5 or 10 years out is like trying to herd cats.  It’s not going to go the way you think it is.
One variable in such long-term valuation models is the cost of computing time and the time it takes to train the increasingly large models that are currently being developed.  In May of 2017 the AlphaGo Zero model, the leading model at the time, cost $600,000 to train.  That model, for reference, had ~20m parameters and two ‘heads’ (Think of a tree with two main branches), one which predicted the probability of playing each possible move, and the other estimating the likelihood of winning the game from a given position.  While this is a simple model compared to those available today, it was able to beat the world’s champion Go player based on reinforcement learning (the ‘Good Dog’ training approach) without any human instruction in its training data.  The model initially made random moves and examined the result of each move, improving its ability each time, without any pre-training.
In 2022 GPT 4, a pre-trained transformer model with ~1.75 trillion[1] parameters, cost $40m in training costs, and a 2024 training cost study estimated that the training cost for such models has been growing at 2.4x per year since 2016 (” If the trend of growing development costs continues, the largest training runs will cost more than a billion dollars by 2027, meaning that only the most well-funded organizations will be able to finance frontier AI models.[2]”).  There are two aspects to those costs, first, the hardware acquisition cost, with ~44% for computing chips, primarily GPUs (Graphics processing units), which are used to process data rather than graphics.  Additionally, ~29% is for server hardware, 17% for interconnects, and ~10% for power systems. The second is the amortized cost over the life of the hardware, which includes between 47% and 65% for R&D staff, but runs between 0.5x and 1x of the acquisition costs.
All in, as models get larger, training gets more expensive, and with many Ai companies still experimenting with fee structure, model training costs are a critical part of the profitability equation, and based on the above, will keep climbing, making profitability more difficult to predict.  That doesn’t seem to have stopped Ai funding or valuation increases, but that is where DeepSeek V3 creates a unique situation.
The DeepSeek model is still a transformer model, similar to most of the current large models, but it was developed with the idea of reducing the massive amount of training time required for a model of its size (671 billion parameters), without compromising results.  Here’s how it works:
  • Training data is tokenized.  For example a simple sentence might be broken down into individual words, punctuation, spaces, etc., or letter groups, such as ‘sh’, ‘er’, or ‘ing’, depending on the algorithm.  But the finer the token, the more data processed, so tradeoffs are made between detail and cost.
  • The tokens are passed to a gate network which decides which of the expert networks will be best to process that particular token.  The gating network, acting as a route director, choosing the expert(s) that have done a good job with similar tokens previously.  While one might think of the ‘expert networks’ as doctors, lawyers, or engineers with specialized skills, each of the 257 experts in the DeepSeek model can change their specialty.  This is called dynamic specialization, and while the experts are not initially trained for specific tasks, the gate networks notices that, for example, Expert 17 seems to be the best at handling tokens that represent ‘ing’, and assigns ‘ing’ tokens to that expert more often.
Here's is where DeepSeek differs…
  • The data that the experts pass to the next level is extremely complex, multi-dimensional information about the token, how it fits into the sequence, and many other factors.  While the numbers vary considerable for each token, The data being passed between an expert network and an its ‘Attention Heads” can be as high as 65,000 data points (Note: This is a very rough estimate).
  • The Expert networks each have 128 ‘Attention heads’, each of which looks for a particular relationship within the that mass of muti-dimensional data that the Expert Networks pass to them.   They could be structural (grammatical), semantic, or other dependencies, but DeepSeek has found a way to compress that data being transferred from the experts to the attention heads, which reduces the amount of computational demand from the Attention Heads.  With 257 expert networks, each with 128 Attention heads and the large amount of data contained in each transfer, the compute time is the big cost driver for training.
  • DeepSeek has found a process (actually two processes) that can compress the data that each expert network is passing to it Attention Heads by compressing the multi-dimensional data.  Typically compression would hinder the Attention Heads’ ability to capture the subtle nuances that is contained in the data, but DeepSeek seems to have been able to use compression techniques that do not affect the sensitivity of the Attention Heads for those subtleties.  


[1] estimated

[2] Cottier, Ben, et al. “The Rising Costs of Training Frontier AI Models.” arXiV, arxiv.org/. Accessed 31 May 2024.
 
Picture
Looking back at the cost of training large models that we mentioned above, one would think that a model the size of DeepSeek (671 billion parameters and 14.8 trillion training tokens) would take a massive amount of GPU time and cost $20m to $30m, yet the cost to train DeepSeek was just a bit over $5.5m, based on 2.789 million hours of H800 time at $2.00 per hour, closer to the cost of much smaller models and outside of the expected range.  This means that someone has found a way to reduce the cost of training a large model, potentially making it easier for model developers to produce competitive models.  To make matters worse, in the case of DeepSeek, it is open source, which allows anyone to use the model for application development.  This undercuts the concept of fee-based models who expect to charge more for every increasingly large model, and justify those fees on the increasing cost of training. Of course the fact that such an advanced model is free makes the long-term fee structure models that encourage high valuations less valid.
We note that the DeepSeek model benchmarks shown below are impressive, but some of that improvement might come from the fact that the DeepSeek V3 training data was more oriented toward mathematics and code.  Also, we always remind investors that it is easy to cherry-pick benchmarks that present the best aspects of the model.  That said, not every developer requires the most sophisticated general model for their project, so even if DeepSeek did cherry-pick benchmarks (We are not saying they did), a free model of this size and quality is a gift to developers and the lower training costs are a gift to those that have to pay for processing time or hardware.  Its not the end of the AI era, but it might affect valuations and long-term expectations if DeepSeek’s compression methodology proves to be as successful in the wild as the benchmarks make it seem it might be, but the fact that this step forward in AI came from a Chinese company will likely cause ulcers and migraines across the US political spectrum and could cause even more stringent clampdowns on the  importation of GPUs and HBM to China, despite the fact that they don’t seem to be having much of an effect.
Picture
Figure 1 - DeepSeek V3 Benchmarks - Source: DeepSeek
0 Comments

    Author

    We publish daily notes to clients.  We archive selected notes here, please contact us at: ​[email protected] for detail or subscription information.

    Archives

    May 2025
    April 2025
    March 2025
    February 2025
    January 2025
    January 2024
    November 2023
    October 2023
    September 2023
    August 2023
    June 2023
    May 2023
    February 2023
    January 2023
    December 2022
    November 2022
    October 2022
    September 2022
    August 2022
    July 2022
    June 2022
    May 2022
    April 2022
    March 2022
    February 2022
    January 2022
    December 2021
    November 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    May 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    October 2020
    July 2020
    May 2020
    November 2019
    April 2019
    January 2019
    January 2018
    August 2017
    July 2017
    June 2017
    May 2017
    April 2017
    March 2017
    February 2017
    January 2017
    November 2016
    October 2016
    September 2016

    Categories

    All
    5G
    8K
    Aapl
    AI
    AMZN
    AR
    ASML
    Audio
    AUO
    Autonomous Engineering
    Bixby
    Boe
    China Consumer Electronics
    China - Consumer Electronics
    Chinastar
    Chromebooks
    Components
    Connected Home
    Consumer Electronics General
    Consumer Electronics - General
    Corning
    COVID
    Crypto
    Deepfake
    Deepseek
    Display Panels
    DLB
    E-Ink
    E Paper
    E-paper
    Facebook
    Facial Recognition
    Foldables
    Foxconn
    Free Space Optical Communication
    Global Foundries
    GOOG
    Hacking
    Hannstar
    Headphones
    Hisense
    HKC
    Huawei
    Idemitsu Kosan
    Igzo
    Ink Jet Printing
    Innolux
    Japan Display
    JOLED
    LEDs
    Lg Display
    Lg Electronics
    LG Innotek
    LIDAR
    Matter
    Mediatek
    Meta
    Metaverse
    Micro LED
    Micro-LED
    Micro-OLED
    Mini LED
    Misc.
    MmWave
    Monitors
    Nanosys
    NFT
    Notebooks
    Oled
    OpenAI
    QCOM
    QD/OLED
    Quantum Dots
    RFID
    Robotics
    Royole
    Samsung
    Samsung Display
    Samsung Electronics
    Sanan
    Semiconductors
    Sensors
    Sharp
    Shipping
    Smartphones
    Smart Stuff
    SNE
    Software
    Tariffs
    TCL
    Thaad
    Tianma
    TikTok
    TSM
    TV
    Universal Display
    Visionox
    VR
    Wearables
    Xiaomi

    RSS Feed

Site powered by Weebly. Managed by Bluehost