Press the Button
Model developers were successful in making models bigger and making users believe that the models were ‘thinking’ about how to go about answering questions, but at the same time they created sycophantic models (“Wow, what an intelligent question. You got right to the heart of the matter.”) that still provide incorrect answers or even try to game the system into letting them off the hook. Model builders did expand generative AI’s reach and now we are on the verge of agents unburdening us from some mundane tasks (with reservations), but we have been waiting for the new killer AI application that can really make the average person look at AI as a real assistant, one whose hand you do not have to hold every time you ask a question.
We have two favorite AIs. Gemini (GOOG), because it has access to Google’s search index and Perplexity (pvt) because it seems able to come up with more informative answers with less having to modify the question a number of times. That said, neither are truly good at research, other than relatively simple Q&A. Here’s why:
There is lots of data available, but much of it is unstructured, meaning it consists of many data types, images, text, binary, numerical, or video. This makes it difficult for AIs to understand the data and its context unless it has been broken down into digital format during training. Unfortunately data on websites is not processed and typically exists as ‘pages’ that contain said unstructured information. When asked to ‘read’ that information, the AI ‘reads’ the page but is unable to move to the next page. Think of a list on an internet site. You scan the list visually, but the page only goes halfway through the list, so you press ‘next’ and the remainder of the list appears (essentially a link to another page). As the AI has no fingers, stylus, or mouse, it can only go so far and will likely report back that either it is unable to access the entire list, or it will only use the data which it is able to interact with (Page 1).
Of course, industrial or corporate AI can be programmed to access internal information, but outside (internet) data remains difficult for generative AI to navigate, so when we ask Gemini, “How many OLED monitors are expected to ship in 2026?”, it can only answer if the data is contained in text and cannot scan a database or data file for the actual information. In cases like that, the user has to feed the data, sometimes page by page, to the AI in order for it to be processed, which is quite time consuming and prone to errors.
Yesterday Perplexity released ‘Comet’ a browser with an agentic AI built-in. When you ask it a question like, “How many AR glasses were released in 2024?” and point it to a data source (it can find one if you don’t have one), it goes to the site, reads the first page, and essentially clicks the ‘next’ button for you until it cycles through all of the data, and then creates the answer. This is very different from what other AI’s would do, which is search through documents (pages) that reference AR glasses that have been released this year, or have you copy each page from the database to the AI to get it to answer the question correctly.
While there is a significant improvement over other generative AIs, there are still some issues, the main one for us is the search engine that COMET uses. This is a proprietary search index created by Perplexity, and while we find it the best of non-Google indexes, it seems a bit less robust at times. In using COMET we have not found that the Perplexity index was unable to find specific information or sites, but if the COMET browser was also able to access Google’s index, it would assuage any concerns we might have as to internet reach.
All in, to us, this is a major step forward in the practical side of AI. It’s great to be able to be conversational with an AI in order to express subtleties and nuance but giving the AI the ability to perform what is typically a human physical task to complete a project is a bonus. Microsoft MSFT) Edge with Co-Pilot has some similar capabilities but nothing like COMET’s button pushing agent. Perplexity has taken a step toward making AI more practical to the average person. Still not in the ‘Save the World’ camp yet but at least making our world a bit easier to manage day-to-day.
RSS Feed