Post by Benjamin Buchanan | Commonstock | ChatGPT will unlock a 4th component of the 'GUI'

Home

Discover

Trends

Leaders

Trending Assets

Top investors this month

Trending Assets

Top investors this month

Benjamin Buchanan

$15.8MFollowers

@01core_ben02/07/2023

ChatGPT will unlock a 4th component of the "GUI" - long-form explanation w/example

The QWERTY keyboard was invented in the late 1800s. The Mouse was invented in the 1960s. The computer screen was invented in the the early 1970s. Essentially, humans have been interacting with technology in mostly the same way for the past fifty+ years.

Siri, Alexa, Cortana etc do not count because they offer terrible user experiences except for a limited range of tasks which usually involve very obvious keywords/instructions (e.g. “set a timer”).

Think about the answer to this question for just a moment: why can’t you simply ask your computer to do things?

As an experiment I asked Cortana (Microsoft’s Equivalent to Siri) to do five tasks:

I asked it to open Microsoft Excel (it succeeded, opening a new blank file)

I asked it to open a specific Excel file by name (it failed, and just opened a new Excel file again)

I asked Cortana to select a specific cell in an Excel file (it failed and said “I’m sorry, but I can’t help with that.”)

I asked Cortana to open a new Edge browser window (it failed, and provided an irrelevant list of instructions showing me how to turn on Cortana in Edge - see below)

I asked Cortana to open a new Chrome browser (it failed, and provided a list of instructions on “How to Make Cortana Use Chrome or Your Default Browser” - see below)

Back to our question: why can’t we talk to our computers today?

The most fundamental answer is: because they don’t understand what we want / they don’t understand common language to a degree that would enable a pleasant voice-based user experience.

Secondarily the answer is: because the software hasn’t been updated to connect voice commands to apps.

Crucially - the answer is NOT because the actual connection of voice commands to the operating system/software is difficult. The connection already exists. We can already ask computers to do things like set timers.

The second answer is only the case because companies hadn’t solved the first issue of teaching computers to understand our “intent”.

You can imagine how poor a user experience it would be to give commands to your computer and have it only be able to act as you had desired 25-50% of the time.

Contrast this with how a mouse and keyboard work. The mouse is able to accurately translate your “intent” into computer language 100% of the time.

You: Click on app

Computer knows: You want to open app

You: Click on new cell in Excel

Computer knows: You want to edit that cell

You: Click and drag a square on Powerpoint

Computer knows: You want to move the square

There is no ambiguity. Every time you use the mouse the computer knows exactly what it is you want to do.

The same is true with the keyboard.

You type: www.google.com

Computer knows: You want to go to google.com

You type (at least on my computer): Function + F9

Computer knows: You want to turn on/off Bluetooth

LLMs like ChatGPT are going to unlock voice as a new component of the “GUI”. I put GUI into quotes because it stands for graphical user interface, and voice doesn’t perfectly fit under that definition. Still, when people think of the GUI they think of the “mechanics through which humans interact with technology”. Using this broader definition voice is an obvious fit.

The vast majority of things we need to interact with on our computer are text based. There may be some icons without text, but image recognition software can already interpret common icons/images with near perfect accuracy.

There is nothing magical about the mouse or keyboard. They work because they can interpret signals with 100% accuracy and translate them to our computers. This makes for a pleasant user experience.

ChatGPT and other LLMs have largely solved the problem of understanding “intent”. They are just getting into their third month after launch, and subsequent versions will be able to understand intent as good or better than the vast majority of humans.

All that’s left is to connect them to all parts of the operating system and apps.

It won’t be long before you can just tell Excel: “Write a formula that shows a ‘1’ if the figure in a cell is positive, ‘0’ if it’s negative, and prints ‘n/a’ if the figure is text.”

You won’t have to know how to write a nested “if” statement - you will just be able to describe what you want and Excel will do the rest.

Based on the rate of progress I’m seeing - I expect we’ll have something like this by year’s end.

www.google.com

Google

Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for.

Benjamin Buchanan

$15.8MFollowers

@01core_ben

Already have an account?