14Feb

The current state of AI Chat

The cool thing about AI is that it’s reached the state of general consciousness. Oh not the robots. They’re as dumb as ever. I just mean that everyone’s started talking about it.

That is; Generative Pre-trained Transformers, ChatGPT and Davinci-3. 

Apart from the opinions that encompass; “I worship our new overlords”; it’s a new, interesting, useful and maturing technology. It’s tempting for users to anthropomorphise it. In that case, those users are failing the Turing test.

The “Generative Pre-trained Transformer” Language Models known as Davinci-3 and the canned version “ChatGPT” are algorithms (software programs) that use probability to determine the most likely output based on the input text. They ‘consider’ what words are more likely to come after certain others in order to create a more natural sounding sentence.

The algorithm is pre-trained and uses a huge set of data (like the internet) to complete responses. The training helps the language model learn the context of words in a sentence. The model generates text similar in style and content to the original.

These language models can be used to summarise, answer, translate and many other tasks.

ChatGPT is not writing a poem, it is looking at the probability that a group of words go together given a very large set of data. It is trained about what the output should probably be. ChatGPT for example takes an input sentence and generates a response by predicting the next word or phrase in the conversation. ChatGPT has also been further ‘rigged’ to give more predictable answers.

If you ask ChatGPT if a man’s name or a woman’s name is a doctor, ChatGPT will say it can’t say as it has been set up that way. The raw Davinci-3 model will say the man is the Doctor, because that is mostly the bias in the raw data of the internet.

The responses aren’t perfect. GPT models can be unpredictable, unstable and sometimes downright incorrect. 

Q: What is heavier, 1kg of beef or 1kg of compressed air?
A: 1kg of beef is heavier than 1kg of compressed air.

Please don’t use your solution in the machine targeting that cancer.

I asked the Davinci-3 Model: “What can you tell me about the trumpet player Virgil Reality?”

“Virgil Reality is an American jazz trumpet player and composer based in New York City. He has performed with a variety of renowned artists, including John Scofield, Wynton Marsalis, Dave Holland, and many others. He has released several albums as a leader and co-leader, most notably his 2019 album “The Virgil Reality Project”. In addition to his work as a performer and composer, he also serves as an educator at the New School for Jazz and Contemporary Music in Manhattan.”

Now every part of this is patently untrue. Davinci-3 can put together words, however it has no actual context. It doesn’t actually ‘know’ anything. Especially if the solution doesn’t fall within its training model. That means that even small variances in the context of a problem, will lead to a wildly incorrect or false result. 

The dive on this is pretty deep. I found a great article by Colin Fraser who is a Data Scientist at Meta. His article ChatGPT: Automatic expensive BS at scale from Jan 23, is enlightening to say the least.   

It’s simply a matter of Scale

There is nothing to stop anyone from developing their own language model or GPT, given time and money. Currently the problem is scale. It’s so popular that even the paid API accounts have started to have a greater amount of ‘sorry’ when trying to interact.

To scale up to billions of daily user requests is going to take some doing. Currently an answer costs in computer terms; approx 10 to 100 times the cost of a regular Google Search. (Source: University of California, Berkeley. “Scaling Up to Billions of Daily User Requests: Cost-Effective Solutions for Large-Scale Web Search” and Sam Altman CEO OpenAI). 

If the cost of a response is USD $0.05 and there are 8.5 billion searches per day, this would mean USD $425 million per day to deliver the service in its current form. Of course the AI itself could optimise the model theoretically. 

The Search

One of the first big ticket disruptions is in the search market. Principally between Microsoft and Google. I’ve put Microsoft first here as they are firing the first shots. 

Microsoft invested early in OpenAI. They invested USD $1 Billion in 2019 and have now shovelled in another multi-year investment in January 2023, said to be USD $10 billion. (Source: Bloomberg). 

This makes sense as Bing only has 9% of the search market globally and Google has 90%. 

According to Investopedia, Google Services generated $69.4 billion, or about 92% of total revenue, in Q4 FY 2021. Advertising revenue, at $61.2 billion, comprised 88% of the segment’s revenue. That is Google Ads (mostly search ads and some display).

GPT is an existential threat to Google’s search business as it could disrupt the way searches are performed and Microsoft hopes to integrate GPT into their Bing search engine to gain market share. Thus the large investment in OpenAI.    

Google has been working on AI for a while now, though the latest moves from OpenAI with backing from Microsoft have floored them. In fact it has caused Google to issue an internal Code Red.  

Google’s product is called LaMDA (Language Model for Dialogue Applications) and the first generation was released in May of 2021 based on Google research from 2017. LaMDA 2 was released in 2022. 

This is the one that Blake Lemoine, a Google employee was placed on administrative leave for claiming the product was sentient. Mostly as he didn’t agree with the morals in the responses as has been reported. I think we are going to see a lot of this. Reading some of the articles on ChatGPT is frustrating, even if it is fantastic to see the range of opinions and thought on the subject.

Google is working on a number of other AI projects, including Image Generation Tools, Prototype App Testing and MusicLM to generate music. As an aside all of these and the OpenAI projects are mired in ethical issues about repurposing everyone’s copyrighted content. It’s not going well for Google due to OpenAI’s big leap forward. 

Recently two presentations were given. One by Microsoft and one by Google. The Microsoft presentation was a triumph presenting the new AI powered Bing and Edge Browser. Watch it here.

In Google’s presentation the presenter was flustered when the demo phone was not there to give the live demonstration. Google’s parent company, Alphabet, lost $100 billion worth of value during and after the presentation.

It might look a bit like this: https://www.perplexity.ai/ Integrated Bing and Chat to show what searching using AI might be like. 

Some of the issues

Control: GPT models are trained on large datasets and the output is not necessarily related to the input. Possibly not accurate, relevant or appropriate. 

Too Generalised:  GPT models may produce outputs that are too general or vague. I noticed when using questions to generate ‘scaffolding’ for text, the output felt ‘hokey’, ‘corny’ and sometimes clichéd. It has helped me mine the dataset to gather ideas together, though it isn’t thinking those up, it’s presenting them from the data.  

Bias: GPT models can inherit biases from the data they are trained on, which can lead to inappropriate and offensive outputs. They are also limited by the quality of the data, which may contain errors.

Limited scope: GPT models are limited in their ability to understand complex contexts and relationships within text, so they may miss nuances or overlook important details in the output they generate.

Originality: GPT models often produce results that lack originality and creativity, as they tend to generate responses based on what has been seen in training data. In other words, how good is the training? OpenAI is gaining strides for the work put in, and to be put in. Though it is that training work that determines the usefulness.

Scale: GPT models require a large amount of computing power and time for training. Although ChatGPT exists now and has this release’s usefulness, there are further GPT models and training to come. ChatGPT is already so popular that many users are getting “Come back later” messages. Source: OpenAI.

Statistics: Due to reliance on statistical methods, GPT models are vulnerable to adversarial attacks where malicious actors can manipulate input data in order to produce misleading or incorrect results.

It’s very useful though

Demand is already high. In just five days ChatGPT crossed 1 million users. For comparison, Netflix took 41 months. Facebook took 10 months. Instagram 2.5 months. Source: Statista report titled “Netflix, Facebook and Instagram: How long did it take to reach the first million users?” and OpenAI

If the output is any kind of text or binary (audio, image, maybe video in the future) then we’ve all been creating enough data for an algorithm to mine it and recombine it.   

I’ve integrated the Davinci-3 Model into my workflow, using it for scaffolding, ideation, and basically getting an alternative way of doing something, summarising or looking for errors. I don’t use the output in production, without editing, with the usual amount of quality control and testing.

It is great for scaffolding. Creating frameworks and outlines and then refining them using experience. It can pick up errors, though just as easily create them.  

I started about a year ago with an invite to the Dall•E2 image generator and then got a business API account. I can write programs that use the API account or use the web interface. It has been very consistent, though as of Feb 23 it is obviously having issues due to scale as per the speed of growth mentioned earlier.  

Your results also depend on how good you are asking and structuring the right questions, or on your career experience. If you’re an experienced copywriter, you would be able to use the tool to ‘scaffold’ work, mine the data set for responses; however also reject responses and nonsense using your skills. Tip: You can use the interface to also refine your questions.

Research, Generate and Debug Code, Create a weight loss plan, Ideation and Creative, Fine Tune, Discuss Strategy, Scaffold Scripts and Copy, Generate Images. I’ve used them all in my workflow. 

MickyD’s or Al La Carte. Your quality control and results may differ. The usual question of “how much does it matter to you” comes into play, when judging the output.

The bottom line: it will generally get you closer to your result, much quicker, if you can ask the right questions. 

Are we ready?

I don’t even think slightly. There are issues that are going to slow the rollout (technical ones), though the usefulness of the tool (and the hype) has already changed so much.

Already China has tried to enact laws on generative content

Removing your data from the model (GDPR) may prove as difficult as removing things from the internet is already, maybe more so. 

As stated previously, a cheeseburger is just as relevant as Al La Carte dining depending on your point of view. Desktop Publishing killed Typesetting overnight as the business flyer didn’t have to go to the expensive printer. It wasn’t as good. It did the job.  

Students will be and are already cheating on their essays. I’m sure the results are mixed, yet, there goes human nature. 

Fire. It’ll warm your house or burn it down.

The next wave is here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This field is required.

This field is required.