ChatGPT: Analysis and implications for conversational technology

Table of content

By Sergio Soage, Machine Learning Engineer on Aivo

As we saw in the second issue of our series of articles on conversational AI, ChatGPT has gained and continues to gain a lot of attention. Not only from the artificial intelligence world but also from pretty much the rest of the world.

That's why from Aivo we decided to analyze (as with the GPT-3 releases and other foundational models) comprehensively their capabilities and limitations to determine possible use cases to enhance our conversational pipeline.

Limitations of ChatGPT

In short, ChatGPT (for more details see the previous article) is a new model based on GPT-3 and created by OpenAI. This foundational model was heavily supervised, trained, and refined via dialogues using reinforcement learning. All this training achieved promising results and ChatGPT impresses users with its ability to generate text from a prompt given by users. However, in extensive testing, we have seen that this model is not entirely accurate and there is much we can and should learn about how it works.

The results of our tests indicate that, as a conversational technology, it is not entirely reliable and it could even cause multiple problems for our clients.

Take this simple example:

‍

‍

-Two children, Chloe and Alexander, went for a walk. They both saw a dog and a tree. Alexander also saw a cat and pointed it out to Chloe. She went to pet the cat. Who saw the cat first?

- Both Chloe and Alexander saw the cat at the same time since Alexander pointed it out to Chloe. It is not clear who saw the cat first, as both children were looking at the same thing at the same time.

EXTRACTED FROM CHATGPT

In this case, it is clear that it does not understand the meaning of what it is reading and saying, giving an ambiguous answer. These and many other examples explain why ChatGPT is not a recommended solution for applications where accuracy and consistency are essential.

With the current deployment, ChatGPT does not serve as a reliable core service for applications that are also domain specialized. Even if it had a backend that could validate the response, it would also be unclear whether it can provide accurate or correctable information without retraining. As we saw in the previous post, Even OpenAI makes it clear that the system is unreliable. In summary, the disadvantages of models such as ChatGPT are:

It does not provide factually correct information: OpenAI itself clarifies "OpenAI confirms that ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers."
Lack of specific domain: Does not have the knowledge to solve specific business queries.
Lack of transactional and even conversational skills: Has no access to customer databases, nor is it capable of maintaining the conversation beyond the length of the tokens that can be supported by the prompt.
Reliance on a third-party app for all customer data processing. Data privacy and security are big issues in this case.
OpenAI wrote a very good report regarding the issues and potential mitigations. You can read it here and expand the information in this paper.

GPT chat: Successful use cases

Does this imply that models such as GPT and others should not be used in conversational technologies? Clearly, the answer is no. These models are incredibly useful for improving ours and elevating the user experience. At Aivo, we use GPT-3 and other models in multiple ways, among them:

Data generation

Training data: Several papers use foundational models to generate data, such as.:

GPT3MIX: https://arxiv.org/abs/2104.08826
HYDE: https://arxiv.org/abs/2212.10496

Dialog Generation:

Synthetic dialog generation allows us to improve our Dialogue State Tracking models, changes in Intents, or any type of change in the conversation (sentiment, etc.)

Answer generation

By being able to influence the generation of responses, either via prompts or via the model parameters (temperature), we can generate variations in the answers to improve the user experience and adjust it to the characteristics of the conversation.

Accelerating development times to improve our models

ChatGPT is not, and will not be, the only model available. We can obtain data from multiple sources, controlling and choosing where and how to use such content. This allows us to shorten significantly the time to improve our models by reusing the knowledge acquired by the foundational models in a selective and controlled way.

These are just some of the ways in which we at Aivo use the latest technologies, and we will go deeper into them later.

This is the third article in a series on foundational models, ChatGPT, and conversational technologies. If you haven't already, we invite you to read the previous ones on our blog. In next month's issue, we will go deeper and offer concrete examples of how we use these models at Aivo and what improvements we get with real-world data.

In the meantime, if you want to learn more about the technology we use at Aivo, you can start by learning about our Suite.

See you in the next article!

Disclaimer: This article was NOT generated by ChatGPT but by a human :)

ChatGPT: Analysis and implications for conversational technology

Table of content

Limitations of ChatGPT

Related article: Introduction: the risk of the foundational models of artificial intelligence

GPT chat: Successful use cases

Data generation

Accelerating development times to improve our models

Chatbots also participate in education

Efecty introduces Gyra, a bold, agile and innovative virtual assistant

Video Conversational AI: The First Step Towards the Metaverse

WhatsApp Business: 3 Steps for Automating Your Customer Service