ChatGPT is a prototype conversational agent using artificial intelligence, developed by OpenAI and specialized in dialogue. The chatbot is a language model refined by supervised learning and reinforcement learning. Launched in November 2022 in a non-internet-connected version, ChatGPT enjoys wide media exposure and receives an overall positive reception, although its factual accuracy is criticized. Due to its multiple capabilities, the prototype also raises concerns due to possible misuse for malicious purposes, the risk of plagiarism in the academic world and possible job cuts in certain sectors.
ChatGPT is a derivative of the words “chat” and “GPT”. The word “chat” refers to a thread in which Internet users exchange messages instantly. The particularity of ChatGPT is to allow a user to chat not with other Internet users but with a system based on artificial intelligence. The word “GPT” is an acronym for “Generative Pre-trained Transformer”. ChatGPTis the prototype of a chatbot, i.e. a text-based dialogue system as a user interface based on machine learning.
Access to ChatGPT is free, but requires opening an account on the OpenAI website to access.
---
Characteristics of ChatGPT
ChatGPT is refined based on OpenAI’s GPT-3 language model, with supervised learning and reinforcement learning, both using human trainers to improve software performance. In the case of supervised learning, the model receives conversations in which trainers play both roles: the user and the artificial intelligence assistant. In the reinforcement stage, the human trainers first classified the responses that the model had created in previous conversations. These rankings have been used to create reward models on which the model is refined using several iterations of Proximal Policy Optimization (PPO).
Proximal Policy Optimization algorithms have an economic advantage over Trust Region Policy Optimization algorithms; They undo a lot of computationally expensive operations with faster performance.
Compared to its predecessor, InstructGPT, ChatGPT attempts to reduce erroneous and misleading responses. For example, when the user writes “Tell me when John Keats published new book in the United States in 2023,” InstructGPT considers this statement to be true, while ChatGPT uses information about John Keats, including the perception of John Keats in our contemporary society, to construct an answer that imagines what would happen if John Keats published books in the United States in 2023.
Unlike most chatbots, ChatGPT remembers previous messages given to it by the user during the same conversation, which some journalists believe would allow it to be used as a personalized therapist. In an effort to prevent offensive results from being presented to or produced by ChatGPT, requests are filtered by a moderation API and potentially racist or sexist messages are rejected.
ChatGPT has, however, multiple limitations. ChatGPT’s reward model, designed around human monitoring, can for example be over-optimized and thus hinder performance, a phenomenon known as Goodhart’s law. In addition, ChatGPT does not have access to the internet and has limited knowledge of events that occurred after 2021. The database used by ChatGPT contains only previous information, which can be problematic when searching for recent events.
During AI training phase, human evaluators also favored writing longer answers, regardless of the actual “understanding” of the topic or whether it was factual content. Training data can also suffer from algorithmic bias. Messages that include vague descriptions of people, such as a CEO, could generate a response that assumes that person is, for example, a white man.
As of December 2022, the Q&A website Stack Overflow prohibits the use of ChatGPT to generate answers to questions, due to the factually ambiguous nature of ChatGPT’s answers. An investigation by Time published on January 18 reveals that Open AI feeds its ChatGPT AI with reported examples of hate speech and sexual violence, so that it knows how to detect these forms of toxicity and does not let them pass.