GPT-4 vs ChatGPT: Everything you need to know

In March 2023, OpenAI unveiled GPT-4, the new version of the language model that powers ChatGPT, its famous conversational robot. Presented as more accurate and reliable, it can even interpret images! Read this article to discover what else GPT-4 is capable of doing, and why is it more powerful than ChatGPT.

After months of rumors and speculations, OpenAI made official GPT-4, the new version of its language model, the engine behind the revolutionary AI ChatGPT, the conversational robot that has been so much talked about on the Internet since its public release in November 2022. The company released the new version via an update that improves AI capabilities while introducing promising new features. Subscribers to the paid ChatGPT Plus program can already take advantage of it.

"GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks," says OpenAI in its statement.

The startup promises that with GPT-4, its chatbot will become "more creative and collaborative than ever." And, surprise, Microsoft's AI in Bing is already based on it! So, from the first glimpses, is the new version of conversational AI that great? Does it mark an incredible difference from its predecessor? And, above all, does it solve the shortcomings of artificial intelligence and the drifts it brings?

Is GPT-4 the most advanced language model?

GPT is a generative language based on a neural network model that mimics the human neural system through algorithms. This artificial intelligence system is trained by deep learning and analyzing huge volumes of data - from the Internet in the case of GPT. This combination allows it to generate text by "reasoning" and writing in the manner of a human being.

GPT-3, the third generation of this technology, has been one of the most advanced AI text generation models. The previous versions, GPT-1 and GPT-2 had 1.5 billion parameters that define the AI's learning process and structure the results it achieves. The number of parameters in an AI model is generally used to measure performance: the more parameters, the more powerful, smooth, and predictable the model. GPT-3 was a real leap forward in this regard, as it grew to 175 billion parameters. For GPT-4, on the other hand, OpenAI did not want to reveal the exact size of its new model.

What is the difference between GPT-4 and GPT-3?

GPT-4 takes the basics of GPT-3 and can generate, translate and summarize texts, answer questions, serve as a chatbot, and generate content on demand. It brings a promising new feature and many improvements, as OpenAI explains on its website.

Accepts images

One of the most interesting new features is that the language model becomes "multimodal." Indeed, thanks to a collaboration with the start-up 'Be My Eyes,' GPT-4 can analyze and respond to queries containing text and images, whereas GPT-3 was limited to the written word. "It can flexibly accept inputs that intersperse images and text arbitrarily, kind of like a document," OpenAI co-founder Greg Brockman summarized to The Guardian. Simply put, the user can submit an image with a question to the new template. For example, if the user enters a hand-drawn sketch detailing a website project into the chatbot, GPT-4 produces a detailed response explaining the steps to make that site - but it still only generates text.

Announcing GPT-4, a large multimodal model, with our best-ever results on capabilities and alignment: https://t.co/TwLFssyALF pic.twitter.com/lYWwPjZbSg
— OpenAI (@OpenAI) March 14, 2023

The New York Times conducted several tests with GPT-4. The reporter submitted a photo of the contents of his refrigerator to the AI, asking it what he could cook with the food present. It was able to suggest several recipes containing the available ingredients. Only one of the answers, a wrap, required an ingredient that did not appear present. In another example, a visually impaired person submits a photo of two shirts of the same design but in different colors to the AI, and the AI tells him which one is red. According to OpenAI, GPT-4 can "generate the same level of context and understanding as a human being," explaining the world around the user, summarizing web pages drowning in the information or answering questions about what it "sees," for example. This option is not currently available and is still being tested at 'Be My Eyes,' which uses GPT-4 for a visual accessibility product, but should be available in a few weeks.

A more creative and useful AI

According to OpenAI, GPT-4 is more creative and collaborative than its predecessor and any other existing AI system. First, the new language model produces more accurate answers faster without crashing because of too many simultaneous queries submitted by users. In addition, the size of the text entered as a query has been increased since GPT-4 can now analyze texts of up to 25,000 words, compared to about 3,000 words for GPT-3.5. This means it can analyze larger texts - a novel, a short story, a scientific article, etc. That allows the program to analyze a larger text and solve more writing or synthesis problems.

OpenAI claims that "GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5." This version of the language model would therefore be better at tasks that require creativity or advanced reasoning. For example, during Greg Brockeman's demonstration, the company's co-founder asked him to summarize a section of a blog post using only words beginning with "g." The AI could be used for tasks such as music composition, screenwriting - books written by ChatGPT in its GPT-3.5 version have already been pouring into the publishing market for a few weeks - and reproducing an author's style.

Better test results

According to the results published by OpenAI, GPT-4 has taken an important step forward regarding the accuracy of its answers, reducing the gross errors and illogical reasoning found on ChatGPT with GPT-3.5. The company has put its new language model through biology, law, economics and literature tests. And GPT-4 far outperforms its predecessor, as seen on the graph - the results are in blue for GPT-3.5 and in green for GPT-4.

We note, however, that while there are clear improvements, AI still struggles with exams that require creativity, such as languages and English literature. It did, however, pass the US Bar exam with a score close to the top 10% of candidates, where GPT-3 was around the bottom 10%. GPT-4 also performs very well in many languages - English is its "native" language, the one it uses as a base - with an accuracy level of 84.1% in Italian, 83.7% in Spanish, or 83.6% in French. These results mean that users will get higher-quality responses.

A more secure language model

OpenAI has worked hard to make GPT-4 more "secure" and to avoid its abuses as much as possible. Thus, it would be 82% less likely than GPT-3.5 to respond to requests for unauthorized content, such as coding malware, for example. Similarly, its accuracy has been increased, as it is now 40% more likely than the previous version to provide a factual response to a request.

Not all problems are solved, though! Indeed, AI still tends to "hallucinate," inventing and asserting with aplomb false information. That's why it reminds us that "one should be very careful when using the results of a linguistic model, especially in high-stakes contexts," adding that "GPT-4 presents similar risks as previous models, such as generating harmful advice, malicious code or inaccurate information".

How to use GPT-4?

OpenAi has already worked with several vendors to create new services and applications integrating GPT-4. These include Duolingo, Be My Eyes, Stripe, Morgan Stanley, Khan Academy, and the government of Iceland. Developers can sign up for a waiting list to access the company's API. As for the general public, they have already had a glimpse of GPT-4 with... the chatbot integrated into Bing by Microsoft! Indeed, when announcing its AI Prometheus, the firm had not precisely indicated which version of the OpenAI language model it was based on, only explaining that it uses "the learnings and key advances of ChatGPT and GPT-3.5". Microsoft explains that Bing will benefit from improvements as OpenAI "brings updates to GPT-4 and beyond," thanks to which "we'll have multimodal models that will offer completely different possibilities, for example, videos" - GPT-3.5 is only capable of generating content in the form of text, tables and computer code. In addition to the OpenAI enhancements, there will be its "own updates based on community feedback." Hopefully, their integration will cause fewer clashes this time!