Even as users grapple to figure out how to use ChatGPT-3.5, 4.0 and other types of generative AI, OpenAI releases ChatGPT-4o. No; that’s not a typo, just a stupid way to make a name more complicated. It’s a lower case ‘o’; not a zero — and it stands for ‘omni’.
With the latest version of this genAI product, OpenAI is introducing new ways to interact with its knowledge base and merging some of its modules that the company suggests will make it faster and less prone to some types of mistakes.
To the user, the most noticeable recent enhancements that are in 4o include…
Users can upload images and receive detailed descriptions, answers to questions about the images, and even modifications to the images.
Longer context retention — that is, it can have longer conversations with you and still retain the information that was in the earlier parts of the conversation. This should make it more accurate because you can continue to refine your questions and requests to get closer to what you really want.
Advanced error handling including better disclaimers when it’s not sure about the answer.
Better code building for software developers
I’ve been experimenting to understand just how far the new versions have taken the product versus the older versions and compared to doing it myself manually. Frankly, most of these enhancements in 4o are subtle — in many cases too subtle to even know they are happening unless you look for them with a keen eye. And they still haven’t broken through the barriers that keep many of the answers from being far from what you want.
For example, before they released 4o, I used ChatGPT to create images to promote the Outlook add-in discussed in this issue. In a series of prompts, I asked ChatGPT to produce an image that included a Microsoft Outlook logo, a Salesforce logo and illustrated seemlessly moving email from Outlook to Salesforce.
My first output did a pretty good job of illustrating the point; but among the flaws, it spelled Outlook wrong. (This is an issue that I found consistent in ChatGPT and some other genAI models — that even if you give it exact spelling to include in images, it spells the word differently.).
For my second output, I requested only that ChatGPT fix the spelling of Outlook — and it came up with a better spelling. But it also changed the entire concept of the image. It no longer illustrated the concept as well.
Then, through additional iterations, it came up with other concepts in consecutive tries, most of which had one flaw or another. In one, it showed a laptop keyboard without the rest of the laptop — which I showed to a colleague who told me that was the first thing he noticed about it.
Is ChatGPT-4o any better?
After ChatGPT-4o was released, with the promise that it could ingest images better and give better results, I tried it again — this time starting by uploading the initial image (with the nice illustration of concept but misspelling; and I asked it to fix the spelling.
Again, it created a whole new image instead of using the original. So I asked it to use the original concept instead and try it again. The result: a third image.
Finally I decided to have a conversation with the chat to see why it wasn’t giving me the result I sought. Its response was that it cannot modify just a section of an image.
Bottom line on this experiment (which was similar to results in other attempts) was that the disclaimer was better than those I got from previous versions; but it didn’t give it to me until after I asked why. More importantly, although the creators say the context retention is better, it’s still not good. It could not remember the image that it created for me even one iteration in the past.
This won’t keep me from using ChatGPT or other AIs. It will, though, force me to be vigilant at a level I typically don’t have to be when editing what I create myself or what I get from colleagues.
Leave a Reply