Google Gemini vs ChatGPT: Generating Alt-Text. Who Wins?

An illustrative comparison between Google Gemini and ChatGPT for generating alt-text, depicted through a split design. On the left, Google Gemini is symbolized by futuristic, abstract shapes in blues and greens, embodying innovation and digital intelligence. On the right, ChatGPT is represented through a network of interconnected nodes and lines in warm orange and red tones, highlighting advanced AI technology and communication. A bold 'VS' symbol in the center accentuates the competitive analysis between the two technologies, against a backdrop that merges both designs, suggesting a blend of competition and collaboration in the field of alt-text generation.

Introduction

I’m currently subscribing to ChatGPT and Google Gemini to test them both out. One of the tasks I often use chatbots for is creating alt-text for my images.

For the importance of using alt-text in your images, see this post.

The prompt given to Gemini and ChatGPT was: ‘Alt text for this image‘.

Here is the image I used. Note that it includes text instructions and images.

The image contains a dialogue with a request and a response:

Request: "Create a nice neutral soothing background for a Windows computer with 4K resolution."

Response: "Sure, here is a nice neutral soothing background for a Windows computer with 4K resolution:"

Following this exchange, there are four images displayed as potential backgrounds that meet the criteria of being neutral and soothing, suitable for use as a wallpaper on a Windows computer with a 4K resolution screen.

The top-left image shows a stylized watercolor landscape with layered hills or mountains in shades of purple and pink, with a sky filled with fluffy clouds.

The top-right image portrays a calm beach scene at dusk or dawn with soft orange light reflecting on the water and wet sand.

The bottom-left image captures a sunrise or sunset view over a tranquil ocean with gentle waves, featuring rich orange and blue hues.

The bottom-right image depicts tall grass or reeds in the foreground with a sun setting behind them, casting a warm glow and long shadows.

Google Gemini Output

Sure, here are some alt-text options for the image you sent:

  • A collage of four nature scenes: a beach, a forest, mountains, and a waterfall.
  • Four landscape images arranged in a square: a beach with white sand and blue waves, a lush green forest with tall trees, snow-capped mountains, and a cascading waterfall.
  • A collection of scenic photos: a tropical beach, a dense forest, majestic mountains, and a powerful waterfall.
  • Four calming nature images: a serene beach, a peaceful forest, tranquil mountains, and a breathtaking waterfall.

You can choose the alt text that best describes the image and is most relevant to your content.

ChatGPT Output

The image consists of a collage of four different landscapes, likely intended as options for a computer wallpaper with a soothing and neutral theme suitable for a 4K resolution display:

  1. The top-left image shows a stylized watercolor landscape with layered hills or mountains in shades of purple and pink, with a sky filled with fluffy clouds.
  2. The top-right image portrays a calm beach scene at dusk or dawn with soft orange light reflecting on the water and wet sand.
  3. The bottom-left image captures a sunrise or sunset view over a tranquil ocean with gentle waves, featuring rich orange and blue hues.
  4. The bottom-right image depicts tall grass or reeds in the foreground with a sun setting behind them, casting a warm glow and long shadows.

These images evoke a sense of peace and tranquility, with soft color palettes and natural settings that are often chosen to create a relaxing environment on a computer desktop.

The Winner Is… (Before Text Clarification)

ChatGPT, obviously.

Gemini didn’t even acknowledge the text in the background, which explained that the images were intended to be used as a background for a 4K monitor.

Gemini gave me four options, which was nice, I suppose, but each of them was very short and not really detailed enough to be appropriate for alt-text.

ChatGPT’s response was much more detailed and more appropriate for someone who might have sight issues and wants a description of the photo read to them.

Neither of them mentioned the text in the image, which could be my fault as I didn’t mention that in the prompt. When I added the following prompt:

That’s excellent, but you did not acknowledge the text in the image. Please fix that.

ChatGPT did a good job of responding:

The image contains a dialogue with a request and a response:

Request: “Create a nice neutral soothing background for a Windows computer with 4K resolution.”

Response: “Sure, here is a nice neutral soothing background for a Windows computer with 4K resolution:”

Following this exchange, there are four images displayed as potential backgrounds that meet the criteria of being neutral and soothing, suitable for use as a wallpaper on a Windows computer with a 4K resolution screen.

…while Gemini misunderstood the prompt:

Absolutely! You’re right, I apologize for the oversight. To ensure your image is best understood for accessibility and usability, here’s how I’d adjust those alt text options to include the text:

Options Focusing on the Images:

  • Option 1: A collage of four nature scenes with the overlaid text “Find Your Serenity”: a beach, a forest, mountains, and a waterfall.
  • Option 2: Four landscape images arranged in a square with the text “Find Your Serenity” in the center: a beach with white sand and blue waves, a lush green forest with tall trees, snow-capped mountains, and a cascading waterfall.
  • Option 3: A collection of scenic photos including the words “Find Your Serenity”: a tropical beach, a dense forest, majestic mountains, and a powerful waterfall.
  • Option 4: Four calming nature images with the caption “Find Your Serenity”: a serene beach, a peaceful forest, tranquil mountains, and a breathtaking waterfall.

Option Focusing on the Text:

  • “Find Your Serenity” text with supporting background images of a beach, forest, mountains, and waterfall.

The Winner Is…

For generating alt-text for an image, there is no question that ChatGPT is better than Google Gemini.

Leave a Reply