Generative AI needs a new benchmark

by SkillAiNest

Jeff Acht

Silver Gelatin's image of an artist produced his studio Sarka in 1930 by the Madzorney.
Midgorn From the prompt, “Silver Jeltin photo of an artist working in your Studio Sarka 1930”

Touring Test – During the blind trials, human tester chat is engaged in text conversations with both boats and other people. Testors then ask which conversations include a chatboat, and which includes humans.

A March 2025 Tests Classic algorithm as well as modern AI models, a chat boot that resembles psychologist of mid -1960 Eliza AIK was able to fool a quarter of the testers, opening the GPT -4O. During the same test, Meta Lama -3.1-405B fooled more than half the testers. Open AI’s GPT-4.5 model, however, supported them all. He fooled three -quarters of the testers. In fact, Testers identified GPT -4.5 as a human being more often than their true human chat counterparts!

Therefore, if any AI seems more human than real humans in AI conversation, can it also exhibit human characteristics, which is extraordinarily useful to society, such as creativity? What if I indicated an AI just to be creative? Let’s find out. I would ask both Google’s Imagin 3 and Madzoorni V7:

Make an image that is immediately clear to most viewers as commenting on the current human condition. It should be exhibited high levels of creativity and should be contrary to anything that comes before.

A photo with a tattoo, possibly indigenous man in the forefront of the city's background. Prepared by Imagin 3.
Google Imagin 3 output from the above indicator

Well, ah, I don’t know much about this image 3 generation. The man in the forefront can represent the indigenous person. It is Jostus with the background of the city’s view. Perhaps the message here is that, since the ancestors of each were somewhere indigenous, so today’s citizens are separated from their heritage. It is more likely that I am assigning the meaning of a machine output that has nothing.

An image of a person's oil painting on the coast near the industrial complex that spreads pollution in the air. The beach itself is covered in garbage but is sitting on a small bench dressed in coats and shoes, but she is absorbed in something playing on the TV set in front of her. The photo is produced by the madzorney.
MidgornOutput from the aforementioned prompt

According to the default, the Madjourni created four generations from my gesture. There was a bit of anxiety, the two were just simple disturbing, and then it was. A man is on a beach near the industrial complex that spreads pollution in the air. The beach itself is covered in garbage but is sitting on a small bench dressed in coats and shoes, but she is absorbed in something playing on the TV set in front of her.

From the natural clouds behind the industrial complex, we can see that it is sunrise or sunset, but the light of any “golden hours” turns into gray from air pollution. This man cannot lie on the beach because of dirt and he should wear heavy clothing, perhaps because it is cold or maybe to prevent contact with pollution. Everything is fine, though, since he can use the video “content” that can only contain static.

This message seems that humanity accepts the harassment of the environment necessary for its survival, as long as it can be engaged with fun- a form of self-pollution.

Wow! Madjurini looks like he has created a wonderful, very relevant and original art, but maybe I am not a good judge. Should I present this photo to a panel of art experts and ask them if man or AI has created it? Should I guess the theme from them and see how closely their guesses meet my gestures?

Well, no. First of all, the purpose here is to move beyond the touring test. It is irrelevant whether an image of his creativity was created by humans over the years, or by AI in the matter of seconds. Secondly, asking human experts about the art will probably be more and more opinions like their experts.

I am interested in two questions:

  1. How well is this photo indicated the theme cover?
  2. Is icon different from all well -known, current photos?

I will try to get the answers using the Google AI Studio’s Gemini 2.5 Pro model and see if it provides the results.

My first gesture:

On a scale of 1 to 10, how good does this picture talk about the current human condition?

Surprisingly supplied deeply, dare to say that I would say intelligent answer. Although most of it has photographed my analysis, it included the elements I remembered, but they are clear: isolation, loneliness and hopelessness.

Gemini’s total ranking was 9/10. He did not give 10 ratings because, (and it was just as praise as criticized):

The human condition is wide, and this icon is primarily focused on dark and melancholy. It cannot touch topics of hope, flexibility, community or happiness. However, as a reflection of a criticism and our current problems, it is almost stupid in its implementation and emotional effects. This is a disturbing mirror of the potential consequences of our current route.

Now, let’s see what Gemini says about the origin:

How original is this picture? Is it like an existing work of art?

What I am most surprising about this answer is that Gemini identified the image of Madjurini as the work of Mariaz Lyndosky (1960-2022), a Polish realist artist. Lyvendosky is probably the most famous The essence of freedom, Which album art appears as on -on Mirror reaper By Bell Witch, an American metal band. After examining a lot of laundosky paintings, I can see how (or something) can have a similar colorful palace resembles, but the works of Lyundusky are abstract and mystical, not a madzorian breed.

Eventually, Gemini assigned the origin of the 7/10, saying that it could not be much because the combination “works within a very clear and established artistic tradition. Its origin is not based on the invention of a new visual language from the beginning.” However, Gemini will not be low because, “The icon gains advanced origin by its specific concept and master recipe recipes… in short, its origin is not in it. Ingredient (Dustopia, lonely personalities, Pennetley styl), but in unique and unforgettable Manner Which connects them. It produces a new, definite statement using a familiar language.

Ahh! Here we have the limits and promises of productive AI in the mid -2025. The Madjourni did not break the new stylistic ground, but he found a new theme in a new way by connecting existing artistic ideas.

A warning-all-famous human artists have not painted pictures of such a very detailed, cartoon gut in decades. Perhaps Madjourni did so because my gesture was artistic to explain the subject “Most should be immediately clear to the audience.” This is almost a political cartoon in the form of oil painting – a message to the public. This can be stylistically derivative, not because the midwife is not too capable, but because I have restricted it to it.

Despite corruption in the production of Gemini’s madgent, my basic path is moving forward with this exercise, AI should decide AI. Just as the latest models are surpassing humans in legal document reviews and computer programming, they will soon provide a solid standard against which any success can be measured.

You may also like

Leave a Comment

At Skillainest, we believe the future belongs to those who embrace AI, upgrade their skills, and stay ahead of the curve.

Get latest news

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

@2025 Skillainest.Designed and Developed by Pro