Monday, August 28, 2023

Add a pinch of salt to creative claims for AI

Moira Rose on "Schitt's Creek" saying, "That doesn't sound right."

Sometimes someone or even somethings -- as in the case of generative AI -- will confidently assert something to be true. Don't suspend your disbelief. Investigate and demand the basis for that claim.

Creative lies from AI

Having investigated the output of both Chat GPT and Bard, I have concluded that you always have to fact-check whatever generative AI tells you because it is prone to what is called hallucinations.

In fact, as related in The perils of prompt engineering, Bard does not hesitate to fabricate facts and will double-down on its assertions with even more hallucinational as it spins out an even more elaborate web of AI-generated reality.

While I remain baffled about what motivates AI to generate false information when it should have access to all the data necessary to report accurately,I can't say that humans have a monopoly on data-backed truth. They may not fabricate editions out of thin air, but they are quite prone to fabricate conclusions on the flimsiest of pretexts.

To profess is human

Unlike the AI thought process, I can understand the human motivation in this case. Human want to make a name for himself (really it is most often a he) about an area and knows that saying something surprising will get more attention than reporting what people expect.

Knowing that some kind of apparently scientific approach is needed to justify the claim,the person just draws on a handful of students, gives them the task, and draws conclusions that turn into a major claim based on those very limited results.

I've read a lot of Dan Ariely books and really enjoy his insights into human responses and triggers. But I always take them with a grain or two of salt.

That's because the majority of his experiments involve a tiny, not terribly diverse segment of the population, namely the undergraduates at the university who can be induced to participate. Often that means that his conclusions are based on just two dozen people or so.

The small sample size doesn't guarantee that the conclusions are incorrect. However, assuming that it is conclusive proof is the classic logical fallacy of hasty generalization.

What he does with whichever group of students happen to pass by that today is not necessarily replicable by the population at large. It's kind of like just polling your siblings or neighbors to conclude that everyone's favorite flavor of ice cream is chocolate. That may be true, but we'd really need a bigger sample size to be sure.

Claiming AI is more creative than humans

Yet who can fault Ariely for doing this when everyone else does it? Indeed, that is precisely what Erik Guzik, Assistant Clinical Professor of Management at University of Montana did for is article in The Conversation entitled "AI scores in the top percentile of creative thinking."

What does a person who just reads that title think was involved in making such a claim for AI creativity? It sounds like thousands of creative people were compared to the AI to determine that its output was comparable to the top 1%.

But that's not what happened at all. Instead, this major claim is based on nothing more than 24 students whose output was deemed to be not all that creative relative to the output of ChatGPT(after 4 iterations, it sounds like) based on the assessment of the Torrance Tests of Creative Thinking, or TTCT.

We get very little shared about what the process for the students was and how they scored. The only clue we get about the creative prowess of ChatGPT is the professor's assertion:

Consider the following prompt offered to GPT-4: “Suppose all children became giants for one day out of the week. What would happen?” The ideas generated by GPT-4 touched on culture, economics, psychology, politics, interpersonal communication, transportation, recreation and much more – many surprising and unique in terms of the novel connections generated.
This combination of novelty and utility is difficult to pull off, as most scientists, artists, writers, musicians, poets, chefs, founders, engineers and academics can attest.
Yet AI seemed to be doing it – and doing it well.

Notice that the professor is merely telling us that ChatGPT did well. He fails to show us the creativity by sharing the actual output and letting us see a contrast between it and even one of the 24 students who may well have all been math or engineering majors who don't considers themselves creative writers at all.

My own test

This is not a strict replication because I had to use ChatGPT3.5 instead of 4, but it should still show what most people would actually get when putting exactly that prompt to the generative AI. Here's its response enclosed in rows of stars:

*************************************************************************

If all children were to magically transform into giants for a single day each week, the world would undoubtedly witness a fascinating array of changes and events:

Physical Transformations: The sudden change in size would lead to various practical adjustments. Everyday items like furniture, clothing, and even vehicles might need to be redesigned or adapted to accommodate the giant children.
Playful Activities: Children would engage in unique and imaginative activities that capitalize on their newfound size. Parks and playgrounds would transform into vast landscapes for epic games, obstacle courses, and adventures.
Exploration: With their enhanced height, children could explore their surroundings from a different perspective, possibly leading to new discoveries or insights about their environment.
Education and Learning: Schools might introduce special days to leverage the giants' size advantage for educational purposes. Science classes, for instance, could involve experiments that demonstrate the effects of size on various phenomena.
Social Interactions: The dynamics of social interactions might shift as children grapple with their changing sizes. Friendships could take on new dimensions, and children might learn valuable lessons about empathy and acceptance.
Impact on Economy: The demand for goods and services tailored to giant children could create unique economic opportunities. Businesses might offer oversized toys, clothing, and other products on the designated giant day.
Environmental Impact: The increased scale of activities could lead to concerns about environmental effects. Authorities might need to manage waste, traffic, and other logistics associated with giant-sized activities.
Scientific Exploration: Scientists and researchers would be intrigued by the temporary transformation and could use the opportunity to study the physiological, psychological, and ecological impacts of such a phenomenon.
Creativity and Art: Artists and creators would likely draw inspiration from the whimsical idea, resulting in a surge of giant-themed literature, artwork, movies, and more.
Sense of Wonder: The experience of becoming giants would foster a sense of wonder and imagination among children, encouraging them to think beyond conventional boundaries and consider new possibilities.

Overall, a day when all children become giants would be a remarkable and unforgettable event that would influence various aspects of society, culture, and individual experiences. It would provide a unique chance for children and adults alike to see the world from an entirely different perspective and explore the imaginative and transformative power of change.

************************************************************************

Color me not impressed. This is just an analysis of various points of view one can take on such a scenario presented in a fairly analytical way. It's not at all what I consider creative output. What would be creative is to construct a story in which some -- but not all -- of the possibilities delineated would be manifested. But simply listing them as possible considerations is analytical rather than creative thinking.

Visit WriteWayPro.weebly.com Like and follow on Facebook and on LinkedIn

Tuesday, August 8, 2023

The perils of prompt engineering

You may have noticed a few people who formerly called themselves writers or content creators now calling themselves "prompt engineers." This title gives a veneer of a more scientific credentials to what really amounts to playing around with Chat GPT or Bard or some of the lesser-known generative AI tools out there to elicit the response you seek.

At present, though, there is absolutely no licensing or testing to "prove" oneself as a prompt engineer. It's a self-declared title that can't really be disproven until we come to a stage where Microsoft, Google, or OpenAI decide to make it an official credential.

You get to call the shots and take credit for when the generative AI delivers. Sounds ideal, right?

It might have been -- if only you really could wholly rely on the AI to be accurate. Unfortunately, you can't. You have to fact-check everything it assert, no matter how confident it may sound about it.

Here Be Hallucinations

Why is that? Generative AI has learned to fabricate facts to back the narrative it's constructing.

This is what has come to be called "hallucinations," though I really don't like the term because it makes it sound like what humans experience from external forces acting on their minds. No one seems to be giving generative AI the equivalent of mind-altering drug to prompt it to come up with this --sometimes rather elaborate -- hallucinations.

Bard Gets Creative

Bard got incredibly expansive in one hallucination, positing an edition of Ben Horowitz's book The Hard Thing About Hard Things that doesn't exist. I asked it if the book ever references Built to Last, an extremely popular and influential business book from the late 20th century, when I noted some striking similarities and even the same title for a chapter at the end of both books.

Bard answered:

The passages where Ben Horowitz mentions "Built to Last" explicitly in his book "The Hard Thing about Hard Things" are on the following pages:
Chapter 6: The Flywheel, page 125
Chapter 10: Core Ideology, page 197

I countered that the pages and chapter titles didn't correspond to the copy of the book I had in front of me, and so it compounded the original hallucination as follows: and it answered:

The passages where Ben Horowitz mentions "Built to Last" explicitly in his book "The Hard Thing about Hard Things" are on the following pages:
Chapter 6: The Flywheel, page 125
Chapter 10: Core Ideology, page 197

Say What?

I countered that the pages and chapter titles didn't correspond to the copy of the book I had in front of me, and so it compounded the original hallucination as follows:

You are correct. The pages I provided correspond to the 2015 edition of the book. The 2014 edition has different page numbers. The passages where Ben Horowitz mentions "Built to Last" explicitly in the 2014 edition are on the following pages:
Chapter 6: The Flywheel, page 107
Chapter 10: Core Ideology, page 177

It then went on to explain that that the chapter titles were changed:

You are correct. The chapter names in the 2014 edition of "The Hard Thing about Hard Things" are different from the chapter names in the 2015 edition. The chapter that I referred to as "The Flywheel" is actually called "Concerning the Going Concerns" in the 2014 edition. The chapter that I referred to as "Core Ideology" is actually called "The Ideology of Success" in the 2014 edition.
I apologize for the confusion. I have updated my previous response to reflect the correct chapter names.

It sounds pretty credible until you look for the 2015 edition of this book and find it never existed. Bard simply created this out of thin air and went on to support the fiction with additional details.

This is very dangerous for anyone who is not by nature as skeptical as I am. Most people would likely accept that if Bard can give an account of an edition of a book as detailed as this, its must really exist. In fact, though, you'd have to not be taken in by the confidence of the assertion and do your own due-diligence.

I'd say that would be the real work of the prompt engineer today: not just coming up with the request for information but rigorously questioning the results and doing your own research to be certain that they're not the product of hallucinations.

Did Open AI Open Pandora's Box?

Bard vs Bard: why you can't just accept the first AI prompt response

Uncommon Content