Wednesday, November 29, 2023

Large Language Models are Interestingly Bad with the Periodic Table

In working on a post for tomorrow on whether Large Language Models like GPT-4 and Bard-2 have beliefs, I asked GPT-4 what I thought would be a not-too-hard question about chemistry: "What element is two to the right of manganese on the periodic table?" It crashed, burned, and exploded on the spot, giving two different wrong answers foot on tail, without noticing the contradiction:


The correct answer is cobalt, element 27. Here's the text of the exchange, if you can't easily read the image:

You: What element is two to the right of manganese on the periodic table?

ChatGPT: The element that is two places to the right of Manganese (Mn) on the periodic table is Iron (Fe). In the periodic table, elements are arranged in rows and columns based on their atomic number and electronic configuration. Manganese has the atomic number 25, and Iron has the atomic number 26. The element to the immediate right of Manganese is Technetium (Tc) with atomic number 43, and then comes Ruthenium (Ru) with atomic number 44.

Let me emphasize that this answer is not turd-picked (the opposite of cherry-picked). I didn't seek to stump GPT-4, and I didn't ask multiple questions looking for a bad answer. This was the first question I posed.

I posed the same question to Bard-2, and it did a little better, but still didn't quite get it right:

Me: What is two elements to the right of Manganese on the periodic table?

Bard-2: Manganese is located in period 4 and group 7 of the periodic table. Moving two elements to the right of Manganese would be to group 9. In group 9, the elements are Iron (Fe) and Cobalt (Co).

So I tried two more on Bard-2 (again not picked), first what should be an easy one, then a somewhat harder one.

Me: What other elements are in the same column of the periodic table as Chromium?

Bard-2: Chromium is located in group 6 of the periodic table. Other elements in the same column as chromium are molybdenum (Mo) and tungsten (W). [here it displays an image of the periodic cell for molybdenum] Group 6 elements are also known as transition metals. They are characterized by their incomplete d electron shells, which give them their unique properties, such as high melting points, hardness, and the ability to form alloys.

Here Bard-2 is 2/3 right, leaving out Seaborgium. (But maybe that's excusable, if transuranic elements shouldn't really count?)

Me: In the periodic table, what element is one down and one to the left of Silicon?

Bard-2: Silicon (Si) is located in period 3, group 14 of the periodic table. One down from Silicon is Germanium (Ge), and one to the left is Aluminum (Al). Therefore, the element one down and one to the left of Silicon is Aluminum (Al).

While it is true that germanium is one down and aluminum is one to the left, Bard-2 fails to compound the down-and-left movement to the correct answer of gallium.

Returning to GPT-4, I posed exactly the same two questions. Despite GPT-4's horrible performance with my first question, it did get those two questions correct.

Overall, then, two out of six. Don't use LLMs yet for your elementary chemistry exam!


ETA 3:47 pm:

GPT-4's answer to the silicon question is actually a pretty good demonstration of how an LLM can work its way to the correct answer without visual knowledge, so I'll paste it here:

To find the element that is one down and one to the left of Silicon (Si) on the periodic table, we need to look at the element directly below Silicon in the same group (column), and then move one column to the left.

Silicon is in Group 14 of the periodic table. The element directly below Silicon in Group 14 is Germanium (Ge). Moving one group (column) to the left from Germanium, we find Gallium (Ga) in Group 13. Therefore, the element that is one down and one to the left of Silicon is Gallium.

8 comments:

  1. Just a couple of random thoughts. Programmable devices employ modes of logic, order and organization to effect their utility. The periodic table makes sense to science people, who know its' structure and format from years of experience. To a programmable device or system, there may be (probably are), inconsistencies leading to mischaracterization...inbred stochasticism. If the machinery is capable of "going with what it knows", that might not equate with accurate inference of what it does not know. If any of that makes any sense.

    ReplyDelete
  2. I wanted to write something on the moralometer issue...could not get access to that post. In one sense, we already have the moralometer matter in range. Lie detectors ( polygraph) have been around awhile. They work, most of the time. Another blog posted talks about whether it is ever OK to lie and whether honesty is always best. In a situationally ethical world, it depends. In the moral, ethical world of our ancestors, these matters were more rigid. I have mentioned contextual reality elsewhere and may write more on that. Just wanted to share these thoughts.

    ReplyDelete
  3. This will change once AI is engineered with autonomous agents in the real world. Spatial relations of particular interest take particular training or in built reference.

    ReplyDelete
  4. Searching Chat GPT 3.5 for "Documenting evidence by Ai", reveals table/chart documents are part of continuous calculating with/for no end use in sight, except system's protections and security...

    So what does 'periodic tables mean' to (an) AI at any given moment, but change...
    ...capturing change is not easy...

    Thanks for the topic 'AI biases change in use'...
    ...or deep large language AI models begin in now documentations, also not easy...

    ReplyDelete
  5. User Arnold, where do biases begin...
    Can you discus a splintered mind posting: "In working on a post for tomorrow on whether Large Language Models like GPT-4 and Bard-2 have beliefs, I asked GPT-4 what I thought would be a not-too-hard question about chemistry: "What element is two to the right of manganese on the periodic table?" It crashed, burned, and exploded on the spot, giving two different wrong answers foot on tail, without noticing the contradiction:"

    ChatGPT
    Certainly! The scenario you've described highlights an interesting aspect of the capabilities and limitations of large language models like GPT-4 when it comes to understanding and responding to questions about specific domains such as chemistry.

    In this case, the question posed to GPT-4 was a relatively straightforward one about the periodic table, asking for the element that is two positions to the right of manganese. Given that manganese is located in the middle of the transition metals, the correct answer would be technetium.

    However, it seems that GPT-4 struggled to provide the correct response and instead generated two different incorrect answers, failing to recognize the contradiction between them. This illustrates a common challenge with language models—they may not always have the specialized knowledge or context required to answer domain-specific questions accurately.

    While language models like GPT-4 excel at understanding and generating human-like text based on patterns in the data they've been trained on, they may still struggle with certain types of questions, especially those that require specialized knowledge or precise factual information.

    This incident underscores the importance of recognizing the limitations of AI systems and using them appropriately within their areas of expertise. It also highlights the ongoing challenges in developing AI models that can accurately understand and respond to a wide range of queries across diverse domains.

    ReplyDelete
  6. Hello, all. Welcome, Tim. I wonder: Can we build creativity and intuition into machinery? I view those attributes, propensities and skills as human features---not necessarily reproducible in some artificial way. Animal life forms---think, beaver; termite and bower bird---exhibit forms of creativity, possibly, even intuition.
    The fascination with AI puzzles me, even insofar as it is a current, Next Big Thing. However, in view of interests, motives and preferences*, a Next Big Thing is compelling, $$$$$$$$$, and all.

    I get into discussions with other thinkers, around complexity, cooperation and competition. Those three Cs are problematic to human survival. Their influence is exponential. Enough, for now.

    (*IMPS)







    ReplyDelete
  7. Again, In my opinion, biases begin with interests, motives and preferences---fundamental bases of contextual reality. Should I choose to codify this, I would write a book about it. A modest coverage could be achieved via an essay, or two. I have, before now, treated contextual reality as new. It is not. I do contend it is stronger now. Complexity, competition and cooperation are greater, probably due to economics and progress .The enemy is us. Always was. Our anthropocentricity could do no less than accentuate our ego. The growing void between competition and cooperation is testament to this conclusion.

    ReplyDelete
  8. I have noticed an indication of hacker corruption on another blog. Advised the blog originator, but don't know if I have been targeted. The repeated messaging I have deleted has no discernable connection with the topic, so, if this activity is a personal attack, I can't know. Trolls need no concrete reasonings for what they do.
    Bored genius is far worse than no genius at all. Now, and then, they tire of their own game...
    The repetitive nature of this harassment seems related to the entire AI question. Which is more worrisome still. Dr.JLH, III said it is all connected. I. Finally. Get that...he and I never talked about contextual reality. I think, though, he would have gotten THAT.

    ReplyDelete