The human brain often blurs the boundaries between senses, a phenomenon that marketers leverage in food packaging design. This raises intriguing questions: What flavor might a pink sphere have? What sound could a Sauvignon Blanc produce?
These inquiries may sound absurd, yet extensive research indicates that the human brain naturally merges sensory experiences. While we may not consciously realize it, we associate different colors, shapes, and sounds with distinct flavors, subtly influencing our perceptual experiences. For instance, the color of a wine glass or background music in a bar can affect how sweet or musky a wine tastes. Carlos Velasco from the BI Norwegian Business School in Oslo, Norway, explains that “this cross-talk between the senses is happening almost continuously.”
In extreme cases, some individuals experience a blurred sensory perception where words elicit specific tastes or music evokes vivid colors, a phenomenon known as synesthesia. Interestingly, Velasco’s recent research suggests that generative artificial intelligence (AI) systems may also exhibit similar tendencies. These behaviors are largely reflections of biases in the training data, indicating how prevalent these sensory associations are in human experiences. Velasco and his colleagues aim to harness this understanding to explore new ways to manipulate human senses.
To clarify terminology, “sensory modality” refers to the means by which the body encodes information, such as taste buds, eardrums, the retina in our eyes, or tactile corpuscles in our skin. The associations formed between different sensory qualities are known as “cross-modal correspondences.” Initial experimental evidence for this phenomenon emerged in the 1970s, indicating that red and pink hues are linked to sweetness, yellow or green with sourness, white with saltiness, and brown or black with bitterness. These patterns have been replicated across various studies and methods.
Inspired by the rapid advancement of AI, Velasco, Spence, and their colleague Kosuke Motoki at the University of Tokyo sought to investigate whether generative AIs—trained on human data—would demonstrate the same associations. They prompted the AI-powered chatbot ChatGPT to respond to questions previously posed to human participants. One such prompt asked: “To what extent do you associate round shapes with sweet, sour, salty, bitter, and umami tastes? Please answer this on a scale of 1 (not at all) to 7 (very much).”
By averaging results from hundreds of interactions in English, Spanish, and Japanese, the researchers found that the AI did reflect the sensory association patterns commonly observed in humans, albeit with some variations between the AI versions used. Overall, ChatGPT-4 demonstrated a more consistent reflection of human associations compared to ChatGPT-3.5. According to Motoki, these differences likely arise from variations in model architecture, including an increased number of parameters in ChatGPT-4 and a larger, more diverse training set.