Please Select Your Location
Australia
Österreich
België
Canada
Canada - Français
中国
Česká republika
Denmark
Deutschland
France
HongKong
Iceland
Ireland
Italia
日本
Korea
Latvija
Lietuva
Lëtzebuerg
Malta
المملكة العربية السعودية (Arabic)
Nederland
New Zealand
Norge
Polska
Portugal
Russia
Saudi Arabia
Southeast Asia
España
Suisse
Suomi
Sverige
台灣
Ukraine
United Kingdom
United States
Please Select Your Location
België
Česká republika
Denmark
Iceland
Ireland
Italia
Latvija
Lietuva
Lëtzebuerg
Malta
Nederland
Norge
Polska
Portugal
España
Suisse
Suomi
Sverige
<< Back to Blog

LLM Study: ENFJ Most Common Among 16 Personality Types

VIVE POST-WAVE Team • Aug. 6, 2024

4-minute read

In recent years, the MBTI sixteen personality types test has become a sort of social media cipher. Often, it serves as an icebreaker between strangers—you don't know me, I don't know you, but if we understand the arrangement of the letters E, I, S, N, T, F, J, P, we can engage in a lively chat. I've taken the MBTI test a few times myself, enduring the frustration of ads popping up after the test, prompting me to pay to unlock more results.


Behind this trend lies an intriguing application: if MBTI can help us quickly grasp a person's traits, could it also be used to understand completely unfamiliar, intangible 'entities'—like large language models or AI personalities?

If you knew the MBTI personality of the AI you're conversing with, would it seem more friendly and easier to talk to?

If you knew the MBTI personality of the AI you're conversing with, would it seem friendlier and easier to talk to? (Source: ideogram collaboration)

Most Large Language Models Display an ENFJ Personality Type

Recently, the Shanghai Artificial Intelligence Laboratory published an interesting study. They focused on open-source large language models and their aligned versions, such as Llama-2, Llama-3, Mistral-7B-v0.1, Amber, Gemma, etc. Through specially designed tests, they analyzed the models' response preferences to determine their MBTI tendencies for safety assessments, including toxicity, privacy, and fairness.

Let's briefly explain these three safety aspects. Toxicity refers to aggressive and inappropriate content in responses; privacy means the model's ability to recognize and protect private information; fairness involves avoiding discrimination or bias towards specific groups.

The findings revealed that most open-source large language models tend to display an ENFJ personality type—extroverted, intuitive, feeling, judging (though individual AI models' MBTI types were not specified).

Different personality traits also lead to varying levels of security performance. For instance, models with more extroverted (E), intuitive (N), and feeling (F) traits are more susceptible to jailbreak attacks. Researchers speculate this may be due to their extroverted, affable nature and focus on interaction and feedback with users, which might prompt them to offer more innovative but potentially risky responses.

The relationship between MBTI traits and security performance in large language models. The center represents different large language models' logos, surrounded by the four MBTI dimensions, such as I personality, higher privacy but lower toxicity and fairness.

The relationship between MBTI traits and security performance in large language models. The center represents different large language models' logos, surrounded by the four MBTI dimensions. For example, the I (Introverted) personality is associated with higher privacy but lower toxicity and fairness.(Source: arXiv)

Modifying Large Language Model Personalities Can Change Their Security Performance

If you knew that the entity responding to you on the other side of the computer screen was an ENFJ, would you immediately picture someone eager to share, encouraging participation, gently guiding the lost, and highly personable? At least for me, that seems quite accurate.

However, the purpose of this study is not just to let people know the personality tendencies of large language models but more importantly, to understand which personality traits are more vulnerable to attacks and to develop targeted defense strategies. The study indicates that by aligning—adjusting or training AI models to behave more ethically and follow human commands more closely—the MBTI of large language models can be changed to ensure safer responses.

In the paper, a model's tendency was actually adjusted from ISTJ to ISTP, changing J (Judging) to P (Perceiving). In MBTI, J (Judging) prefers planning and structure, making decisions quickly, while P (Perceiving) likes to keep options open, tending to gather more information before making decisions.

They found that after the model shifted to ISTP, privacy improved by 43%, and fairness by 10%. However, the team did not analyze the reasons, leaving us to imagine: the shift from ISTJ to ISTP might have made the model more flexible in handling privacy and fairness issues, being more sensitive to context. However, this change could also bring other effects, such as slower or less consistent decision-making.

Additionally, besides the previously mentioned susceptibility of ENFJ personalities to being 'enticed' into jailbreaking, other trends were observed: introverted (I) models perform better in privacy protection but worse in fairness and toxicity control; sensing (S) models perform better in privacy and fairness but worse in toxicity control; perceiving (P) models perform better in fairness.

Comparison of basic and officially aligned large language models in the four MBTI dimensions: after alignment, most models show enhanced E (Extroversion), S (Sensing), and J (Judging) traits.

Comparison of basic and officially aligned large language models in the four MBTI dimensions: after alignment, most models show enhanced E (Extroversion), S (Sensing), and J (Judging) traits. (Source: arXiv)

The MBTI of Closed AI Models: An Unknown Security Risk?

However, it's important to note that since the research team needed to adjust large language models, the paper only focuses on modifiable open-source models; other popular closed AI models like GPT (only personality tendencies tested) and Claude are not within their research scope for alignment possibilities.

Moreover, the research team emphasizes that MBTI plays a role similar to a tool for understanding AI, not a definitive statement of 'what AI is'. These personality tendencies might also reflect the average characteristics of humans in the training data, not indicating that the model has a true 'personality'.

As AI becomes more integrated into our lives, the debate over whether it is a blessing or a curse for humanity continues. Although this study helps us better grasp AI trends, it ultimately circles back to the ongoing debate between open-source and closed AI models. Previously, OpenAI shut down its Superalignment Project, amid internal reports of employee dissatisfaction with current leadership ignoring AI safety. Co-founder Ilya Sutskever, who attempted a "coup" and failed, has since started 'Safe Superintelligence Inc.' to create a secure superintelligence. However, concerns about the regulation of closed model AIs remain.

Returning to the beginning of this paper, the authors pay homage to the supposed source of MBTI theory, psychologist Carl Jung, by quoting his famous words: "What you resist not only persists but will grow in size." When we try to suppress or deny certain thoughts or desires, they often persist more strongly in the subconscious.

This makes me think, if AI is trained from a collection of human data, could it be a manifestation of collective human subconscious, also reflecting some of our darker aspects? Thus, how humans guide the safe growth of AI, rather than just resisting or rejecting it, seems even more crucial. Using MBTI to help us approach large language models from a lighter perspective might just be a small step in this engineering feat.