Social media sites and language: an empirical analysis of inclusive language

Executive summary of the master thesis of Anna Cernecca. Data Analytics for Politics, Society, and Complex Organizations (DAPS&CO) within the MA in Public and Corporate Communication, University of Milano.

Supervisors: Dr. Giulia Dotti Sani and Dr. Federico Vegetti


What is inclusive language and what attitudes do people have toward it? It is a very discussed topic, but opinions are often confused and misleading.

In the last few years, articles or debates on social media sites about the introduction of inclusive language have flourished also in the Italian context. In light of the largely negative reactions and opinions that emerged in the media, this thesis explores opinions and attitudes toward inclusive language in the general population.

Specifically, the thesis aims at answering the following question: what attitudes do people have towards inclusive language and the symbols that have been identified for this purpose?

To this end, I adopted a multi-method research design combining text analysis of comments on social media content and quantitative analysis of data collected through a short online questionnaire.


The socio-political importance of language is supported by many studies. For example, some scholars analyzed the impact that variables like class, race, and gender have on language forms and uses. Of course, society and inequality are not made up exclusively of language. However, the importance of power structures to determine norms around gender must be recognized. Since power is not monolithic, there are many concurrent dimensions of power. Here the focus is on the dimension of gender.

The extent to which languages mark gender, and how they do so, varies a lot across the world.

There are three types of language in this regard: grammatical gender languages (like Italian), natural gender languages (like English) and genderless languages (like Turkish).

At one extreme there are languages where gender is a grammatical category: when talking about people, it is impossible not to mention gender. On the other extreme position, there are languages where gender is not grammatically significant.

Even if the characteristics of grammatical genders vary from language to language, they share some features that seem to influence the mental representations of women and men.

One of these features is the multiple meaning of the masculine form. In many languages, the masculine form is used either specifically, when it refers exclusively to men or generically when it refers to both women and men or when gender is unknown or irrelevant.

In light of these reflections, a number of proposals for language reform have been made.

Inclusive language can be defined as a language that is free from words, phrases, or tones that reflect prejudiced, stereotyped, or discriminatory opinions toward certain groups of people.

The two main guidelines for a more inclusive language are gender specification and gender neutralization.

Gender specification always involves the marking of the reference to the masculine or feminine. To make language truly inclusive, according to many scholars, it is first necessary to implement the feminization process. Only when the collective imagination will have an equal perception of sexual identities, a neutralization reform could be possibly implemented. Proposals for the feminization of language suggest including both feminine and masculine forms of articles, adjectives, nouns, pronouns, and participial forms.

The other guideline towards a more inclusive language is gender neutralization. Here the focus will be on Italian, but reflections are also ongoing in other languages.

Some strategies to neutralize gender include the use of words that maintain the same form in both masculine and feminine form, the use of collective terms, and the use of impersonal structures or a passive formula.

Other proposals involve the use of symbols in place of the masculine ending. The main proposed symbols are the asterisk, the at symbol, the u, the x, and the schwa.

The schwa is a symbol that can be found in the international phonetic alphabet, the system for defining the correct pronunciation of written languages. It identifies an intermediate vowel, whose sound lies in between the existing vowels.

Social media analysis

To analyze opinions and reactions to the introduction of gender-neutralization strategies, I analysed the comments that users made under the post of the municipality of Castelfranco Emilia.

To be more specific, on 11th April 2021 the official page of the city of Castelfranco Emilia published a post (that you can see in the outline) where they stated that, in their public communications on social media, they would begin to use the schwa to replace the universal masculine.

I decided to analyze in detail what the reactions of users have been, focusing on the analysis of 550 comments and replies published in response to the municipality’s post. To do so, I decided to employ text analysis. Text analysis allows us to process texts to extract insights and patterns. In particular, I analyzed the main topics, the prevailing sentiment, and the emotions of the comments.

Since the number of posts was limited, a manual coding of the sentiment and the topic of each post was carried out. To classify the emotions of the posts I used an NRC dictionary, that associated words with the eight basic emotions: anger, fear, anticipation, trust, surprise, sadness, joy, and disgust.

Starting with sentiment analysis, the majority of the comments and replies have a negative sentiment. However, it was possible to observe (see Figure 1) that there are important differences across genders, as women were much more likely than men to post comments (as well as replies) with a positive sentiment.

Figure 1. Distribution of negative, neutral, or positive comments by gender

Regarding the analysis of the emotions of the comments, it appears that in every analyzed case the distribution of the words used is pretty similar, and there are not great differences across gender. The most common emotions are trust, fear, sadness, and anticipation.

Lastly, regarding topic analysis, it has been seen that the most common topics are linguistics, freedom, gender, and inclusivity. Here, some further differences were found with regard to the gender of the respondents (see Figure 2). In fact, men’s comments were more likely to be coded as containing linguistic content, whereas inclusivity resulted to be more important among women than among men.

Figure 2. Distribution of topics by gender
Survey data analysis

Regarding the analysis of the questionnaire, the goal was to analyze the attitudes that people have towards inclusive language and to see if there were differences among people with different socio-demographic characteristics and political orientations.

The questionnaire was approved by the University’s ethical board and was shared on several Facebook pages over a period of six months. At the end of the six-month period, 975 responses were collected.

More than 85% of the sample already heard about inclusive language and more than 70% of them affirmed that inclusive language is an important issue to address.

All respondents were asked which symbol they thought was the best option to make a word neutral. Over 30% of the respondents in the sample answered that no option was good enough. However, almost 40% of the respondents found that the best option to make a word neutral was the asterisk (Figure 3) and the second most chosen symbol was the schwa (22%). All the other options were not considered adequate for neutralizing a word, since all of them were indicated by very few respondents.

As shown in Figure 4, when asked what symbols the respondents actually used to make a word neutral, the most frequent symbols were the asterisk (over 50%), the schwa (almost 30%), and the at symbol (almost 10%).

Interestingly, no socio-demographic differences emerged in the exposure or non-exposure to symbols, but instead, differences emerged in the use or non-use of symbols.

Figure 3. Distribution of symbols that are considered best to neutralize language
Figure 4. Distribution of symbols that are most used to neutralize language

Limitations and future research

In both cases, the design does not allow generalizing the results to the entire population.

For the questionnaire, this is mainly due to the choice and practical necessity of conducting a self-compiled questionnaire and spreading it on a social media site. It is therefore desirable to investigate this further, via a representative sample.

Regarding the analysis of the comments, it would be fruitful to conduct a subsequent analysis with a larger sample and across different social media, to further investigate variations in reactions and opinions and, crucially, if these change over time as the social and political discourse on inclusive language evolves.


In conclusion, this research has tried to demonstrate the importance of language in people’s lives. It was possible to verify that there are still many limits to these changes, which can largely be attributed to the novelty of these innovations. In any case, these changes can already lead to important reflections, both from the point of view of minorities and the community as a whole.

Foto di Jason Leung su Unsplash