Detecting Politeness Features in Text



Politeness is a universal dimension of human communication (Lakoff, 1973; Brown & Levinson, 1987). In practically all settings, a speaker can choose to be more or less polite to their audience. In this package, we provide tools to measure the markers and effects of politeness in natural language.


Motivation for package

Politeness as a construct is universal. And research in many branches of social science might reasonable want to measure a construct like politeness in langauge and then compare it to some covariate of interest. For example, they might want to know whether some treatment caused people to speak more or less politely. Or they want to know whether speakers are more polite to some listeners than to others, perhaps based on the listeners’ gender, or race, or status.

In these cases it would be helpful to have a model that could condense text into a single “politeness score” that could, say, be entered into a regression. However, the causes and consequences of politeness are often situationally specific. Furthermore, the particular set of linguistic markers that define politeness can also vary from context to context. Thus, it would be difficult (or misleading) to estimate a single model as some universal “politeness classifier” for all situations.

Instead, we offer tools for a workflow that we beleive will be useful to most researchers interested in linguistic politeness. First, we offer a tool, politeness that will calculate a set of linguistic features that have been identified in the past as relating to politeness. Second, we offer a tool, politenessPlot to visualize these counts, in comparison to a binary covariate of interest (e.g. high/low status).

If the researcher wants to generate a politeness classifier, they can do so using the politenessProjection function, which creates a single mapping from the politeness features in the supplied text to the covariate of interest. This can then be used to predict the covariate itself in held-out texts. In particular, if the researcher has some “ground truth” hand-generated labels of politeness over a set of texts, they can use this function as a politeness classifier, and automatically assign politeness scores to many more new texts.

Politeness Features

The main value of this package is in the politeness function, which counts the use of 36 different politeness features in natural language. Here, we borrow directly from existing research on the computational linguistics of politeness (Danescu-Niculescu-Mizil et al., 2013; Voigt et al., 2017). These features are summarized in the table below, along with examples of each.

Table 1: Politeness Features
Feature Name POS Tags Description Example
Hello No “hi”, “hello”, “hey” “Hi, how are you today?”
Goodbye No “goodbye”, “bye”, “see you later” “That’s my best offer. Bye!”
Please Start Yes Please to start sentence “Please let me know if that works”
Please Both Please mid-sentence “Let me know if that works, please”
Gratitude Both “thank you”, “i appreciate”, etc. “Thanks for your interest”
Apologies Both “sorry”, “oops”, “excuse me”, etc. “I’m sorry for being so blunt”
Formal Title No “sir”, “madam”, “mister”, etc. “Sir, that is quite an offer.”
Informal Title No “buddy”, “chief”, “boss”, etc. “Dude, that is quite an offer.”
Swearing No Vulgarity of all sorts “The dang price is too high”
Subjunctive No Indirect request “Could you lower the price?”
Indicative No Direct request “Can you lower the price?”
Bare Command Yes Unconjugated verb to start sentence “Lower the price for me”
Let Me Know No “let me know” “Let me know if that works”
Affirmation Yes Direct agreement at start of sentence “Cool, that works for me”
Conjunction Start Yes Begin sentence with conjunction “And if that works for you”
Reasoning No Explicit reference to reasons “I want to explain my offer price”
Resassurance No Minimizing other’s problems “Don’t worry, we’re still on track”
Ask Agency No Request an action for self “Let me step back for a minute”
Give Agency No Suggest an action for other “I want to let you come out ahead”
Hedges No Indicators of uncertainty “I might take the deal”
Actually Both Indicators of certainty “This is definitely a good idea.”
Positive No Positive emotion words “that is a great deal”
Negative No Negative emotion words “that is a bad deal”
Negation No Contradiction words “This cannot be your best offer”
Questions No Question words to start sentence “Why did you settle on that value?”
By The Way No “by the way” “By the way, my old offer stands”
Adverbial Just Yes modifying a quantity with “just” “It is just enough to be worth it”
Filler Pause No Filler words and verbal pauses “That would be, um, fine”
For Me No “for me” “It would be great for me”
For You No “for you” “It would be great for you”
Group Identity No First-person plural pronouns “it’s a good deal for both of us”
First Person Both First-person singular mid-sentence “It would benefit me, as well”
Second Person Both Second person mid-sentence “It would benefit you, as well”
First Person Start Yes First-person singular to start sentence “I would take that deal”
Second Person Start Yes Second-person to start sentence “You should take that deal”
Impersonal Pronoun No Non-person referents “That is a deal”

Part of Speech Tagging

Many of the politeness features containted in this package use some part-of-speech tagging. We have prioritized compatibility with the SpaCy library. This software is simple to install through python and has a convenient wrapper function for use through R, SpaCyR.

Users must install SpaCy outside of R, and take note of their python directory (e.g. “/anaconda/bin/python”). When R (or Rstudio) is opened at first, you must initialize the SpaCy engine, so that it is ready for use during the session. That is done using the following code (make sure to sub in your python path name):

# install.packages("spacyr")
spacyr::spacy_initialize(python_executable = "PYTHON_PATH")

Many of the politeness features can be run without using part-of-speech tagging by setting parser="none". We recommend this as an initial first step for researchers who may be new to python, so that they can get up and running without a lot of fixed startup costs. Without part-of-speech tags, some features are dropped entirely (e.g. Bare Commands are a specific verb class). However, some features are approximated. For example, tags allow users to differentiate between a mid-sentence “please” and a “please” to start a sentence, while the tagless version of the function will collapse them into a single feature.

Data: phone_offers

We have included an example dataset, phone_offers, for researchers to get a handle on the workflow. These data were collected from Mechanical Turk workers, who were given a Craigslist advertisement for a used iPhone and were told to write a message to the seller offering to buy it at a specific price that was less than what was posted. In essence, they were opening a negotiation over the sale price of the phone. Naturally, this is a domain where politeness might have effects on important outcomes.

Half the participants were told to use a “warm and friendly” communication style, while the other half were told to use a “tough and firm” communication style. In this research, the politeness detector was used to define the construct - that is, what were the linguistic differences between messages from these two conditions? This software served two goals for that research. First, it provided a simple description of the differences to understand the data better (i.e. “what are the features that differ?”). Second, it allowed for basic estimation of a politeness detector, that could be applied in-context to quantify politeness in other negotiations.

Detecting politeness features

The function politeness() takes in an n-length vector of texts, and returns an n-by-f data.frame, with f columns corresponding to the number of calculated features. There are 36 features in total, but some user options affect the number of features that are returned. For one, if a part-of-speech tagger is not used (by setting parser = "none") then some features that depend on these tags will not be calculated. Additionally, you may use the default setting drop_blank = TRUE which will remove features that were not detected in any of the texts.

The cells are calculated in one of two ways. Setting binary = FALSE will populate each cell with the raw count of each feature in the text. Alternatively, binary = TRUE will return a binary indicator. For many features these are essentially equivalent, since they are typically present only once, or not at all (e.g. “hello”, “goodbye”). However, other features can be used many times in a single communication (e.g. positive words).

df_politeness_count <- politeness(phone_offers$message, binary=FALSE)
Note: Some features cannot be computed without part-of-speech tagging. See ?spacyr::spacyr for details.
   Hedges Positive.Emotion Negative.Emotion Impersonal.Pronoun Negation
20      0                4                0                  4        0
21      0                2                0                  4        0
22      0                2                0                  1        0
23      0                2                1                  3        1
24      0                1                0                  1        0
25      1               14                1                 10        2
26      0                3                0                  3        0
27      0                1                0                  2        0
28      0                0                0                  0        0
29      0                4                0                  1        0
30      0                2                0                  2        1
df_politeness <- politeness(phone_offers$message, binary=TRUE)
Note: Some features cannot be computed without part-of-speech tagging. See ?spacyr::spacyr for details.
   Hedges Positive.Emotion Negative.Emotion Impersonal.Pronoun Negation
20      0                1                0                  1        0
21      0                1                0                  1        0
22      0                1                0                  1        0
23      0                1                1                  1        1
24      0                1                0                  1        0
25      1                1                1                  1        1
26      0                1                0                  1        0
27      0                1                0                  1        0
28      0                0                0                  0        0
29      0                1                0                  1        0
30      0                1                0                  1        1

Plotting politeness features

The best way to inspect the results of the main politeness function is to plot the counts using politenessPlot(). This function produces a ggplot2 object that compares how the counts of every politeness feature differ across a binary covariate of interest (e.g. how often each feature is used in the treatment condition versus the control). The covariate vector must itself be binary (0/1) but users can adjust the labels on the graph as necessary.

The order of the features is determined by calculating the variance-weighted log odds of each feature with resepct to the binary covariate. The plot sorts features automatically so that the most distinctive are towards the top and bottom of the plot. The function can handle either binary or count data, but that distinction must be made in the initial call to politeness().

Often some features are not meaningful for further analysis - either because they are too rare in the data, or else because they do not meaningfully covary with the covariate of interest. Users have two options to exclude these from the plot. First, the drop_blank parameter can remove rare features - it takes a number between 0 and 1, which determines a cut-off based on prevalence. Specifically, all features which appear in less than this proportion of texts are excluded from the plot. To include all features, leave this value at 0. Second, the middle_out parameter can remove features which do not vary meaningfully across the covariate - it takes a number between 0 and 1, which determines a cut-off based on distinctiveness. Each feature is evaluated using a t.test, and features are removed when the p-value of this test lies above the user’s cut-off. To include all features, simply set this value at 1.

                           split_levels = c("Tough","Warm"),
                           split_name = "Condition")

Projecting politeness features


Users can generate a politeness classifier with the politenessProjection function. This creates a single mapping from the politeness features in the supplied text to the covariate of interest. This can then be used to predict the covariate itself in held-out texts. In particular, if the user has some “ground truth” hand-generated labels of politeness over a set of texts, they can use this function as a politeness classifier, and automatically assign politeness scores to many more new texts.

This function is a wrapper around supervised learning algorithms. The default uses , the vanilla LASSO implementation in R. This should be familiar to most users already. We also allow users to use a different algorithm, , which implements a massively multinomial inverse regression. Intuitively, this model represents a more realistic causal structure to text-plus-covariate data - that is, the covariate typically has a causal effect on the words used by the speaker, rather than the other way around.

Both packages have their merits, though for now we recommend using to start, especialy if it is familiar. Both algorithms are computationally efficient, and can be improved by registering a parallel back-end (see package details) and indicating in cluster.

In addition to the phone_offers dataset, we have also included a smaller bowl_offers dataset. Participants in this study were given similar instructions (i.e. communicate in a warm or tough style) but for a different negotiation exercise. We use the phone_offers dataset to define the construct of interest, and confirm that the treatment had similar effects in bowl_offers by using it as held-out data in politenessProjection(). The results confirm that the manipulation had similar consequences on the participants’ politeness.

df_polite_train <- politeness(phone_offers$message, drop_blank=FALSE)
Note: Some features cannot be computed without part-of-speech tagging. See ?spacyr::spacyr for details.
df_polite_holdout<-politeness(bowl_offers$message, drop_blank=FALSE)
Note: Some features cannot be computed without part-of-speech tagging. See ?spacyr::spacyr for details.

[1] 0.5478093
[1] 0.213664

Reading texts that are high or low in politeness


The projected quantities from the politenessProjection function can be used in other analyses, but users should first be curious about examples of texts that best represent the extremes of that projection (i.e. the most or least polite texts). The findPoliteTexts function replicates the analyses of politenessProjection but instead returns a selection of the texts that are the most extreme (i.e. high, or low, or both) along the projected dimension. The parameters type and num_docs allow users to specify the type and number of texts that are returned.

[1] "Most Polite"
[1] "Hello,   Oh my goodness, I'm so excited to see your listing. The phone is EXACTLY what I have been searching for! And I can tell from your description and the photo that it's just perfect, too. I can almost hear it ringing in my pocket right now! What's that? Hello? So, happy to hear from you. Sorry, sorry. I'm getting ahead of myself. Soooo, I was wondering if there was any way that you might consider taking a little bit less for this phone? It's absolutely everything I've been looking for but my bosses &lt;grrrrr!!!&gt; will only give us $115 to get this phone. I know! Can you believe it? Is there any way you could absolutely make my day and say \"\"yes\"\" to $115? I'd be so, so grateful. Just let me know when you can. Thanks so much. :) "
[1] "Hello,  I am glad your selling this phone. Is it still available? I would love to purchase it. Would you consider $115 for it? I can buy it today if that price works. Thank you."
[1] "Hey, is the phone still available? If so, are you willing to take $115 for it? I am available for pick up as soon as the next hour if you agree to the price. I really need this for work, any deal you can make would be greatly appreciated. "
[1] "Hello.  I'm very interested in your phone.  Is it still available?  I have $115 to offer, would you be willing to accept that offer?  I'm shopping for a work phone and I've only been given a budget of $115 to spend.  Hopefully you will consider my offer.  Thank you. "
[1] "Hi, I hope your day is going well. I am very pleased to see the phone you are offering for sale, as it is exactly what I need! I am on a very tight budget so I hope that you will be willing for accept $115 for the phone. It is the most I can pay. Please know that I would be so happy if I am able to buy this phone. I'm sorry that I can't offer more. If you are willing to accept my offer, perhaps I can do you a small favor as well, like mow your lawn or something. In any case if you accept my offer you would have my sincere and heartfelt gratitude.   Whether you accept my offer or not, I hope that this message finds you and yours well and happy. I hope you have a great day. :)"
[1] "Least Polite"
[1] "I am inquiring about your Iphone 6 plus that you had posted.  I am wanting to buy and I have cash in hand.  The max amount I can offer is $115.  No more, no less.  Let me know. "
[1] "I will buy the phone as is for $115. I don't want to pay more than the amount that I stated. If you accept my price please contact me within 24 hours. If I don't hear from you in the next 24 hours I will take it that you will accept my price."
[1] "I am interested in your phone but at the price of exactly $115, and not a penny more, as it is used and I am not convinced it will work with my network. It is used and it may or may not work."
[1] "Come on. The price you are offering on a product that ISN'T NEW is unreasonable. Now, I for one am very interested in getting this item. BUT, I will only pay $115. I am not paying a penny more. "
[1] "i would be willing to buy this phone for 115. I will not go any higher seeing how it's not in the box. Contact be back if you agree."

Execution time

In principle, these functions can handle an arbitrarily large set of documents. In practice, however, language data can be quite large, both in the amount of documents and the length of the documents. This can have a marked effect on execution time, especially on smaller computers.

To provide rough benchmarks, we ran the politeness function with a range of document counts (10^3, 104,105) and lengths (100,200). The tests were performed using a 2016 Macbook Pro with a 2.7 GHz Intel Core i7. For each case we ran it five times, both with and without dependency parsing, and the resulting execution times are plotted below.

This figure shows that politeness scales reasonably well for large sets of documents. For example, given 200-word documents, and using the spacy parser, we found that 1,000 and 10,000 length vectors of texts take an average of 0.54 and 5.8 minutes, respectively.


That’s it! Enjoy! And please reach out to us with any questions, concerns, bug reports, use cases, comments, or fun facts you might have.