Measuring Personality: Learn It 4—Creating the Blirt Test

Creating the Blirt Scale

Imagine a couple—Khalil and Gen. Khalil is usually entertaining, but he never holds back; you always know what he’s thinking. His partner Gen is kind and friendly—they are the first to arrive when help is needed, but they hide their feelings and opinions.

Back in the early 2000s, social psychologist William Swann and his colleagues became interested in the impact of self-disclosure—the process of communicating information about ourselves to other people—on personal relationships. In one paper, the researchers wrote about “blirters” and “brooders”—good labels for Khalil and Gen. Early in their research, the psychologists realized that the story was not going to be simple. Enthusiastic self-disclosure (blirting) is sometimes good for relationships and sometimes bad, and the same is true about reluctance to self-disclose (brooding).

The researchers also realized that they didn’t really have a good way to sort people out on the self-disclosure continuum. Self-selection (“I’m very open.” “I’m very private.”) often doesn’t fit with how other people—including your friends—see you. And researchers’ first impressions are extremely unreliable (“He seems like a blirter.” “She seems like a brooder.”). They needed a better way to measure people’s willingness to self-disclose.

In this exercise, we’re going to give you a small taste of the process of creating a personality questionnaire—you can help create a “blirtatiousness” test.[1]

Scale Construction: What Questions Should We Use?

The first step in constructing a test or scale is to be clear about what it is you are measuring. In their papers, Dr. Swann and his colleagues define blirtatiousness as the extent to which people respond to friends and partners quickly and effusively. A person is effusive if they excitedly show and express emotion, or if they are less restrained with showing emotion. One thing to notice about this definition is that it focuses on behavior more than inner feelings. It is the behaviors of our friends and partners that affect us, regardless of their intentions and motivations, so that is what the BLIRT scale aims to measure.

The first step in creating a questionnaire is writing the questions, but this is not as straightforward as it seems. Will they be open-ended (e.g., “How open-minded are you? ___). Probably not, as they are hard to score. Forced choice, where a person chooses one of several options, is a better choice. Some forced-choice questions make you give rankings, or others may have you choose from options, like these questions from the Narcissistic Personality Inventory:

Sample text from the narcissistic personality inventory that has people choose which statement best identifies them: "I have a natural talent for influencing people" or "I am not good an influencing people." and "Modesty doesn't become me" or "I am essentially a modest person."
Figure 1. The questions from Terry Raskin’s Narcissistic Personality Inventory force participants to choose between two options.

Another common forced-choice format is the Likert[2] scale, which is composed of a statement (not a question) followed by 5 or 7 numbers allowing you to indicate your level of agreement with the statement. For example, here is an item from the Rosenberg Self-Esteem inventory:

Sample text from a personality inventory that says "I feel that I am a person of worth, at least on an equal plane with others." Then a person can choose either strongly disagree, disagree, agree, or strongly agree.
Figure 2. Morris Rosenberg’s questions on the self-esteem inventory utilize the Likert scale.

Dr. Swann and his team chose a 7-point Likert format to measure blirtatiousness. To do this, they needed to write clear, simple statements that people could agree or disagree with, where different levels of agreement were possible.

We aren’t going to ask you to write any questions, but you can imagine that you have joined the test-development team by looking at the eight statements below. Choose four that you think would be the best items to include in the BLIRT scale.

When they were developing the scale, Dr. Swann and his team wrote dozens of questions and then pared them down to 20. Then they got 237 undergraduates to rate the 20 questions for how well they fit the qualities that the BLIRT scale was trying to measure.[3]

Questionnaire writers have strategies to encourage people to read the statements carefully. For example, they often write “reverse scoring” items. To show what this means, just below is the 7-point Likert scale used with the Blirtatiousness questionnaire. Below that, you will see two statements. Look at how the statements and the Likert scale fit together.

Likert scale showing 1 as strongly disagree, then counting up so that 4 is neither agree nor disagree and 7 is strongly agree.
Figure 3. A Likert scale.
  1. I speak my mind as soon as a thought enters my head.
    • For this question, 1 means not blirtatious and 7 means very blirtatious.
  2. I don’t speak my mind as soon as a thought enters my head.
    • For this question, 1 means very blirtatious and 7 means not blirtatious.

Dr. Swann and his team chose 8 items for the BLIRT scale and half were worded so that higher numbers mean more blirtatious, and half so that high numbers mean less blirtatious. After the test, a process called “reverse scoring” put all the questions back on the same scale, so that higher numbers mean more blirtatious.[4]

Measuring Personality

Before you go on, now is a good time to measure your blirtatiousness. Follow the link below to find out if you are a blirter or a brooder.

Take the Blirt Test

Checking the Test

At this point in the test-creation process, Dr. Swann and his team settled on eight statements that seemed to measure BLIRT. They were ready to administer the test, but before they could praise the test and its effectiveness, they needed to be sure of a few things: the questions need to work together as a set, the test must be reliable, and the test must be valid.

  • The questions must work together as a set. In other words, we want to be sure that the 8 items are all giving us responses about the same quality (blirtatiousness) and that the responses people are giving are consistent with one another.
    • You might think that a single question would be enough to measure blirtatiousness. Why ask 8 questions when one would do? But research has shown that asking variations on the same question 8 or 10 different times gives a more stable measure. The questions must be slightly different (enough to make people think carefully), but not too different (so they don’t measure different things).
    • The researchers administered the BLIRT to 1,137 students and used statistical procedures[5] to be sure that the 8 items in the scale worked together. The results indicated that the 8 items on the scale were consistent with each other in measuring the same psychological quality.
  • The test must be reliable. The word “reliability” means “consistent.” We should be able to give you a test of some quality (e.g., how extraverted you are) and then give you that same test again two months later, and your scores should be pretty similar. This is important for what is called “stable traits.” Obviously, some psychological qualities, like moods, change all the time and we would not expect consistency. But, blirtatiousness should be a stable trait.
    • One common way to measure reliability of a test is a process called “test-retest reliability.” It is as simple as it sounds: you give the test, wait some period of time, and give it again to the same people.
  • The test must be valid. Believe it or not, after all this work, we still don’t know if the BLIRT scale is VALID. Validity is a question of whether or not we are measuring the thing we are trying to measure. Reliability doesn’t tell us if a scale is valid; reliability simply means that we get consistent answers. So how can we figure out if our test is valid or not?
    • There is no one way to determine the validity of a scale. Test developers like Dr. Swann usually take several different approaches. They may:
      • compare the test results with other personality tests of similar traits (convergent validity), or
      • compare scores from the BLIRT test with other dissimilar tests (discriminant validity).
      • Researchers may also compare the results of the BLIRT test to real-world outcomes (criterion validity), or
      • see if the results work to predict people’s behavior in certain situations (predictive validity).
      • Next, we will peek at some studies that try to assess these different aspects of validity.

 


  1. By the way, even serious psychologists seem to want to give their tests interesting names, so the name Dr. Swann gave to BLIRT stands for Brief Loquaciousness and Interpersonal Responsiveness Test.
  2. The man who created the scale pronounced his name as LICK-ert. Many psychologists—maybe even your instructor—pronounce it LIKE-ert. It probably doesn’t matter much which way you say the name.
  3. Note: Notice that the four items from the BLIRT are about what you DO. They aren’t about your beliefs (option 1), how you think other people see you (option 3), opinions about yourself (option 4), or what you think about other people (option 6).
  4. Reverse scoring is simple: 7 becomes 1, 6 becomes 2, 5 becomes 3, 4 stays 4, 3 becomes 5, 2 becomes 6, and 1 becomes 7. Only the 4 items with the reverse wording are rescored this way. The goal is to make it so that higher numbers mean more blirtatious for all the items.
  5. Cronbach’s alpha and Factor Analysis