Measuring Personality: Learn It 5—Measuring Validity

Measuring Validity: Does BLIRT Measure What It Claims to Measure?

You’ve learned how a personality test like BLIRT can be designed and refined. The next step is just as important: showing that the test is valid.

A test is valid if it measures what it is supposed to measure. In other words, validity evidence answers the question:

When someone scores high (or low) on BLIRT, does that score actually mean they are more (or less) blirtatious in real life?

validity

Validity is the degree to which evidence supports the interpretation and use of test scores for a particular purpose.

Psychologists often build validity evidence in several ways, including:

  • Convergent validity: BLIRT scores relate to similar traits and measures

  • Discriminant validity: BLIRT scores do not relate strongly to unrelated traits

  • Criterion validity: BLIRT scores relate to a meaningful real-world outcome

  • Predictive validity: BLIRT scores help predict future behavior in relevant situations

Convergent Validity

One way to test validity is to compare your new scale with other measures that already have strong evidence behind them.

For convergent validity, researchers choose traits that are related to blirtatiousness, but not identical. For example, blirtatiousness involves fast, expressive responding—so it makes sense that it might relate to traits like:

  • assertiveness
  • extraversion
  • impulsivity
  • social confidence
  • self-liking

If BLIRT scores were not related to any of these, that would raise concerns that BLIRT is not measuring what it claims to measure.

To test this, Swann’s team gave the BLIRT scale and several related measures (including a measure of assertiveness[1]) to 1,397 college students.

Discriminant Validity

Convergent validity asks, “Does BLIRT relate to what it should relate to?” Discriminant validity asks the flip side:

“Does BLIRT not relate to what it shouldn’t relate to?”

Researchers compare BLIRT scores to traits or outcomes that should have little or no connection to blirtatiousness.

For example, BLIRT measures conversational style—how quickly and emotionally someone responds—not academic skill. So knowing someone’s BLIRT score should not tell you much about their GPA.

Swann’s team compared BLIRT scores to self-reported GPA and other traits expected to be less related to blirtatiousness.[2]

Overall, BLIRT performed the way a valid scale should: it was related to similar traits (good convergent validity) and weakly related—or unrelated—to traits that didn’t conceptually fit (good discriminant validity).

Librarians or salespeople?

Who do you predict would score higher on blirtatiousness: car salespeople or librarians?

Swann’s team administered the BLIRT to 30 employees at car dealerships and libraries in central Texas (ages 20–66; average age 34.3). Most people expect salespeople to be more blirtatious because that job often rewards fast, expressive talk, while library work may reward reflection and restraint.

Prediction check

Using the bar graph below, adjust the bars based on your prediction about who will be more blirtatious. Then click the link below to see if your prediction is correct.

The results matched the real-world expectation: on average, salespeople scored significantly higher on BLIRT than librarians. That supports criterion validity.

Predictive Validity

Predictive validity asks whether a test score can help predict behavior in a specific situation.

Swann’s team tested whether BLIRT scores predicted first impressions during short conversations. They recruited college students, paired them up, and had each pair complete a 7-minute “getting acquainted” phone call. Partners didn’t know each other and never saw each other. Students completed the BLIRT and then rated their conversation partner afterward.

Because this was a first-impression setting, researchers predicted that high blirters would be seen as more socially engaging.

Prediction check

After the conversations, the students rated their conversation partners on several different qualities. For example, who do you think would be perceived as more responsive—a high blirter or a low blirter?

  • high blirter
  • low blirter
  • no difference

Keeping in mind that this was a first-impression 7-minute conversation, who do you think would be seen as more interesting: a high blirter or a low blirter?

  • high blirter
  • low blirter
  • no difference

measuring personality

You now know more about personality test development than most people do. Scales like BLIRT (and Big Five measures) aren’t just “fun quizzes”—they can be used in research, counseling, hiring, education, and other high-impact decisions.

That’s why validity matters: if a scale is unreliable or invalid, it can lead to unfair conclusions about people.


  1. The Rathus Assertiveness Schedule
  2. Other traits included agreeableness, conscientiousness, and affect intensity (how strongly people experience emotions).