Measuring Personality: Learn It 5—Measuring Validity

You learned about personality tests and how to go about developing a BLIRT test, but let’s dig deeper into how you could actually measure the validity of the test. Remember, validity is a measure of whether or not a test is testing what it should.

validity

There are multiple types of validity:

  • convergent validity: compare the test results with other personality tests of similar traits
  • discriminant validity: compare the test results with other dissimilar tests 
  • criterion validity: compare the results of the BLIRT test to the real world to see if it measures what it was designed to. 
  • predictive validity: see if the results work to predict people’s behavior in certain situations

Convergent Validity

One way to test the validity of a test is to compare it to results from tests of other traits for which validated tests already exist.

When testing for convergent validity, the researcher looks for other traits that are similar to (but not identical to) the trait they are measuring. For example, we are studying blirtatiousness. It would be reasonable to think that a person who is blirtatious is also assertive. The two traits—blirtatiousness and assertiveness—are not the same, but they are certainly related. If our blirtatiousness scale is not at all related to assertiveness, then we should be worried that we are not really measuring blirtatiousness successfully.

We can use the correlation between the BLIRT score and a score on a test of assertiveness to measure convergent validity. The researchers gave a set of tests, including the BLIRT scale and a measure of assertiveness[1] to 1,397 college students. Assertiveness was just one of several traits that were expected to be similar to blirtatiousness.[2]

Discriminant Validity

We want our BLIRT score to have a moderate-to-strong relationship with traits that are similar, but we also want it to be unrelated to traits or abilities that are not similar to blirtatiousness. Tests of discriminant validity compare our BLIRT score to traits that should have weak or no relationship to blirtatiousness. For example, people who are blirtatious may be good students or poor students or somewhere in-between. Knowing how blirtatious you are should not tell us much about how good a student you are.

The researchers compared the BLIRT score of the 1,397 students mentioned earlier to their self-reported GPA.[3]

Dr. Swann’s team compared 21 different traits and abilities to the blirtatiousness scale. Some assessed convergent validity and others tested discriminant validity. The results were generally convincing: BLIRT scores were similar to traits that should be related to blirtatiousness (good convergent validity) and unrelated to traits that should have no connection to blirtatiousness (good discriminant validity).

Criterion Validity

Another way to test the validity of a measure is to see if it measures what it was designed to—if fits the way people behave in the real world. The BLIRT researchers conducted two studies to see if BLIRT scores fit what we know about people’s personalities. Criterion validity is the relationship between a measure and another real-world outcome.

Librarians or Salespeople?

Who do you think is more likely to be blirtatious, a salesperson or a librarian? The researchers found thirty employees of car dealerships and libraries in central Texas and gave them the BLIRT scale. Their ages ranged from 20 to 66 (average age = 34.3 years).

Using the bar graph below, adjust the bars based on your prediction about who will be more blirtatious. Then click the link below to see if your prediction is correct.

Most people expect salespeople to be more blirtatious than librarians. The researchers explained that we assume that high blirters will look for a work environment that rewards “effusive, rapid responding,” while low blirters would prefer a workplace that encourages “reflection and social inhibition.” As you can see, the results of the study were consistent with this idea: salespeople had significantly higher blirt scores (on average) than librarians.

Predictive Validity

Another way to assess validity of the BLIRT scale is to see if it predicts people’s behavior in specific situations. Based on research about first impressions, the experimenters believed that people who are open and expressive should, in general, make better first impressions than people who are reserved and relatively quiet.

To test this hypothesis, the researchers recruited college students and put them into pairs. The members of each pair had a 7-minute “getting acquainted” telephone conversation. The members of the pairs did not know each other and, in fact, they never saw each other. The participants also completed several personality measures, including the BLIRT scale. Note that they were NOT paired based on their BLIRT scores, so there were different combinations of blirtatiousness across the 32 pairs tested.

After the conversations, the students rated their conversation partners on several different qualities. For example, who do you think would be perceived as more responsive—a high blirter or a low blirter?

  • high blirter
  • low blirter
  • no difference

Keeping in mind that this was a first-impression 7-minute conversation, who do you think would be seen as more interesting: a high blirter or a low blirter?

  • high blirter
  • low blirter
  • no difference

measuring personality

You now know more about creating a personality test than most people do. Scales like the BLIRT or the Big Five test you took at the beginning of this exercise are used for serious purposes. Psychological researchers use them in their studies, of course. But psychological tests are also used by companies in their hiring process, by therapists trying to understand their patients, school systems assessing strengths and weaknesses of their students, and even sports teams trying to identify the best athletes to fit their system.

We hope that this exercise has given you some insight into the characteristics of a good personality test, and the work that goes into developing a useful scale. Next time you take one, consider the process that went into its development.

Think of another way, not mentioned in the reading, that experimenters could test the validity of the Blirt test. What type of validity would you be testing? What would you expect the results of your validity test to be?

 


  1. The Rathus Assertiveness Schedule
  2. Others included self-perceived social confidence, extroversion, impulsivity, and self-liking.
  3. Other traits assessed for discriminant validity were agreeableness, conscientiousness, affect intensity (how strongly people were influenced by their emotions).