- Know what an indicator variable is
- Find and describe an appropriate multiple linear regression model equation with categorical predictors
High School and Beyond

The data set of High School and Beyond contains information about [latex]200[/latex] high school students and [latex]10[/latex] variables for each student. The data collected about each student includes whether the student identifies as male or female, their socio-economic status, along with science scores, math scores, and reading scores.
Descriptions of the variables are as follows:
| Variable name | Definition |
| id | Identification number of the student |
| female | Gender of the student (0 = male, 1 = female) |
| race | Ethnic background of the student (1 = Hispanic, 2 = Asian, 3 = Black, 4 = White) |
| ses | Socio-economic status of the student (1 = low, 2 = medium, 3 = high) |
| schtyp | School type (1 = public, 2 = private) |
| prog | Program type (1 = general, 2 = academic preparatory, 3 = vocational/technical) |
| read | Score from test of reading |
| write | Score from test of writing |
| math | Score from test of math |
| science | Score from test of science |
| socst | Score from test of social studies |
We are interested in building a prediction model that will allow us to predict science test scores based on math test scores and the recorded gender of the student.
Note: In this study, students only identified themselves as female or male.