Student-Teacher Achievement Ratio Data

A cleaned subset of the Project STAR dataset. This processed version focuses on third-grade test scores (math and reading) and includes key variables for mixed-effect modeling.

Usage

star

Format

A data frame with 4,192 rows and 13 columns:

school_id: Factor indicating unique school ID.
system_id: Factor indicating school system ID.
sctype: Factor indicating school type: "inner-city", "suburban", "rural", or "urban".
gender: Factor indicating student's gender: "female" or "male".
ethnicity: Factor indicating student's ethnicity: "cauc" (Caucasian), "afam" (African-American), "asian" (Asian), "hispanic" (Hispanic), "amindian" (American-Indian), or "other".
cltype: Factor indicating student's class type in 3rd grade: "small", "regualr", or "regular-with-aide".
tdegree: Factor indicating highest degree of 3rd grade class teacher: "bachelor", "master", or "specialist".
tyear: Integer years of teacher's total teaching experience in 3rd grade.
lunch: Factor indicating whether the student qualified for free lunch in 3rd grade: "free" or "non-free".
read_old: Total reading scaled score in 2nd grade.
read: Total reading scaled score in 3rd grade.
math_old: Total math scaled score in 2nd grade.
math: Total math scaled score in 3rd grade.

Source

https://search.r-project.org/CRAN/refmans/AER/html/STAR.html

Details

Project STAR is a large-scale experiment in Tennessee (1980s) studying the effect of class size on student test performance. The original dataset tracked over 7,000 students across 79 schools from kindergarten to third grade, in which they were randomly assigned into one of three interventions: small class (13 to 17 students per teacher), regular class (22 to 25 students per teacher), and regular-with-aide class (22 to 25 students with a full-time teacher's aide). The test score data analyzed in this chapter are the sum of the scores on the math and reading portion of the Stanford Achievement Test.

This processed version focuses on student performance in third grade, ensuring a hierarchical structure where students are nested within schools. It includes a subset of key variables related to student demographics, prior-year (2nd grade) and current-year (3rd grade) test scores, class assignment, teacher qualifications, and school-level identifiers. All students in this dataset have been controlled as attending the same school in both 2nd and 3rd grades. This dataset is structured to facilitate mixed-effects modeling, making it well-suited for evaluating school effects and treatment impacts.

References

Stock, J.H. and Watson, M.W. (2007). Introduction to Econometrics, 2nd ed. Boston: Addison Wesley.

Data sourced from the AER package.