# Factors that Predict ACT Science Scores from a Multicultural

## Transcript Of Factors that Predict ACT Science Scores from a Multicultural

Brazill, S. C. (2019). Factors that Predict ACT Science Scores from a Multicultural Perspective. Educational Research: Theory and Practice, 30(2), 1-16.

Factors that Predict ACT Science Scores from a Multicultural Perspective

Shihua Chen Brazill

Montana State University

Abstract: This study investigated predictors for ACT Science scores, a test used by many universities to rank applicants. This study utilized quantitative research methods using the Montana Office of Public Instruction’s GEMS (Growth and Enhancement of Montana Students) data set. All advanced statistical analysis was conducted using Stata software IC/15. This research is significant for increasing the representation of under-represented groups in STEM education because it helps clarify three important relationships: (1) How well do gender, race, and meal status predict 11th grade ACT Science scores; (2) How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status; and (3) How well does high school GPA predict 11th grade ACT Science scores while controlling for gender, race, meal status, and school size.

Key Words: GEMS (growth and enhancement of Montana students), ACT Science scores, regression analyses, secondary data analysis, multicultural education

Acknowledgment: The author thanks Dr. Carrie Myers from Montana State University for her expertise in advanced educational statistics and providing valuable feedback; and also thanks Dr. Pat Munday for his mentoring and support.

LITERATURE REVIEW

The ACT, i.e. American College Test, is a standardized high-school achievement test that colleges use to rank applicants (Bauer & Wise, 2016). The ACT includes four subjects—English, mathematics, reading, science, and an optional writing test (Frey, 2018). The ACT score is on a scale of 1-36 for each of the four subject areas with a composite score averaging the four scores and rounded to the nearest whole number (Watson & Flamez, 2014). Maruyama (2012) found that, for high school students, “ACT scores in English, math, science, and reading were related respectively to student performance in English composition, college algebra, college biology, and college social studies/humanities” (p.3).

In examining variables that predict ACT scores, researchers have identified achievement gaps based on race, socioeconomic status, and other factors. As Soares (2015) notes, race and ethnicity are important variables in predicting ACT scores. Black and Hispanic students receive

Correspondence concerning this article should be addressed to Shihua Chen Brazill, E-mail: [email protected]

S. C. Brazill

2

lower average ACT scores compare to White and Asian Americans. Lotkowski, Robbins, and Noeth (2004) use High School Grade Point Average (HSGPA) and ACT Assessment scores as academic factors that predict postsecondary retention. Socioeconomic status (SES), parents’ educational achievement, and family income are non-academic factors that also predict postsecondary retention. A combination of academic and non-academic factors, together, best predict postsecondary retention and performance. Inzlicht and Ben-Zeev (2000) argue that gender stereotype negatively predicts females’ test performance. When it comes to test scores as well as other measures of science, math, and STEM ability, females positioned in a male dominated environment generally show lower achievement than males.

Class size is a good predictor of ACT performance. Students in small classes are more likely to take the ACT exam and they perform better than those in average-size classes, thus increasing their probability of college acceptance (Krueger & Whitmore, 2001; Schanzenbach, 2006). Small classes may not be better for all students, however. Budden & Hsing (2006) found that with small class sizes, socioeconomic status is magnified, with a larger achievement gap between poor and better-off students. Furthermore, qualified high school instructors play a significant role because such teachers provide higher quality instruction that leads to students having higher academic success. Interestingly, school size acts very differently from class size, as results from Lotkowski, Robbins, and Noeth (2004) indicate that school size has no relationship on college retention.

Some researchers question whether educational institutions should use standardized test scores for high school students to predict college success. Rooney & Schaeffer (1998) claim that standardized test scores are not the best predictors of college student success. Studies by Lotkowski, Robbins, and Noeth (2004) counter that claim, finding that ACT scores used in conjunction with other factors are good predictors of college success. Importantly, the study argued that ACT scores and other factors are excellent ways of identifying at-risk students so that colleges can improve retention, a finding that Bettinger, Evans, and Pope (2013) confirmed. Furthermore, as a core finding, Bettinger, Evans, and Pope (2013) show that the ACT composite score obscures the fact that ACT English and Math scores are strongly correlated with college success, whereas ACT Science and Reading scores are not. This study also found little correlation between ACT Science scores and either high school or college GPA.

Despite some criticism of using standardized test scores as a predictor of college success, Hays (2017) points out that Midwestern colleges use ACT tests as an important tool to make admission and financial aid decisions. In other words, whether or not ACT scores are a good predictor of achievement once students are admitted to college, their use in the admissions process means that ACT scores can be a significant factor in limiting achievement. Therefore, this research, using Montana high school students as a case study, is significant and has practical implications for high schools located in Western and Northern Rocky Mountain regions. Little research has been conducted to show what factors predict ACT Science Scores. This paper fills that gap by examining factors that predict 11th Grade ACT Science Scores for Montana high school students. It provides valuable insights for schools to better prepare their students to achieve higher ACT scores needed for college acceptance.

DESIGN AND METHODOLOGY

The purpose of this research is to investigate and determine academic and non-academic factors that predict 11th grade ACT Science scores. It is important to understand these factors so

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

3

we as educators can help students excel in their studies and become college ready for Science and STEM fields. The overall intent and objectives are to examine the above relationships. The unit of analysis is 11th grade ACT Science scores. The social phenomenon is how well do public high schools prepare students for their 11th grade ACT Science tests.

The independent variables (IVs) (gender, race, meal status, school size, and high school GPA) could predict the dependent variable (DV) (11th grade ACT Science scores) for the following reasons. Gender might predict the DV because males seem to do better with science in general than females. This could be because females receive many social cues that discourage them from pursuing science fields. Race could predict the DV because White students on average have more financial and social resources compared to Non-White students. Meal status could predict the DV because it is a proxy for socioeconomic status. Low income students in general might not perform as well academically because they lack role models, i.e. no professional in their family, or parents who place less emphasis on their children’s studies. They also have to struggle with non-academic issues, such as housing, food security, and health care. School size might predict the DV because larger schools have more resources to better support students. GPA could predict the DV because the two are similar as measurements of learning.

As researchers and teachers involved in higher education and STEM fields, we want to better understand student learning and develop strategies to improve student learning. 11th grade ACT Science scores are used in the research questions as a measured outcome of student learning. For the quantitative research design, the research questions, hypotheses, sample/population, variables, coding, level of measurement, and operationalization will be described in detail as follows.

RESEARCH QUESTIONS AND HYPOTHESES Research question #1: How well do gender, race, and meal status predict 11th grade ACT

Science scores? This research question is used to examine the relationship between the three demographic independent variables (gender, race, and meal status) and the dependent variable (11th grade ACT Science scores).

Ha: Gender, race, and meal status predict 11th grade ACT Science scores. Research question #2: How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status? This research question is used to examine the relationship between the independent variable of interest (school size) and the dependent variable (11th grade ACT Science scores). Ha: School size predicts 11th grade ACT Science scores while controlling for the three demographic independent variables (gender, race, and meal status). Research questions #3: How well does high school GPA predict 11th grade ACT Science scores while controlling for gender, race, meal status, and school size? This research question is used to examine the relationship between the independent variable (high school GPA) and the dependent variable (11th grade ACT Science) while controlling for gender, race, meal status, and school size. Ha: High school GPA predicts 11th grade ACT Science scores while controlling for gender, race, meal status, and school size.

POPULATION AND SAMPLE The population is 8887 high school seniors in the 2015 – 2016 academic year who entered

postsecondary education in a Montana institution in 2016 – 2017. A random sample of 300

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

4

participants was selected. Table 1 illustrates relevant demographic statistics and descriptive data of the sample characteristics.

Table 1. Descriptive Data and Demographic Statistics of the Sample Characteristics

Gender Males (Reference) Females

Race Non-White

White (Reference) Meal Status

F/free (Reference) N/non-free R/reduced

School Size/Class A (Reference) AA B C

GPA Continuous Total

144 (48.0%) 156 (52.0%)

35 (11.7%) 265 (88.3%)

48 (16.0%) 238 (79.3%) 14 (4.7%)

70 (23.3%) 139 (46.3%) 49 (16.3%) 42 (14.0%)

0-4 300

As shown in Table 1, this study includes 156 females and 144 males (male is the reference group). 265 students identified themselves as White and 35 students fall into the categories classified as Non-White (White is the reference group). Forty-eight receive free lunch, 238 did not receive free lunch, and 14 received reduced cost lunch (free lunch is the reference group). Seventy are from Class A schools, 139 are from class AA, 49 are from class B, and 42 are from class C (class A is the reference group).

STUDY VARIABLES As shown in Table 2 below, for research question 1 (How well do gender, race, and meal

status predict 11th grade ACT Science scores?), the independent variables are gender, race, and meal status, and they are all categorical/dummy variables. Gender is dummy coded where 0=male,

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

5

and 1=female. For race, White is dummy coded as “0”, all the other races including Hispanic, American Indian, Asian, two or more races are combined as Non-White due to small sample sizes and dummy coded as “1”. Meal status is dummy coded where 0=free, 1=reduce, and 2=non-free. The dependent variable is 11th grade ACT Science scores, the scores range from 1-36, and it is a continuous variable measured at the ratio level. 11th grade ACT Science scores are used as a measure of learning.

For research question 2 (How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status?), the independent variable is school size. A is represented as “1”, B is represented as “2”, C is represented as “3”, and AA is represented as “4”. The control variables are gender, race, and meal status. The dependent variable is 11th grade ACT Science scores, the scores range from 1-36, and it is a continuous variable measured at the ratio level.

For research question 3 (How well does high school GPA predict 11th grade ACT Science scores while controlling for gender, race, meal status, and school size?), the independent variable is GPA, the scores range from 0-4, and it is a continuous variable measured at the ratio level. The dependent variable is 11th grade ACT Science scores, the scores range from 1-36, and it is a continuous variable measured at the ratio level.

Table 2: Study Variables

Independent variable

Independent variable

Independent variable

Dependent variable

Name Gender

Race

Meal Status

11th grade ACT Science scores

Research Question 1

Description

Coding

Male & Female

White & Non-White

Free, Reduced & Nonfree

Mean 11th grade ACT Science scores

Male = 0 (Reference) Female = 1

White=0 (Reference) Non-White=1

Free=0 (Reference) Reduce =1 Non-free= 2

1-36

Level of Measurement Categorical/Dichotomous

(dummy)

Categorical/Nominal (dummy)

Categorical/Nominal (dummy)

Continuous

Independent variable

Dependent variable

Name School Size

11th grade ACT Science scores

Research Question 2 Description A, B, C, AA

Mean 11th grade ACT Science scores

Coding

A = 1 (Reference)

B = 2 C = 3 AA= 4 1-36

Level of Measurement Categorical/Nominal(du

mmy)

Continuous

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

Independent variable

Dependent variable

Name HS GPA

S. C. Brazill

Research Question 3

Description Mean high school

GPA

Coding 0-4

11th grade ACT Mean 11th grade ACT

1-36

Science scores

Science scores

6

Level of Measurement Continuous

Continuous

STATISTICAL STRATEGY The analytical approach for this research was based on Urdan (2011), Acock (2018), and

Mehmetoglu & Jakobsen (2017). Multiple regression allows for a net effect of X1 keeping X2 constant and X2 = control or covariate (p.73, Mehmetoglu & Jakobsen, 2017). Multiple regressions (Y= a + b1X1 + b2X2 + b3X3 + …+biXi, “a” is the intercept and “b” is the unstandardized beta coefficient) were used to examine the relationship between a continuous (i.e., interval or ratio scaled) dependent variable (11th grade ACT Science scores) and continuous (high school GPA) and dummy/categorical dependent variables (gender, race meal status, and school size) (Urdan, 2011).

Three Ordinary Least Squared (OLS) Regression Models were used to evaluate the three research questions and hypotheses. OLS Regression is a conceptual model and a mathematical function used to explain the relationships among variables of interest. The three models were built based on the same logic and each model was built on top of the previous model. For model 1, the dependent variable 11th grade ACT Science scores was regressed on the independent variables (gender, race, and meal status). The purpose of model 1 was to test the net effect of gender, race, and meal status on the dependent variable 11th grade ACT Science scores. The goal was to assess the demographic variables and how well they predict the dependent variables. For model 2, the variable of interest was added to model 1 to test the relationships between the dependent variable 11th grade ACT Science scores and the independent variable school size. The additional control variable high school GPA was added for model 3 (the full model).

Research Question 1/Hypothesis 1: Model 1: 11th grade ACT Science scores= a + b1(gender)+ b2 (race) + b3 (meal status)

Research Question 2/Hypothesis 2: Model 2: 11th grade ACT Science scores = a + b1(gender)+ b2 (race) + b3 (meal status) + b4 (school size)

Research Question 3/Hypothesis 3: Model 3: 11th grade ACT Science scores = a + b1(gender)+ b2 (race) + b3 (meal status) + b4 (school size) + b5 (high school GPA)

Descriptive statistics for 11th grade ACT Science scores and high school GPA are shown in Table 3. The mean for high school GPA is 3.30 and the standard deviation is 0.54. The mean for 11th grade ACT Science scores is 21.9 and the standard deviation is 4.77.

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

7

Table 3: Descriptive Statistics for 11th Grade ACT Science Scores and High School GPA

. tabstat ACTScience11 HS_GPA, stats(mean, sd, range count)

stats

ACTSc~11

HS_GPA

mean sd

range N

21.94333 4.771023

34 300

3.309002 .5441673

2.43 300

As shown in Table 4, the normality of the variables was evaluated by skewness and kurtosis statistics.

Table 4: Skewness and Kurtosis

. sktest gendercoded RaceEthnicityFedCoded MealStatusCoded ClassCoded HS_GPA ACTScience11

Variable

Skewness/Kurtosis tests for Normality joint

Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2

gendercoded RaceEthn~ded MealStatus~d

ClassCoded HS_GPA

ACTScience11

300 300 300 300 300 300

0.5628 0.0000 0.0000 0.0034 0.0001 0.5879

. 0.0000 0.0178

. 0.0311 0.0071

. . 60.88 . 16.68 7.20

. 0.0000 0.0000

. 0.0002 0.0274

. tabstat ACTScience11 gendercoded RaceEthnicityFedCoded MealStatusCoded ClassCoded HS_GPA, statistics (Ske > wness, Kurtosis)

stats ACTSc~11 gender~d Race~ded MealSt~d ClassC~d HS_GPA

skewness kurtosis

-.0749568 -.0800641 2.388201 -1.63251 -.4196527 -.5748774 3.975316 1.00641 6.703504 3.805793 1.522014 2.50809

Skewedness and kurtosis were tested using procedures described by Acock (2018). The test for skewness was significant (p < 0.05) for High School GPA indicating that it has negative skewness (GPA skewness =-0.57, p=0.000). The test for Kurtosis was significant (p < 0.05) for GPA and 11th grade ACT science indicating that these variables have positive kurtosis (GPA kurtosis=2.51, p=0.031; and 11th grade ACT science kurtosis=3.98, p=0.007).

Model specificity and regression residuals were tested using a series of diagnostic procedures. As shown in Table 5, there were no violations observed with heteroskedasticity, multicollinearity, specification problem, functional form, and influential observations. The normality assumption of constant residuals was tested using the Shapiro-Wilk W normality test. The findings showed the assumption for normality was violated suggesting that some of the residuals are not normally distributed. This could be due to the categorical independent variables of gender, race, meal status, and school size/class.

Table 5: Statistical Analysis (statistical tests, assumptions, and diagnostic procedures)

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

8

. quietly regress ACTScience11 gendercoded RaceEthnicityFedCoded MealStatusCoded ClassCoded HS_GPA

. regcheck

> > s

Regression assumptions:

Test:

We seek value

> 1) no heterokedasticity problem

Breusch-Pagan hettest Chi2(1): 0.086 p-value: 0.769

> 0.05

> 2) no multicollinearity problem

Variance inflation factor HS_GPA : 1.09 MealStatusCoded : 1.07 RaceEthnicityFedCoded : 1.05 gendercoded : 1.05 ClassCoded : 1.01

< 5.00

> 3) residuals are not normally distributed

Shapiro-Wilk W z: 5.178 p-value: 0.000

normality

test

> 0.01

> 4) no specification problem

Linktest t: 1.112 p-value:

0.267

> 0.05

> 5) appropriate functional form

Test for appropriate F(3,291):1.404 p-value: 0.242

functional

form

> 0.05

> 6) no influential observations

Cook's distance no distance is above the cutoff

< 1.00

>

.

PREPARING THE DATA All data preparation used the GEMS dataset in Excel, which then was imported to Stata.

Stata IC/15 was performed to analyze the data and produce the visual representation (graphs) of the data. The hypotheses were evaluated at .05 level of significance. We kept a research log that described the analytical procedures. One data preparation strategy was recoding the gender, race, meal status, school size (class) groups. Gender is dummy coded where 0=male, and 1=female. For race, White is dummy coded as “0”, all the other races including Hispanic, American Indian, Asian, two or more races are combined as Non-White due to small sample sizes and dummy coded as “1”. Meal status is coded where 0=free, 1=reduce, and 2=non-free. School size (class) A is coded as “1”, B is coded as “2”, C is coded as “3”, and AA is coded as “4”.

RESULTS

The descriptive statistics are shown in Table 2. Please refer to Appendix A for histograms

of the independent and dependent variables and Appendix B for the scatter plot of the continuous

independent variable with the dependent variable. Three OLS models were estimated to evaluate

the three research questions. Research question 1: How well do gender, race, and meal status predict 11th grade ACT

Science scores? In model 1 (11th grade ACT Science scores = gender + race + meal status), multiple

regression was conducted to test research question 1 to determine if the linear combination of the

independent variables (gender, race, and meal status) have some explanatory power in explaining the dependent variable 11th grade ACT Science scores.

The results in Table 6 of the F-test for 11th grade ACT Science scores, regressed on gender,

race, and meal status, show a significant relationship (F (4, 295) = 5.62, p=0.0002). This suggests that the linear combination of independent variables is significantly associated with 11th grade ACT Science scores. The R2 shows that the combination of independent variables explains 7.08% of the variance in 11th grade ACT Science scores. The adjusted R2 shows that the combination of independent variables explains 5.82% of the variance in 11th grade ACT Science scores. There is a 1.26% difference between the R2 and the adjusted R2.

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

9

The R2 for model 1 is 0.07. The adjusted R2 for this model is 0.06. The adjusted R2 is the corrected version of the original R2 and serves as a conservative adjustment to account for the added predictor variables (Mehmetoglu & Jakobsen, 2016). Therefore, the total variance explained by the linear combination of independent variables for 11th grade ACT Science scores is 6%.

Table 6: Model 1 Result

The significant variable in model 1 is race (b=-2.89, p=0.001). Compared to White, Nonwhite students score 2.89 points lower on the 11th ACT Science scores.

Research question 2: How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status? In model 2 (11th grade ACT Science scores = gender + race + meal status + school size), we added school size as a variable of interest. Multiple regression was conducted to test research question 2 to determine if school size predicts 11th grade ACT Science scores while controlling for gender, race, and meal status.

The results in Table 7 of the F-test for 11th grade ACT Science scores, regressed on school size, show a significant result (F (7, 292) = 4.05, p=0.0003). This suggests that the linear combination of independent variables are associated with 11th grade ACT Science scores. The R2 shows that the combination of independent variables (gender, race, meal status, and school size) explains 8.85% of the variance in 11th grade ACT Science scores. The adjusted R2 shows that the combination of independent variables explains 6.67% of the variance in 11th grade ACT Science scores. There is a 2.18% difference between the R2 and the adjusted R2. The small difference between the R2 and adjusted R2 suggests that the new predictor variable (school size) added to the model is important.

Table 7: Model 2 Result

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

10

. reg ACTScience11 i. gendercoded i. RaceEthnicityFedCoded i. MealStatusCoded i. ClassCoded

Source

Model Residual

Total

SS

602.419992 6203.61667

6806.03667

df

MS

Number of obs =

F(7, 292)

=

7 86.0599989 Prob > F

=

292 21.2452626 R-squared

=

Adj R-squared =

299 22.7626644 Root MSE

=

300 4.05 0.0003 0.0885 0.0667 4.6093

ACTScience11

1.gendercoded 1.RaceEthnicityFedCoded

MealStatusCoded 1 2

ClassCoded 2 3 4

_cons

Coef.

-.8115025 -2.89793

Std. Err.

.5437116 .8448744

t

-1.49 -3.43

P>|t|

0.137 0.001

-.3984726 1.284566

1.416529 .7500529

-0.28 1.71

0.779 0.088

-1.288084 .1041843 .5426423

21.64729

.8651329 .9050692

.682891

.9156198

-1.49 0.12 0.79

23.64

0.138 0.908 0.427

0.000

[95% Conf. Interval]

-1.881593 -4.560746

.2585879 -1.235115

-3.186373 -.1916297

2.389427 2.760761

-2.990771 -1.677102

-.80137

19.84524

.4146021 1.88547

1.886655

23.44934

.

The significant variable is race (b=-2.90, p=0.001). Gender, meal status, and school size are not significant. Compared to White, Nonwhite students score 2.90 points lower on the 11th

ACT Science scores. Between model 1 and model 2, the unstandardized coefficient for race had

minimal change of 0.01 by adding the variable school size. Research question 3: How well does high school GPA predict 11th grade ACT Science

scores while controlling for gender, race, meal status, and school size. High school GPA was added in model 3 (11th grade ACT Science scores = gender + race + meal status + school size + high

school GPA). Multiple regression was conducted to test research question 3 to determine if high school GPA predicts 11th grade ACT Science scores while controlling for gender, race, meal status,

and school size. The results in Table 8 of the F-test for 11th grade ACT Science scores, regressed on high

school GPA, show a significant (F (8, 291) =15.49, p=0.0000). This suggests that the linear combination of independent variables are associated with 11th grade ACT Science scores. The R2

shows that the combination of the independent variables (gender, race, meal status, school size, and high school GPA) explains 29.87% of the variance in 11th grade ACT Science scores. The adjusted R2 shows that the combination of independent variables explains 27.94% of the variance in 11th grade ACT Science scores. There is a 1.93% difference between the R2 and the adjusted R2.

The R2 shows that the combination of the independent variables (gender, race, meal status, school size, and high school GPA) explains 29.87% of the variance in 11th grade ACT Science scores, a 21.02% increase from the 8.85% explained by model 2. The adjusted R2 shows that the

combination of the independent variables (gender, race, meal status, school size, and high school GPA) explains 27.94% of the variance in 11th grade ACT Science scores, a 21.27% increase from

the 6.67% explained by model 2. Both the R2 and the adjusted R2 increased from model 1 to model 2 to model 3 by adding

variables of interests and the control variable. The increase in adjusted R2 occurred from model 2 to model 3 and was due to GPA. The increase from model 2 to model 3 between R2 and adjusted R2 was minimal.

The adjusted R2 from model 1 to model 2 changed from 0.06 to 0.07, respectively, in the full model 3, the adjusted R2 is 0.28. This is a marked improvement in explaining the variance in

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

Factors that Predict ACT Science Scores from a Multicultural Perspective

Shihua Chen Brazill

Montana State University

Abstract: This study investigated predictors for ACT Science scores, a test used by many universities to rank applicants. This study utilized quantitative research methods using the Montana Office of Public Instruction’s GEMS (Growth and Enhancement of Montana Students) data set. All advanced statistical analysis was conducted using Stata software IC/15. This research is significant for increasing the representation of under-represented groups in STEM education because it helps clarify three important relationships: (1) How well do gender, race, and meal status predict 11th grade ACT Science scores; (2) How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status; and (3) How well does high school GPA predict 11th grade ACT Science scores while controlling for gender, race, meal status, and school size.

Key Words: GEMS (growth and enhancement of Montana students), ACT Science scores, regression analyses, secondary data analysis, multicultural education

Acknowledgment: The author thanks Dr. Carrie Myers from Montana State University for her expertise in advanced educational statistics and providing valuable feedback; and also thanks Dr. Pat Munday for his mentoring and support.

LITERATURE REVIEW

The ACT, i.e. American College Test, is a standardized high-school achievement test that colleges use to rank applicants (Bauer & Wise, 2016). The ACT includes four subjects—English, mathematics, reading, science, and an optional writing test (Frey, 2018). The ACT score is on a scale of 1-36 for each of the four subject areas with a composite score averaging the four scores and rounded to the nearest whole number (Watson & Flamez, 2014). Maruyama (2012) found that, for high school students, “ACT scores in English, math, science, and reading were related respectively to student performance in English composition, college algebra, college biology, and college social studies/humanities” (p.3).

In examining variables that predict ACT scores, researchers have identified achievement gaps based on race, socioeconomic status, and other factors. As Soares (2015) notes, race and ethnicity are important variables in predicting ACT scores. Black and Hispanic students receive

Correspondence concerning this article should be addressed to Shihua Chen Brazill, E-mail: [email protected]

S. C. Brazill

2

lower average ACT scores compare to White and Asian Americans. Lotkowski, Robbins, and Noeth (2004) use High School Grade Point Average (HSGPA) and ACT Assessment scores as academic factors that predict postsecondary retention. Socioeconomic status (SES), parents’ educational achievement, and family income are non-academic factors that also predict postsecondary retention. A combination of academic and non-academic factors, together, best predict postsecondary retention and performance. Inzlicht and Ben-Zeev (2000) argue that gender stereotype negatively predicts females’ test performance. When it comes to test scores as well as other measures of science, math, and STEM ability, females positioned in a male dominated environment generally show lower achievement than males.

Class size is a good predictor of ACT performance. Students in small classes are more likely to take the ACT exam and they perform better than those in average-size classes, thus increasing their probability of college acceptance (Krueger & Whitmore, 2001; Schanzenbach, 2006). Small classes may not be better for all students, however. Budden & Hsing (2006) found that with small class sizes, socioeconomic status is magnified, with a larger achievement gap between poor and better-off students. Furthermore, qualified high school instructors play a significant role because such teachers provide higher quality instruction that leads to students having higher academic success. Interestingly, school size acts very differently from class size, as results from Lotkowski, Robbins, and Noeth (2004) indicate that school size has no relationship on college retention.

Some researchers question whether educational institutions should use standardized test scores for high school students to predict college success. Rooney & Schaeffer (1998) claim that standardized test scores are not the best predictors of college student success. Studies by Lotkowski, Robbins, and Noeth (2004) counter that claim, finding that ACT scores used in conjunction with other factors are good predictors of college success. Importantly, the study argued that ACT scores and other factors are excellent ways of identifying at-risk students so that colleges can improve retention, a finding that Bettinger, Evans, and Pope (2013) confirmed. Furthermore, as a core finding, Bettinger, Evans, and Pope (2013) show that the ACT composite score obscures the fact that ACT English and Math scores are strongly correlated with college success, whereas ACT Science and Reading scores are not. This study also found little correlation between ACT Science scores and either high school or college GPA.

Despite some criticism of using standardized test scores as a predictor of college success, Hays (2017) points out that Midwestern colleges use ACT tests as an important tool to make admission and financial aid decisions. In other words, whether or not ACT scores are a good predictor of achievement once students are admitted to college, their use in the admissions process means that ACT scores can be a significant factor in limiting achievement. Therefore, this research, using Montana high school students as a case study, is significant and has practical implications for high schools located in Western and Northern Rocky Mountain regions. Little research has been conducted to show what factors predict ACT Science Scores. This paper fills that gap by examining factors that predict 11th Grade ACT Science Scores for Montana high school students. It provides valuable insights for schools to better prepare their students to achieve higher ACT scores needed for college acceptance.

DESIGN AND METHODOLOGY

The purpose of this research is to investigate and determine academic and non-academic factors that predict 11th grade ACT Science scores. It is important to understand these factors so

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

3

we as educators can help students excel in their studies and become college ready for Science and STEM fields. The overall intent and objectives are to examine the above relationships. The unit of analysis is 11th grade ACT Science scores. The social phenomenon is how well do public high schools prepare students for their 11th grade ACT Science tests.

The independent variables (IVs) (gender, race, meal status, school size, and high school GPA) could predict the dependent variable (DV) (11th grade ACT Science scores) for the following reasons. Gender might predict the DV because males seem to do better with science in general than females. This could be because females receive many social cues that discourage them from pursuing science fields. Race could predict the DV because White students on average have more financial and social resources compared to Non-White students. Meal status could predict the DV because it is a proxy for socioeconomic status. Low income students in general might not perform as well academically because they lack role models, i.e. no professional in their family, or parents who place less emphasis on their children’s studies. They also have to struggle with non-academic issues, such as housing, food security, and health care. School size might predict the DV because larger schools have more resources to better support students. GPA could predict the DV because the two are similar as measurements of learning.

As researchers and teachers involved in higher education and STEM fields, we want to better understand student learning and develop strategies to improve student learning. 11th grade ACT Science scores are used in the research questions as a measured outcome of student learning. For the quantitative research design, the research questions, hypotheses, sample/population, variables, coding, level of measurement, and operationalization will be described in detail as follows.

RESEARCH QUESTIONS AND HYPOTHESES Research question #1: How well do gender, race, and meal status predict 11th grade ACT

Science scores? This research question is used to examine the relationship between the three demographic independent variables (gender, race, and meal status) and the dependent variable (11th grade ACT Science scores).

Ha: Gender, race, and meal status predict 11th grade ACT Science scores. Research question #2: How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status? This research question is used to examine the relationship between the independent variable of interest (school size) and the dependent variable (11th grade ACT Science scores). Ha: School size predicts 11th grade ACT Science scores while controlling for the three demographic independent variables (gender, race, and meal status). Research questions #3: How well does high school GPA predict 11th grade ACT Science scores while controlling for gender, race, meal status, and school size? This research question is used to examine the relationship between the independent variable (high school GPA) and the dependent variable (11th grade ACT Science) while controlling for gender, race, meal status, and school size. Ha: High school GPA predicts 11th grade ACT Science scores while controlling for gender, race, meal status, and school size.

POPULATION AND SAMPLE The population is 8887 high school seniors in the 2015 – 2016 academic year who entered

postsecondary education in a Montana institution in 2016 – 2017. A random sample of 300

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

4

participants was selected. Table 1 illustrates relevant demographic statistics and descriptive data of the sample characteristics.

Table 1. Descriptive Data and Demographic Statistics of the Sample Characteristics

Gender Males (Reference) Females

Race Non-White

White (Reference) Meal Status

F/free (Reference) N/non-free R/reduced

School Size/Class A (Reference) AA B C

GPA Continuous Total

144 (48.0%) 156 (52.0%)

35 (11.7%) 265 (88.3%)

48 (16.0%) 238 (79.3%) 14 (4.7%)

70 (23.3%) 139 (46.3%) 49 (16.3%) 42 (14.0%)

0-4 300

As shown in Table 1, this study includes 156 females and 144 males (male is the reference group). 265 students identified themselves as White and 35 students fall into the categories classified as Non-White (White is the reference group). Forty-eight receive free lunch, 238 did not receive free lunch, and 14 received reduced cost lunch (free lunch is the reference group). Seventy are from Class A schools, 139 are from class AA, 49 are from class B, and 42 are from class C (class A is the reference group).

STUDY VARIABLES As shown in Table 2 below, for research question 1 (How well do gender, race, and meal

status predict 11th grade ACT Science scores?), the independent variables are gender, race, and meal status, and they are all categorical/dummy variables. Gender is dummy coded where 0=male,

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

5

and 1=female. For race, White is dummy coded as “0”, all the other races including Hispanic, American Indian, Asian, two or more races are combined as Non-White due to small sample sizes and dummy coded as “1”. Meal status is dummy coded where 0=free, 1=reduce, and 2=non-free. The dependent variable is 11th grade ACT Science scores, the scores range from 1-36, and it is a continuous variable measured at the ratio level. 11th grade ACT Science scores are used as a measure of learning.

For research question 2 (How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status?), the independent variable is school size. A is represented as “1”, B is represented as “2”, C is represented as “3”, and AA is represented as “4”. The control variables are gender, race, and meal status. The dependent variable is 11th grade ACT Science scores, the scores range from 1-36, and it is a continuous variable measured at the ratio level.

For research question 3 (How well does high school GPA predict 11th grade ACT Science scores while controlling for gender, race, meal status, and school size?), the independent variable is GPA, the scores range from 0-4, and it is a continuous variable measured at the ratio level. The dependent variable is 11th grade ACT Science scores, the scores range from 1-36, and it is a continuous variable measured at the ratio level.

Table 2: Study Variables

Independent variable

Independent variable

Independent variable

Dependent variable

Name Gender

Race

Meal Status

11th grade ACT Science scores

Research Question 1

Description

Coding

Male & Female

White & Non-White

Free, Reduced & Nonfree

Mean 11th grade ACT Science scores

Male = 0 (Reference) Female = 1

White=0 (Reference) Non-White=1

Free=0 (Reference) Reduce =1 Non-free= 2

1-36

Level of Measurement Categorical/Dichotomous

(dummy)

Categorical/Nominal (dummy)

Categorical/Nominal (dummy)

Continuous

Independent variable

Dependent variable

Name School Size

11th grade ACT Science scores

Research Question 2 Description A, B, C, AA

Mean 11th grade ACT Science scores

Coding

A = 1 (Reference)

B = 2 C = 3 AA= 4 1-36

Level of Measurement Categorical/Nominal(du

mmy)

Continuous

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

Independent variable

Dependent variable

Name HS GPA

S. C. Brazill

Research Question 3

Description Mean high school

GPA

Coding 0-4

11th grade ACT Mean 11th grade ACT

1-36

Science scores

Science scores

6

Level of Measurement Continuous

Continuous

STATISTICAL STRATEGY The analytical approach for this research was based on Urdan (2011), Acock (2018), and

Mehmetoglu & Jakobsen (2017). Multiple regression allows for a net effect of X1 keeping X2 constant and X2 = control or covariate (p.73, Mehmetoglu & Jakobsen, 2017). Multiple regressions (Y= a + b1X1 + b2X2 + b3X3 + …+biXi, “a” is the intercept and “b” is the unstandardized beta coefficient) were used to examine the relationship between a continuous (i.e., interval or ratio scaled) dependent variable (11th grade ACT Science scores) and continuous (high school GPA) and dummy/categorical dependent variables (gender, race meal status, and school size) (Urdan, 2011).

Three Ordinary Least Squared (OLS) Regression Models were used to evaluate the three research questions and hypotheses. OLS Regression is a conceptual model and a mathematical function used to explain the relationships among variables of interest. The three models were built based on the same logic and each model was built on top of the previous model. For model 1, the dependent variable 11th grade ACT Science scores was regressed on the independent variables (gender, race, and meal status). The purpose of model 1 was to test the net effect of gender, race, and meal status on the dependent variable 11th grade ACT Science scores. The goal was to assess the demographic variables and how well they predict the dependent variables. For model 2, the variable of interest was added to model 1 to test the relationships between the dependent variable 11th grade ACT Science scores and the independent variable school size. The additional control variable high school GPA was added for model 3 (the full model).

Research Question 1/Hypothesis 1: Model 1: 11th grade ACT Science scores= a + b1(gender)+ b2 (race) + b3 (meal status)

Research Question 2/Hypothesis 2: Model 2: 11th grade ACT Science scores = a + b1(gender)+ b2 (race) + b3 (meal status) + b4 (school size)

Research Question 3/Hypothesis 3: Model 3: 11th grade ACT Science scores = a + b1(gender)+ b2 (race) + b3 (meal status) + b4 (school size) + b5 (high school GPA)

Descriptive statistics for 11th grade ACT Science scores and high school GPA are shown in Table 3. The mean for high school GPA is 3.30 and the standard deviation is 0.54. The mean for 11th grade ACT Science scores is 21.9 and the standard deviation is 4.77.

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

7

Table 3: Descriptive Statistics for 11th Grade ACT Science Scores and High School GPA

. tabstat ACTScience11 HS_GPA, stats(mean, sd, range count)

stats

ACTSc~11

HS_GPA

mean sd

range N

21.94333 4.771023

34 300

3.309002 .5441673

2.43 300

As shown in Table 4, the normality of the variables was evaluated by skewness and kurtosis statistics.

Table 4: Skewness and Kurtosis

. sktest gendercoded RaceEthnicityFedCoded MealStatusCoded ClassCoded HS_GPA ACTScience11

Variable

Skewness/Kurtosis tests for Normality joint

Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2

gendercoded RaceEthn~ded MealStatus~d

ClassCoded HS_GPA

ACTScience11

300 300 300 300 300 300

0.5628 0.0000 0.0000 0.0034 0.0001 0.5879

. 0.0000 0.0178

. 0.0311 0.0071

. . 60.88 . 16.68 7.20

. 0.0000 0.0000

. 0.0002 0.0274

. tabstat ACTScience11 gendercoded RaceEthnicityFedCoded MealStatusCoded ClassCoded HS_GPA, statistics (Ske > wness, Kurtosis)

stats ACTSc~11 gender~d Race~ded MealSt~d ClassC~d HS_GPA

skewness kurtosis

-.0749568 -.0800641 2.388201 -1.63251 -.4196527 -.5748774 3.975316 1.00641 6.703504 3.805793 1.522014 2.50809

Skewedness and kurtosis were tested using procedures described by Acock (2018). The test for skewness was significant (p < 0.05) for High School GPA indicating that it has negative skewness (GPA skewness =-0.57, p=0.000). The test for Kurtosis was significant (p < 0.05) for GPA and 11th grade ACT science indicating that these variables have positive kurtosis (GPA kurtosis=2.51, p=0.031; and 11th grade ACT science kurtosis=3.98, p=0.007).

Model specificity and regression residuals were tested using a series of diagnostic procedures. As shown in Table 5, there were no violations observed with heteroskedasticity, multicollinearity, specification problem, functional form, and influential observations. The normality assumption of constant residuals was tested using the Shapiro-Wilk W normality test. The findings showed the assumption for normality was violated suggesting that some of the residuals are not normally distributed. This could be due to the categorical independent variables of gender, race, meal status, and school size/class.

Table 5: Statistical Analysis (statistical tests, assumptions, and diagnostic procedures)

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

8

. quietly regress ACTScience11 gendercoded RaceEthnicityFedCoded MealStatusCoded ClassCoded HS_GPA

. regcheck

> > s

Regression assumptions:

Test:

We seek value

> 1) no heterokedasticity problem

Breusch-Pagan hettest Chi2(1): 0.086 p-value: 0.769

> 0.05

> 2) no multicollinearity problem

Variance inflation factor HS_GPA : 1.09 MealStatusCoded : 1.07 RaceEthnicityFedCoded : 1.05 gendercoded : 1.05 ClassCoded : 1.01

< 5.00

> 3) residuals are not normally distributed

Shapiro-Wilk W z: 5.178 p-value: 0.000

normality

test

> 0.01

> 4) no specification problem

Linktest t: 1.112 p-value:

0.267

> 0.05

> 5) appropriate functional form

Test for appropriate F(3,291):1.404 p-value: 0.242

functional

form

> 0.05

> 6) no influential observations

Cook's distance no distance is above the cutoff

< 1.00

>

.

PREPARING THE DATA All data preparation used the GEMS dataset in Excel, which then was imported to Stata.

Stata IC/15 was performed to analyze the data and produce the visual representation (graphs) of the data. The hypotheses were evaluated at .05 level of significance. We kept a research log that described the analytical procedures. One data preparation strategy was recoding the gender, race, meal status, school size (class) groups. Gender is dummy coded where 0=male, and 1=female. For race, White is dummy coded as “0”, all the other races including Hispanic, American Indian, Asian, two or more races are combined as Non-White due to small sample sizes and dummy coded as “1”. Meal status is coded where 0=free, 1=reduce, and 2=non-free. School size (class) A is coded as “1”, B is coded as “2”, C is coded as “3”, and AA is coded as “4”.

RESULTS

The descriptive statistics are shown in Table 2. Please refer to Appendix A for histograms

of the independent and dependent variables and Appendix B for the scatter plot of the continuous

independent variable with the dependent variable. Three OLS models were estimated to evaluate

the three research questions. Research question 1: How well do gender, race, and meal status predict 11th grade ACT

Science scores? In model 1 (11th grade ACT Science scores = gender + race + meal status), multiple

regression was conducted to test research question 1 to determine if the linear combination of the

independent variables (gender, race, and meal status) have some explanatory power in explaining the dependent variable 11th grade ACT Science scores.

The results in Table 6 of the F-test for 11th grade ACT Science scores, regressed on gender,

race, and meal status, show a significant relationship (F (4, 295) = 5.62, p=0.0002). This suggests that the linear combination of independent variables is significantly associated with 11th grade ACT Science scores. The R2 shows that the combination of independent variables explains 7.08% of the variance in 11th grade ACT Science scores. The adjusted R2 shows that the combination of independent variables explains 5.82% of the variance in 11th grade ACT Science scores. There is a 1.26% difference between the R2 and the adjusted R2.

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

9

The R2 for model 1 is 0.07. The adjusted R2 for this model is 0.06. The adjusted R2 is the corrected version of the original R2 and serves as a conservative adjustment to account for the added predictor variables (Mehmetoglu & Jakobsen, 2016). Therefore, the total variance explained by the linear combination of independent variables for 11th grade ACT Science scores is 6%.

Table 6: Model 1 Result

The significant variable in model 1 is race (b=-2.89, p=0.001). Compared to White, Nonwhite students score 2.89 points lower on the 11th ACT Science scores.

Research question 2: How well does school size predict 11th grade ACT Science scores while controlling for gender, race, and meal status? In model 2 (11th grade ACT Science scores = gender + race + meal status + school size), we added school size as a variable of interest. Multiple regression was conducted to test research question 2 to determine if school size predicts 11th grade ACT Science scores while controlling for gender, race, and meal status.

The results in Table 7 of the F-test for 11th grade ACT Science scores, regressed on school size, show a significant result (F (7, 292) = 4.05, p=0.0003). This suggests that the linear combination of independent variables are associated with 11th grade ACT Science scores. The R2 shows that the combination of independent variables (gender, race, meal status, and school size) explains 8.85% of the variance in 11th grade ACT Science scores. The adjusted R2 shows that the combination of independent variables explains 6.67% of the variance in 11th grade ACT Science scores. There is a 2.18% difference between the R2 and the adjusted R2. The small difference between the R2 and adjusted R2 suggests that the new predictor variable (school size) added to the model is important.

Table 7: Model 2 Result

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965

S. C. Brazill

10

. reg ACTScience11 i. gendercoded i. RaceEthnicityFedCoded i. MealStatusCoded i. ClassCoded

Source

Model Residual

Total

SS

602.419992 6203.61667

6806.03667

df

MS

Number of obs =

F(7, 292)

=

7 86.0599989 Prob > F

=

292 21.2452626 R-squared

=

Adj R-squared =

299 22.7626644 Root MSE

=

300 4.05 0.0003 0.0885 0.0667 4.6093

ACTScience11

1.gendercoded 1.RaceEthnicityFedCoded

MealStatusCoded 1 2

ClassCoded 2 3 4

_cons

Coef.

-.8115025 -2.89793

Std. Err.

.5437116 .8448744

t

-1.49 -3.43

P>|t|

0.137 0.001

-.3984726 1.284566

1.416529 .7500529

-0.28 1.71

0.779 0.088

-1.288084 .1041843 .5426423

21.64729

.8651329 .9050692

.682891

.9156198

-1.49 0.12 0.79

23.64

0.138 0.908 0.427

0.000

[95% Conf. Interval]

-1.881593 -4.560746

.2585879 -1.235115

-3.186373 -.1916297

2.389427 2.760761

-2.990771 -1.677102

-.80137

19.84524

.4146021 1.88547

1.886655

23.44934

.

The significant variable is race (b=-2.90, p=0.001). Gender, meal status, and school size are not significant. Compared to White, Nonwhite students score 2.90 points lower on the 11th

ACT Science scores. Between model 1 and model 2, the unstandardized coefficient for race had

minimal change of 0.01 by adding the variable school size. Research question 3: How well does high school GPA predict 11th grade ACT Science

scores while controlling for gender, race, meal status, and school size. High school GPA was added in model 3 (11th grade ACT Science scores = gender + race + meal status + school size + high

school GPA). Multiple regression was conducted to test research question 3 to determine if high school GPA predicts 11th grade ACT Science scores while controlling for gender, race, meal status,

and school size. The results in Table 8 of the F-test for 11th grade ACT Science scores, regressed on high

school GPA, show a significant (F (8, 291) =15.49, p=0.0000). This suggests that the linear combination of independent variables are associated with 11th grade ACT Science scores. The R2

shows that the combination of the independent variables (gender, race, meal status, school size, and high school GPA) explains 29.87% of the variance in 11th grade ACT Science scores. The adjusted R2 shows that the combination of independent variables explains 27.94% of the variance in 11th grade ACT Science scores. There is a 1.93% difference between the R2 and the adjusted R2.

The R2 shows that the combination of the independent variables (gender, race, meal status, school size, and high school GPA) explains 29.87% of the variance in 11th grade ACT Science scores, a 21.02% increase from the 8.85% explained by model 2. The adjusted R2 shows that the

combination of the independent variables (gender, race, meal status, school size, and high school GPA) explains 27.94% of the variance in 11th grade ACT Science scores, a 21.27% increase from

the 6.67% explained by model 2. Both the R2 and the adjusted R2 increased from model 1 to model 2 to model 3 by adding

variables of interests and the control variable. The increase in adjusted R2 occurred from model 2 to model 3 and was due to GPA. The increase from model 2 to model 3 between R2 and adjusted R2 was minimal.

The adjusted R2 from model 1 to model 2 changed from 0.06 to 0.07, respectively, in the full model 3, the adjusted R2 is 0.28. This is a marked improvement in explaining the variance in

Educational Research: Theory & Practice, Volume 30, Issue 2, ISSN 2637-8965