Considering English Language Proficiency within Systems of

Transcript Of Considering English Language Proficiency within Systems of
Considering English Language Proficiency within Systems of Educational Accountability under
the Every Student Succeeds Act
Susan Lyons, Ph.D. and Nathan Dadey, Ph.D. Center for Assessment March 8, 2017
This paper was written in collaboration with the Latino Policy Forum with significant financial support from the High Quality Assessment Project.
Lyons & Dadey
1
Introduction The requirement for an indicator of ―progress in achieving English language proficiency‖ (English language proficiency) for English learners (ELs) must now be included in state systems of educational accountability under the Every Student Succeeds Act (ESSA, §1111(c)(4)). Specifically, the statute requires that English language proficiency be addressed in two1 specific ways within systems of accountability—as part of the state‘s long-term and interim goals, and as part of an annual system that meaningfully differentiates schools. ESSA‘s inclusion of English language proficiency within Title I accountability systems represents a key juncture in accountability policy that provides states the opportunity to define, or redefine, progress in achieving English language proficiency in a system of accountability that considers all EL students2. The goal of this brief is to first provide an overview the ESSA requirements around English language proficiency within systems of accountability, and then to offer guidance on the ways in which (a) progress in achieving English language proficiency can be defined, (b) these various definitions can be incorporated into ESSA-compliant state accountability systems, and (c) a state can evaluate the validity of a state ESSA accountability system for meeting EL policy goals.
States must first establish a vision for English learners and English language acquisition embedded in a coherent theory of action before engaging in accountability system design. There are a variety of design decisions that must be made in order to create a new school accountability system under ESSA. The new federal law permits a wide latitude in the specifics of state accountability systems – allowing for variety of types of indicators reported, the stated goals and targets, and the rewards or consequences for schools. Therefore, state leaders need to base complex design decisions on a clear state vision. This state vision is foundational. By providing clearly articulated educational goals for all students, and for English learners in particular, the state vision provides the basis for the evaluation of any particular aspect of the accountability system, as well as the role the accountability system plays within the state educational system. That is, a clearly outlined vision and accompanying theory of action is necessary to facilitate the design of a coherent accountability system.
ESSA Requirements ESSA includes a number of major provisions regarding ELs and English language proficiency, many of which are similar to provisions in the No Child Left Behind Act of 2001. Outside of accountability, these provisions include requirements that states have adopted English language proficiency standards aligned with state academic standards, annual administration of an
1 These two uses are mandated by the statute. However, these are not the only two uses for ELP indicators – states
may wish to develop additional uses with their systems of accountability, not for federal compliance, but in order to
better meet specific policy needs. 2 Under the No Child Left Behind Act of 2001, the achievement of EL students was covered under Title III and thus
accountability for EL student only applied to local educational agencies receiving Title III funds.
Lyons & Dadey
2
assessment of English language proficiency for all ELs and statewide entrance and exit requirements for ELs (cf., CCCSO, 2016). In terms of accountability, the law has two specific requirements around English language proficiency3:
1. Long-Term Goals and Interim Progress. The statewide accountability system must include ―State-designed long-term goals, which shall include measures of interim progress towards meeting such goals… for increases in the percentage of such students making progress in achieving English language proficiency, as defined by the State‖ (ESSA, §1111 (b)(4)(A)).
2. Annual Indicator. The statewide accountability system must also include an annual measure of ―progress in achieving English language proficiency, as defined by the state… within a State-determined timeline for all English learners‖ for all public schools in the state, which is to be used as part of a ―system of meaningful differentiation‖ to identify schools for intervention4 (ESSA, §1111 (b)(4)(B to D)).
These two requirements can be tightly or loosely coupled. For example, the annual measure of progress towards English language proficiency used to differentiate schools could be defined by working backwards from the state‘s long term goals (i.e., tightly coupled), or a state could define their progress towards English language proficiency indicator and long term goals separately (i.e., loosely coupled).
The final regulations on ESSA accountability are currently suspended and being considered by Congress under the Congressional Review Act. On February 7, the House of Representatives voted to overturn the accountability regulations. As of the writing of this paper, the vote has yet to go to the Senate, so the status on the final regulations is still up in the air. If the Senate also votes to overturn the regulations than the Department of Education will not be allowed to release new regulations that are substantially similar to the revoked regulations, meaning that the law will likely need to be implemented by states without regulatory clarification (Ujifusa, 2017). Appendices A and B provide tables that provide and separate the language of the statute from the language of the regulations. While it is likely the regulations will not be legally enforceable, they may still be useful to states in providing additional specificity about statutory intent and are thus referenced throughout this paper, when relevant.
Though the regulations provide further detail regarding the ESSA requirements, both the statute and regulations leave a number of decisions regarding the progress towards English language proficiency indicator in the hands of states. For example, what constitutes ―progress in achieving English language proficiency?‖ Should the long-term goals and measures of interim progress define progress in achieving English language proficiency in the same way as it is defined for the
3 Note, ELs are also included as a federally accountability subgroup for all of the other indicators within the accountability system, which means EL performance on each of the indicators must be reported separately for every school. 4 I.e., either Comprehensive Support and Improvement or Targeted Support and Improvement.
Lyons & Dadey
3
annual indicator? What timeline is defensible? The sections below consider these types of questions and detail the requirements of the statute and regulations.
Long-Term Goal and Measures of Interim Progress The statute and regulations, in particular, require that a state develop ambitious ―long-term goals and measures of interim progress for increases in the percentage of all English learners in the State making annual progress toward attaining English language proficiency‖ (34 C.F.R. §200.13(c)). There is considerable flexibility in how the state defines ―making annual progress toward attaining English language proficiency for each student‖—i.e., student-level progress for the required indicator.
Student-level progress. The regulations clarify that states must develop a procedure for calculating research-based, student-level targets for English learners to reach English language proficiency. This does not mean that all English Learners need to have the same targets, but instead, that their growth targets must be calculated using a consistent methodology for all students. The regulations state that the procedure may take into account any of the following student-level characteristics: student‘s initial level of English language proficiency, time in language instruction, grade level, age, native language proficiency, and limited or interrupted formal education, if any. The regulations also require that the targets be based on research and data. For example, it would be reasonable for a state to expect larger gains for younger students than for older students; by setting targets to reflect the known differences in language acquisition, the state can reasonably expect all English Learners to show progress. These targets must also be based on a state determined maximum number of years by which a student should reach proficiency. As with the targets, this maximum number of years must be based on research and can vary by student demographic factors. Importantly, if an English learner does not attain English language proficiency within the state-determined maximum, that student must be provided English learner services until attainment. There is a body of research that can be leveraged to help states make informed decisions about target setting (see Hakuta, Goto Butler & Witt, 2000; MacSwan & Pray, 2005; Motamed, 2015; Slavin, Madden, Calderón, Chamberlain, & Hennessy, 2011). Additionally, states should be using existing prior data to model language proficiency trends within the state to better understand the likely implications of the targetsetting decisions. Using state-specific data helps ensure that the targets and maximum number of years to reach proficiency are reasonable and achievable.
Long-term goals and interim. The student-level targets describe what is expected from individual students. The statue and regulations also require that the State set a long-term goal for the population of EL students. As the state must set a long-term goal which defines the percentage of English learners in the state making progress toward English language proficiency at a given point in the future. States have flexibility in what specific percentage the goal is, the timeframe for achieving the goal, and how measures of interim progress—intermediary goals
Lyons & Dadey
4
that define increases in the percentage of English learners in the state making progress toward ELP—are defined. States must also provide descriptions for how each of these elements is established. As with setting student-level targets, states will want to spend time examining their trend data to understand the historical progress of English learners toward English proficiency within the state.
The long-term goals, and supporting measures of interim progress, should also be aligned with the state‘s vision for ELs and the theory of action for making progress toward the vision. If the student-level targets are set thoughtfully and appropriately, it may not be unreasonable for a state to set the long-term goal of 100% of ELs making progress toward proficiency annually. However, with such high expectations, careful consideration should be given to what interventions and supports will be provided by both local and state actors. Unless EL students are offered substantially more support, a long-term goal that assumes levels of improvement well beyond those shown in historical trends may be suspect. Ultimately, the long term goal should be challenging but achievable given the level of support and the time necessary to implement program improvement. Once the goal and timeframe are established, the measures of interim progress may be defined by backwards mapping—with an ambitious yet reasonable trajectory of attainment towards the goal—to set intermediate benchmarks that would indicate progress to success on the long-term goal. It is worth noting that like EL language acquisition, program improvement and progress toward the long-term goal may not follow a linear pattern.
Annual Indicator It is a common misconception that accountability systems under ESSA represent a U-turn from those under No Child Left Behind (NCLB). The assessment provisions under ESSA are highly similar to NCLB‘s and annual, statewide content assessments in math and English language arts (ELA) remain a large part of accountability. However, accountability systems under ESSA require multiple additional indicators, including an indicator of progress toward English proficiency for English learners. In all, there are at least five categories of indicators that comprise accountability systems under ESSA:
1. Academic achievement as measured by annual, statewide assessments in math and ELA in grades 3-8 and high school;
2. Academic progress such as growth or achievement gap for elementary and middle schools (this is optional for high schools);
3. Graduation rate for high schools. This indicator category must include the 4-year cohortadjusted graduation rate and may also include extended-year graduation rates;
4. Progress in achieving English language proficiency, the topic of the current paper; and, 5. Additional indicator(s) of school quality or student success.
Importantly, consequences for schools are attached to the summative annual determination based on all of the indicators listed above. Identification for targeted and comprehensive support must
Lyons & Dadey
5
be informed by all of the accountability indicators. This is distinct from the long-term goals in that federal accountability does not require school-level consequences or action related to performance on those goals.
The English language proficiency indicator must be reported for at least all English learners in grades 3-8 and those who are assessed in grades 9-12. States may choose to include the assessment results of English learners in earlier grades and may have good reason to do so given that younger students tend to show the most growth in English language proficiency. This decision is discussed in more detail in the section entitled ―Incorporating English language proficiency into Systems of Accountability.‖ The final regulations further outline three requirements related to the indicator of progress towards English language proficiency: 1) it must use objective, valid measures of student progress on the proficiency assessment, comparing results across years, 2) the indicator of progress must be aligned with the applicable timelines for a student to attain English proficiency within the State-determined maximum number of years, and 3) the indicator may also comprise a measure of proficiency, for example, the percentage increase of English learners attaining proficiency on the English language proficiency assessment as compared with prior years. Lastly, all indicators in Title I accountability must be reported individually using at least three levels of performance. This means that the ELP indicator must differentiate among schools by reporting at least three categories of performance. The following section of the paper provides a deep dive into the different options for defining the measure of progress for this accountability indicator.
Defining and Evaluating an English Language Proficiency Indictor We start with a heuristic to help show all of the major pieces that influence a school indicator of progress in achieving English language proficiency. Figure 1 illustrates that the state context, the specific model used to define the English language proficiency indicator, and the business rules around the implementation of the model all play a role in determining school performance classification on the English language proficiency indicator. State context deals with the on-theground reality of EL students within the state (e.g., Are ELs concentrated in a small number of schools or spread out across many schools? Are ELs concentrated in particular grades?). The statistical model refers to the methodology used to produce scores based on the English language proficiency assessment, which can then be aggregated to the school level. This area encompasses both the class of model used, as well as the way the model is specified and estimated. For example, does the model control for student characteristics and, if so, which ones? Finally, the business rules specify how the results of the statistical model are aggregated (e.g., How many students are needed before a school receives a score? Will the results be pooled over years? Will reclassified ELs be included in the aggregation?). In addition, the information in any one box can inform decisions in another box. For example, if there are few EL students per school, the state might want to choose a smaller n-size in to in order to provide ratings to as many schools serving ELs as possible, despite issues with precision caused by small sample sizes. These types of
Lyons & Dadey
6
tradeoffs are common and a state will need to weigh the positives and negatives of any particular approach. We present these categories here as a structure that can be useful for guiding state discussions about this indicator.
Figure 1. Heuristic of Areas of Concern for English language proficiency Indicator.
Finally, it is worth re-emphasizing that the English language proficiency indicator is one of at least five indicators that will ultimately decide the classification of a school under the full accountability system. Thus, the ultimate impact of the English language proficiency indicator needs to be considered in relation to the other indicators. For example, what role will the English language proficiency indicator play? What weight will the English language proficiency indicator have? Again, such questions need to be considered in light of a state‘s vision and theory of action. These questions are considered more deeply in the section entitled ―Incorporating English language proficiency into Systems of Accountability.‖
Defining Progress in Achieving English language proficiency The law requires than an indicator of progress in achieving English language proficiency be used. This requirement has generally been understood as requiring the quantification of across year changes in individual student performance on the English language proficiency assessment. However, the regulations do allow for the English language proficiency indicators to be a combination of growth and status. Given this understanding, prior work examining growth models for general student populations is applicable (e.g., Castellano & Ho, 2013; Goldschmidt, Choi, & Beaudoin, 2013) but should be re-evaluated in light of the unique characteristics of ELs.
In their recent paper, ―Incorporating English learners Progress into State Accountability Systems,‖ Goldschmidt and Hakuta (2017) evaluate options for growth indicators from a predominantly technical perspective. In this paper, we build on their work by integrating their
Lyons & Dadey
7
perspective with additional considerations related to the implementation and evaluation of the English language proficiency indicator within an accountability system.
Some common approaches for characterizing change across years follow (Goldschmidt & Hakuta, 2017)5:
• Transition (or Value) tables: Transition tables describe growth as a student‘s change in performance level from one year to the next dependent on a student‘s prior status. Transition tables often use performance levels that are divided into sub-performance levels to illustrate growth within a performance level (e.g., Level 1A, Level 1B, Level 2A, Level 2B, Level 3A, etc.).
• Proficiency rates: Hakuta and Goldschmidt (2017) offer that the percentage of students reaching English language proficiency is a relevant indicator for monitoring ELs‘ progress. They argue this method is transparent, but note some challenges in that it will be sensitive to policies regarding reclassification and does not award credit for progress toward proficiency, only counting those students who reach proficiency.
• Gain scores: Gain scores describe a student‘s growth based on the difference between test scores- calculated by subtracting an earlier score from a later score. Gain scores require the use of a vertical scale (i.e., scale scores that range across grade levels). Gain scores can be in the raw metric of the scale scores or they can be normalized in order to provide a norm-referenced interpretation of relative growth.
• Growth rates: Growth rates characterize the rate at which student scores change over time. This is determined by calculating a best fit line, or a trend line, across a series of data points to estimate a student‘s growth rate. This estimate can be linear or non-linear.
• Student Growth Percentiles (SGP): SGPs are based on the percent of academic peers a student outscores (i.e., growing faster than 35% of my peers). Academic peers are those students who have similar prior test scores. SGPs are reported on a 1-99 scale, with lower numbers indicating lower relative growth and higher numbers indicating higher relative growth. For example, if a student has an SGP of 65, it means the student has demonstrated more growth than 65% of his or her academic peers.
• Value-Added Models: Value-added models describe growth as the impact educators or institutions have on student achievement. While not all VAMs have the same model structure, many are residual models, calculated by comparing how much the performance in a given unit (e.g., class, school, or district) deviates from the average expected change in performance for that unit.
• Growth-to-Target: Each of the above models characterize growth in terms of magnitude, but do not explicitly account for whether a student has achieved English language proficiency. Each of the approaches can be modified to account for the required growth to meet a particular target or standard (e.g., English language proficiency). For
5 These models are not all mutually exclusive. For example, SHP can be combined with growth-to-target (i.e., adequate growth percentiles).
Lyons & Dadey
8
example, adequate growth for SGPs is often defined in terms of the growth necessary to for non-proficient student to achieve proficiency within a given number of years (―catchup‖ growth; Betebenner, 2011).
This section of the paper provides a more in-depth discussion of three of the possible growth measures to highlight examples of how states could consider and weigh the various merits of an indicator relative to their state context and the policy goals. The three measures considered are: value tables, value-added models, and growth-to-target methods. These measures were chosen because they may be particularly promising for the English language proficiency indicator and each provide for a different inference related to student growth.
Value Tables. Value or transition tables allow policy makers to explicitly value growth across the performance categories in a way that aligns with the state‘s policy goals (Hill et al., 2005). Value tables are simple and transparent, in that they assign numerical values to changes in achievement. Movements across the achievement levels that are considered more desirable (e.g., from non-proficient to proficient) are given higher values, and thus, schools are awarded more credit. The school score resulting from a value table would be the average points for all of the English learners within the school and therefore, the student growth inference resulting from the value table is: How valued is the observed student growth as measured by progress on the performance levels? The values would be deliberated and decided upon at the state level, with the involvement of key stakeholder groups to ensure that the numerical values in each cell accurately reflect the state‘s theory of action. An example value table is provided in Figure 2 using the performance levels from the WIDA Access 2.0 exam, an English language proficiency consortia assessment currently used in 38 U.S. states and territories. In the example provided, more points are awarded for moving into the higher levels of attainment than the lower levels, since growth at the high end of the scale is generally more difficult to achieve. Additionally, schools are awarded no points for students who lose English language skills across years. States may want to consider awarding some points, or even negative points, to these cells, depending on the state‘s theory of action.
Year 2
1:
2:
3:
4:
Entering Beginning Developing Expanding
1: Entering
25
50
75
100
Year 1
2: Beginning 0
25
50
75
3: Developing 0
0
25
50
4: Expanding 0
0
0
25
5: Bridging
0
0
0
0
Figure 2. Example Value Table with WIDA Performance Levels
5: Bridging 150 125 100 75 25
6: Reaching 200 200 200 200 200
Lyons & Dadey
9
One of the primary benefits of value or transition tables is their transparency for schools and other stakeholders. Once schools know how their students have scored on the English language proficiency assessment, they should be able to calculate their score on the English language proficiency indicator easily. Additionally, the values are set in a way that reflects the state‘s theory of action, for example, schools can be incentivized and rewarded to improve English language proficiency for those students who typically have the most difficulty showing growth. One of the drawbacks of using a value table to measure growth for the English language proficiency indicator is that this methodology will be only loosely related to the state‘s long-term goals and measures of interim progress in that the typical use case for value tables does not include the creation of individual student targets aligned to the state‘s defined timeline for reaching proficiency. This would mean that the state would have to create a separate methodology for calculating student targets in order to track progress on the long-term goals. This could be done relatively easily, for example, by expecting that students‘ progress by one achievement level per year. However, the simplicity of this model for setting student targets may not be reasonable and, as with any target-setting scheme, should be modeled to better understand whether this kind of progress is reasonable to expect for students who have historically reached proficiency. Alternatively, more complex versions of value tables that take into account the student characteristics—including time in EL‘s programming—could be created. However, this would require the design and use of multiple value tables and may remove some of the transparency associated with this method.
Value-Added Models. Value-added models are a diverse collection of statistical techniques that are better defined by their use rather than their structure. Often, value-added models are regression-based and are used to compare students‘ predicted growth to actual growth. The difference—the residual—is often attributed to programmatic effectiveness, or the value-added by the program to the student growth. Most value-added models are covariateadjusted which means they can control for student and school contextual effects that may contribute to explaining student growth trajectories. In this way, value-added models are said to ―isolate‖ the effects of the program on student achievement, regardless of school and student characteristics (a very strong assumption that has rarely been validated). The student growth inference related for value-added models is: How effective is the EL program at eliciting student growth compared to other programs in the state? It is important to note here that this inference is inherently norm-referenced in that EL progress is not measured relative to a criterion, but in comparison to progress made by other ELs in the state. Value-added models will identify those programs making better than average progress with their EL students, average progress, and below average progress.
One of the benefits of the value-added modelling framework for the English language proficiency indicator is that states can easily take into account of the student characteristics that research has shown to be relevant for explaining EL language acquisition (e.g., level of
the Every Student Succeeds Act
Susan Lyons, Ph.D. and Nathan Dadey, Ph.D. Center for Assessment March 8, 2017
This paper was written in collaboration with the Latino Policy Forum with significant financial support from the High Quality Assessment Project.
Lyons & Dadey
1
Introduction The requirement for an indicator of ―progress in achieving English language proficiency‖ (English language proficiency) for English learners (ELs) must now be included in state systems of educational accountability under the Every Student Succeeds Act (ESSA, §1111(c)(4)). Specifically, the statute requires that English language proficiency be addressed in two1 specific ways within systems of accountability—as part of the state‘s long-term and interim goals, and as part of an annual system that meaningfully differentiates schools. ESSA‘s inclusion of English language proficiency within Title I accountability systems represents a key juncture in accountability policy that provides states the opportunity to define, or redefine, progress in achieving English language proficiency in a system of accountability that considers all EL students2. The goal of this brief is to first provide an overview the ESSA requirements around English language proficiency within systems of accountability, and then to offer guidance on the ways in which (a) progress in achieving English language proficiency can be defined, (b) these various definitions can be incorporated into ESSA-compliant state accountability systems, and (c) a state can evaluate the validity of a state ESSA accountability system for meeting EL policy goals.
States must first establish a vision for English learners and English language acquisition embedded in a coherent theory of action before engaging in accountability system design. There are a variety of design decisions that must be made in order to create a new school accountability system under ESSA. The new federal law permits a wide latitude in the specifics of state accountability systems – allowing for variety of types of indicators reported, the stated goals and targets, and the rewards or consequences for schools. Therefore, state leaders need to base complex design decisions on a clear state vision. This state vision is foundational. By providing clearly articulated educational goals for all students, and for English learners in particular, the state vision provides the basis for the evaluation of any particular aspect of the accountability system, as well as the role the accountability system plays within the state educational system. That is, a clearly outlined vision and accompanying theory of action is necessary to facilitate the design of a coherent accountability system.
ESSA Requirements ESSA includes a number of major provisions regarding ELs and English language proficiency, many of which are similar to provisions in the No Child Left Behind Act of 2001. Outside of accountability, these provisions include requirements that states have adopted English language proficiency standards aligned with state academic standards, annual administration of an
1 These two uses are mandated by the statute. However, these are not the only two uses for ELP indicators – states
may wish to develop additional uses with their systems of accountability, not for federal compliance, but in order to
better meet specific policy needs. 2 Under the No Child Left Behind Act of 2001, the achievement of EL students was covered under Title III and thus
accountability for EL student only applied to local educational agencies receiving Title III funds.
Lyons & Dadey
2
assessment of English language proficiency for all ELs and statewide entrance and exit requirements for ELs (cf., CCCSO, 2016). In terms of accountability, the law has two specific requirements around English language proficiency3:
1. Long-Term Goals and Interim Progress. The statewide accountability system must include ―State-designed long-term goals, which shall include measures of interim progress towards meeting such goals… for increases in the percentage of such students making progress in achieving English language proficiency, as defined by the State‖ (ESSA, §1111 (b)(4)(A)).
2. Annual Indicator. The statewide accountability system must also include an annual measure of ―progress in achieving English language proficiency, as defined by the state… within a State-determined timeline for all English learners‖ for all public schools in the state, which is to be used as part of a ―system of meaningful differentiation‖ to identify schools for intervention4 (ESSA, §1111 (b)(4)(B to D)).
These two requirements can be tightly or loosely coupled. For example, the annual measure of progress towards English language proficiency used to differentiate schools could be defined by working backwards from the state‘s long term goals (i.e., tightly coupled), or a state could define their progress towards English language proficiency indicator and long term goals separately (i.e., loosely coupled).
The final regulations on ESSA accountability are currently suspended and being considered by Congress under the Congressional Review Act. On February 7, the House of Representatives voted to overturn the accountability regulations. As of the writing of this paper, the vote has yet to go to the Senate, so the status on the final regulations is still up in the air. If the Senate also votes to overturn the regulations than the Department of Education will not be allowed to release new regulations that are substantially similar to the revoked regulations, meaning that the law will likely need to be implemented by states without regulatory clarification (Ujifusa, 2017). Appendices A and B provide tables that provide and separate the language of the statute from the language of the regulations. While it is likely the regulations will not be legally enforceable, they may still be useful to states in providing additional specificity about statutory intent and are thus referenced throughout this paper, when relevant.
Though the regulations provide further detail regarding the ESSA requirements, both the statute and regulations leave a number of decisions regarding the progress towards English language proficiency indicator in the hands of states. For example, what constitutes ―progress in achieving English language proficiency?‖ Should the long-term goals and measures of interim progress define progress in achieving English language proficiency in the same way as it is defined for the
3 Note, ELs are also included as a federally accountability subgroup for all of the other indicators within the accountability system, which means EL performance on each of the indicators must be reported separately for every school. 4 I.e., either Comprehensive Support and Improvement or Targeted Support and Improvement.
Lyons & Dadey
3
annual indicator? What timeline is defensible? The sections below consider these types of questions and detail the requirements of the statute and regulations.
Long-Term Goal and Measures of Interim Progress The statute and regulations, in particular, require that a state develop ambitious ―long-term goals and measures of interim progress for increases in the percentage of all English learners in the State making annual progress toward attaining English language proficiency‖ (34 C.F.R. §200.13(c)). There is considerable flexibility in how the state defines ―making annual progress toward attaining English language proficiency for each student‖—i.e., student-level progress for the required indicator.
Student-level progress. The regulations clarify that states must develop a procedure for calculating research-based, student-level targets for English learners to reach English language proficiency. This does not mean that all English Learners need to have the same targets, but instead, that their growth targets must be calculated using a consistent methodology for all students. The regulations state that the procedure may take into account any of the following student-level characteristics: student‘s initial level of English language proficiency, time in language instruction, grade level, age, native language proficiency, and limited or interrupted formal education, if any. The regulations also require that the targets be based on research and data. For example, it would be reasonable for a state to expect larger gains for younger students than for older students; by setting targets to reflect the known differences in language acquisition, the state can reasonably expect all English Learners to show progress. These targets must also be based on a state determined maximum number of years by which a student should reach proficiency. As with the targets, this maximum number of years must be based on research and can vary by student demographic factors. Importantly, if an English learner does not attain English language proficiency within the state-determined maximum, that student must be provided English learner services until attainment. There is a body of research that can be leveraged to help states make informed decisions about target setting (see Hakuta, Goto Butler & Witt, 2000; MacSwan & Pray, 2005; Motamed, 2015; Slavin, Madden, Calderón, Chamberlain, & Hennessy, 2011). Additionally, states should be using existing prior data to model language proficiency trends within the state to better understand the likely implications of the targetsetting decisions. Using state-specific data helps ensure that the targets and maximum number of years to reach proficiency are reasonable and achievable.
Long-term goals and interim. The student-level targets describe what is expected from individual students. The statue and regulations also require that the State set a long-term goal for the population of EL students. As the state must set a long-term goal which defines the percentage of English learners in the state making progress toward English language proficiency at a given point in the future. States have flexibility in what specific percentage the goal is, the timeframe for achieving the goal, and how measures of interim progress—intermediary goals
Lyons & Dadey
4
that define increases in the percentage of English learners in the state making progress toward ELP—are defined. States must also provide descriptions for how each of these elements is established. As with setting student-level targets, states will want to spend time examining their trend data to understand the historical progress of English learners toward English proficiency within the state.
The long-term goals, and supporting measures of interim progress, should also be aligned with the state‘s vision for ELs and the theory of action for making progress toward the vision. If the student-level targets are set thoughtfully and appropriately, it may not be unreasonable for a state to set the long-term goal of 100% of ELs making progress toward proficiency annually. However, with such high expectations, careful consideration should be given to what interventions and supports will be provided by both local and state actors. Unless EL students are offered substantially more support, a long-term goal that assumes levels of improvement well beyond those shown in historical trends may be suspect. Ultimately, the long term goal should be challenging but achievable given the level of support and the time necessary to implement program improvement. Once the goal and timeframe are established, the measures of interim progress may be defined by backwards mapping—with an ambitious yet reasonable trajectory of attainment towards the goal—to set intermediate benchmarks that would indicate progress to success on the long-term goal. It is worth noting that like EL language acquisition, program improvement and progress toward the long-term goal may not follow a linear pattern.
Annual Indicator It is a common misconception that accountability systems under ESSA represent a U-turn from those under No Child Left Behind (NCLB). The assessment provisions under ESSA are highly similar to NCLB‘s and annual, statewide content assessments in math and English language arts (ELA) remain a large part of accountability. However, accountability systems under ESSA require multiple additional indicators, including an indicator of progress toward English proficiency for English learners. In all, there are at least five categories of indicators that comprise accountability systems under ESSA:
1. Academic achievement as measured by annual, statewide assessments in math and ELA in grades 3-8 and high school;
2. Academic progress such as growth or achievement gap for elementary and middle schools (this is optional for high schools);
3. Graduation rate for high schools. This indicator category must include the 4-year cohortadjusted graduation rate and may also include extended-year graduation rates;
4. Progress in achieving English language proficiency, the topic of the current paper; and, 5. Additional indicator(s) of school quality or student success.
Importantly, consequences for schools are attached to the summative annual determination based on all of the indicators listed above. Identification for targeted and comprehensive support must
Lyons & Dadey
5
be informed by all of the accountability indicators. This is distinct from the long-term goals in that federal accountability does not require school-level consequences or action related to performance on those goals.
The English language proficiency indicator must be reported for at least all English learners in grades 3-8 and those who are assessed in grades 9-12. States may choose to include the assessment results of English learners in earlier grades and may have good reason to do so given that younger students tend to show the most growth in English language proficiency. This decision is discussed in more detail in the section entitled ―Incorporating English language proficiency into Systems of Accountability.‖ The final regulations further outline three requirements related to the indicator of progress towards English language proficiency: 1) it must use objective, valid measures of student progress on the proficiency assessment, comparing results across years, 2) the indicator of progress must be aligned with the applicable timelines for a student to attain English proficiency within the State-determined maximum number of years, and 3) the indicator may also comprise a measure of proficiency, for example, the percentage increase of English learners attaining proficiency on the English language proficiency assessment as compared with prior years. Lastly, all indicators in Title I accountability must be reported individually using at least three levels of performance. This means that the ELP indicator must differentiate among schools by reporting at least three categories of performance. The following section of the paper provides a deep dive into the different options for defining the measure of progress for this accountability indicator.
Defining and Evaluating an English Language Proficiency Indictor We start with a heuristic to help show all of the major pieces that influence a school indicator of progress in achieving English language proficiency. Figure 1 illustrates that the state context, the specific model used to define the English language proficiency indicator, and the business rules around the implementation of the model all play a role in determining school performance classification on the English language proficiency indicator. State context deals with the on-theground reality of EL students within the state (e.g., Are ELs concentrated in a small number of schools or spread out across many schools? Are ELs concentrated in particular grades?). The statistical model refers to the methodology used to produce scores based on the English language proficiency assessment, which can then be aggregated to the school level. This area encompasses both the class of model used, as well as the way the model is specified and estimated. For example, does the model control for student characteristics and, if so, which ones? Finally, the business rules specify how the results of the statistical model are aggregated (e.g., How many students are needed before a school receives a score? Will the results be pooled over years? Will reclassified ELs be included in the aggregation?). In addition, the information in any one box can inform decisions in another box. For example, if there are few EL students per school, the state might want to choose a smaller n-size in to in order to provide ratings to as many schools serving ELs as possible, despite issues with precision caused by small sample sizes. These types of
Lyons & Dadey
6
tradeoffs are common and a state will need to weigh the positives and negatives of any particular approach. We present these categories here as a structure that can be useful for guiding state discussions about this indicator.
Figure 1. Heuristic of Areas of Concern for English language proficiency Indicator.
Finally, it is worth re-emphasizing that the English language proficiency indicator is one of at least five indicators that will ultimately decide the classification of a school under the full accountability system. Thus, the ultimate impact of the English language proficiency indicator needs to be considered in relation to the other indicators. For example, what role will the English language proficiency indicator play? What weight will the English language proficiency indicator have? Again, such questions need to be considered in light of a state‘s vision and theory of action. These questions are considered more deeply in the section entitled ―Incorporating English language proficiency into Systems of Accountability.‖
Defining Progress in Achieving English language proficiency The law requires than an indicator of progress in achieving English language proficiency be used. This requirement has generally been understood as requiring the quantification of across year changes in individual student performance on the English language proficiency assessment. However, the regulations do allow for the English language proficiency indicators to be a combination of growth and status. Given this understanding, prior work examining growth models for general student populations is applicable (e.g., Castellano & Ho, 2013; Goldschmidt, Choi, & Beaudoin, 2013) but should be re-evaluated in light of the unique characteristics of ELs.
In their recent paper, ―Incorporating English learners Progress into State Accountability Systems,‖ Goldschmidt and Hakuta (2017) evaluate options for growth indicators from a predominantly technical perspective. In this paper, we build on their work by integrating their
Lyons & Dadey
7
perspective with additional considerations related to the implementation and evaluation of the English language proficiency indicator within an accountability system.
Some common approaches for characterizing change across years follow (Goldschmidt & Hakuta, 2017)5:
• Transition (or Value) tables: Transition tables describe growth as a student‘s change in performance level from one year to the next dependent on a student‘s prior status. Transition tables often use performance levels that are divided into sub-performance levels to illustrate growth within a performance level (e.g., Level 1A, Level 1B, Level 2A, Level 2B, Level 3A, etc.).
• Proficiency rates: Hakuta and Goldschmidt (2017) offer that the percentage of students reaching English language proficiency is a relevant indicator for monitoring ELs‘ progress. They argue this method is transparent, but note some challenges in that it will be sensitive to policies regarding reclassification and does not award credit for progress toward proficiency, only counting those students who reach proficiency.
• Gain scores: Gain scores describe a student‘s growth based on the difference between test scores- calculated by subtracting an earlier score from a later score. Gain scores require the use of a vertical scale (i.e., scale scores that range across grade levels). Gain scores can be in the raw metric of the scale scores or they can be normalized in order to provide a norm-referenced interpretation of relative growth.
• Growth rates: Growth rates characterize the rate at which student scores change over time. This is determined by calculating a best fit line, or a trend line, across a series of data points to estimate a student‘s growth rate. This estimate can be linear or non-linear.
• Student Growth Percentiles (SGP): SGPs are based on the percent of academic peers a student outscores (i.e., growing faster than 35% of my peers). Academic peers are those students who have similar prior test scores. SGPs are reported on a 1-99 scale, with lower numbers indicating lower relative growth and higher numbers indicating higher relative growth. For example, if a student has an SGP of 65, it means the student has demonstrated more growth than 65% of his or her academic peers.
• Value-Added Models: Value-added models describe growth as the impact educators or institutions have on student achievement. While not all VAMs have the same model structure, many are residual models, calculated by comparing how much the performance in a given unit (e.g., class, school, or district) deviates from the average expected change in performance for that unit.
• Growth-to-Target: Each of the above models characterize growth in terms of magnitude, but do not explicitly account for whether a student has achieved English language proficiency. Each of the approaches can be modified to account for the required growth to meet a particular target or standard (e.g., English language proficiency). For
5 These models are not all mutually exclusive. For example, SHP can be combined with growth-to-target (i.e., adequate growth percentiles).
Lyons & Dadey
8
example, adequate growth for SGPs is often defined in terms of the growth necessary to for non-proficient student to achieve proficiency within a given number of years (―catchup‖ growth; Betebenner, 2011).
This section of the paper provides a more in-depth discussion of three of the possible growth measures to highlight examples of how states could consider and weigh the various merits of an indicator relative to their state context and the policy goals. The three measures considered are: value tables, value-added models, and growth-to-target methods. These measures were chosen because they may be particularly promising for the English language proficiency indicator and each provide for a different inference related to student growth.
Value Tables. Value or transition tables allow policy makers to explicitly value growth across the performance categories in a way that aligns with the state‘s policy goals (Hill et al., 2005). Value tables are simple and transparent, in that they assign numerical values to changes in achievement. Movements across the achievement levels that are considered more desirable (e.g., from non-proficient to proficient) are given higher values, and thus, schools are awarded more credit. The school score resulting from a value table would be the average points for all of the English learners within the school and therefore, the student growth inference resulting from the value table is: How valued is the observed student growth as measured by progress on the performance levels? The values would be deliberated and decided upon at the state level, with the involvement of key stakeholder groups to ensure that the numerical values in each cell accurately reflect the state‘s theory of action. An example value table is provided in Figure 2 using the performance levels from the WIDA Access 2.0 exam, an English language proficiency consortia assessment currently used in 38 U.S. states and territories. In the example provided, more points are awarded for moving into the higher levels of attainment than the lower levels, since growth at the high end of the scale is generally more difficult to achieve. Additionally, schools are awarded no points for students who lose English language skills across years. States may want to consider awarding some points, or even negative points, to these cells, depending on the state‘s theory of action.
Year 2
1:
2:
3:
4:
Entering Beginning Developing Expanding
1: Entering
25
50
75
100
Year 1
2: Beginning 0
25
50
75
3: Developing 0
0
25
50
4: Expanding 0
0
0
25
5: Bridging
0
0
0
0
Figure 2. Example Value Table with WIDA Performance Levels
5: Bridging 150 125 100 75 25
6: Reaching 200 200 200 200 200
Lyons & Dadey
9
One of the primary benefits of value or transition tables is their transparency for schools and other stakeholders. Once schools know how their students have scored on the English language proficiency assessment, they should be able to calculate their score on the English language proficiency indicator easily. Additionally, the values are set in a way that reflects the state‘s theory of action, for example, schools can be incentivized and rewarded to improve English language proficiency for those students who typically have the most difficulty showing growth. One of the drawbacks of using a value table to measure growth for the English language proficiency indicator is that this methodology will be only loosely related to the state‘s long-term goals and measures of interim progress in that the typical use case for value tables does not include the creation of individual student targets aligned to the state‘s defined timeline for reaching proficiency. This would mean that the state would have to create a separate methodology for calculating student targets in order to track progress on the long-term goals. This could be done relatively easily, for example, by expecting that students‘ progress by one achievement level per year. However, the simplicity of this model for setting student targets may not be reasonable and, as with any target-setting scheme, should be modeled to better understand whether this kind of progress is reasonable to expect for students who have historically reached proficiency. Alternatively, more complex versions of value tables that take into account the student characteristics—including time in EL‘s programming—could be created. However, this would require the design and use of multiple value tables and may remove some of the transparency associated with this method.
Value-Added Models. Value-added models are a diverse collection of statistical techniques that are better defined by their use rather than their structure. Often, value-added models are regression-based and are used to compare students‘ predicted growth to actual growth. The difference—the residual—is often attributed to programmatic effectiveness, or the value-added by the program to the student growth. Most value-added models are covariateadjusted which means they can control for student and school contextual effects that may contribute to explaining student growth trajectories. In this way, value-added models are said to ―isolate‖ the effects of the program on student achievement, regardless of school and student characteristics (a very strong assumption that has rarely been validated). The student growth inference related for value-added models is: How effective is the EL program at eliciting student growth compared to other programs in the state? It is important to note here that this inference is inherently norm-referenced in that EL progress is not measured relative to a criterion, but in comparison to progress made by other ELs in the state. Value-added models will identify those programs making better than average progress with their EL students, average progress, and below average progress.
One of the benefits of the value-added modelling framework for the English language proficiency indicator is that states can easily take into account of the student characteristics that research has shown to be relevant for explaining EL language acquisition (e.g., level of