Factor Variables and Marginal Effects in Stata 11

Preparing to load PDF file. please wait...

0 of 0
100%
Factor Variables and Marginal Effects in Stata 11

Transcript Of Factor Variables and Marginal Effects in Stata 11

Factor Variables and Marginal Effects in Stata 11
Christopher F Baum
Boston College and DIW Berlin
January 2010

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 1 / 18

Using factor variables
Using factor variables
One of the biggest innovations in Stata version 11 is the introduction of factor variables. Just as Stata’s time series operators allow you to refer to lagged variables (L. or differenced variables (D.), the i. operator allows you to specify factor variables for any non-negative integer-valued variable in your dataset. In the auto.dta dataset, where rep78 takes on values 1. . . 5, you could list rep78 i.rep78, or summarize i.rep78, or regress mpg i.rep78. Each one of those commands produces the appropriate indicator variables ‘on-the-fly’: not as permanent variables in your dataset, but available for the command.

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 2 / 18

Using factor variables
For the list command, the variables will be named 1b.rep78, 2.rep78 ...5.rep78. The b. is the base level indicator, by default assigned to the smallest value. You can specify other base levels, such as the largest value, the most frequent value, or a particular value. For the summarize command, only levels 2. . . 5 will be shown; the base level is excluded from the list. Likewise, in a regression on i.rep78, the base level is the variable excluded from the regressor list to prevent perfect collinearity. The conditional mean of the excluded variable appears in the constant term.

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 3 / 18

Using factor variables
Interaction effects

Interaction effects

If this was the only feature of factor variables (being instantiated when called for) they would not be very useful. The real advantage of these variables is the ability to define interaction effects for both integer-valued and continuous variables. For instance, consider the indicator foreign in the auto dataset. We may use a new operator, #, to define an interaction:
regress mpg i.rep78 i.foreign i.rep78#i.foreign
All combinations of the two categorical variables will be defined, and included in the regression as appropriate (omitting base levels and cells with no observations).

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 4 / 18

Using factor variables Interaction effects
In fact, we can specify this model more simply: rather than
regress mpg i.rep78 i.foreign i.rep78#i.foreign
we can use the factorial interaction operator, ##:
regress mpg i.rep78##i.foreign
which will provide exactly the same regression, producing all first-level and second-level interactions. Interactions are not limited to pairs of variables; up to eight factor variables may be included.

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 5 / 18

Using factor variables Interaction effects
Furthermore, factor variables may be interacted with continuous variables to produce analysis of covariance models. The continuous variables are signalled by the new c. operator:
regress mpg i.foreign i.foreign#c.displacement
which essentially estimates two regression lines: one for domestic cars, one for foreign cars. Again, the factorial operator could be used to estimate the same model:
regress mpg i.foreign##c.displacement

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 6 / 18

Using factor variables Interaction effects
As we will see in discussing marginal effects, it is very advantageous to use this syntax to describe interactions, both among categorical variables and between categorical variables and continuous variables. Indeed, it is likewise useful to use the same syntax to describe squared (and cubed. . . ) terms:
regress mpg i.foreign c.displacement c.displacement#c.displacement
In this model, we allow for an intercept shift for foreign, but constrain the slopes to be equal across foreign and domestic cars. However, by using this syntax, we may ask Stata to calculate the marginal effect ∂mpg/∂displacement, taking account of the squared term as well, as Stata understands the mathematics of the specification in this explicit form.

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 7 / 18

Computing marginal effects
Computing marginal effects
With the introduction of factor variables in Stata 11, a powerful new command has been added: margins, which supersedes earlier versions’ mfx and adjust commands. Those commands remain available, but the new command has many advantages. Like those commands, margins is used after an estimation command. In the simplest case, margins applied after a simple one-way ANOVA estimated with regress i.rep78, with margins i.rep78, merely displays the conditional means for each category of rep78.

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 8 / 18

Computing marginal effects

. regress mpg i.rep78

Source

SS

Model Residual

549.415777 1790.78712

Total

2340.2029

df

MS

4 137.353944 64 27.9810488

68 34.4147485

Number of obs =

F( 4, 64) =

Prob > F

=

R-squared

=

Adj R-squared =

Root MSE

=

69 4.91 0.0016 0.2348 0.1869 5.2897

mpg
rep78 2 3 4 5
_cons

Coef. Std. Err.

t P>|t|

-1.875 -1.566667
.6666667 6.363636
21

4.181884 3.863059 3.942718 4.066234
3.740391

-0.45 -0.41
0.17 1.56
5.61

0.655 0.686 0.866 0.123
0.000

[95% Conf. Interval]

-10.22927 -9.284014 -7.209818 -1.759599
13.52771

6.479274 6.150681 8.543152 14.48687
28.47229

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 9 / 18

Computing marginal effects

. margins i.rep78

Adjusted predictions Model VCE : OLS

Number of obs =

69

Expression : Linear prediction, predict()

rep78 1 2 3 4 5

Delta-method Margin Std. Err.

z P>|z|

21 19.125 19.43333 21.66667 27.36364

3.740391 1.870195 .9657648 1.246797 1.594908

5.61 10.23 20.12 17.38 17.16

0.000 0.000 0.000 0.000 0.000

[95% Conf. Interval]

13.66897 15.45948 17.54047 19.22299 24.23767

28.33103 22.79052
21.3262 24.11034
30.4896

Christopher F Baum (Boston College/DIW) Factor Variables and Marginal Effects

Jan 2010 10 / 18
VariablesFactor VariablesEffectsCommandCommands