# Econ 2148, fall 2017 Statistical decision theory

## Transcript Of Econ 2148, fall 2017 Statistical decision theory

Statistical Decision Theory

Econ 2148, fall 2017 Statistical decision theory

Maximilian Kasy

Department of Economics, Harvard University

1 / 53

Statistical Decision Theory

Takeaways for this part of class

1. A general framework to think about what makes a “good” estimator, test, etc.

2. How the foundations of statistics relate to those of microeconomic theory.

3. In what sense the set of Bayesian estimators contains most “reasonable” estimators.

2 / 53

Statistical Decision Theory

Examples of decision problems

Decide whether or not the hypothesis of no racial discrimination in job interviews is true Provide a forecast of the unemployment rate next month Provide an estimate of the returns to schooling Pick a portfolio of assets to invest in Decide whether to reduce class sizes for poor students Recommend a level for the top income tax rate

3 / 53

Statistical Decision Theory

Agenda

Basic deﬁnitions Optimality criteria Relationships between optimality criteria Analogies to microeconomics Two justiﬁcations of the Bayesian approach

4 / 53

Statistical Decision Theory Basic deﬁnitions

Components of a general statistical decision problem

Observed data X A statistical decision a

A state of the world θ A loss function L(a, θ ) (the negative of utility) A statistical model f (X |θ ) A decision function a = δ (X )

5 / 53

Statistical Decision Theory Basic deﬁnitions

How they relate

underlying state of the world θ ⇒ distribution of the observation X . decision maker: observes X ⇒ picks a decision a her goal: pick a decision that minimizes loss L(a, θ ) (θ unknown state of the world) X is useful ⇔ reveals some information about θ ⇔ f (X |θ ) does depend on θ .

problem of statistical decision theory:

ﬁnd decision functions δ which “make loss small.”

6 / 53

Statistical Decision Theory Basic deﬁnitions

Graphical illustration

Figure: A general decision problem

observed data X

decision function a=δ(X)

decision a

statistical model

X~f(x,θ)

state of the world θ

loss L(a,θ)

7 / 53

Statistical Decision Theory Basic deﬁnitions

Examples

investing in a portfolio of assets: X : past asset prices a: amount of each asset to hold θ : joint distribution of past and future asset prices L: minus expected utility of future income

decide whether or not to reduce class size: X : data from project STAR experiment a: class size θ : distribution of student outcomes for different class sizes L: average of suitably scaled student outcomes, net of cost

8 / 53

Statistical Decision Theory Basic deﬁnitions

Practice problem For each of the examples on slide 2, what are

the data X , the possible actions a,

the relevant states of the world θ , and

reasonable choices of loss function L?

9 / 53

Statistical Decision Theory Basic deﬁnitions

Loss functions in estimation

goal: ﬁnd an a

which is close to some function µ of θ . for instance: µ(θ ) = E[X ]

loss is larger if the difference between our estimate and the true value is larger Some possible loss functions: 1. squared error loss,

L(a, θ ) = (a − µ(θ ))2

2. absolute error loss,

L(a, θ ) = |a − µ(θ )|

10 / 53

Econ 2148, fall 2017 Statistical decision theory

Maximilian Kasy

Department of Economics, Harvard University

1 / 53

Statistical Decision Theory

Takeaways for this part of class

1. A general framework to think about what makes a “good” estimator, test, etc.

2. How the foundations of statistics relate to those of microeconomic theory.

3. In what sense the set of Bayesian estimators contains most “reasonable” estimators.

2 / 53

Statistical Decision Theory

Examples of decision problems

Decide whether or not the hypothesis of no racial discrimination in job interviews is true Provide a forecast of the unemployment rate next month Provide an estimate of the returns to schooling Pick a portfolio of assets to invest in Decide whether to reduce class sizes for poor students Recommend a level for the top income tax rate

3 / 53

Statistical Decision Theory

Agenda

Basic deﬁnitions Optimality criteria Relationships between optimality criteria Analogies to microeconomics Two justiﬁcations of the Bayesian approach

4 / 53

Statistical Decision Theory Basic deﬁnitions

Components of a general statistical decision problem

Observed data X A statistical decision a

A state of the world θ A loss function L(a, θ ) (the negative of utility) A statistical model f (X |θ ) A decision function a = δ (X )

5 / 53

Statistical Decision Theory Basic deﬁnitions

How they relate

underlying state of the world θ ⇒ distribution of the observation X . decision maker: observes X ⇒ picks a decision a her goal: pick a decision that minimizes loss L(a, θ ) (θ unknown state of the world) X is useful ⇔ reveals some information about θ ⇔ f (X |θ ) does depend on θ .

problem of statistical decision theory:

ﬁnd decision functions δ which “make loss small.”

6 / 53

Statistical Decision Theory Basic deﬁnitions

Graphical illustration

Figure: A general decision problem

observed data X

decision function a=δ(X)

decision a

statistical model

X~f(x,θ)

state of the world θ

loss L(a,θ)

7 / 53

Statistical Decision Theory Basic deﬁnitions

Examples

investing in a portfolio of assets: X : past asset prices a: amount of each asset to hold θ : joint distribution of past and future asset prices L: minus expected utility of future income

decide whether or not to reduce class size: X : data from project STAR experiment a: class size θ : distribution of student outcomes for different class sizes L: average of suitably scaled student outcomes, net of cost

8 / 53

Statistical Decision Theory Basic deﬁnitions

Practice problem For each of the examples on slide 2, what are

the data X , the possible actions a,

the relevant states of the world θ , and

reasonable choices of loss function L?

9 / 53

Statistical Decision Theory Basic deﬁnitions

Loss functions in estimation

goal: ﬁnd an a

which is close to some function µ of θ . for instance: µ(θ ) = E[X ]

loss is larger if the difference between our estimate and the true value is larger Some possible loss functions: 1. squared error loss,

L(a, θ ) = (a − µ(θ ))2

2. absolute error loss,

L(a, θ ) = |a − µ(θ )|

10 / 53