# Random variables, expectation, and variance

0 of 0
100% ## Transcript Of Random variables, expectation, and variance

Random variables, expectation, and variance
DSE 210

Random variables

Roll a die.

Deﬁne X =

1 if die is 3 0 otherwise

Here the sample space is ⌦ = {1, 2, 3, 4, 5, 6}.

Roll n dice.

! = 1, 2 ) X = 0 ! = 3, 4, 5, 6 ) X = 1

X = # of 6’s Y = # of 1’s before the ﬁrst 6

Both X and Y are deﬁned on the same sample space, ⌦ = {1, 2, 3, 4, 5, 6}n. For instance,

! = (1, 1, 1, . . . , 1, 6) ) X = 1, Y = n 1.

In general, a random variable (r.v.) is a deﬁned on a probability space. It is a mapping from ⌦ to R. We’ll use capital letters for r.v.’s.

The distribution of a random variable
Roll a die. Deﬁne X = 1 if die is 3, otherwise X = 0.

X takes values in {0, 1} and has distribution:

Pr( = 0) = 1 and Pr( = 1) = 2 .

X

3

X

3

Roll n dice. Deﬁne X = number of 6’s.

X takes values in {0, 1, 2, . . . , n}. The distribution of X is:

Pr(X = k) = #(sequences with k 6’s) · Pr(one such sequence) ✓ ◆ ✓ 1 ◆k ✓ 5 ◆n k
=n 6 6 k

Throw a dart at a dartboard of radius 1. Let X be the distance to the center of the board.

X takes values in [0, 1]. The distribution of X is:

Pr(  ) = 2. Xx x

Expected value, or mean
The expected value of a random variable X is X
E(X ) = x Pr(X = x).
x

Roll a die. Let X be the number observed.
E(X ) = 1 · 16 + 2 · 16 + · · · + 6 · 16 = 1 + 2 + 3 + 4 + 5 + 6 = 3.5 6

(average)

Biased coin. A coin has heads probability p. Let X be 1 if heads, 0 if tails. E(X ) = 1 · p + 0 · (1 p) = p.

Toss a coin with bias p repeatedly, until it comes up heads. Let X be the number of tosses.
E( ) = 1 . X p

Pascal’s wager
Pascal: I think there is some chance (p > 0) that God exists. Therefore I should act as if he exists.
Let X = my level of su↵ering. I Suppose I behave as if God exists (that is, I behave myself). Then X is some signiﬁcant but ﬁnite amount, like 100 or 1000. I Suppose I behave as if God doesn’t exists (I do whatever I want to). If indeed God doesn’t exist: X = 0. But if God exists: X = 1 (hell). Therefore, E(X ) = 0 · (1 p) + 1 · p = 1.
The ﬁrst option is much better!
Linearity of expectation
I If you double a set of numbers, how is the average a↵ected? It is also doubled.
I If you increase a set of numbers by 1, how much does the average change? It also increases by 1.
I Rule: E(aX + b) = aE(X ) + b for any random variable X and any constants a, b.
I But here’s a more surprising (and very powerful) property: E(X + Y ) = E(X ) + E(Y ) for any two random variables X , Y .
I Likewise: E(X + Y + Z ) = E(X ) + E(Y ) + E(Z ), etc.

Linearity: examples

Roll 2 dice and let Z denote the sum. What is E(Z )?
Method 1 Distribution of Z :

z 2 3 4 5 6 7 8 9 10 11 12

Pr(Z = z)

1 36

2 36

3 36

4 36

5 36

6 36

5 36

4 36

3 36

2 36

1 36

Now use formula for expected value:

E(Z ) = 2 · 316 + 3 · 326 + 4 · 336 + · · · = 7.
Method 2 Let X1 be the ﬁrst die and X2 the second die. Each of them is a single die and thus (as we saw earlier) has expected value 3.5. Since Z = X1 + X2,

E(Z ) = E(X1) + E(X2) = 3.5 + 3.5 = 7.

Toss n coins of bias p, and let X be the number of heads. What is E(X )? Let the individual coins be X1, . . . , Xn. Each has value 0 or 1 and has expected value p. Since X = X1 + X2 + · · · + Xn,
E(X ) = E(X1) + · · · + E(Xn) = np.
Roll a die n times, and let X be the number of 6’s. What is E(X )? Let X1 be 1 if the ﬁrst roll is a 6, and 0 otherwise.
E(X1) = 16 . Likewise, deﬁne X2, X3, . . . , Xn. Since X = X1 + · · · + Xn, we have
E(X ) = E(X1) + · · · + E(Xn) = n6 .

Coupon collector, again
Each cereal box has one of k action ﬁgures. What is the expected number of boxes you need to buy in order to collect all the ﬁgures?
Suppose you’ve already collected i 1 of the ﬁgures. Let Xi be the time to collect the next one.
Each box you buy will contain a new ﬁgure with probability (k (i 1))/k. Therefore,

E(Xi ) = k + 1 . ki

Total number of boxes bought is X = X1 + X2 + · · · + Xk , so

E(X ) = E(X1) + E(X2) + · · · + E(Xk )

= k + k + k +···+ k

k✓ k 1 k 2◆

1

= 1 + 1 + · · · + 1 ⇡ ln .

k

2

kk

k

Independent random variables

Random variables X , Y are independent if Pr(X = x, Y = y ) = Pr(X = x)Pr(Y = y ).
Independent or not?
I Pick a card out of a standard deck. X = suit and Y = number. Independent.
I Flip a fair coin n times. X = # heads and Y = last toss. Not independent.
I X , Y take values { 1, 0, 1}, with the following probabilities:

Y -1 0 1 -1 0.4 0.16 0.24 X 0 0.05 0.02 0.03 1 0.05 0.02 0.03
Independent.

XY -1 0.8 0.5 0 0.1 0.2 1 0.1 0.3

Variance
If you had to summarize the entire distribution of a r.v. X by a single number, you would use the mean (or median). Call it µ.
But these don’t capture the spread of X :

Pr(x)

Pr(x)

x µ

x µ

What would be a good measure of spread? How about the average distance away from the mean: E(|X µ|)?
For convenience, take the square instead of the absolute value.

Variance:

var(X ) = E(X µ)2 = E(X 2) µ2,

where µ = E(X ). The variance is always 0.

Variance: example
Recall: var(X ) = E(X µ)2 = E(X 2) µ2, where µ = E(X ). Toss a coin of bias p. Let X 2 {0, 1} be the outcome.
E(X ) = p E(X 2) = p E(X µ)2 = p2 · (1 p) + (1 p)2 · p = p(1 p) E(X 2) µ2 = p p2 = p(1 p) This variance is highest when p = 1/2 (fair coin).
p The standard deviation of X is var(X ). It is the average amount by which X di↵ers from its mean.

Variance of a sum
var(X1 + · · · + Xk ) = var(X1) + · · · + var(Xk ) if the Xi are independent.
Symmetric random walk. A drunken man sets out from a bar. At each time step, he either moves one step to the right or one step to the left, with equal probabilities. Roughly where is he after n steps?

Let Xi 2 { 1, 1} be his ith step. Then E(Xi ) = ?0 and var(Xi ) = ?1. His position after n steps is X = X1 + · · · + Xn.
E(X ) = 0 var( ) =
X np stddev(X ) = n
He is likely to be pretty close to where he started!

Sampling
Useful variance rules:
I var(X1 + · · · + Xk ) = var(X1) + · · · + var(Xk ) if Xi ’s independent. I var( + ) = 2var( ).
aX b a X

What fraction of San Diegans like sushi? Call it p.

Pick n people at random and ask them. Each answers 1 (likes) or 0 (doesn’t like). Call these values X1, . . . , Xn. Your estimate is then:

= X1 + · · · + Xn . Y

n

How accurate is this estimate? Each Xi has mean p and variance p(1 p), so

E( ) = E(X1) + · · · + E(Xn) =

Y

p

n

var( ) = var(X1) + · · · + var(Xn) = p(1 p)

Y

2

r

n

n

(1 ) 1 stddev( ) = p p  p

Y n 2n

DSE 210: Probability and statistics

Winter 2018

Worksheet 4 — Random variable, expectation, and variance

1. A die is thrown twice. Let X1 and X2 denote the outcomes, and deﬁne random variable X to be the minimum of X1 and X2. Determine the distribution of X.
2. A fair die is rolled repeatedly until a six is seen. What is the expected number of rolls?
3. On any given day, the probability it will be sunny is 0.8, the probability you will have a nice dinner is 0.25, and the probability that you will get to bed early is 0.5. Assume these three events are independent. What is the expected number of days before all three of them happen together?
4. An elevator operates in a building with 10 ﬂoors. One day, n people get into the elevator, and each of them chooses to go to a ﬂoor selected uniformly at random from 1 to 10.
(a) What is the probability that exactly one person gets out at the ith ﬂoor? Give your answer in terms of n.
(b) What is the expected number of ﬂoors in which exactly one person gets out? Hint: let Xi be 1 if exactly one person gets out on ﬂoor i, and 0 otherwise. Then use linearity of expectation.
5. You throw m balls into n bins, each independently at random. Let X be the number of balls that end up in bin 1.
(a) Let Xi be the event that the ith ball falls in bin 1. Write X as a function of the Xi. (b) What is the expected value of X?
6. There is a dormitory with n beds for n students. One night the power goes out, and because it is dark, each student gets into a bed chosen uniformly at random. What is the expected number of students who end up in their own bed?
7. In each of the following cases, say whether X and Y are independent.
(a) You randomly permute (1, 2, . . . , n). X is the number in the ﬁrst position and Y is the number in the second position.
(b) You randomly pick a sentence out of Hamlet. X is the ﬁrst word in the sentence and Y is the second word.
(c) You randomly pick a card from a pack of 52 cards. X is 1 if the card is a nine, and is 0 otherwise. Y is 1 if the card is a heart, and is 0 otherwise.
(d) You randomly deal a ten-card hand from a pack of 52 cards. X is 1 if the hand contains a nine, and is 0 otherwise. Y is 1 if all cards in the hand are hearts, and is 0 otherwise.
8. A die has six sides that come up with di↵erent probabilities:
Pr(1) = Pr(2) = Pr(3) = Pr(4) = 1/8, Pr(5) = Pr(6) = 1/4.
(a) You roll the die; let Z be the outcome. What is E(Z) and var(Z)?
4-1

DSE 210

Worksheet 4 — Random variable, expectation, and variance Winter 2018

(b) You roll the die 10 times, independently; let X be the sum of all the rolls. What is E(X) and var(X )?
(c) You roll the die n times and take the average of all the rolls; call this A. What is E(A)? What is var(A)?

9. Let X1, X2, . . . , X100 be the outcomes of 100 independent rolls of a fair die.

(a) What are E(X1) and var(X1)?

(b) Deﬁne the random variable X to be X1 X2. What are E(X) and var(X)?

(c) Deﬁne the random variable Y to be X1 2X2 + X3. What is E(Y ) and var(Y )?

(d) Deﬁne the random variable Z = X1 var(Z )?

X2 + X3

X4 + · · · + X99

X100. What are E(Z) and

10. Suppose you throw m balls into n bins, where m terms of m and n.

n. For the following questions, give answers in

(a) Let Xi be the number of balls that fall into bin i. What is Pr(Xi = 0)? (b) What is Pr(Xi = 1)? (c) What is E(Xi)? (d) What is var(Xi)?

11. Give an example of random variables X and Y such that var(X + Y ) 6= var(X) + var(Y ).

12. Suppose a fair coin is tossed repeatedly until the same outcome occurs twice in a row (that is, two heads in a row or two tails in a row). What is the expected number of tosses?

13. In a sequence of coin tosses, a run is a series of consecutive heads or consecutive tails. For instance, the longest run in HT HHHT T HHT HH consists of three heads. We are interested in the following question: when a fair coin is tossed n times, how long a run is the resulting sequence likely to contain?
To study this, pick any k between 1 and n, and let Rk denote the number of runs of length exactly k (for instance, a run of length k + 1 doesn’t count). In order to ﬁgure out E(Rk), we deﬁne the following random variables: Xi = 1 if a run of length exactly k begins at position i, where i  n k + 1.

(a) What are E(X1) and E(Xn k+1)? (b) What is E(Xi) for 1 < i < n k + 1? (c) What is E(Rk)? (d) What is, roughly, the largest k for which E(Rk) 1?

4-2

Modeling data with probability distributions
DSE 210
Distributional modeling
A useful way to summarize a data set: • Fit a probability distribution to it. • Simple and compact, and captures the big picture while smoothing out the wrinkles in the data. • In subsequent application, use distribution as a proxy for the data.
Which distributions to use? There exist a few distributions of great universality which occur in a surprisingly large number of problems. The three principal distributions, with ramiﬁcations throughout probability theory, are the binomial distribution, the normal distribution, and the Poisson distribution. – William Feller.
Well, this is true in one dimension. For higher-dimensional data, we’ll use combinations of 1-d models: products and mixtures.