# 20 CS 472 Analysis of Algorithms II Random Variables and

## Transcript Of 20 CS 472 Analysis of Algorithms II Random Variables and

11 October 2010

University of Cincinnati School of Computing Sciences and Informatics

20 CS 472 – Analysis of Algorithms II Random Variables and Functions Winter 2010

½º Ç

ØÚ

The objective of this document is to give the student a better understanding of random variables, distributions, and their use in engineering problems.

¾º

ÖÓÙÒ

A random variable x is a variable that can take any one of a set of values according to some probability distribution. Suppose random variables only take integer values so that in what follows, mention of x without modiﬁers implicitly means x is considered to be a random integer variable. We use the notation P r(x = v) to denote the probability that x has value v. If V is the complete set of values that x can take, we must have

P r(x = v) = 1

(1)

v∈V

which means it is certain that x takes some value from its allowed set V . If x takes all values in V with equal probability, then we say that x is uniformly distributed over V . For example, if x is uniformly distributed over V = {1, 2, ..., 1000}, then P r(x = v) = 1/1000 for any 1 ≤ v ≤ 1000 and P r(x = v) = 0 for any v < 1 and any v > 1000.

Two important parameters of distributions are the mean and the standard deviation. The mean tells us something about the value we might expect for x the next time it is assigned a value. The standard deviation tells us something about how close to the mean the next value is going to be. The mean of x is deﬁned as

µx = v · P r(x = v)

v∈V

where we use µx as a shorthand representation of this sum. The standard deviation is deﬁned as

σx =

(x − µx)2P r(x = v).

(2)

v∈V

Think of σx as the mean value of the random variable (x − µx)2. It is easy to see that if x has a high probability of taking a value very close to µx, then σx is not going to be high.

It is straightforward to compute µx and σx if x is uniformly distributed between numbers a and

b (that is, V = {a, (a + 1), ..., (b − 1), b}). From equation (1) and the fact that there are b − a + 1

numbers in V , P r(x = v) = 1/(b − a + 1). Therefore, µx =

The sum

b i=a

i

written

out

is

v∈V v · P r(x = v) =

b i=a

i/(b

−

a

+

1).

a + (a + 1) + (a + 2) + ... + (b − 2) + (b − 1) + b

(3)

To see what this is, just pair the left a with the right b to get (a + b), then pair the 2nd left (a + 1) with the 2nd right (b − 1) to get (a + b) again. Pairing 3rd left and 3rd right terms gives (a + b)

again. If V has an even number of numbers, We can do this pairing (b − a + 1)/2 times, each time adding (a + b). So, in total, we add (a + b) ∗ (b − a + 1)/2 if |V | is even. If |V | is odd, we get (a + b) ∗ (b − a)/2 for the pairing total but there is a middle, unpaired term of (a + b)/2. Hence the total for |V | odd is (a − b) ∗ (b − a + 1)/2 which is the same as for the case |V | even. Therefore, the sum (3) is (a + b)(b − a + 1)/2 and

µx = i = (a + b) · (b − a + 1) = a + b . (4) v∈V b − a + 1 2(b − a + 1) 2

In other words, µx is right in the middle of the range of values x can take.

What about the standard deviation? If x is uniformly distributed between a and b the standard deviation, from equations (2) and (4), is

b

σx =

i=1

i− b+a 2· 1

2

b−a+1

A closed form √equivalent expression can be worked out for σx as was done for µx. The result is σx = |(b − a)|/ 12.

Now consider another distribution, called the triangle distribution. If x has the triangle distribution from a to b, then

P r(x = v) = 2(x − a + 1) . (b − a + 2)(b − a + 1)

In other words, the probability that x has value v rises linearly from a until b and is 0 outside that

range. It is not hard to see that

b i=a

P

r(x

=

i)

=

1

as

required.

It is also straightforward to

compute µx = a + 2(b − a)/3. A value for σx can be computed but the result is a rather complex

expression and, since we will not need it, we will not show it.

Up to now we have considered integer random variables to avoid heavy duty math you may not have seen yet. But all the results above are similar to results obtained if x were allowed to take non-integer values as well. So, from now on we lift the restriction of integer values.

For many applications the Gaussian distribution is the most important distribution available to us. Avoiding the complicated deﬁnition of the Guassian distribution, it suﬃces to state the following amazing properties it has:

1. It is completely characterized by its mean and its standard deviation. In other words, all I have to say is Gaussian with parameters µx and σx and you know exactly what distribution I am talking about.

2. In most cases, the sum of a collection of random variables taken from any distribution, integer or real-valued, tends to a Gaussian distribution.

It is the latter property, better known as the Central Limit Theorem, that makes the use and study of Gaussian distributions so important. We will investigate this more closely in the remainder of this document.

Figure 1: Output for program of Section 3.1 Figure 2: Output for program of Section 3.2

¿º

×ØÖ ÙØ ÓÒ×

We write programs to experiment with several probabilistic distributions.

¿º½ ÍÒ ÓÖÑ ×ØÖ ÙØ ÓÒ

We present a MATLAB program that asks for a mean (µx) and then generates 1000 · µx random real values from the uniform distribution with a = 0 and b = 2 ∗ µx. The MATLAB built-in function rand is used to generate the values. An array bin of buckets is created so that, at the end of execution, bin(i) contains the number of values generated between the integer i, exclusive, and the integer i + 1, inclusive. Then the result (the y coordinates are the numbers in bin(i) and the x coordinates are bucket indices i) is plotted. The y axis takes a range of values from 0 to 1.2 times the maximum of bin(i) over all i. See Figure 1 for an example output where the mean was set to 100.

Here is the code:

function uniform(mean) bin = zeros(1,2*mean); for i=1:100000*mean

number = ceil(rand(1)*2*mean); bin(number) = bin(number) + 1; end plot(bin); axis([0 2*mean 0 1.2*max(bin)]); ylabel(’Points in bucket’); xlabel(’Bucket number’); title(’Random values: Uniform Distribution’);

¿º¾ ÌÖ Ò Ð

×ØÖ ÙØ ÓÒ

We repeat Section 3.1 for the triangle distribution. The distribution rises linearly to (3/2) · µx. Example output is shown in Figure 2 for a mean of 100. We use the uniform generator to get triangle values under the following transformation:

√ v = (3/2) · µv · x

where x is uniformly distributed between 0 and 1, and v is distributed according to the triangle distribution with mean µv. Here is the code:

Figure 3: Output for program of Section 3.3

Figure 4: Program of Section 3.4: 100 values, µvi = 1.

Figure 5: Program of Section 3.4: 10 values, µvi = 10.

Figure 6: Program of Section 3.5: 25 values, µvi = 400.

function triangle(mean) bin = zeros(1,(3/2)*mean); for i=1:1000*mean

number = ceil((3/2)*mean*sqrt(rand(1))); bin(number) = bin(number) + 1; end plot(bin); axis([0 (3/2)*mean 0 1.2*max(bin)]); ylabel(’Points in buckets’); xlabel(’Bucket number’); title(’Random values: Triangle Distribution’);

Note: Where rand(1) ∗ 2 ∗ µx was used in the ﬁrst program, (3/2) ∗ sqrt(rand(1)) ∗ µv is now used.

¿º¿

Ù×× Ò ×ØÖ ÙØ ÓÒ

We repeat Section 3.1 for the Gaussian distribution except the input includes standard deviation. Example output is shown in Figure 3 for a mean of 100 and standard deviation 50. The built-in MATLAB function randn generates numbers taken from a normal distribution with mean 0 and standard deviation 1. To translate to a Gaussian distribution of mean µv and standard deviation

σv use:

v = µv + σv · x

where x is taken from a normal distribution, and v is distributed according to the Gaussian distribution with mean µv and standard deviation σv The code is as follows:

function gauss(mean,stddev) bin = zeros(1,2*mean); errlow = 0; errhigh = 0; for i=1:10000*mean

number = ceil(stddev*randn(1))+mean; if number > 0 & number < 2*mean + 1

bin(number) = bin(number) + 1; elseif number < 1

errlow = errlow +1; else

errhigh = errhigh + 1; end end figure(2); plot(bin); axis([0 2*mean 0 max(bin)*1.2]); ylabel(’Points in bucket’); xlabel(’Bucket number’); title(’Random values: Gaussian Distribution’); fprintf(’Display of frequency of values from normal fprintf(’Error percentages: low=%f high=%f\n’, \

errlow/(10000*mean),errhigh/(10000*mean));

distribution\n’);

Notes: Where rand(1) ∗ 2 ∗ µx was used, now σv ∗ randn(1) + µv is used. Function randn(1) returns a value from minus inﬁnity to plus inﬁnity. Therefore the vector bin will never be able to hold all the values generated. Hence we that check random values before attempting to store them and throw away those that are outside the index range of bin.

¿º ËÙÑ Ó ÙÒ ÓÖÑÐÝ ×ØÖ ÙØ Ú ÐÙ ×

We repeat Section 3.1 except this time each recorded value is the sum of values of uniformly distributed random values. Input from the user is the number of values to sum and the mean of each of those values. Figure 4 shows results where the number of values summed is 100 and the mean of each value is 1, and Figure 5 shows where the number of values summed is 10 and the mean of each is 10. Observe that for larger means the distribution of the sum approaches that of a Gaussian distribution. Experiment to ﬁnd the relationship between the mean and standard deviation of the resulting Gaussian distribution and the number of values summed and their means. Here is the code:

function central_uni(num2sum,mean) bin = zeros(1,num2sum*2*mean); for i=1:1000*mean

s = 0; for j=1:num2sum s=s+rand(1)*2*mean; end number = ceil(s); bin(number) = bin(number) + 1; end plot(bin); axis([num2sum*mean*(1-.25) num2sum*mean*(1+.25) 0 max(bin)*1.2]); ylabel(’Points in bucket’); xlabel(’Bucket number’); title(’Random values: Sum of Uniformly Distributed Values’); fprintf(’Display of frequency of sum of uniform random values\n’);

¿º ËÙÑ Ó ØÖ Ò Ð

×ØÖ ÙØ Ú ÐÙ ×

Finally, we repeat Section 3.4 except this time the values to sum are distributed according to the triangle distribution. Input from the user is the number of values to sum and the mean of each of those values. Figure 6 shows output when summing 100 values, each of mean 100. Here is the code:

function central_tri(num2sum,mean) bin = zeros(1,num2sum*2*mean); for i=1:1000*mean

s = 0; for j=1:num2sum s=s+(3/2)*mean*sqrt(rand(1)); end number = ceil(s); bin(number) = bin(number) + 1; end plot(bin); axis([num2sum*mean*(1-.25) num2sum*mean*(1+.25) 0 max(bin)*1.2]); ylabel(’Points in bucket’); xlabel(’Bucket Number’); title(’Random values: Sum of Triangle Distributed Values’); fprintf(’Display of frequency of sum of triangle random values\n’);

University of Cincinnati School of Computing Sciences and Informatics

20 CS 472 – Analysis of Algorithms II Random Variables and Functions Winter 2010

½º Ç

ØÚ

The objective of this document is to give the student a better understanding of random variables, distributions, and their use in engineering problems.

¾º

ÖÓÙÒ

A random variable x is a variable that can take any one of a set of values according to some probability distribution. Suppose random variables only take integer values so that in what follows, mention of x without modiﬁers implicitly means x is considered to be a random integer variable. We use the notation P r(x = v) to denote the probability that x has value v. If V is the complete set of values that x can take, we must have

P r(x = v) = 1

(1)

v∈V

which means it is certain that x takes some value from its allowed set V . If x takes all values in V with equal probability, then we say that x is uniformly distributed over V . For example, if x is uniformly distributed over V = {1, 2, ..., 1000}, then P r(x = v) = 1/1000 for any 1 ≤ v ≤ 1000 and P r(x = v) = 0 for any v < 1 and any v > 1000.

Two important parameters of distributions are the mean and the standard deviation. The mean tells us something about the value we might expect for x the next time it is assigned a value. The standard deviation tells us something about how close to the mean the next value is going to be. The mean of x is deﬁned as

µx = v · P r(x = v)

v∈V

where we use µx as a shorthand representation of this sum. The standard deviation is deﬁned as

σx =

(x − µx)2P r(x = v).

(2)

v∈V

Think of σx as the mean value of the random variable (x − µx)2. It is easy to see that if x has a high probability of taking a value very close to µx, then σx is not going to be high.

It is straightforward to compute µx and σx if x is uniformly distributed between numbers a and

b (that is, V = {a, (a + 1), ..., (b − 1), b}). From equation (1) and the fact that there are b − a + 1

numbers in V , P r(x = v) = 1/(b − a + 1). Therefore, µx =

The sum

b i=a

i

written

out

is

v∈V v · P r(x = v) =

b i=a

i/(b

−

a

+

1).

a + (a + 1) + (a + 2) + ... + (b − 2) + (b − 1) + b

(3)

To see what this is, just pair the left a with the right b to get (a + b), then pair the 2nd left (a + 1) with the 2nd right (b − 1) to get (a + b) again. Pairing 3rd left and 3rd right terms gives (a + b)

again. If V has an even number of numbers, We can do this pairing (b − a + 1)/2 times, each time adding (a + b). So, in total, we add (a + b) ∗ (b − a + 1)/2 if |V | is even. If |V | is odd, we get (a + b) ∗ (b − a)/2 for the pairing total but there is a middle, unpaired term of (a + b)/2. Hence the total for |V | odd is (a − b) ∗ (b − a + 1)/2 which is the same as for the case |V | even. Therefore, the sum (3) is (a + b)(b − a + 1)/2 and

µx = i = (a + b) · (b − a + 1) = a + b . (4) v∈V b − a + 1 2(b − a + 1) 2

In other words, µx is right in the middle of the range of values x can take.

What about the standard deviation? If x is uniformly distributed between a and b the standard deviation, from equations (2) and (4), is

b

σx =

i=1

i− b+a 2· 1

2

b−a+1

A closed form √equivalent expression can be worked out for σx as was done for µx. The result is σx = |(b − a)|/ 12.

Now consider another distribution, called the triangle distribution. If x has the triangle distribution from a to b, then

P r(x = v) = 2(x − a + 1) . (b − a + 2)(b − a + 1)

In other words, the probability that x has value v rises linearly from a until b and is 0 outside that

range. It is not hard to see that

b i=a

P

r(x

=

i)

=

1

as

required.

It is also straightforward to

compute µx = a + 2(b − a)/3. A value for σx can be computed but the result is a rather complex

expression and, since we will not need it, we will not show it.

Up to now we have considered integer random variables to avoid heavy duty math you may not have seen yet. But all the results above are similar to results obtained if x were allowed to take non-integer values as well. So, from now on we lift the restriction of integer values.

For many applications the Gaussian distribution is the most important distribution available to us. Avoiding the complicated deﬁnition of the Guassian distribution, it suﬃces to state the following amazing properties it has:

1. It is completely characterized by its mean and its standard deviation. In other words, all I have to say is Gaussian with parameters µx and σx and you know exactly what distribution I am talking about.

2. In most cases, the sum of a collection of random variables taken from any distribution, integer or real-valued, tends to a Gaussian distribution.

It is the latter property, better known as the Central Limit Theorem, that makes the use and study of Gaussian distributions so important. We will investigate this more closely in the remainder of this document.

Figure 1: Output for program of Section 3.1 Figure 2: Output for program of Section 3.2

¿º

×ØÖ ÙØ ÓÒ×

We write programs to experiment with several probabilistic distributions.

¿º½ ÍÒ ÓÖÑ ×ØÖ ÙØ ÓÒ

We present a MATLAB program that asks for a mean (µx) and then generates 1000 · µx random real values from the uniform distribution with a = 0 and b = 2 ∗ µx. The MATLAB built-in function rand is used to generate the values. An array bin of buckets is created so that, at the end of execution, bin(i) contains the number of values generated between the integer i, exclusive, and the integer i + 1, inclusive. Then the result (the y coordinates are the numbers in bin(i) and the x coordinates are bucket indices i) is plotted. The y axis takes a range of values from 0 to 1.2 times the maximum of bin(i) over all i. See Figure 1 for an example output where the mean was set to 100.

Here is the code:

function uniform(mean) bin = zeros(1,2*mean); for i=1:100000*mean

number = ceil(rand(1)*2*mean); bin(number) = bin(number) + 1; end plot(bin); axis([0 2*mean 0 1.2*max(bin)]); ylabel(’Points in bucket’); xlabel(’Bucket number’); title(’Random values: Uniform Distribution’);

¿º¾ ÌÖ Ò Ð

×ØÖ ÙØ ÓÒ

We repeat Section 3.1 for the triangle distribution. The distribution rises linearly to (3/2) · µx. Example output is shown in Figure 2 for a mean of 100. We use the uniform generator to get triangle values under the following transformation:

√ v = (3/2) · µv · x

where x is uniformly distributed between 0 and 1, and v is distributed according to the triangle distribution with mean µv. Here is the code:

Figure 3: Output for program of Section 3.3

Figure 4: Program of Section 3.4: 100 values, µvi = 1.

Figure 5: Program of Section 3.4: 10 values, µvi = 10.

Figure 6: Program of Section 3.5: 25 values, µvi = 400.

function triangle(mean) bin = zeros(1,(3/2)*mean); for i=1:1000*mean

number = ceil((3/2)*mean*sqrt(rand(1))); bin(number) = bin(number) + 1; end plot(bin); axis([0 (3/2)*mean 0 1.2*max(bin)]); ylabel(’Points in buckets’); xlabel(’Bucket number’); title(’Random values: Triangle Distribution’);

Note: Where rand(1) ∗ 2 ∗ µx was used in the ﬁrst program, (3/2) ∗ sqrt(rand(1)) ∗ µv is now used.

¿º¿

Ù×× Ò ×ØÖ ÙØ ÓÒ

We repeat Section 3.1 for the Gaussian distribution except the input includes standard deviation. Example output is shown in Figure 3 for a mean of 100 and standard deviation 50. The built-in MATLAB function randn generates numbers taken from a normal distribution with mean 0 and standard deviation 1. To translate to a Gaussian distribution of mean µv and standard deviation

σv use:

v = µv + σv · x

where x is taken from a normal distribution, and v is distributed according to the Gaussian distribution with mean µv and standard deviation σv The code is as follows:

function gauss(mean,stddev) bin = zeros(1,2*mean); errlow = 0; errhigh = 0; for i=1:10000*mean

number = ceil(stddev*randn(1))+mean; if number > 0 & number < 2*mean + 1

bin(number) = bin(number) + 1; elseif number < 1

errlow = errlow +1; else

errhigh = errhigh + 1; end end figure(2); plot(bin); axis([0 2*mean 0 max(bin)*1.2]); ylabel(’Points in bucket’); xlabel(’Bucket number’); title(’Random values: Gaussian Distribution’); fprintf(’Display of frequency of values from normal fprintf(’Error percentages: low=%f high=%f\n’, \

errlow/(10000*mean),errhigh/(10000*mean));

distribution\n’);

Notes: Where rand(1) ∗ 2 ∗ µx was used, now σv ∗ randn(1) + µv is used. Function randn(1) returns a value from minus inﬁnity to plus inﬁnity. Therefore the vector bin will never be able to hold all the values generated. Hence we that check random values before attempting to store them and throw away those that are outside the index range of bin.

¿º ËÙÑ Ó ÙÒ ÓÖÑÐÝ ×ØÖ ÙØ Ú ÐÙ ×

We repeat Section 3.1 except this time each recorded value is the sum of values of uniformly distributed random values. Input from the user is the number of values to sum and the mean of each of those values. Figure 4 shows results where the number of values summed is 100 and the mean of each value is 1, and Figure 5 shows where the number of values summed is 10 and the mean of each is 10. Observe that for larger means the distribution of the sum approaches that of a Gaussian distribution. Experiment to ﬁnd the relationship between the mean and standard deviation of the resulting Gaussian distribution and the number of values summed and their means. Here is the code:

function central_uni(num2sum,mean) bin = zeros(1,num2sum*2*mean); for i=1:1000*mean

s = 0; for j=1:num2sum s=s+rand(1)*2*mean; end number = ceil(s); bin(number) = bin(number) + 1; end plot(bin); axis([num2sum*mean*(1-.25) num2sum*mean*(1+.25) 0 max(bin)*1.2]); ylabel(’Points in bucket’); xlabel(’Bucket number’); title(’Random values: Sum of Uniformly Distributed Values’); fprintf(’Display of frequency of sum of uniform random values\n’);

¿º ËÙÑ Ó ØÖ Ò Ð

×ØÖ ÙØ Ú ÐÙ ×

Finally, we repeat Section 3.4 except this time the values to sum are distributed according to the triangle distribution. Input from the user is the number of values to sum and the mean of each of those values. Figure 6 shows output when summing 100 values, each of mean 100. Here is the code:

function central_tri(num2sum,mean) bin = zeros(1,num2sum*2*mean); for i=1:1000*mean

s = 0; for j=1:num2sum s=s+(3/2)*mean*sqrt(rand(1)); end number = ceil(s); bin(number) = bin(number) + 1; end plot(bin); axis([num2sum*mean*(1-.25) num2sum*mean*(1+.25) 0 max(bin)*1.2]); ylabel(’Points in bucket’); xlabel(’Bucket Number’); title(’Random values: Sum of Triangle Distributed Values’); fprintf(’Display of frequency of sum of triangle random values\n’);