Variance and Covariance of life Table Functions Estimated

Preparing to load PDF file. please wait...

0 of 0
100%
Variance and Covariance of life Table Functions Estimated

Transcript Of Variance and Covariance of life Table Functions Estimated

NATIONAL For HEALTH

CENTER STATISTICS

PROPERWOF THIS PUBLICATIONSf31!AN(JI
EDITORIAL LMWRY

Series 2

Number

20

VITAL

and

DATA EVALUATION

HEALTH
AND METHODS

STATISTICS
RESEARCH

Variance and Covariance of life Table Functions Estimated From a Sample of Deaths

Formulas for the variance of abridged and complete of deaths.

and covariance of functions life tables based on a sample

Washington, D.C.

March 1967

U.S. DEPARTMENT OF

HEALTH,

EDUCATION, AND John W. Gardner Secretary

WELFARE

Public Health Service William H. Stewart Surgeon General

Public Health Service Publication No. 1000-Series 2-No. 20

For sale by tbe Superintendent of Documents, U.S. Government Washington, D. C., 20402- Price 15 cents

Printing Office

NATIONAL CENTER FOR HEALTH ST A-1‘--l-S” II-C- S
FORREST E. LINDER, PH.D., Director THEODORE D. WOOLSEY, Deputy Director OSWALD K. SAGEN, PH.~, Assi.skzfit Director WALT R. SIMMONS, M.A.,Statistical Advisor ALICE M. WATERHOUSE, M. D.M,edical Consultant JAMES E. KELLY, D,D.S.,Dental Advisor LOUIS R. STOLCIS, M.A.,Executive O//icer
DONALD GREEN, in/ormation O//icer
OFFICE OF HEALTH STATISTICSANALYSIS IWAO M. MORIYAMA, PH.D., Cbie/ DIVISION OF VITAL STATISTICS ROBERT D. GROVE, PH.D., Chic/
DIVISION OF HEALTH INTERVIEW STATISTICS PHILIP S. LAWRENCE, Sc.D., Chic/
DIVISION OF HEALTH RECORDS STATISTICS MONROE G. SIRKEN, PH.D., Chic/
DIVISION OF HEALTH EXAMINATION STATISTICS ARTHUR J. MCDOWELL, Cbie/
DIVISION OF DATA PROCESSING LEONARD D. MCGANN, Cbie/
Public Health Service Publication No. 1000-Series 2-No. 20
Library of Congress Catalog Card Number 66-62207

FOREWORD
Annually the National Center for Health Statistics prepares two sets of abridged life tables for the United States. In the regular abridged life tables the age-specific mortality rates are based on tabulations of all deaths occurring during the calendar year distributed by age, color, and sex. In the sample life tables the age-specific mortality rates are based on a 10-percent sample of deaths; these rates are available several months prior to the rates based on a complete count of deaths.
Formulas for the variance of the functions of life tables based on a complete count of deaths were derived several years ago. These formulas have appeared in several publications prepared by Dr. C. L. Chiang. However, formulas for the variance of the life table functions were not available for assessing the reliability of life tables based on a sample of deaths. Accordingly the National Center for Health Statistics invited Dr. Chiang to generalize his earlier work and make the results appli­ cable to life tables based on a sample of deaths. This report presents his results and includes the formulas for variance and covariance of the life tables functions based on a sample of deaths.
As the Center’s project director for this study, I identified the problem and worked with Dr. Chiang in the early stages of developing the stochastic model and in reviewing this manuscript for publication.
Monroe G. Sirken, Ph. D., Chief Division of Health Records Statistics

CONTENTS
Foreword ------------------------------------------------------------I. Introduction -----------------------------------------------------II. A Life Table Based on the Complete Count of Deaths ------------------
III. A Life Table Based on aSample of Deaths --------------------------References -----------------------------------------------------------

Page i 1 1 3 8

IN THIS REPOR T fovmulas for the uayiance and covariance of functions of abridged and complete life tables based on a sample of deaths are derived. It is well known that life table functions, as any statistical quantities, are random va?’iables. When a life table is constricted on the basis of a sample of deaths vathey than on the total count, theye is a component of sampling vayiation associated with the obseyved values. This component of vaviation must be assessed in making statistical in­ fevence YegaYding .swvival expedience of a population as determined in such a table. In the foymulas foy the vayiance and covayiance ofesti­ mates of the probability of dy~kg (q, 1, the survival ?’ate (POi ), and the expectation of life ( e Jpresented in this paper, both the random vaYia­ tion and the sampling vayiation aye taken into account.

VARIANCE AND COVARIANCE OF LIFE TABLE FUNCTIONS ESTIMATED
FROM A SAMPLE OF DEATHS
C. I.I. Chiang, Ph. D., School of Public Health, University of California, BeYkeley

1. INTRODUCTION
The Federal Government has published annual abridged life tables for the United States since 1945.1 These tables are based on age-specific mortality rates computed from the total number of deaths occurring during the calendar year and fromO the estimate of the midyear population. In 1958” the Federal vital statistics agency estab­ lished another series of annual abridged life tables based on a 10-percent sample of deaths with the objective of publishing annual life tables on a more current schedule.
Using a 10-percent sample rather than the total number of deaths makes very little differ­ ence in the numerical values of a life table, but it does increase the amount of variation associated with these values. This is because the life table functions as determined by a sample are subject to sampling variation and to the random variation present when the total count is used. The main functions of general interest are the probability of dying (qi), the survival rate (pO) , and the expectation of life (e]). The purpose of this paper is to derive formulas for the variances of the estimates of these functions, taking into account both random variation and sampling variation. h section II we shall reproduce the corresponding formulas for these functions for the case where only random variation is considered.

Il. A LIFE TABLE BASED ON THE COMPLETE COUNT OF DEATHS

In earlier studies of life table and mortality rates 3-5 probability distributions of life table functions and formulas for the corresponding variances were derived. Some of these formulas are reproduced below for easy reference; the original publications can be consulted for details.
In an abridged life table the life span is di­ vided into w+l intervals, each of which is defined by two exact ages (Xi, x i+l) , for i= O,l,...,w, except
for the last interval which is usually a half-open interval such as ‘’95 and over. ” The age XOmay be taken as O, the age at birth, and XW as the age
at the beginning of the last interval. For the inter­ val (Xi, Xi+l) let ni =Xi+l –Xi be the length of the in­ terval; when ~i=) for all i, we have the complete life table. Let Di be the number dying iu the age interval (xi, x I+~) during the calendar year and Pi be the corresponding midyear population, so that the sum

DO+ D1+. ..+ DW=D

(1)

is the total number of deaths during the year and

<+ PI+ . ..+ PW=P

(2)

1

is the total midyear population. The ratio
(3)
is the age-specific mortality rate. Let ivi be the number of individuals alive at
exact age xi among whom the Di deaths occur so that

(4)
is an estimate of theprobability thatanindividual alive at exact age xi will die in the interval (xi, xi+ ~) . The number P/i is not directly observed but is estimated from Di and Pi . A meaningful concept of the relation between Ni and Pi from the standpoint of the life table is that ~ is an estimate of the total number of years lived in the interval (xi, xi +1) by Ni individuals. Let ai be the average fraction of the interval (Xi, x 1+ J lived by each of the Di individuals. 6 It can be seen that

~=ni(Ni–

Di) + aini Di .

(5)

The first term on the right side of (5) is the number of years lived by the iVi-Di survivors, and the second term is equal tr the sum of the
fractions of the interval lived by the Di individuals who die during the interval. Equation (5) can be rewritten as

Ni =~[~+(l-ai)niDi].

(6)

By substituting (6) in (4) we establish a basic relationship between the age-specific death rate (Mi ) and the corresponding estimated probability of dying (81)
niMi
61 = l+(l-ai)rziMi “ (7)
This formula has been used for the construction of life tables7 and has appeared in references 4 and 5.
To derive the formula for the variance of ~~, we only need to observe that D, is a binomial

random variable in Ni “trials” with the binomial probability ~~. Accordingly, Di has the expecta­ tion

E(DJ = Ni q,

(8)

and the variance

2 =Niqi(l–qj ).
‘D ~

(9)

Therefore, the sample variance of ~i is (10)

Substituting (7) in (10) yields mula

niMi(l–aini S( =
~[l+(l–ai)ni

Mi)
MiJ3 “

the required

for­ (11)

The covariance between ~i and ~j for two age intervals is equal to zero as shown in reference 3. Formula (11) is very important in the statis­ tical analysis of a life table since the formulas for the variances of other life table functions can all be expressed in terms of the variance of ~i. Two important examples follow.
1. In a current life table, the proportion of survivors from age O to age X&

A

la

‘Oa = ~

(12)

is actually computed from

Boa=&& . . .&_l

= (l-Q(I-$J...

($J$J

,

(13)

where~i = I-$i is an estimate of the probability of surviving the interval (xi, x i + ~). The formula for the sample variance of & is

s; =x ::’jji-z~z a.

off a 1=0

qi

(14)

2. The observed expectation may be expressed as

,$=an+c

A

CY IXCY cr+l~a, cz’+1

+C CY+7. pAa, a+2 +-.”+

of life at age Xa

A cW*~, w*

(15)

2

where c,= (1-a l_~)ni-~

+ alni.

The sample variance of & is given by

(16)

The derivation of formulas (14) and (17) is given in references 3 and 4 and will not be repeated here.

Ill. A LIFE TABLE BASED ON A SAMPLE OF DEATHS

In the construction of the new series of life tables, the actual number (Di) of deaths is not observed directly but is estimated by a sampling procedure. The estimated value of Di is then used as the basis for the computation of the mortality rate and the life table functions. We have then a sample of size d taken without replacement from the total of D death certificates such that

—d .

(18)

Df

is a preassigned sampling fraction. In the new series described here, f=. 10. Depending upon the distribution by age at death, a number di of deaths in the sample will fall into the age interval (xi, x,+ ~) with

dO+dl+. ..+dw=d.

(19)

The number D, is estimated from

(20)

and the age-specific mortality rate from
A
di Ml=~=—i fPi “ (21)
It is clear that di (or fii ) and hence Mi are sub­ ject to both random variation and sampling varia­ tion. As a result, the formula for the variance of $i as presented in section II must be revised and the covariance between $i and $j will need to be evaluated. Our first step is to derive the formulas for the variance of di and the covariance between di and dj.

The vaviance Of di .—The probability dis­

tribution of di depends upon the total number of

deaths Di in the age interval (xi ,xi+l) and the

total number of deaths D at all ages. Relative to

Di and D, the variance of di may be written as

2

2

+ EG2

‘di = = E(dil Di, D)

di/Di, D “

(22)

The first term on the right side of (22) is the

variance of the conditional expectation of di given Di and D, and the second term is the expectation of the variance of the conditional distribution of di given Di and D. We shall discuss them sepa­ rately.
Since the sample is taken without replace­ ment, the conditional distribution of di given Di and D is hypergeometric with the expectation

E(dijDi, D)=d ~= fDi .

(23)

Using formulas (23) and (9), we compute the

variance of E(di lDi ,D),

2 ‘E(di}Di,

= f2U~ = f2Ni~i (l–~i)-

D)

1

(24)

For the last term in equation (22), we make use

of the well-known theorem in the hypergeometric distribution to write

U2 = d~(l dil Di, D . d#(l–

-;)

(~)

#) (~)

(25)

since the number D is usually large. We now re­ write (25)

2 u

=;(1

diiDiD,

-;)(Di

- ~)

(26) D? = f(l–f)(Di - &)

and its expectation

+,9 o = f(l-f)[E(Di) - E(A)].

(27)

i i’

The first expectation inside the brackets was given in (8), and the second expectation may be rewritten as

E(+rl;

)=E[$E(D; ]D)].

(28)

Our problem is to find the conditional expectation
E(D;/D).

3

To avoid confusion in notation, let us con­ sider the particular case in which i= Oand write the conditional expectation
(29)
where 60 is the value that the random variable DO takes on and i%{~o = bob} is the corresponding conditional probability. Using Bayes’ law,
Pr{B}-Pr{A[B} Pr{B/A} = Pr&l} (30)
we may rewrite the conditional probability as

Pr{Do=

$[D}

Pr{Do=$]. Pr{D1+. . .+ Dw=D-f30\ =
Pr{Do+D1+. $ ‘+ DW=D}

(31)

where the sum in the numerator possible values of 3 so that
61+. ..+6W=’D-60
and in the denominator

is taken over all
(32)

The probability distribution of each Di is bi­

nomial with the probability distribution

Ni !

6.

Ni-6i

pr{Di= ~i}= 6,1(Ni_6i)! ‘i ‘(l-qi)

- (34)

t

When (34) is substituted, formula (31) becomes

very unwieldy. Ordinarily, however, the proba­

bility qi is small and Ni is large, so that the

Poisson distribution should be a good approxima­

tion; this wiIl give us –hi ~:i

Pr{Di=6i}=

e ~,,

(35)

1“

where, for simplicity,

Xi =Ni qi = E(Di)

(36)

is the expected number of deaths in the age inter­ val (Xi, xi+ ~) as given in (8). Now formula (35) is substituted in the last expression of (31) to give the sum in the numerator

–(xl+. . .+AW)

1

. e

(D- 130)! ‘Al+” “ “ ‘Aw)D-60

(37)

the denominator

(38)

.

–(AO+A1+. ..+AW)
e

DJ~ (AO+A1.+.+. AW)D,

and, finally, after simplification, probability

the conditional

Formula (39) shows that the conditional distribu­ tion of Do given D is binomial with the proportion of the expected number of deaths in the age inter­ val (XOX, l)
(40)
as the binomial probability. It follows that

2 0

2“

; Nk qk

‘D2

(::90)

)

(41)
DeathsVarianceLife TableSampleFormulas