Some results on lower variance bounds useful in reliability

Preparing to load PDF file. please wait...

0 of 0
100%
Some results on lower variance bounds useful in reliability

Transcript Of Some results on lower variance bounds useful in reliability

Ann Inst Stat Math (2008) 60:591–603 DOI 10.1007/s10463-007-0119-3
Some results on lower variance bounds useful in reliability modeling and estimation
N. Unnikrishnan Nair · K. K. Sudheesh
Received: 16 November 2005 / Revised: 29 May 2006 / Published online: 30 May 2007 © The Institute of Statistical Mathematics, Tokyo 2007
Abstract In the present paper a general theorem that links characterizations of discrete life distributions based on relationship between failure rate and conditional expectations with those in terms of Chernoff-type inequalities is proposed. Exact expression for lower bounds to the variance is calculated for distributions belonging to the modified power series family, Ord family and mixture geometric models. It is shown that the bounds obtained here contain the Cramer–Rao and Chapman–Robbins inequalities as special cases. An application of the results to real data is also provided.
Keywords Characterizations · Chernoff-type inequalities · Failure rate · Unbiased estimation
1 Introduction
Several papers in literature address the problem of characterizing discrete probability distributions through relations between conditional expectations and failure rates or reversed failure rates. Another independent stream of thought is to characterize the same class of distributions through lower bound on the variance of random variables by Chernoff-type inequalities satisfying specific conditions. Alharbi and Shanbhag (1996) point out the application of the latter results in characterizing life distributions through a result similar to Cox representation of the survival function in terms of failure rate and suggest cases of some continuous distributions as illustrations. In the present paper we establish a general characterization theorem that combines the results available in the
N. Unnikrishnan Nair · K. K. Sudheesh (B)
Department of Statistics, Cochin University of Science and Technology, Cochin 22, India e-mail: [email protected]

592

N. Unnikrishnan Nair, K. K. Sudheesh

two approaches described above. By doing so, the theorem enables the criteria for modeling lifetime data and also to infer the parameters contained in the model, from among a large class of distributions. Apart from the reliability context, the results provide an alternate methodology for unbiased estimation that includes identical conclusions with those provided by Cramer–Rao in regular cases and ensures attainment of minimum variances in non-regular cases, provided in the Chapman–Robbins inequality.
The concepts and definitions required for the work in the subsequent sections consist of a class A of discrete probability distributions supported by the set N of non-negative integers, the set B of real valued functions c(.), of a random variable X defined on N having finite variance along with
m(x) = E h(X)|X > x r(x) = E h(X)|X < x
for a function h(x) ∈ B such that E h2(X) < ∞, E(h(X)) = µandV(h(X)) = σ 2. In the above formulation f (x), F(x) and R(x) = P(X ≥ x) denote respectively the probability mass function, distribution function and survival function of X so that
k(x) = f (x) (1) R(x)
and
λ(x) = f (x) (2) F(x)
are the failure rate and reversed failure rate of X, respectively. Of these, reversed failure rate, which is receiving considerable attention recently (see Block et al. 1998; Gupta and Nanda 2001; Nair and Asha 2004; Nanda and Sengupta 2005) provides some interesting extensions hitherto not discussed in earlier papers.
The research relating to characterization of distributions by bounds on variance of a random variable had its origin in Chernoff’s (1981) inequality for the normal distribution. Characterizations based on various extensions of this inequality in the continuous and discrete cases have been obtained by Borokov and Utev (1983), Cacoullos and Papathanasiou (1985, 1989, 1992, 1995, 1997), Srivastava and Sreehari (1987, 1990), Prakasa Rao and Sreehari (1986, 1987, 1997), Sumitra and Bhandari (1990), Korwar (1991), Papathanasiou (1993), Alharbi and Shanbhag (1996) and Borzadaran and Shanbhag (1998). Among these Cacoullos and Papathanasiou (1997) established that for a non-negative integer valued random variable X
V(c(X)) ≥ E2(z(X) c(X)) (3) E(z(X) h(X))

Some results on lower variance bounds

593

for every c(.) in B if and only if 1x
z(x) = f (x) (µ − h(y))f (y), (4)
y=0

where all the functions and notations in the above expressions are as defined earlier.
Modeling lifetimes through relations between conditional expectations and failure rates in the discrete time domain have been initiated by Osaki and Li (1988) when they proved such a result for the negative binomial distribution. This was followed by a similar result by Ahmed (1991) concerning the binomial and Poisson distributions. In a more general framework Nair and Sankaran (1991) showed that X follows the Ord family of distributions satisfying

f (x + 1) − f (x)

−(x + d)

f (x)

= a0 + a1x + a2x2

(5)

if and only if

E(X|X > x) = µ + (c0 + c1x + c2x2)k(x),

where ci = (1 − 2a2)−1ai, a2 = 12 , i = 0, 1, 2 and deduced the formulas of the earlier researchers as particular cases. While all these results take h(x) = x, Glanzel (1991) involved higher order conditional moments in the form

E X2|X > x = P(x)E(X|X ≥ x) + Q(x),

where P(x) and Q(x) are polynomials of degree at most one with real coefficients to characterize (5). A further generalization aimed at including more distributions than in (5), Sindhu (2002) (see also Sankaran and Nair 2002) replaced the linear function in the numerator on the right of (5) with a quadratic function b0 + b1x + b2x2 to claim the characteristic property
b2E X2|X > x + (b1 + 2a2)E(X|X ≥ x) + a1 − a2 + a0 + a1x + a2x2
×k(x + 1) = 0.
Further results are available in Consul (1995) regarding the exponential family and other extensions in Ruizz and Navarro (1994) and Navarro et al. (1998).
In Sect. 2, we present a general theorem that establish the link between the two streams of characterizations reviewed above, by showing that the identities connecting (reversed) failure rates and left (right) truncated means are necessary and sufficient conditions for the existence of lower bound to the

594

N. Unnikrishnan Nair, K. K. Sudheesh

variance. The expression for the lower bound is calculated for discrete distributions belonging to the modified power series family, Ord family and mixture of geometric distributions, which cover most of the discrete lifetime models used in practice. Section 3 explains the application of the results in unbiased estimation of parametric functions. It is shown that the bound obtained in Sect. 2 contains as particular cases the Cramer–Rao and Chapman–Robbins inequalities. In Sect. 4 we discuss how the results can be used in a practical situation by illustration through a Poisson data.

2 Main result

As seen from the deliberations in Sect. 1, the reversed hazard rate and right truncated means do not appear in the characterizations mentioned above which is more appropriate when the observations are truncated from above. We present a general result that meets this objective and also subsumes most of the existing results.
Theorem 1 Let X be a discrete random variable supported on N or a subset thereof and g(.), c(.), h(.) be functions in B such that E c2(X) < ∞, E( c(X)g(X)) < ∞ and E h2(X) < ∞. Then for every c(x) ∈ B and some g(x) and h(x), the following statements are equivalent.
(i)

f (x + 1) =

σ g(x)

, x = 0, 1, 2, . . .

(6)

f (x) σ g(x + 1) − µ + h(x + 1)

with g(0) = (µ − h(0))/σ and f (0) is evaluated from (ii)

∞ 0

f

(x)

=

1

r(x + 1) = µ − σ λ(x)g(x)

(7)

(iii)

m(x) = µ + σ k(x)g(x) (8) 1 − k(x)

(iv)

V(c(X)) ≥ E2(g(X) c(X))

(9)

provided E(g(X) h(X)) = σ Here µ and σ denote respectively the mean and variance of h(x) and
c(x) = c(x + 1) − c(x).

Some results on lower variance bounds

595

Proof Assuming (6)

h(x)f (x) = µf (x) + σ f (x − 1)g(x − 1) − σ f (x)g(x).

Summation from 1 to x and the use of the values of g(0) from (i) leads to

x
σ f (x)g(x) = (µ − h(y))f (y)
0
or

σ f (x)g(x) = µF(x) − F(x)r(x + 1).

Dividing by F(x), we reach (7). Retracing the steps we get (6). Thus (i) ⇔ (ii). Now, from

r(x + 1)F(x) + m(x)(1 − F(x)) = µ

one can solve for F(x) and R(x), and then use (1) and (2) to reach the identity

µ − r(x + 1) = (m(x) − µ)(1 − k(x))

λ(x)

k(x)

which proves the equivalence of (ii) and (iii). From the results of Cacoullos and Papathanasiou (1997) stated at (3) and (4), we take z(x) = σ g(x) and obtain

V(c(X)) ≥ σ E2(g(X) c(X)) E(g(X) h(X))

if and only if (ii) or equivalently (iii) is satisfied. Further

E(g(X )



h(X)) = (h(x + 1) − h(x))g(x)f (x)

0



x

= σ −1 (h(x + 1) − h(x))

0

0



= σ −1 h(x)(h(x) − µ)f (x)

0

= σ −1V(h(X)) = σ .

(µ − h(y))f (y)

This proves that (ii) ⇔(iii) ⇔(iv). Since (iv)⇒(ii) ⇒(i), the chain of implications in the theorem is complete.

596

N. Unnikrishnan Nair, K. K. Sudheesh

Remarks

1. The equality in (9) holds if and only if c(x)is linear in h(x). 2. E(g(X)) = σ −1Cov(X, h(X)).
This follows from

∞x

E(g(X)) = σ −1

(µ − h(y))f (y)

x=0 y=0


= σ −1

x
(h(y) − µ)f (y)

x=0 y=x+1


= σ −1 x(h(x) − µ)f (x).

x=0

3. The value of g(x) is unique for a particular choice of h(x). But we can have different forms for g(x) for the same distribution, when h(x) is different.
4. For a given h(x) the value of g(x)characterizes the distribution of X. Thus for h(x) = x, the random variable X has the Poisson distribution in the class A if and only if g(x) ≡ λ1/2 for all x.
We now consider some illustrations of the above results. Since for modeling and inference, families of distributions are more desirable and results for individual distributions can be easily deduced, we look at the modified power series family (MPSD) and the Ord family which together covers most of the discrete distributions in common use. MPSD is defined as distributions having probability mass functions of the form
f (x) = a(x)[u(θ )]x , A(θ )
where X ∈ N or a subset of N, a(x) ≥ 0, u(θ ) and A(θ ) are positive, finite and differentiable. From
x
A(θ )F(x) = a(y)(u(θ ))y
0

successive differentiation w. r. t θ (denoted by primes) yield

r(x + 1) = µ 1 + (log A(θ )) (log F)

(10)

when h(x) = x and

r(x + 1) = (log F) log F + 2(log F) (log A(θ )) + log A (θ ) (log A(θ )) (11)

Some results on lower variance bounds

597

when h(x) = log u (θ ) 2x(x − 1) + log u (θ ) (log u(θ )) x − log A (θ ) (log A(θ ))

Using the expressions for E(X) and the recurrence relations for moments in Johnson et al. (1992),

g(x) = − µ A(θ ) 1 ∂F = − u(θ ) 1 ∂F (12)

σ A (θ ) f (x) ∂θ

u (θ ) σ f (x) ∂θ

corresponding to (10) and

1 ∂2F A (θ ) ∂F

g(x) = σ f (x) ∂θ 2 + 2 A(θ ) ∂θ

(13)

corresponding to (11), since in that case E(h(X)) = µ = 0. The results of Osaki and Li (1998), Ahmed (1991), Consul (1995) etc can be derived from (10) as particular cases of (10) for specific distributions, through the use of (8) and the g(x) values given above. Further these results also suggest that characterizations in terms of relationship between failure rate and mean residual lives are not independent of those between the corresponding reversed concepts. But the two sets of results are useful in their own right depending on whether the data is left or right truncated.
By virtue of Theorem 1 and above discussion, we conclude that for the MPSD

inf

V (c(X ))

c(x) ∈ B E2(g(X) c(X)) = 1

with g(x) as stated in (12) and (13) for the prescribed values of h(x).

The implication of these results in unbiased estimation of θ will be discussed

in the next section. It may be noted that, though (12) appears to be compli-

cated, it ends up with a simple forms for various members. For example, in

the case of binomial (n, p), Poisson, Borel Tanner and Geeta distributions, the

value

of

g(x)

are

(n



x)p

1 2

(nq)−

1 2

,

λ1/2,

−θ

µ

∂ ∂

F θ



nf

and

−θ

µ

∂ ∂

F θ



f

,

respec-

tively. When u(θ ) = θ in the above equations, the results for the sub-class of

generalized power series distributions can be obtained.

For the Ord family of distributions specified by (5), from Nair and Sankaran

(1991), (when h(x) = x)

r(x + 1) = µ 1 − b0 + b1x + b2x2 λ(x)

with bi = µ−1(1 − 2a2)−1ai, a2 = 12 , i = 0, 1, 2. Thus for the family g(x) = µ.σ −1 b0 + b1x + b2x2

598

N. Unnikrishnan Nair, K. K. Sudheesh

and

inf
c(x) ∈ B

σ 2V(c(X))

E2 c0 + c1X + c2X2

= 1. c(X )

where ci = (1 − 2a2)−1ai, a2 = 12 , i = 0, 1, 2. Generally for the Ord family, the g(x) turns out to be polynomials, e.g. linear for binomial, quadratic for hyper geometric and discrete t models (see also Korwar 1991).
We now show that Theorem 1 can be applied to some finite mixture of discrete distributions as well. The mixture of geometric laws
f (x) = αp1qx1 + (1 − α)p2qx2, 0 < pi < 1, qi = (1 − pi), i = 1, 2; 0 ≤ α ≤ 1; x ∈ N

is characterized by Nair et al. (1999) E(X − x|X > x) = p1 + p2 − 1 k(x + 1) p1p2 p1p2
so that by taking h(X) = X − x and comparing with (8) g(x) = σ p1p2f (x) −1 (p1 + p2 − µp1p2)R(x + 1) − f (x + 1) .

3 Unbiased estimation

The present section is a discussion of the implications of Theorem 1 to unbiased estimation and a comparison of inequality (9) with the Cramer–Rao and Chapman–Robbins lower bounds to the variance of an unbiased estimator. First we take h(x) = x and note that h(x) ∈ B. The lower bound in (9) is attained when c(x) = h(x), in which case

V(c(X)) = σ 2.

(14)

A necessary and sufficient condition for this is

r(x + 1) = µ − σ λ(x)g(x)

which is equivalent to

x

x

A(θ ) ∂ x

h(t)f (t) = µ f (t) + µ A (θ ) ∂θ f (t) (15)

0

0

0

Some results on lower variance bounds

599

on using (12). From (15),

h(x)f (x) = µf (x) + µ A(θ ) ∂f A (θ ) ∂θ
or

∂ log f = u (θ ) (h(x) − µ)

∂θ

u(θ )

Now the Cramer–Rao lower bound for unbiasedly estimating µ using h(x) is

µ (θ ) 2

V(c(X)) =

2

E

∂ log f ∂θ

= u(θ ) ∂µ = σ 2.

(16)

u (θ ) ∂θ

Hence the two bounds in (14) and (16) are equal. Notice that MPSD’s are linear exponential and hence include cases in which the Cramer–Rao lower bound is attained, under regularity conditions.
Second popular lower bound to the variance of an unbiased estimator is provided by the Chapman–Robbins inequality. When E(h(X)) = µ(θ ) where θ ∈ ⊂ R and ϕ ∈ such that fθ (x) and fφ(x) are different, satisfying {fθ (x) > 0} ⊃ fφ(x) > 0 , we can set c(x) = ffφθ ((xx)) − 1 in (9) to claim





E(g(X )

∞x
c(X)) = σ −1 ⎝

(µ − h(y))f (y)⎠

fφ(x + 1) − fφ(x)

.

(17)

x=0 y=0

fθ (x + 1) fθ (x)

Equation (17) simplifies to

E(g(X )

⎧⎛



c(X ))

=

⎨ σ −1



x−1
⎝ (µ − h(y))f

(y) fφ(x) ⎠



θ fθ (x)

x=1 y=0


⎞⎫






x

(µ − h(y))f (y) fφ(x) ⎠⎬

θ fθ (x) ⎭

x=0 y=0


= −σ −1 (µ − h(x))fφ(x)

0

= σ −1[µ(ϕ) − µ(θ )]

600

N. Unnikrishnan Nair, K. K. Sudheesh

Inequality (9) now reduces to
V(c(X)) ≥ [µ(φ) − µ(θ )]2 V (h(X ))
or
V(h(X)) ≥ [µ(φ) − µ(θ )]2 V fφ(X)/fθ (X)
which is the Chapman–Robbins inequality. It is well known that this bound does not require the regularity conditions of the Cramer–Rao inequality, is valid when is discrete and provides bounds sharper than the latter. The last statement is also true for the Chernoff-type inequality (9) derived in Theorem 1, which is more general. Moreover (9) provides a more general alternative methodology to extract UMVUE’s when h(x) is taken as a statistic that is unbiased for µ.

4 Illustration

Although the above deliberations were essentially directed towards modeling and inference of lifetime data, the methodology is applicable for identification of distribution and estimation of parameters in other contexts as well. We illustrate the procedure for the data on the count of alpha particles giving rise to Poisson distribution reported in Mould (2005)

X

01

2

3

4

5

6

7

8 9 10 11 12

Frequency: 57 203 383 525 532 408 273 139 45 27 10 4 2

The failure rate, mean residual life (which has no physical interpretation in the present data) and g(x) are plotted in Figs. 1, 2 and 3.

failure rate

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

0

2

4

6

8

10

12

Fig. 1 Failure rate of X
DistributionsLogFailure RateVarianceBounds