Thank you for another most interesting problem in
probability/statistics. It sounds like the professor is choosing
unusually interesting problems actually.
Anyway, let's begin with some terminology (and of course, please
remember to please ask for clarification if anything is unclear).
I will interpret the word "amax" to mean "a * max" here, for some
positive constant real number a.
We are given a random variable X uniformly distributed from 0 to
theta.
We are given a positive integer constant n.
We are given two other random variables:
Y= 2*(x_1+x_2+...+x_n)/n
and
Z=a*max(x_1,...,x_n)
where each x_i is independently distributed according to X.
In words, for example, to compute Z we think of randomly drawing n
numbers from 0 to theta, taking their max, then
multiplying the result by a. We want to find the mean and standard
deviation of the resulting numbers.
We also define the random variables:
W=max(x_1,...,x_n)
---------------------------------
Part 1)
We want to find an a such that Z is and unbiased estimate of theta.
An unbiased estimate means that E(Z)=theta .
In other words, if we do the experiment I mentioned above a bunch of
times, we should get theta.
To do this, we will begin by computing the probability density
function of W .
We will do that by first computing the probability DISTRIBUTION of W,
then differentiating the
distribution to get the density. (Recall the fact of probability that
the derivative of distribution is density).
Now the probability distribution of W is
F(z) = P(W<=z)
for some real number z.
It is easy to compute this manually:
F(z)
=P(W<=z)
=P(max(x_1,...,x_n))<=z
0 if z<=0
(z/theta)^n if 0<=z<=theta
1 if z>=theta
Hence, the probability density for W, f, is the derivative:
f(z)=
F'(z)=
n/theta^n * z^{n-1} if 0<=z<=theta
0 otherwise
Now that we have the probability density, we can compute the
expectation:
E(W)
=Integral from z= 0 to theta of z f(z)
=Integral n/theta^n z^n
= n/(n+1) * theta
Hence,
E(Z)=E(aW)=a E(W)=
a*n/(n+1) * theta
The estimate E(Z) for theta is unbiased if this value equals theta:
E(Z)=theta
if and only if a=(n+1)/n .
Hence, the value for a that produces an unbiased estimate for theta is
a= (n+1)/n
By the way, I wrote a simple Java progam to test it.
If you know Java or have it installed, here it is:
public class MaxMean{
public static void main(String[] args){
int n=Integer.parseInt(args[0]);
double a= (double)(n+1)/(double)n;
double theta=Double.parseDouble(args[1]);
int iterations=1000000;
if (args.length>2)iterations=Integer.parseInt(args[2]);
System.out.println("Starting trials: n: "+n+"\ntheta:
"+theta+"\niterations: "+iterations);
double sum=0; //sum of all the values so far
for (int i=0;i<iterations;++i){
double max=-1;
for (int j=0;j<n;++j)
max=Math.max(max,Math.random()*theta);
sum+=a*max;
}
System.out.println("Mean of the distribution Z was:
"+sum/iterations);
}
}
A sample run was:
java MaxMean 3 8 100000
Starting trials: n: 3
theta: 8.0
iterations: 100000
Mean of the distribution Z was: 8.004283746090845
-----------------------------------------------------------------------
Part 2)
I am not sure here whether you wanted the standard deviation of the
computation for both methods of estimating theta or just the second,
so I will compute the standard deviation for both methods, that is,
the standard deviation of Y and of Z.
Part 2a: the standard deviation of Y
The standard deviation is the square root of the variance.
So let us start by computing the variance of the first method of
estimating theta.
Fact 1:
if A and B are independent random variables then:
var(A+B)=var(A)+var(B)
fact 2:
var(a*A)=a^2 var(A)
Hence,
var(Y)= var (2/n* (X+X+X+X...+X))
= 4/n^2 * n * var(X)
= 4 var(X) /n .
Now the variance X is defined as E(X^2)-E(X)^2 .
E(X^2)= integral from 0 to theta x^2/theta = theta^3/(3*theta)=
theta^2/3 .
Clearly E(X) = theta/2.
Hence, var(X)=theta^2/3-theta^2/4 =theta^2/12
Thus, var(Y)=(4/n)*theta^2/12 = (theta^2)/(3n)
The standard deviation of the first method of estimating theta is
therefore the square root of this, or
sd(Y)
=sqrt(theta^2/(3n))
Part 2b: the standard deviation of Z
Now we compute the standard deviation of the the second method of
estimating theta.
Recall that Z=a*max(x_1,...,x_n)
We assume here that a is chose as above so the estimate is unbiased:
a=(n+1)/n
Then the mean of Z, E(Z)=theta, from part 1.
Let W=max(x_1,...,x_n).
What is E(W^2) ?
We know from part 1 that the probability density f(z) for W is
n/theta^n * z^{n-1}
Hence,
E(W^2)=integral from 0 to theta of (z^2 f(z) dz)
= integral from 0 to theta of z^2 n/theta^n * z^{n-1}
= integral from 0 to theta of n/theta^n z^(n+1)
= n/((n+2)*theta^n) theta^(n+2)
= n/(n+2) * theta^2
Now we know that E(aW)=theta from part 1.
Hence, var(Z)=var(aW)=E((aW)^2)-E(aW)^2
= a^2 E(W^2)-theta^2
a^2 theta^2 n
------------- - theta^2
n+2
Let's use the unbiased estimate for a, a=(n+1)/n from part 1.
Then the last formula is
(n+1)^2 theta^2 n
----------------- - theta^2
(n+2)n^2
=
(n+1)^2
theta^2 * ( ---------- - 1 )
(n+2)*n
This can be simplified to
theta^2
-------
n(n+2)
the standard deviation is the square root of this, or
s.d.(Z)
theta
= --------
sqrt(n(n+2))
3)
Now we have two find out which way is better. Well, I think one would
want to choose the method
with the smallest standard deviation. So we compare them.
The first method is better if and only if the standard deviation of
first method is smaller than sd of second method,
or if and ony if,
theta/(sqrt(3n)) < theta /sqrt(n(n+2))
which is true if an only if
n^2+2n < 3n
n^2 < n
This is never true. Or, for n=1, both methods are the same, and for
all other n, the second method is preferable. |