View Question
Q: complex question in statistics (sufficient condition for some inequality) ( Answered ,   0 Comments )
 Question
 Subject: complex question in statistics (sufficient condition for some inequality) Category: Science > Math Asked by: mm1234-ga List Price: \$15.00 Posted: 01 Nov 2002 01:53 PST Expires: 01 Dec 2002 01:53 PST Question ID: 95175
 ```Let x, y be i.i.d. random variables with zero mean. Let the distribution of each of them be D. Let A be the area in the 2-dimensional plane that consists of the fourth and the eigth octants; i.e., (x, y) is in A iff either y>x & x>0 & y>0, or y 0; i.e., for the conditional mean of x-y given that the point (x,y) is in A to be positive. I think I have to clarify what sort of condition I look for. (Clearly, it will impose some restriction on D.) Of course, I don't want the condition that trivially rephrases my required result in the form of integrals of the p.d.f. of D. Neither do I want such a restrictive condition as a specific distribution family that works - one would be quite easy to construct by straightforward numerical computation (say, if D is truncated normal, it is sufficient that the truncation is done to the right of a certain percentile point). What I do want is a non-trivial condition, that has some generality (say, the set of distributions allowed by this condition should be larger than continuum). Perhaps, something referring to the skewness of D, or its other moments, or its hazard rate, etc.```
 ```Dear mm1234, I have a condition that is fairly general. While it is not very far from just a restatement of your original inequality, I do feel that it elucidates the nature of the inequality, and it does enable one to write down quite easily large families of distributions that satisfy it. Still, if you are not satisfied please ask for a clarification and I will see if I can still simplify it further. Broadly speaking, your inequality concerns comparing the negative part of the distribution D with the positive part of it. It will be satisfied if the negative part of D is "less concentrated around its mean value" (see below) than the positive part. More precisely, define two distributions D+ and D- as: D+ is the distribution of x conditioned on x>0 D- is the distribution of -x conditioned on x<0 (let's assume for convenience that D does not have an atom at 0) (also let's assume that P(x > 0) = 1/2 - this simplifies the inequalities below and still leaves us with a fairly broad class of distributions; I will indicate below how to treat the more general case) Introduce two i.i.d. variables x' and y' having distribution D+, and two i.i.d. variables x'' and y'' having distribution D-. Also, write the set A you defined as the (disjoint) union of two sets A1 and A2, where A1 = { 0 > x > y }, A2 = { y > x > 0 } Now your original inequality E[ x - y | A ] > 0 can be restated as E[ (x-y)1_A1 + (x-y)1_A2 ] > 0 (1_A1 is the indicator function of A1 etc.), or E[ (x-y)1_A1 ] > E[ (y-x)1_A2 ] This can be written in terms of the conditioned r.v.s x',y',x'',y'': E[ (x''-y'') | x''>y'' ] > E[ (y'-x') | y'>x' ] (here we use the assumption that P(x>0)=1/2, otherwise there will be a factor alpha on one side of the inequality and a factor 1-alpha on the other side, where alpha = P(x>0)) Since x'',y'' are i.i.d. and x',y' are i.i.d. this can be rewritten as E|x''-y''| > E|x'-y'| Which is (for now) my final form of the inequality, in other words the sufficient condition you were asking for. For two i.i.d. random variables u,v, the quantity E|u-v| is known as the DIVERGENCE of the distribution of u. It is a bit like the variance in that it measures how much u is concentrated around its mean value, except it is much less convenient to work with than variance because it is not additive with respect to independent sums. Still, you can now construct large families of distributions D satisfying your original inequalities by fixing their negative and positive parts separately and then giving each of them probability 1/2 of happening to make up D. (Note that under this assumption, since you specified that D has mean 0, the mean of D+ must equal the mean of D-) If you fix them in such a way that the divergence of D- is bigger than the divergence of D-, then your inequality will be satisfied. I hope this is the kind of answer you were looking for, please do not hesitate to ask for clarification. If this is not exactly what you were looking for, it would help me to know in more detail the application for which you need this, so I can understand better the kind of condition you are looking for. Regards, dannidin``` Clarification of Answer by dannidin-ga on 01 Nov 2002 03:24 PST ```I forgot to mention as a simple (trivial) example that a very large family of distributions D satisfying your condition is the distributions which are supported on the interval (-infinity,0), in other words where x is a negative r.v. -dannidin``` Request for Answer Clarification by mm1234-ga on 04 Nov 2002 02:35 PST ```Thanks, this is helpful. And this is exactly the type of answer I was looking for. However, I have a couple of concerns about the specific condition you suggest. First, I am not too happy with the assumption P[x>0] = 1/2. Not because it's restrictive (I don't mind that), but because it's counterintuitive. Really, suppose that D- has bigger "divergence" than D+. What can we say about skewness of D? Of course, there are numerous conflicting definitions of skewness, so I am not rigorous here; but this seems a clear case of a left-skewed distribution if there ever was one. Unfortunately, one of the main measures of skewness is Pearson's 2nd coefficient (the difference between the mean and the median, scaled by the st. dev.). A "typical" left-skewed distribution should have negative Pearson's 2nd coefficient, implying P[x>0] > 1/2 (since the median is to the right of the mean, which is zero). Sadly, if we try to consider the case P[x>0] > 1/2, things will get even worse. This implies that the coefficient alpha to which you referred is less than 1/2, and so the overall conclusion about D now becomes ambiguous: the higher divergence of the left side is offset by the fact that the left side is multiplied by a smaller number (alpha < 1-alpha). I am also concerned, though to a lesser extent, with the use of E|x'-y'| to measure the divergence (or spread) of a distribution. I would have no problem with using E|x - Ex|, or E|x - Median[x]|, or Median|x - Median[x]| (respectively, mean absolute deviation from mean, mean abs. dev. from median, and median abs. deviation from median). But E|x'-y'| does not directly relate to either of those measures. Of course, E[(x'-y')^2] does equal to 2 E[(x-Ex)^2] = 2 var[x]. This provides some justification for using E|x'-y'| as a measure of spread. And a very minor point -- the trivial example yuo mention wouldn't work since I require E x = 0. Let me know what you think. Thanks again!``` Clarification of Answer by dannidin-ga on 04 Nov 2002 04:04 PST ```Hi mm1234, Regarding your concern about skewness and how it conflicts with my condition that P(x>0)=1/2: I want to emphasize that my form of the inequality is not directly related to skewness, which is a measure of how the distribution is biased more towards the left side relative to its mean than towards the right side. What we are doing here is rather comparing the amount of "spread" of the left side of the distribution relative to the amount of spread of the right side. So this is more a "second-moment" version of what you call skewness (which is a "first-moment" kind of concept). Indeed, for a distribution with mean zero that satisfies P(x>0)=1/2, skewness as you define it is equal to zero. However, this does not rule out the left side of the distribution being more "spread out" (in the sense of divergence) than the right side of the distribution. And again, if you allow arbitrary values of P(x>0), you just need to give the correct weights (i.e. P(x>0) and 1-P(x>0)) to the divergence of the left- and right- sides when you compare them. This does not conflict with the concept of skewness. About your unease regarding my use of divergence as opposed to the more traditional ways of measuring "spread": I was uneasy about this myself, so over the weekend I checked an old paper I remembered reading once that discussed comparisons of the different measures of spread. As you yourself noticed, putting two independent copies of the original random variable can be related to just one copy if we were talking about variance rather than divergence. But in fact using the Cauchy-Schwarz inequality (or alternatively the Holder inequality) can bring us to divergence, since: divergence = E|x-y| < or = sqrt(E(x-y)^2) = sqrt(2 var(x)) = sqrt(2) * s.d(x) (s.d(x) = standard deviation). In other words the divergence of x is bounded from above by sqrt(2) times the standard deviation of x which is a far more convenient measure of spread. As for a lower bound (which you need since you are comparing two divergences), this is more difficult. Some computations using Fourier transforms show the following inequality divergence = E|x-y| > or = E|x-Ex|, which relates the divergence to the more familiar "mean absolute deviation from mean". The paper I found this in is von Bahr, B., Esseen, C.G. Inequalities for the r-th absolute moment of a sum of random variables, 1<=r<=2. Ann. Math. Statist. 36 (1965) 299-303. Using these two inequalities you can rewrite my condition in terms of these simpler measures of spread. I hope this makes things clearer, and again if you have any more questions please ask. Regards, dannidin```
 mm1234-ga rated this answer: ```Thanks, that looks really good. I really liked your justification for the E|x-y| as a measure of divergence. And I guess I am comfortable that you can increase the spread of D- even while keeping its mean and its p.d.f. at 0 constant (so that the transition from D- to D+ would result in smooth p.d.f. of D at zero).```