Google Answers: seeking clarification on matrix solution for multiple regressors standard errors

View Question

Q: seeking clarification on matrix solution for multiple regressors standard errors ( Answered 5 out of 5 stars

, 0 Comments )

Question

Subject: seeking clarification on matrix solution for multiple regressors standard errors
Category: Science > Math
Asked by: 1e2-ga
List Price: $20.00

Posted: 09 Sep 2003 07:17 PDT
Expires: 09 Oct 2003 07:17 PDT
Question ID: 253825

http://www.ilr.cornell.edu/~hadi/RABE/Data/P054.txt

from this data set can someone show me the intermediate numerical
calculations to get the standard error for each coefficient including
the constant. please use actual numbers and not greek notation.
i know how to get the t value but would like to know how to get the p
value preferably without using a table.

coef se t p
constant 10.787 11.5890 .93 0.3616
x1 0.613 0.161 3.81 0.0009
x2 -0.073 0.1357 -0.54 0.5956
x3 0.320.1685 1.9 0.0699
x4 0.081 0.2215 0.37 0.7155
x5 0.038 0.147 0.26 0.7963
x6 -0.217 0.1782 -1.22 0.2356

Answer

Subject: Re: seeking clarification on matrix solution for multiple regressors standard errors
Answered By: elmarto-ga on 09 Sep 2003 11:58 PDT
Rated: 5 out of 5 stars

Hi 1e2! In order to find the standard errors for each coefficient, one must first calculate the variance-covariance matrix of the coefficients. From there, the calculation of the standard deviation of the coefficients is immediate. The computation of this matrix requires a large amount of calculations, so calculating it "by hand" can take an exceedingly long time (with no small probability of a mistake), so a computer is used to calculate it. In particular, the most difficult step in computing this matrix is the inversion of a (potentially very large) matrix. Anyway, here is the formula to calculate it. I will first use "Greek" notation (I'm sorry, it can't be helped :-) ) and then I will show you what numbers you have to plug in each of the symbols, so you shouldn't have any trouble understanding it. We must first define the X matrix. Each row of the X matrix corresponds to an observation, while each column represents the value of the explanatory variables in that observation. For example, the first few lines in the data you provided are: Y X1 X2 X3 X4 X5 X6 43 51 30 39 61 92 45 63 64 51 54 63 73 47 71 70 68 69 76 86 48 61 63 45 47 54 84 35 The X matrix doesn't include the explained variable Y; only the explanatory ones and the constant. If we didn't include the constant, the (first few rows of the) X matrix would be: 51 30 39 61 92 45 64 51 54 63 73 47 70 68 69 76 86 48 63 45 47 54 84 35 and so on. It shows the values of the explanatory variables. Since you're also interested in the constant, we have to include it. The constant is like another explanatory variable, whose value is always 1. Therefore, the X matrix becomes: 1 51 30 39 61 92 45 1 64 51 54 63 73 47 1 70 68 69 76 86 48 1 63 45 47 54 84 35 Obviously, it has 1 more column than if it didn't include the constant. Having defined X, here is the formula for the variance-covariance matrix: Cov. Matrix = (s^2)(X'X)^(-1) Let's see how to compute each of the components. We'll see first how to compute (X'X)^(-1) X' means "X transposed". Thus, X' will be like X but with the columns and rows switched. Again, since the first few rows of X are: 1 51 30 39 61 92 45 1 64 51 54 63 73 47 1 70 68 69 76 86 48 1 63 45 47 54 84 35 Then X' is: 1 1 1 1 ... 51 64 70 63 ... 30 51 68 45 ... 39 54 69 47 ... 61 ... . . . ... ... 92 ... . . . ... ... 45 ... . . . ... ... Now that you have X', the next step is to multiply X' by X (which is (X'X) ). This is one step that takes an enormous time to complete without the aid of a computer. You have to perform a matrix multiplication here, which is explained in the following page: http://www.aps.uoguelph.ca/~gjansen/MBG4030/notes/chap03.pdf also: http://www.fw.umn.edu/fw5601/Lecture/matrices/Matrices.html (find Multiplication) As you can see, as matrices get larger, matrox multiplication becomes increasingly tedious. In your case, you have to multiply X', which is a matrix with 7 rows (because of the 7 explanatory variables) and 30 columns (because of the 30 observations), by X, which is a matrix with 30 rows and 7 columns. You have to multiply a 7x30 matrix by a 30x7 matrix. A computer or calculator with matrix operations capabilities is highly recommended if you want to compute this matrix. The result of this multiplication will be a 7x7 matrix. If you happen to have Microsoft Excel, matrix multiplication can be done with the MMULT command. The other long step is the following. Recall from the formula that we actually need (X'X)^(-1). That is, we have to find the inverse matrix of X'X. Finding the inverse of a matrix requires several operations, which are detailed at: http://www.fw.umn.edu/fw5601/Lecture/matrices/Matrices.html (find Inversion) Again, without the aid of a computer, calculation of the inverse matrix can take a long time and be very boring. In Microsoft Excel, matrix inversion is quickly done with the MINVERSE command. Now you should have a matrix (X'X)^(-1). We must still compute the (s^2) component of the covariance matrix formula. Calculating s^2 is easier than what we've done so far. This component is simply the sum of the squared residuals of the regression, divided by (n-k), where n is the number of observations (n=30 in this case) and k is the number of explanatory variables (k=7 in your case - because of the constant plus 6 explanatory variables). Once you have s^2 (which is a number - not a matrix) you just have to multiply it by each element of the (X'X)^(-1) matrix. This gives the covariance matrix. The elements in the diagonal of this matrix are the variances of the coefficients, in the same order as in the X matrix. For example, if in the X matrix the constant was the first column, X1 was the second one, etc; then the 1st element of the diagonal is the variance of the constant coefficient, the 2nd one is the variance of the coefficient of X1, etc. Finally, in order to find the standard error, just take the square root of each of the variances. I have done all these calculations in Microsoft Excel. If you have this software, please let me know so I can put the file in my web page for you to download, so you can see the formulas involved (which are exactly the ones I've explained here). More information on the covariance matrix at: http://www.rci.rutgers.edu/~dhjones/APPLIED_LINEAR_STATISTICAL_MODELS(PHD)/LECTURES/LECTURE06/2-Simple%20linear%20regression%20model%20in%20matrix%20terms.pdf http://www.roguewave.com/support/docs/sourcepro/analyticsug/3-2.html Regarding your second question, it's not possible to find the p-values without a table, unless you use the formula for the t distribution, which is quite complicated. You can find it at: t- distribution http://mathworld.wolfram.com/Studentst-Distribution.html (it's the formula F(t), the cummulative distribution function). If you want to use this function, the p-value is: if t is positive, 2(1-F(t)); if t is negative, 2F(t). The degrees of freedom of the t distribution in this case is (n-k), that is, 30-7=23 in your case. I would guess that this is the calculation the computer does when it displays the p-value. Even with a table, it's usually not possible to compute the p-value. t-distribution tables usually only list t-values for the following p-values: 0.1, 0.05, 0.02, 0.01, 0.005, 0.001. If the p-value you wanted to find were different from any of those (as is usually the case), you won't be able to find the p-value, since the t-value will not be listed. Fortunately, I have located a java table that will help you find the p-values Student's t-distribution http://stat-www.berkeley.edu/users/stark/Java/tHiLite.htm In order to use it for your data, enter the degrees of freedom (23 in your case). Use the t-value with plus and minus for the upper and lower endpoint of this page. For example, the t-value of the constant is 0.93. So in the lower endpoint you should enter -0.93, and in the upper endpoint, 0.93. You will get a percentage for the highlighted area (it's 63.8% in the 0.93 case). The p-value is simply 1 minus that percentage. It would be 36.2% in this case. The same procedure goes for the other t-values. I hope this helps! Recall that this question is not finished until you're satisfied with it. If you need any further assistance, please do let me know through a clarification request. Otherwise, I await your rating and comments. Best regards, elmarto
Clarification of Answer by elmarto-ga on 09 Sep 2003 16:27 PDT In case you do have Microsoft Excel, or are able to open Excel files; in the following link you will find an Excel sheet that shows the steps that are needed to obtain the std. errors of the coefficients. http://www.angelfire.com/alt/elmarto Best wishes! elmarto
Request for Answer Clarification by 1e2-ga on 10 Sep 2003 09:36 PDT could you describe how you derive the p values from a table of t's
Clarification of Answer by elmarto-ga on 10 Sep 2003 13:57 PDT Regarding your request for clarification; as I stated in the answer, it's usually not possible to find p-values from the t's using the tables that usually appear in Statistics books. This is so because such tables only show the t's (and degrees of freedom) for which the p-values are either 0.1, 0.05, 0.02, etc. If you have a t that would evaluate to another p-value, you won't be able to extract the p-value from the table. In general, the p-value itself is not needed: one only needs to know if the t you got is greater or smaller than the t that corresponds to a p-value of either 0.1, 0.05, etc. (depending on the level of confidence you want). Sometimes you can use the tables to infer in what range is the p-value for your t. If your t falls between the t that has a p-value of 0.1 and the one that has a p-value of 0.05, you can conclude that the p-value is between 0.05 and 0.1; and so on. But with these tables, there is not much more you can say about the p-value. Best luck and thanks for the rating and comments! elmarto

1e2-ga rated this answer: 5 out of 5 stars

exellent. thanks for the educational links

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy