The principles of the probabilistic regression analysis become for the case with several argument, which so-called multiple involution, describes, because the formal connections can be represented so more elegantly.
It exists a variable y, those linear from several firmly given variables x depends in the form
whereby e again the variable disturbance represents. e a variate is and therefore is y likewise as linear transformation of e a variate. Y are present the for each n many observations for xj (j = 1," , p) and, so that itself for the observations i (i = 1," , n) the set of equations
results in. As sample-theoretical beginning each sample element is interpreted ei as its own variate i and just as so everyone yi.
Since it concerns a linear set of equations here, the elements of the system in matrix way of writing can be summarized. One receives - the column vectors of the dependent variable y and the variable disturbance e as random vektors and ((p+1) - column vector of the involution coefficients
(p+1))- Data matrix \ underline X= \ begin {pmatrix} 1&x_ {11} & x_ {12} & \ cdots &x_ {1j} & \ cdots &x_ {1p} \ \ 1&x_ {21} & x_ {22} & \ cdots &x_ {2j} & \ cdots &x_ {2p} \ \ \ vdots& & & & & \ vdots \ \ 1&x_ {i1} & x_ {i2} & \ cdots &x_ {ij} & \ cdots &x_ {IP} \ \ \ vdots& & & & & \ vdots \ \ 1&x_ {n1} & x_ {n2} & \ cdots &x_ {nj} & \ cdots &x_ {to NP} \ end {pmatrix}.
The ones in the first column serve as substitute symbols for the absolute term. One calls a such "variable "dummy variable.
The random vektor e is distributed with the expectancy value vector Ee and second order statistics Se. y is then distributed with the expectancy value vector a + + Ee and second order statistics Se.
The set of equations can be represented now substantially more simply in such a way:
So that the involution estimations can be analyzed inferentiell, determined acceptance must be fulfilled for the classical linear involution model:
Also in the multiple linear involution model the Quadratsumme of the residues is minimized according to the method of the smallest squares. One receives the vector of the estimated involution coefficients as solution (sentence from Gauss Markow)
This appraiser is BLUE (Best linear Unbiased Estimator), thus best (unbiased with smallest variance) linear undistorted appraisers. For the characteristics of the estimator b thus no distribution information of the variable disturbance must be present.
One receives the estimated set of equations with the help of the minimum square appraiser b
whereby e is the vector of the residues.
The interest of the analysis is appropriate particularly in the estimation \ widehat for y_0 or also prognosis of the dependent variable y for a given tuple of x0. Computes itself as
The estimated values yi compute themselves as
whereby one more briefly
to set can. - matrix M is by the way idempotent and maximum of rank p+1. It is called also having matrix, because it puts y on "the hat ".
The residues determined as
whereby IN with M has comparable characteristics.
The prognosis \; _ {\ widehat y_0} determined as
Since X is firmly given, one can represent all this variables as linear transformation of y and thus of e, and therefore also its expectancy value vector and its second order statistics can be determined unproblematically.
The variance of the variable disturbance becomes estimated with the help of the residues, as middle Quadratsumme of the residues
The Quadratsumme RSS (of English "residual sum OF of squares ") of the residues results in in matrix notation
RSS = \ underline {e} ^T \ underline e = \ underline {y} ^T (\ underline I - \ underline M) ^T (\ underline I - \ underline M) \ underline y = \ underline y ^T (\ underline I - \ underline M) \ underline y.
We found here 5 related websites.
Index | Privacy | Terms Of Use | Sitemap | Feedback