Web Site

Economy-point.org



» Economics » Multivariate procedure » Topics begins with R » Regression analysis


Page modified: Friday, June 23, 2006 20:29:56

The classical linear involution model (KLR)

Multiple involution

The principles of the probabilistic regression analysis become for the case with several argument, which so-called multiple involution, describes, because the formal connections can be represented so more elegantly.

It exists a variable y, those linear from several firmly given variables x depends in the form

y = \ beta_0 + \ beta_1 x_1 + \ beta_2 x_2+"… + \ beta_p x_p + \ varepsilon \; ,

whereby e again the variable disturbance represents. e a variate is and therefore is y likewise as linear transformation of e a variate. Y are present the for each n many observations for xj (j = 1,"…, p) and, so that itself for the observations i (i = 1,"…, n) the set of equations

y_i = \ beta_0 + \ beta_1 x_ {i1} + \ beta_2 x_ {i2} + \ cdots + \ beta_p x_ {IP} + \ varepsilon_i

results in. As sample-theoretical beginning each sample element is interpreted ei as its own variate i and just as so everyone yi.

Since it concerns a linear set of equations here, the elements of the system in matrix way of writing can be summarized. One receives - the column vectors of the dependent variable y and the variable disturbance e as random vektors and ((p+1) - column vector of the involution coefficients

\ underline y= \ begin {pmatrix} y_1 \ \ to y_2 \ \"… \ \ y_i \ \"… \ \ y_n \ end {pmatrix} \; , \ underline \ varepsilon= \ begin {pmatrix} \ varepsilon_1 \ \ \ to varepsilon_2 \ \"… \ \ \ varepsilon_i \ \"… \ \ \ varepsilon_n \ end {pmatrix} \; and \ underline \ beta= \ begin {pmatrix} \ beta_0 \ \ \ to beta_1 \ \ \ beta_2 \ \"… \ \ \ beta_j \ \"… \ \ \ beta_p \ end {pmatrix} \; ,

(p+1))- Data matrix \ underline X= \ begin {pmatrix} 1&x_ {11} & x_ {12} & \ cdots &x_ {1j} & \ cdots &x_ {1p} \ \ 1&x_ {21} & x_ {22} & \ cdots &x_ {2j} & \ cdots &x_ {2p} \ \ \ vdots& & & & & \ vdots \ \ 1&x_ {i1} & x_ {i2} & \ cdots &x_ {ij} & \ cdots &x_ {IP} \ \ \ vdots& & & & & \ vdots \ \ 1&x_ {n1} & x_ {n2} & \ cdots &x_ {nj} & \ cdots &x_ {to NP} \ end {pmatrix}.

The ones in the first column serve as substitute symbols for the absolute term. One calls a such "„variable "“dummy variable.

The random vektor e is distributed with the expectancy value vector Ee and second order statistics Se. y is then distributed with the expectancy value vector a + + Ee and second order statistics Se.

The set of equations can be represented now substantially more simply in such a way:

\ underline y = \ underline X \ \ underline \ beta + \ underline cdot \ varepsilon

Acceptance of the classical linear involution model

So that the involution estimations can be analyzed inferentiell, determined acceptance must be fulfilled for the classical linear involution model:

  1. Concerning the variable disturbance ei
    1. The random vektor e is distributed with the expectancy value vector 0 and second order statistics Se = se2I.
    2. The variates ei are stochastically independent.
  2. The data matrix X is firmly given
  3. The data matrix X has rank (p+1).
  • In the first acceptance thus all have ei the same variance (Homoskedastie) and they are uncorrelated all in pairs. One could understand in such a way that the variable disturbance may contain no information and only coincidentally strews. Therefore also y can be explained only by information from X.
  • The second acceptance keeps X constant.
  • The third acceptance is necessary for a clear solution of the involution problem.

Estimation of the involution coefficients

Also in the multiple linear involution model the Quadratsumme of the residues is minimized according to the method of the smallest squares. One receives the vector of the estimated involution coefficients as solution (sentence from Gauss Markow)

\ underline b = \ begin {pmatrix} b_0 \ \ b_1 \ \ b_2 \ \"… \ \ b_j \ \"… \ \ b_p \ end {pmatrix} = (\ underline {X} ^T \ underline X) ^ {- to 1} \ underline {X} ^T \ underline y.

This appraiser is BLUE (Best linear Unbiased Estimator), thus best (unbiased with smallest variance) linear undistorted appraisers. For the characteristics of the estimator b thus no distribution information of the variable disturbance must be present.

One receives the estimated set of equations with the help of the minimum square appraiser b

\ underline y = \ underline X \ cdot \ underline b + \ underline e \; ,

whereby e is the vector of the residues.

The interest of the analysis is appropriate particularly in the estimation \ widehat for y_0 or also prognosis of the dependent variable y for a given tuple of x0. Computes itself as

\ underline y_0 = b_0 + b_1 x_ {01} + b_2 x_ {02} +"… + b_p x_ {0p}.

Selected estimators of the KLR

The estimated values yi compute themselves as

\ {\ underline y} = \ underline {Xb widehat} = \ underline X (\ underline X ^T \ underline X) ^ {- 1} \ underline X ^T \ underline y

whereby one more briefly

\ widehat {\ underline y} = \ underline M \ underline y

to set can. - matrix M is by the way idempotent and maximum of rank p+1. It is called also having matrix, because it puts y on "„the hat "“.

The residues determined as

\ underline e = \ underline y - \ underline {Xb} = \ underline y - \ underline M \ underline y = (\ underline I - \ underline M) \ underline y,

whereby IN with M has comparable characteristics.

The prognosis \; _ {\ widehat y_0} determined as

\ y_0 = (widehat 1; x_ {01}; x_ {02}; \ cdots) (\ underline X ^T \ underline X) ^ {- 1} \ underline X ^T \ underline y.

Since X is firmly given, one can represent all this variables as linear transformation of y and thus of e, and therefore also its expectancy value vector and its second order statistics can be determined unproblematically.

The variance of the variable disturbance becomes estimated with the help of the residues, as middle Quadratsumme of the residues

s^2_ \ varepsilon = \ widehat \ sigma^2_ \ varepsilon = \ frac {\ sum_ {i=1} ^n e_i^2} {n (p+1)} \;

The Quadratsumme RSS (of English "„residual sum OF of squares "“) of the residues results in in matrix notation

RSS = \ underline {e} ^T \ underline e = \ underline {y} ^T (\ underline I - \ underline M) ^T (\ underline I - \ underline M) \ underline y = \ underline y ^T (\ underline I - \ underline M) \ underline y.


Articles in category "Regression analysis"

We found here 1 articles.

R

» Regression analysis

Related Websites

We found here 5 related websites.

  • Introduction to Regression Analysis
    NLREG performs linear and nonlinear regression analysis and curve fitting. NLREG can handle linear, polynomial, exponential, logistic, periodic, ...

  • Multiple Regression
    The general computational problem that needs to be solved in multiple regression analysis is to fit a straight line to a number of points. ...

  • Nonlinear Statistical Regression Analysis
    NLREG performs linear and nonlinear regression analysis and curve fitting. NLREG can handle linear, polynomial, exponential, logistic, periodic, ...

  • Regression Analysis
    The most commonly performed statistical procedure in SST is multiple regression analysis. The REG command provides a simple yet flexible way compute ...

  • Regression analysis - Wikipedia, the free encyclopedia
    Fox, J., Applied Regression Analysis, Linear Models and Related Methods. (1997), Sage; Hardle, W., Applied Nonparametric Regression (1990), ...

Page cached: Wednesday, July 5, 2006 14:55:51
Valid XHTML 1.0!  Valid CSS!

Page copy protected against web site content infringement by Copyscape