g_wlsq(G;S;Y;W;XX)

Returns a model corresponding to the weighted multiple least squares regression of one or more independent variables against a given dependent variable.

Function type

Vector only

Syntax

g_wlsq(G;S;Y;W;XX)

Input

Argument Type Description
G any A space- or comma-separated list of column names

Rows are in the same group if their values for all of the columns listed in G are the same.

If G is omitted, all rows are considered to be in the same group.

If any of the columns listed in G contain N/A, the N/A value is considered a valid grouping value.

S integer The name of a column in which every row evaluates to a 1 or 0, which determines whether or not that row is selected to be included in the calculation

If S is omitted, all rows will be considered by the function (subject to any prior row selections).

If any of the values in S are neither 1 nor 0, an error is returned.

Y integer or decimal A column name denoting the dependent variable
W integer or decimal A column name denoting the weights of the dependent variable
XX integer or decimal A space- or comma-separated list of column names denoting the independent variable(s)

XX may also include the special value 1 for the constant (intercept) term in the linear model.

Return Value

For every row in each group defined by G (and for those rows where S=1, if specified), g_wlsq computes a weighted multiple least squares regression for the independent variable(s) XX against the dependent variable Y and returns a special type representing a model for each group in the data. The values in Y are weighted by the values in the column listed in W.

Note: W may also be the constant value 1, in which case the result of g_wlsq() is identical to that of g_lsq().
The model that g_wlsq returns can be used as an argument to:
  • param(M;P;I) to extract the regression model parameters, or
  • score(XX;M;Z) to score data points according to the regression model
Note: g_wlsq may be much slower if there is significant multicollinearity in the data (i.e., if two or more of the independent variables XX are nearly perfectly correlated with each other).
Assuming M is the column containing the result of g_wlsq, use the following function calls to obtain the desired information:
param(M;'b';N)
Nth coefficient of the model (corresponding to the Nth data column in XX)
param(M;'p';N)
p-value associated with the Nth coefficient of the model
param(M;'g';N)
Nth diagonal value of (XTX)-1, where X is the matrix of input values
param(M;'valcnt';)
Count of valid observations (those where XX and Y are all non-N/A) in the data
param(M;'ybar';)
Mean of the valid dependent variable observations Y in the data
param(M;'chi2';)
Residual sum of squares
param(M;'df';)
Degrees of freedom of the model
param(M;'r2';)
Coefficient of determination (R2) for the model
param(M;'adjr2';)
Adjusted R2 for the model
score(XX;M;)
Predicted Y for data points XX according to the model