g_glm(G;S;Y;XX;Z)
Returns a generalized linear model corresponding to the regression of a dependent variable with one or more independent variables. (Available as of prod-9)
Function type
Vector only
Syntax
g_glm(G;S;Y;XX;Z)
Input
Argument | Type | Description |
---|---|---|
G |
any | A space- or comma-separated list of column names Rows are in the same group
if their values for all of the columns listed in If If any of the columns listed in |
S |
integer | The name of a column in which every row evaluates to a 1 or 0, which determines
whether or not that row is selected to be included in the calculation If
If any of the values in
|
Y |
integer or decimal | A column name denoting the dependent variable If the
If the
|
XX |
integer or decimal | A space- or comma-separated list of column names denoting
the independent variable(s) If you would like to include an intercept term, the first element of
|
Z |
text | A list of key-value pairs that provide control over the model fitting For example: 'fit_type' 'logistic' 'eps' '1e-6' The options
you may specify for the
|
Return Value
For every row in each group defined by G
(and for those rows where
S
=1, if specified), g_glm
computes a regression of a dependent variable Y
with one or more
independent variables XX
and returns a generalized linear model for each
group in the data.
g_glm
may be much slower if there is significant multicollinearity in
the data (i.e., if two or more of the independent variables XX
are nearly
perfectly correlated with each other).g_glm
returns can be used as an argument to the following
functions:param(M;P;I)
to extract the numerical value of a regression model parametercparam(M;P;I)
to extract the text value of a regression model parameter
M
is the column containing the result of g_glm
,
use the following function calls to obtain the desired information:param(M;'b';N)
N
th coefficient of the model (corresponding to theN
th data column inXX
)param(M;'se';N)
- Set of standard error of beta (computed with the Fisher information)
param(M;'tv';N)
- Set of standardized beta values with mean of 0
param(M;'pv';N)
- Set of p-values
If the
fit_type
is logistic, then normal CDF is used to compute p-values, else Student’s t CDF is used. param(M;'dev';)
- Deviance at last iteration
param(M;'delta';)
- Percent change of the deviance from one iteration to the next
((dev - dev0) / (0.1 + dev0))
param(M;'fisher_iterations';)
- Number of iterations taken by
g_glm
param(M;'fit_type';)
- Type of model generated by
g_glm
cparam(M;'convergence';)
- Boolean value indicating if there was convergence in the model
Returns true when the change in deviance is less than the convergence epsilon:
(delta <= convEPS)
Otherwise, returns false.
param(M;'df';)
- Degrees of freedom
param(M;'convEPS';)
- Convergence epsilon
param(M;'valcnt';)
- Number of observations in the model
Example
The following example uses g_glm(G;S;Y;XX;Z)
to fit a logistic regression
model on Fisher's Iris data set (pub.demo.mleg.uci.iris) for the
Iris-virginica class and outputs the related model parameters.
<base table="pub.demo.mleg.uci.iris"/> <willbe name="response" value="class='Iris-virginica'"/> <willbe name="glm_results" value="g_glm(;;response;1,sepal_length,sepal_width, petal_length,petal_width; 'fit_type' 'logistic' )"/> <willbe name="beta_bias" value="param(glm_results;'b';1)"/> <willbe name="beta_1" value="param(glm_results;'b';2)"/> <willbe name="beta_2" value="param(glm_results;'b';3)"/> <willbe name="beta_3" value="param(glm_results;'b';4)"/> <willbe name="beta_4" value="param(glm_results;'b';5)"/> <note>Standard error</note> <willbe name="se_bias" value="param(glm_results;'se';1)"/> <willbe name="se_1" value="param(glm_results;'se';2)"/> <willbe name="se_2" value="param(glm_results;'se';3)"/> <willbe name="se_3" value="param(glm_results;'se';4)"/> <willbe name="se_4" value="param(glm_results;'se';5)"/> <note>t</note> <willbe name="t_bias" value="param(glm_results;'tv';1)"/> <willbe name="t_1" value="param(glm_results;'tv';2)"/> <willbe name="t_2" value="param(glm_results;'tv';3)"/> <willbe name="t_3" value="param(glm_results;'tv';4)"/> <willbe name="t_4" value="param(glm_results;'tv';5)"/> <note>p-values</note> <willbe name="p_bias" value="param(glm_results;'pv';1)"/> <willbe name="p_1" value="param(glm_results;'pv';2)"/> <willbe name="p_2" value="param(glm_results;'pv';3)"/> <willbe name="p_3" value="param(glm_results;'pv';4)"/> <willbe name="p_4" value="param(glm_results;'pv';5)"/> <note>Residual Deviance</note> <willbe name="dev" value="param(glm_results;'dev';)"/> <note>Delta</note> <willbe name="delta" value="param(glm_results;'delta';)"/> <note>Fisher iterations</note> <willbe name="fisher_iterations" value="param(glm_results;'fisher_iterations';)"/> <note>Convergence</note> <willbe name="convergence" value="cparam(glm_results;'convergence';)"/> <willbe name="fit_type" value="cparam(glm_results;'fit_type';)"/>