g_liblinear(G;S;Y;XX;Z)
Returns a model of a given data set using one of the 10 supported underlying algorithms, which include logistic regression as well as both support vector classification and regression. (Available as of prod-9)
Function type
Vector only
Syntax
g_liblinear(G;S;Y;XX;Z)
Input
Argument | Type | Description | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
G |
any | A space- or comma-separated list of column names Rows are in the same group
if their values for all of the columns listed in If If any of the columns listed in |
||||||||||||||||||||
S |
integer | The name of a column in which every row evaluates to a 1 or 0, which determines
whether or not that row is selected to be included in the calculation If
If any of the values in
|
||||||||||||||||||||
Y |
integer or decimal | A column name denoting the dependent variable For a classification or logistic regression analysis, this is a column of labels. For a support vector regression analysis, this is a column of continuous data. |
||||||||||||||||||||
XX |
integer or decimal | A space- or comma-separated list of column names denoting
the independent variable(s) If you wish to include a bias, the first element of |
||||||||||||||||||||
Z |
text and decimal | A list of key-value pairs that modify the underlying method For example: 'solver_type' 'L1R_L2LOSS_SVC' The options you may
specify for the
|
Return Value
For every row in each group defined by G
(and for those rows where
S
=1, if specified), g_liblinear
computes a fast, large-scale classification or regression, according to the method specified
by the Z
parameter.
g_liblinear
supports 10 different types of underlying algorithms,
including three types of logistic regression, four types of support vector classification,
and three types of support vector regression.
g_liblinear
may be much slower if there is
significant multicollinearity in the data (i.e., if two or more of the independent
variables XX
are nearly perfectly correlated with each other).g_liblinear
returns can be used as an argument to the
following functions:score(XX;M;Z)
- Score data points when
g_liblinear
is used to train a continuous regression model or a logistic regressionValid solver types include:- L2R_LR
- L1R_LR
- L2R_LR_DUAL
- L2R_L2LOSS_SVR
- L2R_L2LOSS_SVR_DUAL
- L2R_L1LOSS_SVR_DUAL
Note: Do not include the bias (intercept) in theXX
parameter toscore(XX;M;Z)
, even if it was specified in theXX
parameter tog_liblinear(G;S;Y;XX;Z)
. classify(XX;M;Z)
- Classify data points when
g_liblinear
is used to train a discrete modelValid solver types include:- L2R_L2LOSS_SVC_DUAL
- L2R_L2LOSS_SVC
- L2R_L1LOSS_SVC_DUAL
- L1R_L2LOSS_SVC
param(M;P;I)
- Extract the model parameters
M
is the column containing the result of
g_liblinear
, use the following function calls to obtain the desired information:param(M;'solver_type';)
- Algorithm used to train the model:
0 L2-regularized logistic regression (primal) 1 L2-regularized L2-loss support vector classification (dual) 2 L2-regularized L2-loss support vector classification (primal) 3 L2-regularized L1-loss support vector classification (dual) 5 L1-regularized L2-loss support vector classification 6 L1-regularized logistic regression 7 L2-regularized logistic regression (dual) 11 L2-regularized L2-loss support vector regression (primal) 12 L2-regularized L2-loss support vector regression (dual) 13 L2-regularized L1-loss support vector regression (dual) param(M;'violation_cost';)
- Penalty factor
param(M;'sensitivity';)
- Sensitivity
param(M;'stopping_crit';)
- Stopping criterion
param(M;'nr_weight';)
- Number of weight multipliers assigned
param(M;'weight_label';N)
- Label of the weight multiplier
param(M;'weight';N)
- Weight multiplier
param(M;'nr_class';)
- Number of classes
param(M;'nr_feature';)
- Number of features
param(M;'model_weights';N)
- Array of all the model weights
The array will have length
nr_class * nr_feature
(including the bias feature, if the bias is positive).If there are two classes, however, the array will have length
nr_feature
(including the bias feature). param(M;'labels';N)
- Expressed list of all the different class types
param(M;'bias';)
- Intercept value
If no bias was specified, the result is -1.
param(M;'max_iter';)
- Maximum number of iterations reached
The result is 1 if the maximum number of iterations was reached, 0 otherwise.
param(M;'support_vectors';N)
- Number of support vectors
If
solver_type
is 1 or 3, the result is the number of support vectors at each subproblem.If
solver_type
is 7, the result is a list of zeros.If
solver_type
is 0 or 2, the result is a list of N/A values.If
solver_type
is 11, 12, or 13, the results are meaningless. param(M;'num_iterations';N)
- Number of iterations
If
solver_type
is 1, 3, 5, 6, or 7, the result is the number of iterations at each subproblem.If
solver_type
is 0 or 2, the result is N/A. param(M;'valcnt';)
- Count of valid observations in the data
Example
The following example uses g_liblinear(G;S;Y;XX;Z)
to train a model using
support vector classification to determine whether or not a client at a particular banking
institution subscribed for a term deposit. This example uses the information in the Bank Marketing data set
(pub.demo.mleg.uci.bankmarketing).
<base table="pub.demo.mleg.uci.bankmarketing"/> <library> <block name="enum" fields="job,marital,education,default,housing, loan,contact,month,day_of_week,poutcome"> <foreach enum="{@fields}"> <willbe name="{@enum}_enum" value="g_enum(;;;{@enum})"/> </foreach> </block> </library> <insert block="enum"/> <willbe name="label" value="g_enum(;;;y)"/> <willbe name="model" value="g_liblinear(;;label;1 job_enum marital_enum education_enum default_enum housing_enum loan_enum contact_enum month_enum day_of_week_enum poutcome_enum age duration campaign pdays previous empvarrate conspriceidx consconfidx euribor3m nremployed; 'solver_type' 'L1R_L2LOSS_SVC')"/> <willbe name="predictions" value="classify(job_enum marital_enum education_enum default_enum housing_enum loan_enum contact_enum month_enum day_of_week_enum poutcome_enum age duration campaign pdays previous empvarrate conspriceidx consconfidx euribor3m nremployed;model;)"/> <willbe name="score" value="if(predictions=label;1;0)"/> <willbe name="sum" value="g_sum(;;score)"/> <willbe name="total" value="g_cnt(;)"/> <willbe name="accuracy" value="(sum/total)*100"/> <sel value="g_first1(;;)"/> <colord cols="model,sum,total,accuracy"/>
Additional Information
Details about the liblinear library provided by the Machine Learning Group at National Taiwan University can be found at: LIBLINEAR -- A Library for Large Linear Classification.