g_auroc(G;S;X;Y)

Returns a model object with four different representations of the Area Under the Receiver Operator Characteristic (AUROC) as well as the perfect model value. (Available as of version 10.42)

Function type

Vector only

Description

g_auroc(G;S;X;Y) results may be used to measure model accuracy for models with a binary dependent variable/target. Values range between 0 to 1 (-1 to 1 if an adjusted value is considered), with larger values signifying better results. A value of 0.5 (0 for an adjusted value) means that predictions are as good as random.

Syntax

g_auroc(G;S;X;Y)

Input

Argument Type Description
G any A space- or comma-separated list of column names

Rows are in the same group if their values for all of the columns listed in G are the same.

If G is omitted, all rows are considered to be in the same group.

If any of the columns listed in G contain N/A, the N/A value is considered a valid grouping value.

S integer The name of a column in which every row evaluates to a 1 or 0, which determines whether or not that row is selected to be included in the calculation

If S is omitted, all rows will be considered by the function (subject to any prior row selections).

If any of the values in S are neither 1 nor 0, an error is returned.

X any numeric type A column name

This column contains the results of scoring a particular model.

Y any numeric type A column name

This column must only contain values of 0 or 1.

Return Value

For every row in each group defined by G (and for those rows where S=1, if specified), g_auroc(G;S;X;Y) returns a model object with five AUROC values. Values may be extracted from the model object with pkg_get(P;K) or param(M;P;I).

Assuming M is the column containing the result of g_auroc(G;S;X;Y), use the following function calls to obtain the desired information:
pkg_get(M;’auroc’)
auroc is the traditional Area Under Receiver Operator Characteristic, which takes values between 0 and 1. auroc is the area under the Receiver Operator Curve and is sometimes called the "c-statistic".
pkg_get(M;’aurocp’)
aurocp is the area under the Cumulative Gains chart.
pkg_get(M;’perfect’)
perfect is the theoretical maximum value of aurocp given the data in column Y. This theoretical maximum would be achieved by a model for which every row with Y=1 has a higher score than any row with Y=0 (i.e., the model score perfectly separates 1s from 0s).
pkg_get(M;’aurocp_adj’)
aurocp_adj is defined so that a perfect model has a aurocp_adj of 1 and a random model with no predictive power has a aurocp_adj of 0. It is computed by subtracting 0.5 from aurocp and dividing by the aurocp of a perfect model.

Example

The following example uses g_info_iv(G;S;X;Y) and g_info_woe(G;S;X;Y) to calculate the information value (IV) and information theoretic value weight of evidence (WoE) for the columns job, marital, education, default, housing, and loan in the table pub.demo.mleg.uci.bankmarketing. The columns that have an IV greater than 0.02 are then specified to g_logreg(G;S;Y;XX;Z) and score(XX;M;Z) using their corresponding WoE columns.

g_auroc(G;S;X;Y) uses the results in the score column to measure the accuracy of the model created by g_logreg(G;S;Y;XX;Z) and returns its results in a model object. pkg_get(P;K) is then used to extract from that model object the four different representations of the Area Under the Receiver Operator Characteristic (AUROC) as well as the perfect model value.

<base table="pub.demo.mleg.uci.bankmarketing"/>
<willbe name="y01" value="y='yes'"/>
<foreach var="job,marital,education,default,housing,loan">
  <willbe name="iv_{@var}" value="g_info_iv(;;{@var};y01)"/>
  <willbe name="iw_{@var}" value="g_info_woe(;;{@var};y01)"/>
</foreach>
<colord cols="y01,iv_*"/>
<note>For this example, only those columns with an IV value greater  
than 0.02 value are specified to g_logreg and score.</note>
<willbe name="model" 
 value="g_logreg(;;y01;1,iw_job,iw_marital,iw_education,iw_default;)"/>
<willbe name="score" 
 value="score(1,iw_job,iw_marital,iw_education,iw_default;model;)"/>
<willbe name="m_auroc" value="g_auroc(;;score;y01)"/>
<willbe name="auroc" 
 value="pkg_get(m_auroc;'auroc')" format="dec:5"/>
<willbe name="aurocp" 
 value="pkg_get(m_auroc;'aurocp')" format="dec:5"/>
<willbe name="aurocp_adj" 
 value="pkg_get(m_auroc;'aurocp_adj')" format="dec:5"/>
<willbe name="perfect" 
 value="pkg_get(m_auroc;'perfect')" format="dec:5"/>
<colord cols="auroc,aurocp,aurocp_adj,perfect"/>