<mets:mets OBJID="eprint_4935" LABEL="Eprints Item" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-3.xsd" xmlns:mets="http://www.loc.gov/METS/" xmlns:mods="http://www.loc.gov/mods/v3" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><mets:metsHdr CREATEDATE="2024-01-01T22:34:48Z"><mets:agent ROLE="CUSTODIAN" TYPE="ORGANIZATION"><mets:name>IIASA Repository</mets:name></mets:agent></mets:metsHdr><mets:dmdSec ID="DMD_eprint_4935_mods"><mets:mdWrap MDTYPE="MODS"><mets:xmlData><mods:titleInfo><mods:title>Mathematical Programming Formulations for Two-group Classification with Binary Variables</mods:title></mods:titleInfo><mods:name type="personal"><mods:namePart type="given">O.K.</mods:namePart><mods:namePart type="family">Asparoukhov</mods:namePart><mods:role><mods:roleTerm type="text">author</mods:roleTerm></mods:role></mods:name><mods:name type="personal"><mods:namePart type="given">A.</mods:namePart><mods:namePart type="family">Stam</mods:namePart><mods:role><mods:roleTerm type="text">author</mods:roleTerm></mods:role></mods:name><mods:abstract>In this paper, we introduce a nonparametric mathematical programming (MP) approach for solving the binary variable classification problem. In practice, there exists a substantial interest in the binary variable classification problem. For instance, medical diagnoses are often based on the presence or absence of relevant symptoms, and binary variable classification has long been used as a means to predict (diagnose) the nature of the medical condition of patients. Our research is motivated by the fact that none of the existing statistical methods for binary variable classification -- parametric and nonparametric alike -- are fully satisfactory. &#13;
&#13;
The general class of MP classification methods facilitates a geometric interpretation, and MP-based classification rules have intuitive appeal because of their potentially robust properties. These intuitive arguments appear to have merit, and a number of research studies have confirmed that MP methods can indeed yield effective classification rules under certain non-normal data conditions, for instance if the data set is outlier-contaminated or highly skewed. However, the MP-based approach in general lacks a probabilistic foundation, an ad hoc assessment of its classification performance. &#13;
&#13;
Our proposed nonparametric mixed integer programming (MIP) formulation for the binary variable classification problem not only has a geometric interpretation, but also is consistent with the Bayes decision theoretic approach. Therefore, our proposed formulation possesses a strong probabilistic foundation. We also introduce a linear programming (LP) formulation which parallels the concepts underlying the MIP formulation, but does not possess the decision theoretic justification. &#13;
&#13;
An additional advantage of both our LP and MIP formulations is that, due to the fact that the attribute variables are binary, the training sample observations can be partitioned into multinomial cells, allowing for a substantial reduction in the number of binary and deviational variables, so that our formulation can be used to analyze training samples of almost any size. &#13;
&#13;
We illustrate our formulations using an example problem, and use three real data sets to compare its classification performance with a variety of parametric and nonparametric statistical methods. For each of these data sets, our proposed formulation yields the minimum possible number of misclassifications, both using the resubstitution and the leave-one-out method.</mods:abstract><mods:originInfo><mods:dateIssued encoding="iso8601">1996-08</mods:dateIssued></mods:originInfo><mods:originInfo><mods:publisher>WP-96-092</mods:publisher></mods:originInfo><mods:genre>Monograph</mods:genre></mets:xmlData></mets:mdWrap></mets:dmdSec><mets:amdSec ID="TMD_eprint_4935"><mets:rightsMD ID="rights_eprint_4935_mods"><mets:mdWrap MDTYPE="MODS"><mets:xmlData><mods:useAndReproduction>
<p xmlns="http://www.w3.org/1999/xhtml"><strong>For work being deposited by its own author:</strong>
In self-archiving this collection of files and associated bibliographic
metadata, I grant IIASA Repository the right to store
them and to make them permanently available publicly for free on-line.
I declare that this material is my own intellectual property and I
understand that IIASA Repository does not assume any
responsibility if there is any breach of copyright in distributing these
files or metadata. (All authors are urged to prominently assert their
copyright on the title page of their work.)</p>

<p xmlns="http://www.w3.org/1999/xhtml"><strong>For work being deposited by someone other than its
author:</strong> I hereby declare that the collection of files and
associated bibliographic metadata that I am archiving at
IIASA Repository) is in the public domain. If this is
not the case, I accept full responsibility for any breach of copyright
that distributing these files or metadata may entail.</p>

<p xmlns="http://www.w3.org/1999/xhtml">Clicking on the deposit button indicates your agreement to these
terms.</p>
    </mods:useAndReproduction></mets:xmlData></mets:mdWrap></mets:rightsMD></mets:amdSec><mets:fileSec><mets:fileGrp USE="reference"><mets:file ID="eprint_4935_4233_1" SIZE="871332" OWNERID="https://pure.iiasa.ac.at/id/eprint/4935/1/WP-96-092.pdf" MIMETYPE="application/pdf"><mets:FLocat LOCTYPE="URL" xlink:type="simple" xlink:href="https://pure.iiasa.ac.at/id/eprint/4935/1/WP-96-092.pdf"></mets:FLocat></mets:file></mets:fileGrp></mets:fileSec><mets:structMap><mets:div DMDID="DMD_eprint_4935_mods" ADMID="TMD_eprint_4935"><mets:fptr FILEID="eprint_4935_document_4233_1"></mets:fptr></mets:div></mets:structMap></mets:mets>