Abstract Search

ISEF | Projects Database | Finalist Abstract

Back to Search Results | Print PDF

Balancing Misclassification Costs (BMC) in Imbalanced Classification

Booth Id:
MATH031

Category:
Mathematics

Year:
2024

Finalist Names:
Fu, Sophia (School: Carmel High School)

Abstract:
Classification tasks in machine learning, essential for applications ranging from fraud detection to medical diagnoses, frequently encounter the challenge of imbalanced datasets. These imbalances can skew predictions towards the majority class, risking oversight of vital minority instances and carrying significant real-world consequences. Established methods, such as Logistic Regression, Support Vector Machines, and ensemble techniques, offer solutions to classification challenges but often struggle with imbalanced datasets. Conventional strategies like resampling and cost-sensitive learning provide value but come with issues like overfitting, data loss, and increased computational demands. A notable disconnect also exists between estimation procedures and evaluation metrics, further complicating the task of accurately gauging model performance. In this work, I present the Balancing Misclassification Costs (BMC) algorithm, an innovative approach designed to adeptly tackle the challenges posed by imbalanced datasets. My method integrates misclassification costs within a unified optimization framework. Capitalizing on rigorous theoretical proof, I have also devised an efficient estimation procedure. Through detailed simulations and its application to a cancer diagnostic dataset, I underscore BMC's superiority over conventional methodologies.