Abstract Search

ISEF | Projects Database | Finalist Abstract

Back to Search Results | Print PDF

A Deep Learning Based Hierarchical Labeling and Generative Sampling Framework for Classifying Particle Jets for Generalizable Tagging

Booth Id:
PHYS046T

Category:
Physics and Astronomy

Year:
2023

Finalist Names:
Relan, Mihir (School: Michael E. DeBakey High School for Health Professions)
Semlani, Yash (School: Michael E. DeBakey High School for Health Professions)

Abstract:
Jet tagging is a classification problem in high-energy physics experiments that aims to identify the collimated sprays of subatomic particles, jets, from particle collisions and ‘tag’ them to their emitter particle. Advances in jet tagging present opportunities for searches of new physics beyond the Standard Model. Current approaches use machine-learning techniques to uncover hidden patterns in complex collision data. However, jet tagging research has primarily focused on developing classification techniques to address individual decay channels, which narrows the spectrum of jet labels. To enable more robust searches for new phenomena, there is a need to develop a generalized jet tagging model that can accurately identify a wide range of jet types under various experimental conditions. We propose a solution that utilizes a hierarchical labeling framework to leverage state-of-the-art classifiers for each jet type while achieving generalized classification. Our framework involves two labeling steps that separate jets first by their emitter particle and then by their end-state decay particles. This approach is motivated by the fact that jet production involves both the production and decay of particles and that the properties of jets can depend on both of these factors. We incorporate deep neural network and visual transformer architectures to tag the jets, preprocessing methods used by state-of-the-art models to achieve generalized jet tagging, and a novel data generation method based on Stable Diffusion and transformer architectures to expand the size of training datasets. This solution achieves performance metrics on par with state-of-the-art models seen in more specific classification tasks while expanding the scope of classification drastically.

Awards Won:
University of Texas at Dallas: Scholarship of $5,000 per year, renewable for up to four years
University of Texas at Dallas: Scholarship of $5,000 per year, renewable for up to four years