M.S. Candidate: Barış Özcan
Program: Data Informatics
Date: 27.12.2024 / 14:00
Place: A-212
Abstract: Machine learning models depend on the quality and quantity of data used in the training process. This dependency necessitates continuous development in data collection methods and the optimization of existing data resources. A notable obstacle for using collected data effectively is concept drift, which is evolving the underlying patterns within data over time, causing a reduction in model performance and relevance. Although the phenomenon of concept drift is widely acknowledged, a standardized approach for its quantification and remedy remains elusive. Our research proposes a holistic approach to handle concept drift by developing a system that not only detects but also dynamically adapts to new data concepts. We introduce a system that examines the concepts understood by existing models and compare these with the concepts present in new datasets. Based on this comparison, the system decides whether to update existing models or to develop new models tailored to the newly identified concepts. This approach enables continuous model improvement and addresses the potential for concept drift within specific classes of a dataset. To establish a comprehensive system for resolving concept drift problem, our research studies various detection methods, including centroid differences and performance-based evaluations. We adapt these concept drift detection methods to the training process by developing different machine learning models for each concept. Also, our study explores various prediction strategies for effectively using ensemble models that consist of models specialized in specific concepts. Additionally, our research addresses the potential occurrence of class imbalances during the training processes by incorporating synthetic data generation techniques. In summary, this thesis aims to refine methods to detect concept drift and develop adaptive solutions for the concept drift problem in dynamic settings.