loading page

Polycystic Ovarian Syndrome (PCOS) Detection using Gradient Boosted Decision Tree
  • Caroline Chan
Caroline Chan
Author Profile


Polycystic Ovarian Syndrome (PCOS) is one of the most common diseases in women of reproductive age, affecting 5 to 10 percent. It is a common condition affecting how a woman’s ovary work, characterized by irregular periods, excess androgen (male hormone), and the absence of polycystic ovaries. Unfortunately, 50 to 70 percent struggle with undiagnosed PCOS because PCOS diagnosis is still tricky. Recently, machine learning has been a trending topic and has proven efficient in diagnosing diseases. Machine learning can help diagnose by analyzing collected clinical data and might become a powerful tool to help women struggle with undiagnosed PCOS. The output of this research will be a classification model to detect PCOS by implementing Gradient Boosted Decision Tree and using data collected from ten different hospitals in Kerala, India. There are 45 clinical and physical features from the collected data. For model construction, this study will implement feature selection algorithm to rank the existing features, hyperparameter optimization and data resampling. The models will be built with different number of features from the maximum number to only ten features. Ultimately, these metrics will evaluate the final classification models: accuracy, precision, recall and F1-score.