Demographic and Clinical Factors Role Identification in Stroke Risk and Subtype Prediction
- Deepak Kumar, Chaman Verma, Purushottam Sharma, Deeksha Kumari, and Zoltán Illés
The purpose of this study was to analyze the factors associated with stroke risk in a patient population comprising 4798 individuals. Using k-means clustering analysis, we identified a significant relationship between subpopulations and the degree of paralysis in stroke patients. Furthermore, we developed a machine learning model that utilized demographic and clinical factors to predict stroke subtypes, achieving an impressive overall accuracy rate of 86%. The crucial determinants for classifying the stroke subtype were found to be the patient's neurological condition, consciousness and memory, body mass index (BMI), glucose levels, and risk score. To gain deeper insights into the interrelationships among different variables, we applied principal component analysis (PCA) to the target attribute of stroke (TOS). The PCA analysis revealed five key principal components that shed light on the underlying dynamics. Specifically, age, cholesterol, glucose, diastolic blood pressure, and modified Rankin Scale (MRS) strongly influenced PC2. Conversely, risk score, MRS, systolic blood pressure, not specified abbreviation (nhiss), and diastolic blood pressure had a strong impact on PC1. In summary, this study contributes to the understanding of stroke risk factors by highlighting the relationship between subpopulations and paralysis severity. Moreover, the developed machine learning model demonstrates promising accuracy in predicting stroke subtypes based on key demographic and clinical factors. The findings obtained through PCA provide valuable insights into the interplay among different variables, emphasizing the influence of specific factors on principal components PC1 and PC2.