System Transparency

How CardioCare-AI Works

A clear, honest look at the dataset, algorithm, methodology, and measured performance powering every prediction.

Algorithm

Gradient Boosting Classifier

We evaluated Logistic Regression, Random Forest, SVM, and Neural Networks. The Gradient Boosting Classifier achieved the best balance of accuracy (73%) and interpretability — every prediction comes with feature importance scores.

49,000+

Training Samples

21,000

Test Samples

Input Features

v1.0

Model Version

Why This Algorithm

Built for Explainability

🌲

Ensemble of Decision Trees

Hundreds of small trees trained sequentially, each correcting the errors of the previous one — superior accuracy over any single model.

⚖️

Feature Importance Scores

Each prediction exposes ranked feature importances — showing exactly which factors (BP, Age, Cholesterol) drove the risk score.

🧮

Handles Mixed Data

Natively handles numeric vitals alongside categorical lifestyle flags without one-hot encoding degradation.

Inputs

11 Clinical Features

Demographic · Physical · Vitals · Lifestyle

🎂Age(Years (18–100))

⚧️Gender(Male / Female)

📏Height(Centimetres (120–250))

⚖️Weight(Kilograms (30–300))

💓Systolic BP(ap_hi — 60 to 250 mmHg)

💗Diastolic BP(ap_lo — 30 to 200 mmHg)

🩸Cholesterol(Normal / Above / Well Above)

🍬Glucose(Normal / Above / Well Above)

🚭Smoking(Yes / No)

🍷Alcohol(Yes / No)

🏃Activity(Physically active — Yes / No)

Process

How a Prediction Is Made

📝

Input

You fill in 11 clinical and lifestyle fields in the prediction form.

✅

Validate

Each field is range-checked; systolic must exceed diastolic.

⚙️

Scale

Values are transformed using the StandardScaler fitted on 70k training records.

🤖

Predict

GBM outputs a probability score and 3-tier risk classification.

Analytics

Model Performance

Evaluated on 21,000 unseen records. All charts use static values computed from the test set.

73%

Accuracy

On 21,000 held-out test records

0.80

AUC-ROC

Area under the ROC curve

77%

Precision (Risk)

Of predicted risk, 77% confirmed

65%

Recall (Risk)

65% of real risk cases detected

📋 Classification Report

Per-class precision, recall, and F1-score from the test set evaluation.

Metric	Precision	Recall	F1-Score	Support
Class 0 (No Risk)	0.71	0.81	0.76	10,353
Class 1 (Risk)	0.77	0.65	0.70	9,914
Accuracy			0.73	20,267
Macro Avg	0.74	0.73	0.73	20,267
Weighted Avg	0.74	0.73	0.73	20,267

🔲 Confusion Matrix

Shows how predictions align with actual outcomes. Green = correct, Red = errors.

Predicted

8,180

2,173

3,185

6,729

Total: 20,267 | Correct: 73.6%

📉 ROC Curve

Receiver Operating Characteristic — measures model's ability to distinguish between classes. Area Under Curve (AUC) closer to 1.0 = better.

📊 Feature Importance

Which input features most influenced the model's predictions — ranked by Gini importance from the Gradient Boosting ensemble.

🎯 Precision-Recall Curve

Critical for medical models — the tradeoff between detecting all real cases (recall) and avoiding false alarms (precision). Average Precision (AP) = 0.8.

📦 Risk Score Distribution

How the model spreads its confidence scores across the 21,000-patient test set. A bimodal shape (peaks at both ends) confirms the model makes decisive predictions rather than hedging near 50%.

👥 CVD Risk by Age Group

Prevalence of cardiovascular disease by age group in the training dataset. Confirms why age ranks as the #2 most important feature — risk more than doubles every decade after 40.

🔥 Feature Correlation

Shows Pearson correlation between all input features. Red = negative, Blue = positive. Hovering reveals exact values.

⚠️

Limitations & Disclaimer

Not a medical diagnosis. CardioCare-AI is an educational tool. Results are probabilistic estimates and must not replace advice from a qualified healthcare professional.

Dataset scope. Trained on a single dataset of 70,000 records. May not generalise equally across all ethnicities, geographies, or clinical settings.

Known accuracy ceiling. At 73% accuracy, 27% of predictions may be incorrect. A "Low Risk" result does not mean you are free of cardiovascular disease.

Read full caution notice →