Introduction
First what is a stroke?
- Stroke is a medical emergency. A stroke occurs when blood flow to a part of your brain is interrupted or reduced, preventing brain tissue from getting oxygen and nutrients. Brain cells begin to die within minutes. Through this data we will try to know more about strokes and Make a model to try to do stroke prediction.
Risk factors for having a stroke include:
- Age: People aged 55 years and over
- Hypertension: if the systolic pressure is 140 mm Hg or more, or the diastolic pressure is 90 mm Hg or more
- Hypercholesterolemia: If the cholesterol level in the blood is 200 milligrams per deciliter
- Smoking
- Diabetes
- Obesity: if the body mass index (BMI) is 30 or more
Import
import numpy as np import pandas as pd import matplotlib.pyplot as pltplt.style.use('ggplot')import seaborn as snsimport plotly.express as pximport plotly.graph_objects as gofrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom sklearn.ensemble import AdaBoostClassifierfrom sklearn.tree import DecisionTreeClassifierfrom sklearn.model_selection import cross_val_scorefrom sklearn.linear_model import LogisticRegressionfrom sklearn.ensemble import GradientBoostingClassifierfrom sklearn.svm import SVCfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.ensemble import GradientBoostingClassifierfrom sklearn.neural_network import MLPClassifierfrom sklearn.metrics import accuracy_scorefrom sklearn.metrics import confusion_matrixfrom sklearn.metrics import classification_report
df=pd.read_csv("D:/Dataset/healthcare-dataset-stroke-data.csv")df.head()
![Stroke Prediction-EDA-Classification-Models Python (1) Stroke Prediction-EDA-Classification-Models Python (1)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/a01c8-image-8.png?w=1024&h=169)
Read & Explore
HideIn[2]:
df.info()
![Stroke Prediction-EDA-Classification-Models Python (2) Stroke Prediction-EDA-Classification-Models Python (2)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/c7c55-image-9.png)
df.describe()
![Stroke Prediction-EDA-Classification-Models Python (3) Stroke Prediction-EDA-Classification-Models Python (3)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/63d5d-image-10.png?w=1024&h=282)
Variance features Distribution
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(12, 10))df.plot(kind="hist", y="age", bins=70, color="b", ax=axes[0][0])df.plot(kind="hist", y="bmi", bins=100, color="r", ax=axes[0][1])df.plot(kind="hist", y="heart_disease", bins=6, color="g", ax=axes[1][0])df.plot(kind="hist", y="avg_glucose_level", bins=100, color="orange", ax=axes[1][1])plt.show()
![Stroke Prediction-EDA-Classification-Models Python (4) Stroke Prediction-EDA-Classification-Models Python (4)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/da3ec-image-11.png)
- We have good distribution for age
- I think we have outliers in bmi
- Average glucose distribution is reasonable because the normal avg of blood in sugar is less than 140 , that may be not good this feature will not be helpful to know if diabetes have correlation between diabetes and strokes
Data Summary ( Check for missing values )
print ("Rows : " , df.shape[0])print ("Columns : " , df.shape[1])print ("\nFeatures : \n" , df.columns.tolist())print ("\nMissing values : ", df.isnull().sum().values.sum())print ("\nUnique values : \n",df.nunique())
![Stroke Prediction-EDA-Classification-Models Python (5) Stroke Prediction-EDA-Classification-Models Python (5)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/3af13-image-12.png)
Data Visualization
Stroke Pie Chart
labels =df['stroke'].value_counts(sort = True).indexsizes = df['stroke'].value_counts(sort = True)colors = ["lightblue","red"]explode = (0.05,0) plt.figure(figsize=(7,7))plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', shadow=True, startangle=90,)plt.title('Stroke Breakdown')plt.show()
![Stroke Prediction-EDA-Classification-Models Python (6) Stroke Prediction-EDA-Classification-Models Python (6)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/bb89b-image-13.png)
Only 5% percent of people have Stroke!
Gender
plt.figure(figsize=(10,5))sns.countplot(data=df,x='gender');
![Stroke Prediction-EDA-Classification-Models Python (7) Stroke Prediction-EDA-Classification-Models Python (7)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/8b405-image-14.png)
There is about 1000 diffrence between Female and Male in the data
Correlation with average glucose level
Visualize some features which maybe have correlation with avg glucose level
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(15, 5))df.plot(kind='scatter', x='age', y='avg_glucose_level', alpha=0.5, color='green', ax=axes[0], title="Age vs. avg_glucose_level")df.plot(kind='scatter', x='bmi', y='avg_glucose_level', alpha=0.5, color='red', ax=axes[1], title="bmi vs. avg_glucose_level")plt.show()
![Stroke Prediction-EDA-Classification-Models Python (8) Stroke Prediction-EDA-Classification-Models Python (8)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/8c0b9-image-15.png)
- Average glucose level is high with old people
- BMI >40 have low average glucose.
Heatmap Correlation
plt.figure(figsize=(15,7))sns.heatmap(df.corr(),annot=True);
![Stroke Prediction-EDA-Classification-Models Python (9) Stroke Prediction-EDA-Classification-Models Python (9)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/6e1a0-image-16.png)
There is no correlation between stroke and BMI
BMI Boxplot
plt.figure(figsize=(10,7))sns.boxplot(data=df,x=df["bmi"],color='green');
![Stroke Prediction-EDA-Classification-Models Python (10) Stroke Prediction-EDA-Classification-Models Python (10)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/a11e7-image-17.png)
we have many outliers but before we fix this we must study BMI first.
BMI
Body mass index is a value derived from the mass and height of a person
![Stroke Prediction-EDA-Classification-Models Python (11) Stroke Prediction-EDA-Classification-Models Python (11)](https://i0.wp.com/www.cdc.gov/healthyweight/images/assessing/bmi-adult-fb-600x315.jpg)
bmi_outliers=df.loc[df['bmi']>50]bmi_outliers['bmi'].shape
![Stroke Prediction-EDA-Classification-Models Python (12) Stroke Prediction-EDA-Classification-Models Python (12)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/37d3b-image-18.png)
# mean with outliers print(bmi_outliers['stroke'].value_counts())
![Stroke Prediction-EDA-Classification-Models Python (13) Stroke Prediction-EDA-Classification-Models Python (13)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/d56ed-image-19.png)
print ("\nMissing values : ", df.isnull().sum().values.sum())
![Stroke Prediction-EDA-Classification-Models Python (14) Stroke Prediction-EDA-Classification-Models Python (14)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/3331e-image-20.png)
Double Check for missing values
df["bmi"] = df["bmi"].apply(lambda x: 50 if x>50 else x)df["bmi"] = df["bmi"].fillna(28.4)print ("\nMissing values : ", df.isnull().sum().values.sum())
![Stroke Prediction-EDA-Classification-Models Python (15) Stroke Prediction-EDA-Classification-Models Python (15)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/cccf5-image-21.png)
Stroke or not in Categorical Features
cat_df = df[['gender','Residence_type','smoking_status','stroke']]summary = pd.concat([pd.crosstab(cat_df[x], cat_df.stroke) for x in cat_df.columns[:-1]], keys=cat_df.columns[:-1])summary
![Stroke Prediction-EDA-Classification-Models Python (16) Stroke Prediction-EDA-Classification-Models Python (16)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/71e7e-image-22.png?w=1024&h=277)
Stroke/Ever Married
plt.figure(figsize=(10,5))strok=df.loc[df['stroke']==1]sns.countplot(data=strok,x='ever_married',palette='inferno');
![Stroke Prediction-EDA-Classification-Models Python (17) Stroke Prediction-EDA-Classification-Models Python (17)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/7c4c3-image-23.png)
Stroke/Work Type
plt.figure(figsize=(10,5))sns.countplot(data=strok,x='work_type',palette='cool');
![Stroke Prediction-EDA-Classification-Models Python (18) Stroke Prediction-EDA-Classification-Models Python (18)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/d076f-image-24.png)
Private work exposes you to more stroke
Stroke/Smoking Status
plt.figure(figsize=(10,5))sns.countplot(data=strok,x='smoking_status',palette='autumn');
![Stroke Prediction-EDA-Classification-Models Python (19) Stroke Prediction-EDA-Classification-Models Python (19)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/bcb5d-image-25.png)
Being a smoker or a formerly smoker increases your risk of having a stroke
Residence Type
plt.figure(figsize=(10,5))sns.countplot(data=strok,x='Residence_type',palette='Greens');
![Stroke Prediction-EDA-Classification-Models Python (20) Stroke Prediction-EDA-Classification-Models Python (20)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/02d2e-image-26.png)
Residence Type has nothing to do with stroke, We cannot take it as a standard
Stroke/Heart Disease
plt.figure(figsize=(10,5))sns.countplot(data=strok,x='heart_disease',palette='Reds');
Most people who have had a stroke do not have any heart disease, but that does not prevent it being an influential factor
plt.figure(figsize=(10,5))sns.countplot(data=strok,x='hypertension',palette='Pastel2');
![Stroke Prediction-EDA-Classification-Models Python (22) Stroke Prediction-EDA-Classification-Models Python (22)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/82c8a-image-28.png)
more than 25% of strok cases They had hypertension
Notes
- Avg glucose level is high with old people
- BMI >40 have low avg glucose
- Being unmarried reduces your risk of a stroke
- Being a smoker or a formerly smoker increases your risk of having a stroke
- more than 25% of strok cases They had hypertension
Data preprocessing
Encoding Categorical Features
df["Residence_type"] = df["Residence_type"].apply(lambda x: 1 if x=="Urban" else 0)df["ever_married"] = df["ever_married"].apply(lambda x: 1 if x=="Yes" else 0)df["gender"] = df["gender"].apply(lambda x: 1 if x=="Male" else 0) df = pd.get_dummies(data=df, columns=['smoking_status'])df = pd.get_dummies(data=df, columns=['work_type'])df
![Stroke Prediction-EDA-Classification-Models Python (23) Stroke Prediction-EDA-Classification-Models Python (23)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/4d98c-image-30.png?w=1024&h=385)
Scaling The variance in Features
std=StandardScaler()columns = ['avg_glucose_level','bmi','age']scaled = std.fit_transform(df[['avg_glucose_level','bmi','age']])scaled = pd.DataFrame(scaled,columns=columns)df=df.drop(columns=columns,axis=1)df=df.merge(scaled, left_index=True, right_index=True, how = "left")df.head()
![Stroke Prediction-EDA-Classification-Models Python (24) Stroke Prediction-EDA-Classification-Models Python (24)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/c8e78-image-31.png?w=1024&h=235)
Drop ID feature and check for nulls
df=df.drop(columns='id',axis=1)df.head()
![Stroke Prediction-EDA-Classification-Models Python (25) Stroke Prediction-EDA-Classification-Models Python (25)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/2e4a0-image-32.png?w=1024&h=212)
df[df.isnull().any(axis=1)]
![Stroke Prediction-EDA-Classification-Models Python (26) Stroke Prediction-EDA-Classification-Models Python (26)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/31421-image-33.png?w=1024&h=60)
Classification Models
Target & Features
X = df.drop(['stroke'], axis=1).values y = df['stroke'].values
Splitting
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
adaboost classification
#create adaboost classification objab_clf = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(), n_estimators=100, learning_rate=0.5, random_state=100)#training via adaboost classficiation modelab_clf.fit(X_train, y_train)print("training....\n")#make prediction using the test setab_pred_stroke= ab_clf.predict(X_train)print('prediction: \n', ab_pred_stroke)print('\nparms: \n', ab_clf.get_params)#scoreab_clf_score = ab_clf.score(X_test, y_test)print("\nmean accuracy: %.2f" % ab_clf.score(X_test, y_test))
![Stroke Prediction-EDA-Classification-Models Python (27) Stroke Prediction-EDA-Classification-Models Python (27)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/e5d3f-image-34.png)
XGboost
xgboost = GradientBoostingClassifier(random_state=0)xgboost.fit(X_train, y_train)#== #Score #== xgboost_score = xgboost.score(X_train, y_train)xgboost_test = xgboost.score(X_test, y_test)#== #testing model #== y_pred = xgboost.predict(X_test)#== #evaluation#== cm = confusion_matrix(y_test,y_pred)print('Training Score',xgboost_score)print('Testing Score \n',xgboost_test)#=== #Confusion Matrix plt.figure(figsize=(14,5))conf_matrix = pd.DataFrame(data=cm,columns=['Predicted:0','Predicted:1'],index=['Actual:0','Actual:1'])sns.heatmap(conf_matrix, annot=True,fmt='d',cmap="Greens");print(accuracy_score(y_test,y_pred))
![Stroke Prediction-EDA-Classification-Models Python (28) Stroke Prediction-EDA-Classification-Models Python (28)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/0d506-image-35.png)
SVM
svc = SVC(random_state=0)svc.fit(X_train, y_train)#== #Score #== svc_score = svc.score(X_train, y_train)svc_test = svc.score(X_test, y_test)#== #testing model #== y_pred = svc.predict(X_test)#== #evaluation#== cm = confusion_matrix(y_test,y_pred)print('Training Score',svc_score)print('Testing Score \n',svc_test)print(cm
![Stroke Prediction-EDA-Classification-Models Python (29) Stroke Prediction-EDA-Classification-Models Python (29)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/32b3c-image-36.png)
Random Forest Classifier
forest = RandomForestClassifier(n_estimators = 100)#== forest.fit(X_train, y_train)#== #Score #== forest_score = forest.score(X_train, y_train)forest_test = forest.score(X_test, y_test)#== #testing model #== y_pred = forest.predict(X_test)#== #evaluation#== cm = confusion_matrix(y_test,y_pred)print('Training Score',forest_score)print('Testing Score \n',forest_test)print(cm)
![Stroke Prediction-EDA-Classification-Models Python (30) Stroke Prediction-EDA-Classification-Models Python (30)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/94e1e-image-37.png)
Logistic Regression
model = LogisticRegression()model.fit(X_train, y_train)score = model.score(X_test, y_test)print('Testing Score \n',score)logistic_score = model.score(X_train, y_train)logistic_test = model.score(X_test, y_test)#== y_pred= model.predict(X_test)print(classification_report(y_test, y_pred))#== cm = confusion_matrix(y_test,y_pred)print(cm)
![Stroke Prediction-EDA-Classification-Models Python (31) Stroke Prediction-EDA-Classification-Models Python (31)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/132ba-image-38.png)
Feature Importance using Logistic Regression
coef = model.coef_[0]coef = [abs(number) for number in coef]print(coef)
![Stroke Prediction-EDA-Classification-Models Python (32) Stroke Prediction-EDA-Classification-Models Python (32)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/2aed5-image-39.png)
cols = list(df.columns)cols.index('stroke')#== #Delete target label #== del cols[5]cols
![Stroke Prediction-EDA-Classification-Models Python (33) Stroke Prediction-EDA-Classification-Models Python (33)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/66a39-image-40.png)
sorted_index = sorted(range(len(coef)), key = lambda k: coef[k], reverse = True)for idx in sorted_index: print(cols[idx])
![Stroke Prediction-EDA-Classification-Models Python (34) Stroke Prediction-EDA-Classification-Models Python (34)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/95fbe-image-41.png)
Although BMI is considered an indicator for recognizing strokes, there are a large number of values in the normal range and not a high rate that indicates a stroke.
MLP NN Classifier
X=df.drop(['stroke','gender','bmi','Residence_type','work_type_Never_worked','smoking_status_Unknown'], axis=1).values #X = df.drop(['stroke','bmi'], axis=1).values y = df['stroke'].valuesX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)# mlp = MLPClassifier(hidden_layer_sizes=(1000,300, 300, 300), solver='adam', shuffle=False, tol = 0.0001)mlp=MLPClassifier(hidden_layer_sizes=(300,300,300), max_iter=1000, alpha=0.00001, solver='adam', verbose=10, random_state=21)mlp.fit(X_train, y_train)mlp_pred= mlp.predict(X_test)mlp_score = mlp.score(X_train, y_train)mlp_test = mlp.score(X_test, y_test)y_pred =mlp.predict(X_test)#== #evaluation#== cm = confusion_matrix(y_test,y_pred)print('Training Score',mlp_score)print('Testing Score \n',mlp_test)print(cm)
Iteration 1, loss = 0.25073982Iteration 2, loss = 0.15601721Iteration 3, loss = 0.15148236Iteration 4, loss = 0.15011300Iteration 5, loss = 0.14801346Iteration 6, loss = 0.14705574Iteration 7, loss = 0.14343648Iteration 8, loss = 0.14475396Iteration 9, loss = 0.14122289Iteration 10, loss = 0.14020491Iteration 11, loss = 0.14082460Iteration 12, loss = 0.13869296Iteration 13, loss = 0.13551809Iteration 14, loss = 0.13677271Iteration 15, loss = 0.13306991Iteration 16, loss = 0.13627428Iteration 17, loss = 0.13310803Iteration 18, loss = 0.13113676Iteration 19, loss = 0.12786408Iteration 20, loss = 0.12653028Iteration 21, loss = 0.12525292Iteration 22, loss = 0.12757926Iteration 23, loss = 0.12214366Iteration 24, loss = 0.12129737Iteration 25, loss = 0.12211088Iteration 26, loss = 0.12322562Iteration 27, loss = 0.11950508Iteration 28, loss = 0.11867142Iteration 29, loss = 0.11774275Iteration 30, loss = 0.11903667Iteration 31, loss = 0.11632040Iteration 32, loss = 0.11553193Iteration 33, loss = 0.11295480Iteration 34, loss = 0.11218260Iteration 35, loss = 0.10999969Iteration 36, loss = 0.11053086Iteration 37, loss = 0.10904621Iteration 38, loss = 0.10831232Iteration 39, loss = 0.10686522Iteration 40, loss = 0.10644428Iteration 41, loss = 0.10688178Iteration 42, loss = 0.10343191Iteration 43, loss = 0.10450590Iteration 44, loss = 0.10335569Iteration 45, loss = 0.10186789Iteration 46, loss = 0.10005436Iteration 47, loss = 0.10356312Iteration 48, loss = 0.10151862Iteration 49, loss = 0.10214588Iteration 50, loss = 0.10308373Iteration 51, loss = 0.09923623Iteration 52, loss = 0.09605030Iteration 53, loss = 0.09936861Iteration 54, loss = 0.09486939Iteration 55, loss = 0.09245237Iteration 56, loss = 0.09775333Iteration 57, loss = 0.09387213Iteration 58, loss = 0.09417488Iteration 59, loss = 0.09496724Iteration 60, loss = 0.09067467Iteration 61, loss = 0.08957575Iteration 62, loss = 0.09188115Iteration 63, loss = 0.09131175Iteration 64, loss = 0.08956810Iteration 65, loss = 0.09027089Iteration 66, loss = 0.09068501Iteration 67, loss = 0.08620702Iteration 68, loss = 0.08673546Iteration 69, loss = 0.08283293Iteration 70, loss = 0.08313578Iteration 71, loss = 0.08808702Iteration 72, loss = 0.08630748Iteration 73, loss = 0.08130300Iteration 74, loss = 0.08077653Iteration 75, loss = 0.08214762Iteration 76, loss = 0.08222929Iteration 77, loss = 0.07996879Iteration 78, loss = 0.08085455Iteration 79, loss = 0.07764043Iteration 80, loss = 0.08130066Iteration 81, loss = 0.07998853Iteration 82, loss = 0.07847984Iteration 83, loss = 0.08112860Iteration 84, loss = 0.07691877Iteration 85, loss = 0.07564515Iteration 86, loss = 0.07751632Iteration 87, loss = 0.07696659Iteration 88, loss = 0.08058930Iteration 89, loss = 0.07747721Iteration 90, loss = 0.07779515Iteration 91, loss = 0.07564913Iteration 92, loss = 0.07393943Iteration 93, loss = 0.07744015Iteration 94, loss = 0.07466905Iteration 95, loss = 0.07443650Iteration 96, loss = 0.07214443Iteration 97, loss = 0.07238843Iteration 98, loss = 0.07042956Iteration 99, loss = 0.06888013Iteration 100, loss = 0.06920919Iteration 101, loss = 0.06901262Iteration 102, loss = 0.07552961Iteration 103, loss = 0.07174945Iteration 104, loss = 0.07029673Iteration 105, loss = 0.07013814Iteration 106, loss = 0.06784715Iteration 107, loss = 0.07159969Iteration 108, loss = 0.06863485Iteration 109, loss = 0.06673842Iteration 110, loss = 0.06937063Iteration 111, loss = 0.06617347Iteration 112, loss = 0.06500215Iteration 113, loss = 0.06340067Iteration 114, loss = 0.06236733Iteration 115, loss = 0.06458241Iteration 116, loss = 0.06619115Iteration 117, loss = 0.07260931Iteration 118, loss = 0.06929901Iteration 119, loss = 0.06682100Iteration 120, loss = 0.06453708Iteration 121, loss = 0.06246274Iteration 122, loss = 0.06107513Iteration 123, loss = 0.06234550Iteration 124, loss = 0.06083020Iteration 125, loss = 0.06177546Iteration 126, loss = 0.05927088Iteration 127, loss = 0.05970574Iteration 128, loss = 0.06032682Iteration 129, loss = 0.06070094Iteration 130, loss = 0.06367095Iteration 131, loss = 0.05975269Iteration 132, loss = 0.06050048Iteration 133, loss = 0.06072319Iteration 134, loss = 0.06303969Iteration 135, loss = 0.06479217Iteration 136, loss = 0.06493533Iteration 137, loss = 0.06678607Training loss did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.Training Score 0.9751188146491473Testing Score 0.9347684279191129[[1420 29] [ 71 13]]
plt.figure(figsize=(14,5))cm = confusion_matrix(y_test,y_pred)conf_matrix = pd.DataFrame(data=cm,columns=['Predicted:0','Predicted:1'],index=['Actual:0','Actual:1'])sns.heatmap(conf_matrix, annot=True,fmt='d',cmap="Reds");
![Stroke Prediction-EDA-Classification-Models Python (35) Stroke Prediction-EDA-Classification-Models Python (35)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/4ddeb-image-42.png)
Sensitivity & Specificity
![Stroke Prediction-EDA-Classification-Models Python (36) Stroke Prediction-EDA-Classification-Models Python (36)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/d7456-sensitivity.jpg)
TN=cm[0,0]TP=cm[1,1]FN=cm[1,0]FP=cm[0,1]sensitivity=TP/float(TP+FN)specificity=TN/float(TN+FP)
print('The acuuracy of the model = TP+TN/(TP+TN+FP+FN) = ',(TP+TN)/float(TP+TN+FP+FN),'\n','The Missclassification = 1-Accuracy = ',1-((TP+TN)/float(TP+TN+FP+FN)),'\n','Sensitivity or True Positive Rate = TP/(TP+FN) = ',TP/float(TP+FN),'\n','Specificity or True Negative Rate = TN/(TN+FP) = ',TN/float(TN+FP),'\n')
![Stroke Prediction-EDA-Classification-Models Python (37) Stroke Prediction-EDA-Classification-Models Python (37)](https://i0.wp.com/geekycodesin.wordpress.com/wp-content/uploads/2024/06/27f3d-image-43.png)
This Notebook was written on Kaggle By Ahmed Ashour. Click on his name to follow him. To read more about Python Notebooks click here
Discover more from Geeky Codes
Subscribe to get the latest posts to your email.