Classification(iris data with SGD and Hyper-parameter tuning)
[Notice] [ML_5]
Classification(iris data with SGD and Hyper-parameter tuning)
import warnings
# Avoid unnecessary warning output.
warnings.filterwarnings('ignore')
import pandas as pd
import seaborn as sns
from sklearn.datasets import load_iris
iris = load_iris()
data = iris['data']
feature_names = iris['feature_names']
target = iris['target']
iris['target_names']
array(['setosa', 'versicolor', 'virginica'], dtype='<U10')
Make DataFrame
df_iris = pd.DataFrame(data, columns=feature_names)
df_iris.head()
sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | |
---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 |
1 | 4.9 | 3.0 | 1.4 | 0.2 |
2 | 4.7 | 3.2 | 1.3 | 0.2 |
3 | 4.6 | 3.1 | 1.5 | 0.2 |
4 | 5.0 | 3.6 | 1.4 | 0.2 |
df_iris['target'] = target
df_iris.head()
sepal length (cm) | sepal width (cm) | petal length (cm) | petal width (cm) | target | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | 0 |
1 | 4.9 | 3.0 | 1.4 | 0.2 | 0 |
2 | 4.7 | 3.2 | 1.3 | 0.2 | 0 |
3 | 4.6 | 3.1 | 1.5 | 0.2 | 0 |
4 | 5.0 | 3.6 | 1.4 | 0.2 | 0 |
from sklearn.model_selection import train_test_split
x_train, x_valid, y_train, y_valid = train_test_split(df_iris.drop('target', 1), df_iris['target'], stratify = df_iris['target'])
sns.countplot(y_train)
<AxesSubplot:xlabel='target', ylabel='count'>
x_train.shape, y_train.shape
((112, 4), (112,))
x_valid.shape, y_valid.shape
((38, 4), (38,))
SGDClassifier
from IPython.display import Image
# source: https://machinelearningnotepad.wordpress.com/
Image('https://machinelearningnotepad.files.wordpress.com/2018/04/yk1mk.png', width=500)
from sklearn.linear_model import SGDClassifier
sgd = SGDClassifier()
sgd.fit(x_train, y_train)
SGDClassifier()
prediction = sgd.predict(x_valid)
(prediction == y_valid).mean()
0.7894736842105263
Hyper-parameter tuning
-
random_state: Fixed when hyperparameter tuning
-
n_jobs=-1: use all CPU (fast learning speed)
sgd = SGDClassifier(penalty = 'elasticnet', random_state = 0, n_jobs = -1)
sgd.fit(x_train, y_train)
SGDClassifier(n_jobs=-1, penalty='elasticnet', random_state=0)
prediction = sgd.predict(x_valid)
(prediction == y_valid).mean()
0.6842105263157895
댓글남기기