1 분 소요

[Notice] [ML_5]

Classification(iris data with SGD and Hyper-parameter tuning)

import warnings

# Avoid unnecessary warning output.
warnings.filterwarnings('ignore')
import pandas as pd
import seaborn as sns
from sklearn.datasets import load_iris
iris = load_iris()
data = iris['data']
feature_names = iris['feature_names']
target = iris['target']
iris['target_names']
array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

Make DataFrame

df_iris = pd.DataFrame(data, columns=feature_names)
df_iris.head()
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
df_iris['target'] = target
df_iris.head()
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0
1 4.9 3.0 1.4 0.2 0
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 0
from sklearn.model_selection import train_test_split
x_train, x_valid, y_train, y_valid = train_test_split(df_iris.drop('target', 1), df_iris['target'], stratify = df_iris['target'])
sns.countplot(y_train)
<AxesSubplot:xlabel='target', ylabel='count'>

x_train.shape, y_train.shape
((112, 4), (112,))
x_valid.shape, y_valid.shape
((38, 4), (38,))

SGDClassifier

from IPython.display import Image
# source: https://machinelearningnotepad.wordpress.com/
Image('https://machinelearningnotepad.files.wordpress.com/2018/04/yk1mk.png', width=500)

sklearn document

from sklearn.linear_model import SGDClassifier
sgd = SGDClassifier()
sgd.fit(x_train, y_train)
SGDClassifier()
prediction = sgd.predict(x_valid)
(prediction == y_valid).mean()
0.7894736842105263

Hyper-parameter tuning

document

  • random_state: Fixed when hyperparameter tuning

  • n_jobs=-1: use all CPU (fast learning speed)

sgd = SGDClassifier(penalty = 'elasticnet', random_state = 0, n_jobs = -1)
sgd.fit(x_train, y_train)
SGDClassifier(n_jobs=-1, penalty='elasticnet', random_state=0)
prediction = sgd.predict(x_valid)
(prediction == y_valid).mean()
0.6842105263157895

댓글남기기