jjinyeok 성장일지

시각지능 딥러닝 #2 - CNN 실습 2022/09/19~2022/09/23 본문

[KT AIVLE School]

시각지능 딥러닝 #2 - CNN 실습 2022/09/19~2022/09/23

jjinyeok 2022. 9. 20. 17:08

  이전 정리에서는 CNN의 전반적인 개요와 Convolution과 Pooling Layer에 대해 정리했다. 이번 정리는 좀 더 코드 위주로 CNN의 적용을 통해 정리하고자 한다. 다만 CNN은 ANN과 같이 하나의 기법이다. 단지 Feature Map을 활용해 공간 구조를 반영한다는 점에서 차이가 있다. 따라서 CNN 모델은 정해져 있는 것이 아니고 개발자가 임의로 Convolution Layer와 Pooling Layer를 쌓아 Feature Map을 만들고 만들어진 Feature Map을 Neural Network를 통해 출력한다. 이때 다른 Weight와 같이 Filter도 학습된다. 이런 구조를 이해하고시작하도록 하자.

 

1. CNN으로 Fashion MNIST 데이터 Classification 문제 해결하기

  CNN으로 Fasion MNIST를 구현하기에 앞서 모델의 구조는 다음과 같다. 모든 Activation Function은 swish로 적용하였다.

  1. 인풋레이어
  2. Convolution : 필터수 32개, 사이즈(3, 3), same padding
  3. BatchNormalization
  4. Convolution : 필터수 32개, 사이즈(3, 3), same padding
  5. BatchNormalization
  6. MaxPooling : 사이즈(2,2) 스트라이드(2,2)
  7. DropOut : 25% 비활성화
  8. Convolution : 필터수 64개, 사이즈(3, 3), same padding
  9. BatchNormalization
  10. Convolution : 필터수 64개, 사이즈(3, 3), same padding
  11. BatchNormalization
  12. MaxPooling : 사이즈(2,2) 스트라이드(2,2)
  13. DropOut : 25% 비활성화
  14. Flatten()
  15. Fully Connected Layer : 노드 512개
  16. BatchNormalization
  17. 아웃풋레이어
import tensorflow as tf
from tensorflow import keras

'데이터 불러오기'
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
# (60000, 28, 28) (60000,) (10000, 28, 28) (10000,)

'데이터 전처리'
# x(Image Data)에 채널 추가
x_train = x_train.reshape(-1, x_train.shape[1], x_train.shape[2], 1)
x_test = x_test.reshape(-1, x_test.shape[1], x_test.shape[2], 1)

# x에 대해 MinMax Scaling
x_train = x_train / 255.
x_test = x_test / 255.

# y에 대해 One-Hot Encoding
from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
# (60000, 28, 28, 1) (60000, 10) (10000, 28, 28, 1) (10000, 10)
'모델 생성 By Functional Style'
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, BatchNormalization, Dropout, Flatten
from tensorflow.keras.activations import swish, softmax
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.optimizers import Adam

# Session Clear
keras.backend.clear_session()

# Layer
input_layer = Input(shape=(28, 28, 1))
conv_layer = Conv2D(filters=32,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(input_layer)
batch_norm_layer = BatchNormalization()(conv_layer)
conv_layer = Conv2D(filters=32,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(batch_norm_layer)
batch_norm_layer = BatchNormalization()(conv_layer)
pool_layer = MaxPool2D(pool_size=(2, 2),
                       strides=(2, 2),
                       )(batch_norm_layer)
dropout_layer = Dropout(rate=0.25)(pool_layer)
conv_layer = Conv2D(filters=64,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(dropout_layer)
batch_norm_layer = BatchNormalization()(conv_layer)
conv_layer = Conv2D(filters=64,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(batch_norm_layer)
batch_norm_layer = BatchNormalization()(conv_layer)
pool_layer = MaxPool2D(pool_size=(2, 2),
                       strides=(2, 2),
                       )(batch_norm_layer)
dropout_layer = Dropout(rate=0.25)(pool_layer)
flatten_layer = Flatten()(dropout_layer)
hidden_layer = Dense(512, activation=swish)(flatten_layer)
batch_norm_layer = BatchNormalization()(hidden_layer)
output_layer = Dense(10, activation=softmax)(batch_norm_layer)

# Model
model = Model(inputs=input_layer, outputs=output_layer)

# Model Compile
model.compile(loss=categorical_crossentropy, optimizer=Adam(), metrics=['accuracy'])

# Model Summary
print(model.summary())
# Model: "model"
# _________________________________________________________________
#  Layer (type)                Output Shape              Param #   
# =================================================================
#  input_1 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
#  conv2d (Conv2D)             (None, 28, 28, 32)        320       
                                                                 
#  batch_normalization (BatchN  (None, 28, 28, 32)       128       
#  ormalization)                                                   
                                                                 
#  conv2d_1 (Conv2D)           (None, 28, 28, 32)        9248      
                                                                 
#  batch_normalization_1 (Batc  (None, 28, 28, 32)       128       
#  hNormalization)                                                 
                                                                 
#  max_pooling2d (MaxPooling2D  (None, 14, 14, 32)       0         
#  )                                                               
                                                                 
#  dropout (Dropout)           (None, 14, 14, 32)        0         
                                                                 
#  conv2d_2 (Conv2D)           (None, 14, 14, 64)        18496     
                                                                 
#  batch_normalization_2 (Batc  (None, 14, 14, 64)       256       
#  hNormalization)                                                 
                                                                 
#  conv2d_3 (Conv2D)           (None, 14, 14, 64)        36928     
                                                                 
#  batch_normalization_3 (Batc  (None, 14, 14, 64)       256       
#  hNormalization)                                                 
                                                                 
#  max_pooling2d_1 (MaxPooling  (None, 7, 7, 64)         0         
#  2D)                                                             
                                                                 
#  dropout_1 (Dropout)         (None, 7, 7, 64)          0         
                                                                 
#  flatten (Flatten)           (None, 3136)              0         
                                                                 
#  dense (Dense)               (None, 512)               1606144   
                                                                 
#  batch_normalization_4 (Batc  (None, 512)              2048      
#  hNormalization)                                                 
                                                                 
#  dense_1 (Dense)             (None, 10)                5130      
                                                                 
# =================================================================
# Total params: 1,679,082
# Trainable params: 1,677,674
# Non-trainable params: 1,408
# _________________________________________________________________
# None
'모델 학습 With Early Stopping'
from tensorflow.keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss',
                               min_delta=0,
                               patience=5,
                               verbose=1,
                               restore_best_weights=True,
                               )

model.fit(x_train, y_train, epochs=100, verbose=1,
          validation_split=0.2,
          callbacks=[early_stopping],
          )
'모델 평가'
from sklearn.metrics import accuracy_score, classification_report
pred = model.predict(x_test)
print(accuracy_score(y_test.argmax(axis=1), pred.argmax(axis=1)))
# 0.9252
print(classification_report(y_test.argmax(axis=1), pred.argmax(axis=1)))
#               precision    recall  f1-score   support

#            0       0.88      0.90      0.89       982
#            1       0.99      0.99      0.99       992
#            2       0.91      0.90      0.91      1008
#            3       0.93      0.94      0.93       984
#            4       0.93      0.87      0.90      1067
#            5       0.99      0.99      0.99       995
#            6       0.77      0.80      0.79       961
#            7       0.95      0.98      0.96       968
#            8       0.99      0.99      0.99      1005
#            9       0.98      0.95      0.97      1038

#     accuracy                           0.93     10000
#    macro avg       0.93      0.93      0.93     10000
# weighted avg       0.93      0.93      0.93     10000

앞선 모델의 계층

 

2. CNN으로 CIFAR-10 데이터 Classification 문제 해결하기

  CIFAR-10을 데이터셋을 ANN을 이용해 Multi-Class Classification 문제를 풀었을 때 Hidden Layer을 아무리 쌓더라도 Accuracy가 0.4 정도로 상당히 낮은 모습을 보였다. CNN을 사용해 공간 구조를 반영한다면 Accuracy가 얼마나 올라갈 수 있을까?

import tensorflow as tf
from tensorflow import keras

'데이터 불러오기'
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
# (50000, 32, 32, 3) (50000, 1) (10000, 32, 32, 3) (10000, 1)

'데이터 전처리'
# x에 대해 Standardization Scaling
mean_x = x_train.mean()
std_x = x_train.std()
x_train = (x_train - mean_x) / std_x
x_test = (x_test - mean_x) / std_x

# y에 대해 One-Hot Encoding
from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
print(x_train.shape, y_train.shape, x_test.shape, y_test.shape)
# (50000, 32, 32, 3) (50000, 10) (10000, 32, 32, 3) (10000, 10)
'모델 생성 By Functional Style'
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, BatchNormalization, Dropout, Flatten
from tensorflow.keras.activations import swish, softmax
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.optimizers import Adam

keras.backend.clear_session()

input_layer = Input(shape=(32, 32, 3))
conv_layer = Conv2D(filters=32,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(input_layer)
conv_layer = Conv2D(filters=32,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(conv_layer)
batch_norm_layer = BatchNormalization()(conv_layer)
pool_layer = MaxPooling2D(pool_size=(2, 2),
                          strides=(2, 2),
                          )(batch_norm_layer)
dropout_layer = Dropout(rate=0.25)(pool_layer)
conv_layer = Conv2D(filters=64,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(dropout_layer)
conv_layer = Conv2D(filters=64,
                    kernel_size=(3, 3),
                    strides=(1, 1),
                    padding='same',
                    activation=swish,
                    )(conv_layer)
batch_norm_layer = BatchNormalization()(conv_layer)
pool_layer = MaxPooling2D(pool_size=(2, 2),
                          strides=(2, 2),
                          )(batch_norm_layer)
dropout_layer = Dropout(rate=0.25)(pool_layer)
flatten_layer = Flatten()(dropout_layer)
hidden_layer = Dense(1024, activation=swish)(flatten_layer)
batch_norm_layer = BatchNormalization()(hidden_layer)
dropout_layer = Dropout(rate=0.35)(batch_norm_layer)
output_layer = Dense(10, activation=softmax)(dropout_layer)

model = Model(inputs=input_layer, outputs=output_layer)

model.compile(loss=categorical_crossentropy, optimizer=Adam(), metrics=['accuracy'])

print(model.summary())
# Model: "model"
# _________________________________________________________________
#  Layer (type)                Output Shape              Param #   
# =================================================================
#  input_1 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
#  conv2d (Conv2D)             (None, 32, 32, 32)        896       
                                                                 
#  conv2d_1 (Conv2D)           (None, 32, 32, 32)        9248      
                                                                 
#  batch_normalization (BatchN  (None, 32, 32, 32)       128       
#  ormalization)                                                   
                                                                 
#  max_pooling2d (MaxPooling2D  (None, 16, 16, 32)       0         
#  )                                                               
                                                                 
#  dropout (Dropout)           (None, 16, 16, 32)        0         
                                                                 
#  conv2d_2 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
#  conv2d_3 (Conv2D)           (None, 16, 16, 64)        36928     
                                                                 
#  batch_normalization_1 (Batc  (None, 16, 16, 64)       256       
#  hNormalization)                                                 
                                                                 
#  max_pooling2d_1 (MaxPooling  (None, 8, 8, 64)         0         
#  2D)                                                             
                                                                 
#  dropout_1 (Dropout)         (None, 8, 8, 64)          0         
                                                                 
#  flatten (Flatten)           (None, 4096)              0         
                                                                 
#  dense (Dense)               (None, 1024)              4195328   
                                                                 
#  batch_normalization_2 (Batc  (None, 1024)             4096      
#  hNormalization)                                                 
                                                                 
#  dropout_2 (Dropout)         (None, 1024)              0         
                                                                 
#  dense_1 (Dense)             (None, 10)                10250     
                                                                 
# =================================================================
# Total params: 4,275,626
# Trainable params: 4,273,386
# Non-trainable params: 2,240
# _________________________________________________________________
# None
'모델 학습 With Early Stopping'
from tensorflow.keras.callbacks import EarlyStopping

early_stopping = EarlyStopping(monitor='val_loss',
                               min_delta=0,
                               patience=5,
                               verbose=1,
                               restore_best_weights=True,
                               )

model.fit(x_train, y_train, epochs=100, verbose=1,
          validation_split=0.2,
          callbacks=[early_stopping],
          )
'모델 평가'
from sklearn.metrics import accuracy_score, classification_report
pred = model.predict(x_test)
print(accuracy_score(y_test.argmax(axis=1), pred.argmax(axis=1)))
# 0.7975
print(classification_report(y_test.argmax(axis=1), pred.argmax(axis=1)))
#               precision    recall  f1-score   support

#            0       0.81      0.84      0.83      1000
#            1       0.90      0.90      0.90      1000
#            2       0.73      0.70      0.72      1000
#            3       0.62      0.60      0.61      1000
#            4       0.79      0.73      0.76      1000
#            5       0.67      0.76      0.71      1000
#            6       0.80      0.88      0.84      1000
#            7       0.90      0.79      0.84      1000
#            8       0.91      0.87      0.89      1000
#            9       0.86      0.90      0.88      1000

#     accuracy                           0.80     10000
#    macro avg       0.80      0.80      0.80     10000
# weighted avg       0.80      0.80      0.80     10000

  Accuracy가 0.4 정도에서 0.8 정도로 크게 성능이 향상되었다. 데이터의 양과 질도 중요하지만 CNN과 같은 인공지능 알고리즘 또한 중요하다는 것을 알 수 있다. 추가적으로 CIFAR-10 데이터는 화질이 낮아 개인적으로 나 또한 구분이 안되는 이미지가 많다. CIFAR-10가 꽤나 어려운 데이터임을 감안했을 때 Accuracy 약 0.8 정도는 꽤 준수한 수치이다.

Comments