binary classification

(강사님이 전 강의와 달라져서 변수명 등이 일정치 않음. 버전도 안맞음. 잘못된 부분이 있을 수 있음)

어떤 데이터의 라벨이 0또는 1로 결정되는 경우 binary classification으로 새로운 데이터 input의 라벨을 예측하는 방법

x값에 따라 y값이 0과 1로 표현되는 데이터 (x,y) 셋을 생각해보자.

왼쪽 그림처럼 linear regression으로 예측하는 것은 부자연스럽고 어렵기때문에 오른쪽 그림과 같은 방법을 사용해야 한다.

일반적인 경우, 즉 ( x1,x2,x3 ... xn | y )꼴인 데이터들에 대해서 선형회귀 때 사용했던 가설함수를 링크함수$f(x)$로 사용하고 가설함수를 $sigmoid(f(x))$로 정의한다.

$$f(X) = XW + b $$

$$h(X) = \frac{1}{1+e^{-f(X)}}\;\;\;\;\;\;(0<=h(X)<=1)$$

비용함수를 선형회귀와 같이 오차 제곱의 평균을 사용한다면 convex 성질을 잃게된다. 따라서 새로운 방법을 사용한다.

비용함수를 코드로 나타내면 다음과 같다.

cost = -tf.reduce_mean(labels*tf.math.log(hypothesis) + (1-labels) * tf.math.log(1-hypothesis))

가설함수와 비용함수가 마련됐다면 선형회귀 때와 같이 경사하강법을 수행해주면 된다. 다음 예로 학습해보자.

점의 색이 라벨이 되는 위 같은 데이터를 가지고 가중치 벡터를 조정하는 알고리즘. (빨강 0, 파랑 1)

(초기)

W,b는 랜덤값을 할당한다. (대충 0 넣음)

W = tf.Variable(tf.zeros([2, 1]), name='weight')
b = tf.Variable(tf.zeros([1]), name='bias')

(반복)

cost함수를 잘 정의해서 똑같이 경사하강법 시행한다.

with tf.GradientTape() as tape:
    cost = cost_fn(x_train, y_train)
    
W_grad, b_grad = tape.gradient(cost, [W,b])

W.assign_sub(learning_rate * W_grad)
b.assign_sub(learning_rate * b_grad)

전체 코드

import tensorflow as tf
import numpy as np

x_train = np.array([
    [1, 2],
    [2, 3],
    [3, 1],
    [4, 3],
    [5, 3],
    [6, 2]], dtype=np.float32)
y_train = np.array([
    [0],
    [0],
    [0],
    [1],
    [1],
    [1]], dtype=np.float32)

W = tf.Variable(tf.zeros([2, 1]), name='weight')
b = tf.Variable(tf.zeros([1]), name='bias')

def logistic_regression(features):
    hypothesis = tf.sigmoid(tf.matmul(features, W) + b)
    return hypothesis

def cost_fn(features, labels):
    hypothesis = logistic_regression(features)
    cost = -tf.reduce_mean(labels * tf.math.log(hypothesis) + (1 - labels) * tf.math.log(1 - hypothesis))
    return cost

n_epochs = 3000
learning_rate = 0.01
print("  iter|        cost|          W1|          W2")
for i in range(n_epochs+1):
    with tf.GradientTape() as tape:
        cost = cost_fn(x_train, y_train)
        
    W_grad, b_grad = tape.gradient(cost, [W,b])
    
    W.assign_sub(learning_rate * W_grad)
    b.assign_sub(learning_rate * b_grad)
    
    if i%100 == 0:
        print("{:5} | {:10.4f} | {:10.4f} | {:10.4f}".format(i,cost.numpy(), W.numpy()[0][0], W.numpy()[1][0]))

경사하강법으로 만들어진 가설함수를 사용해서 새로운 데이터를 예측해보자

#위 코드 밑에다 붙여 사용
x_test = np.array([
    [2, 2],
    [2, 0],
    [4, 0],
    [4, 2],
    [5, 3]], dtype=np.float32)

for x in x_test:
    x = tf.expand_dims(x,0)  #tf.matmul에 호환되기 위해선 ndarray->tensor로 바꿔줘야 한다.
    output = tf.sigmoid(tf.matmul(x, W)+b)
    print("{}".format(output.numpy()))

www.desmos.com 에서 직접 찍음

잘 동작하는 것을 볼 수 있다.

댓글에 케라스를 사용한 분이 계셔서 리뷰.

import tensorflow as tf
import numpy as np

x_train = np.array([
    [1, 2],
    [2, 3],
    [3, 1],
    [4, 3],
    [5, 3],
    [6, 2]], dtype=np.float32)
y_train = np.array([
    [0],
    [0],
    [0],
    [1],
    [1],
    [1]], dtype=np.float32)


# tf.data.Dataset 파이프라인을 이용하여 값을 입력. GPU등과 호환하기 위해서라고 함
# from_tensor_slices 클래스 매서드를 사용하면 리스트, 넘파이, 텐서플로 자료형에서 데이터셋을 만들 수 있음
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(len(x_train))
W = tf.Variable(tf.zeros([2, 1]), name='weight')
b = tf.Variable(tf.zeros([1]), name='bias')

def logistic_regression(features):
    hypothesis = tf.sigmoid(tf.matmul(features, W) + b)
    return hypothesis

def cost_fn(features, labels):
    hypothesis = logistic_regression(features)
    cost = -tf.reduce_mean(labels * tf.math.log(hypothesis) + (1 - labels) * tf.math.log(1 - hypothesis))
    return cost

def grad(hypothesis, features, labels):
    with tf.GradientTape() as tape:
        cost = cost_fn(features, labels)
    return tape.gradient(cost, [W,b])

optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)

EPOCHS = 3000

for step in range(EPOCHS + 1):
    for features, labels in iter(dataset): #이 for문은 매 step마다 한 번 수행되는 것으로 보인다.
        hypothesis = logistic_regression(features)
        grads = grad(hypothesis, features, labels)
        optimizer.apply_gradients(grads_and_vars=zip(grads, [W,b])) #W.sub_assign(learning_rate*W_grad)역할인듯
        if step % 100 == 0:
            print("Iter: {}, cost: {:.4f}, W1 : {:.4f}, W2 : {:.4f}".format(\
                                    step, cost_fn(features, labels), W.numpy()[0][0], W.numpy()[1][0]))

위의 코드와 결과가 정확히 같다. 근데 좀 느리다.

처음 코드(왼쪽), 케라스 사용 코드(오른쪽)

저작자표시 (새창열림)

'ML&DATA > 모두를 위한 딥러닝' 카테고리의 다른 글

실습 (0)	2020.07.13
application & tips (0)	2020.07.13
multinomial classification (0)	2020.07.12
simple linear regression (단순 선형 회귀) (0)	2020.07.08
용어/개념 (0)	2020.07.08

CS

binary classification

'ML&DATA > 모두를 위한 딥러닝' 카테고리의 다른 글

티스토리툴바

binary classification

'ML&DATA > 모두를 위한 딥러닝' 카테고리의 다른 글

관련글

티스토리툴바