What is AI?
AI is the intelligence of machines and the branch of computer science that aims to create it.
AI is the science and engineering of making intelligent machines, especially intelligent computer programs.
AI is often used to describe machines that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”.
Unsupervised Learning: This type of learning focuses on finding patterns, regularities or structure in unlabeled data.
ANN의 개요
Motivation problem
: 자동차 vs 트럭 분류기 문제를 통해 이미지 분류 개념 설명.
: 입력(이미지) -> 출력(고정된 범주의 하나로 할당)
컴퓨터가 보는 방식
시각 인식의 어려움
머신러닝 기반 이미지 분류 단계
Step1: 데이터셋 준비
이미지의 픽셀 값 가져오기: RGB 이미지 -> 각 픽셀의 [R, G, B] 값으로 구성 또는 grayscale 변환 후 1차원 벡터로 변환.
Step2: 데이터셋 분할
Train/Validation/Test
머신러닝 기초 모델
Baseline 모델 선택
Feature Engineering
해결책: ANN 활용 -> 머신러닝 모델이 자동으로 의미있는 특징을 학습하도록 유도.
ANN 개념
: 인공 두뇌를 모방한 모델로, 뉴런의 구조를 기반으로 학습.
Multi-Layer Neural Networks
기본적인 ANN 구조: Input Layer -> Hidden Layer -> Output Layer
왜 한 개의 층으로는 부족한가?
: 단층 신경망 (1-layer ANN) = logistic regression
비선형 데이터를 제대로 학습할 수 없음.
단순 선형 변환만 가능.
Activation Functions
1) Step Function: 미분 불가능하여 사용하기 어려움
2) Sigmoid
: 출력범위: (0, 1)
문제점: 큰 값에서는 기울기가 사라지는 Vanishing Gradient Problem 발생
3) Other Activation Functions : ReLU, Tanh 등도 자주 사용됨.
Training Neural Networks
Loss Function
Backpropagation
모델 성능 개선 전략
1) 하이퍼파라미터 튜닝
- 은닉층 개수, 뉴런 수, 활성화 함수 선택
- learning rate, batch size 조정
2) Regularization
- 과적합 방지를 위해 drop out, L1/L2 정규화 적용
3) Data Augmentation
- 이미지 변형을 통해 데이터 다양성을 확보하여 일반화 성능 향상
The main challenges in image classification include:
Logistic regression performs poorly on raw pixel data because:
Feature engineering is the process of manually selecting or transforming raw data into meaningful representations for machine learning models.
Limitations:
A single-layer neural network cannot model complex decision boundaries fails to classify non-linearly separable data.
Without non-linear activation functions, deep neural networks behave as a single linear transformation, limiting their ability to learn complex patterns. Non-linearity allows the network to model intricate relationships in data.
Feed-forward in artificial neural networks refers to the process where the input data passes through the network layer by layer without loops or feedback connections. The information moves in one direction—from the input layer to the output layer.
A fully-connected layer in a neural network means that every neuron in a layer is connected to every neuron in the next layer. This allows the model to learn complex patterns but increases the number of parameters significantly.
The training set is used to teach the model, while the test set evaluates its performance on unseen data. Without a separate test set, we cannot measure how well the model generalizes to new examples.
Sigmoid: Maps values to the range (0,1), often used for binary classification but suffers from vanishing gradients.
ReLU (Rectified Linear Unit): Replaces negative values with zero, reducing the vanishing gradient problem but can suffer from dying neurons.
Softmax: Converts logits into probabilities, ensuring that the sum of all class probabilities equals 1, mainly used for multi-class classification.
The softmax activation function transforms a vector of raw scores (logits) into probabilities that sum to 1. Like the sigmoid function, it maps values to a range between 0 and 1. However, while sigmoid applies to binary classification, softmax is used for multi-class classification by normalizing all outputs across multiple classes.
The vanishing gradient problem occurs when gradients become too small during backpropagation, causing earlier layers in deep networks to learn slowly or not at all. This leads to inefficient training and poor convergence.
Similarities:
Differences: