[UoT] Introduction to Deep Learning (1)

지유경·2025년 2월 24일

Deep Learning Mid term test 요약 및 예상 문제

Week1

What is AI?
AI is the intelligence of machines and the branch of computer science that aims to create it.
AI is the science and engineering of making intelligent machines, especially intelligent computer programs.
AI is often used to describe machines that mimic “cognitive” functions that humans associate with the human mind, such as “learning” and “problem solving”.
Unsupervised Learning: This type of learning focuses on finding patterns, regularities or structure in unlabeled data.

Week2

ANN의 개요
Motivation problem
: 자동차 vs 트럭 분류기 문제를 통해 이미지 분류 개념 설명.
: 입력(이미지) -> 출력(고정된 범주의 하나로 할당)

컴퓨터가 보는 방식
- 픽셀값 행렬 (0~255 사이의 값)
- RGB(컬러) 이미지: 3개의 채널로 구성됨
시각 인식의 어려움
- Veiwpoint variation: 카메라 움직임에 따라 픽셀 값이 모두 변함.
- Interclass variation: 세부 카테고리로 세분화 가능.
- Background Clutter(배경 복잡성), illumination(조명), occlusion(가림) 등 다양한 도전적 과제
머신러닝 기반 이미지 분류 단계
Step1: 데이터셋 준비
이미지의 픽셀 값 가져오기: RGB 이미지 -> 각 픽셀의 [R, G, B] 값으로 구성 또는 grayscale 변환 후 1차원 벡터로 변환.
Step2: 데이터셋 분할
Train/Validation/Test

머신러닝 기초 모델
Baseline 모델 선택
- Scikit-learn의 대표적인 머신러닝 모델 활용 가능:
  -> KNN, Logistic Regression, SVM, Random Forests
- Logistic Regression 적용
  -> 낮은 성능을 보임
  -> 왜 성능이 낮을까?
  : 이미지 데이터의 고차원 특성을 단순 선형 모델로 설명하기 어려움.
  : Feature Engineering 필요
Feature Engineering
- 이미지 데이터를 단순히 픽셀로 표현한 것이 아니라, 자동차의 길이, 너비, 높이, 크기 등 의미 있는 feature 추출 가능.
- 새로운 feature를 사용하여 logistic regression 적용 -> 성능 향상
  문제점: feature를 수동으로 선택해야 함. 적절한 feature를 고르는 것이 어려움. 고차원 데이터에서는 feature가 폭발적으로 증가할 가능성이 있음.
해결책: ANN 활용 -> 머신러닝 모델이 자동으로 의미있는 특징을 학습하도록 유도.
ANN 개념
: 인공 두뇌를 모방한 모델로, 뉴런의 구조를 기반으로 학습.
- Artificial Neuron 모델
  : weight와 bias 사용
  : 입력 x에 대해 선형 결합 후 활성화 함수 적용
  : Activation Function을 사용하여 비선형성을 추가
Multi-Layer Neural Networks
기본적인 ANN 구조: Input Layer -> Hidden Layer -> Output Layer

왜 한 개의 층으로는 부족한가?
: 단층 신경망 (1-layer ANN) = logistic regression
비선형 데이터를 제대로 학습할 수 없음.
단순 선형 변환만 가능.
Activation Functions
1) Step Function: 미분 불가능하여 사용하기 어려움
2) Sigmoid
: 출력범위: (0, 1)
문제점: 큰 값에서는 기울기가 사라지는 Vanishing Gradient Problem 발생

3) Other Activation Functions : ReLU, Tanh 등도 자주 사용됨.
Training Neural Networks
Loss Function
- 이진 분류: Binary Cross-Entropy 사용
- 다중 분류: Softmax + Cross-Entropy 사용
Backpropagation
- Chain Rule을 이용하여 loss의 기울기를 계산하고 가중치를 업데이트
- 최적화 기법: Gradient Descent 적용
모델 성능 개선 전략
1) 하이퍼파라미터 튜닝
- 은닉층 개수, 뉴런 수, 활성화 함수 선택
- learning rate, batch size 조정
2) Regularization
- 과적합 방지를 위해 drop out, L1/L2 정규화 적용
3) Data Augmentation
- 이미지 변형을 통해 데이터 다양성을 확보하여 일반화 성능 향상

예상문제

What are the main challenges in image classification tasks?

The main challenges in image classification include:

Veiwpoint Variation: The appearance of objects changes with camera angles.
Interclass Variation: Objects within the same class may have different shapes, sizes, or textures.
Background Clutter: Unrelated objects in the background can interfere with classification.
Illumination and Occlusion: Lighting conditions and objects blocking parts of the image affect recognition.
Scalabillity: Large datasets require high computational power for training models.

Why does logistic regression perform poorly on raw pixel image data?

Logistic regression performs poorly on raw pixel data because:

Lack of Feature Extraction: It treats all pixels as independent variables without considering spatial relationships.
High Dimensionality: Images contain thousands of features, making linear models ineffective.
Non-linearity: Many real-world classification problems require non-linear decision boundaries.

What is feature engineering, and what are its limitations?

Feature engineering is the process of manually selecting or transforming raw data into meaningful representations for machine learning models.

Limitations:

Requires Domain Knowledge: Finding effective features requires expertise.
Time-Consuming: Manually extracting features is labor-intensive.
Curse of Dimensionality: High-dimensional feature representations can degrade model performance.

What is the problem with using a single-layer neural network?

A single-layer neural network cannot model complex decision boundaries fails to classify non-linearly separable data.

Why do deep neural networks require non-linear activation functions?

Without non-linear activation functions, deep neural networks behave as a single linear transformation, limiting their ability to learn complex patterns. Non-linearity allows the network to model intricate relationships in data.

What does "feed-forward" mean in the context of an artificial neural network?

Feed-forward in artificial neural networks refers to the process where the input data passes through the network layer by layer without loops or feedback connections. The information moves in one direction—from the input layer to the output layer.

What does “fully-connect” mean in the context of an artificial neural network?

A fully-connected layer in a neural network means that every neuron in a layer is connected to every neuron in the next layer. This allows the model to learn complex patterns but increases the number of parameters significantly.

Why do we need both a training set and a test set?

The training set is used to teach the model, while the test set evaluates its performance on unseen data. Without a separate test set, we cannot measure how well the model generalizes to new examples.

What is the difference between sigmoid, ReLU, and softmax activation functions?

Sigmoid: Maps values to the range (0,1), often used for binary classification but suffers from vanishing gradients.
ReLU (Rectified Linear Unit): Replaces negative values with zero, reducing the vanishing gradient problem but can suffer from dying neurons.
Softmax: Converts logits into probabilities, ensuring that the sum of all class probabilities equals 1, mainly used for multi-class classification.

What is the purpose of the softmax activation? How is it similar to the sigmoid activation? How is it different?

The softmax activation function transforms a vector of raw scores (logits) into probabilities that sum to 1. Like the sigmoid function, it maps values to a range between 0 and 1. However, while sigmoid applies to binary classification, softmax is used for multi-class classification by normalizing all outputs across multiple classes.

What is the vanishing gradient problem, and how does it affect training?

The vanishing gradient problem occurs when gradients become too small during backpropagation, causing earlier layers in deep networks to learn slowly or not at all. This leads to inefficient training and poor convergence.

How is an artificial neural network similar to a biological neural network? How are they different?

Similarities:

Both process information using interconnected units (neurons).
Both rely on weighted connections to determine the strength of signal transmission.
Both can adapt based on learning experience.

Differences:

Biological neurons transmit signals chemically and electrically, while artificial neurons use mathematical computations.
Artificial neural networks are structured in layers, whereas biological neural networks are highly interconnected and dynamic.

지유경

M. Sc in Computer Science and Engineering, Mar 2024 to present / B. Sc in Computer Engineering, Mar 2020 to Feb 2024

[UoT] Introduction to Deep Learning (1)

Deep Learning Mid term test 요약 및 예상 문제

Week1

Week2

예상문제

[cs231n+michigan DL for CV] lecture 2: Image Classification (1)

[UoT] Introduction to Deep Learning (2)

0개의 댓글