[Paper Review] Competition Dynamics Shape Algorithmic Phases of In-Context Learning

JaeHeon Lee, 이재헌·2025년 4월 22일

Representation Learning XAI compositionality paper-review

Paper Review

목록 보기

56/60

interesting paper and task https://openreview.net/pdf?id=XgH1wfHSX8

proposed new task, finite mixture of Markov chains.
there are 10 states (0~9) and task T is defined as 10x10 transition matrix. also T_train = {T_1, T_2, ,,, ,T_N} and N can control complexity. sequence length l is fixed to 512 in training.

4 algorithmic phases Unigram/Bigram Retrieval/Inference
(Uni-Ret, Uni-Inf, Bi-Inf, Bi-Ret)
Unigram: answer based on simple statistics,histogram of states (fast but imprecise)
Bigram: answer based on transition matrix (more precise)
Retrieval: dependent on train dataset - good ID (indistribution) but not good OOD
Inference: independent on train datset - great OOD

a) data diversity threshold
if N is small (easy task) dominant Uni-Ret phase.
if N is high enough (complex), initial training step Uni-Inf --> Bi-Inf (good OOD) --> Bi-Ret (ID overfitting).
b) emergence of induction heads
high N and intermediate training step, induction heads are formed.
c) transient nature
good OOD algorithmic phase (Bi-Inf) only activated transiently. (in the middle of training step). and deactivated by Bi-Ret.

JaeHeon Lee, 이재헌

https://jaeheon-lee486.github.io/

이전 포스트

[Paper Review] Signatures of Criticality in Efficient Coding Networks

다음 포스트

[Paper Review] Competition Dynamics Shape Algorithmic Phases of In-Context Learning

Paper Review

[Paper Review] Signatures of Criticality in Efficient Coding Networks

[Paper Review] Disentangling Representations through Multi-task Learning

0개의 댓글