Leap algorithm

최승혁·2022년 7월 28일
0

Leap: prefetching algorithm in RDMA

Leap has two components: detecting trends and determining what to prefetch

Trend Detection

sudo

procedure FINDTREND(Nsplit)
	Hsize <- SIZE(AccessHistory)
	w <- Hsize/Nsplit // start with small detection window
	∆maj <- 0
	while true do
		∆maj <- Booyer-Moore on {Hhead, ... , Hhead-w-1}
		w <- w * 2
		if ∆maj != major trend then
			∆maj <- 0
        if ∆maj != 0 or w > Hsize then
        	return ∆maj
    return ∆maj
  • AccessHistory: identify the majority values in a fixed-size window of remote page accesses and ignore the rest

  • w: window size

  • major: it said to be a major if it appears at least [w/2] + 1 times

  • to find majority, use the Boyer-Moore majority vote algorithm

example

  • configure

ACCESSHISTORY with Hsize = 8 and Nsplit = 2.

pages with the following addresses: 0x48, 0x45, 0x42, 0x3F, 0x3C, 0x02, 0x04, 0x06, 0x08, 0x0A, 0x0C, 0x10, 0x39, 0x12, 0x14, 0x16, were requested in that order.

t0 being the earliest and t15 being the latest request.

At ti , Hhead stays at the ti-th slot.

  • logic
  1. Algorithm will initially try to detect a trend using a window size of 4.
    • At time t3, FINDTREND successfully finds a trend of -3 within the t0t3 window.
  2. Upon failure, it will look for a trend first within a window size of 8.
    • At time t7, the trend starts to shift from -3 to +2. At that time, t4t7 window does not have a majority ∆, which doubles the window to consider t0t7.
    • This window does not have any majority ∆ either.

Implementation

int find_trend_in_region(int size, long *major_delta, int *major_count) {
    int maj_index = get_prev_index(atomic_read(&trend_history.head)), count, i, j;
    long candidate;
    
    for (i = get_prev_index(maj_index), j = 1, count = 1; j < size; i = get_prev_index(i), j++) {
        if (trend_history.history[maj_index].delta == trend_history.history[i].delta)
            count++;
        else
            count--;
        if (count == 0) {
            maj_index = i;
            count = 1;
        }
    }
    
    candidate = trend_history.history[maj_index].delta;
    for (i = get_prev_index(atomic_read(&trend_history.head)), j = 0, count = 0; j < size; i = get_prev_index(i), j++) {
        if(trend_history.history[i].delta == candidate)
            count++;
    }
    
    //printk("majority index: %d, candidate: %ld, count:%d\n", maj_index, candidate, count);
    *major_delta = candidate;
    *major_count = count;
    return count > (size/2);
}

Prefetch Candidate Generation

sudo

procedure GETPREFETCHWINDOWSIZE(page Pt)
 PWsizet		// Current prefetch window size
 PWsizet−1		// Last prefetch window size
 Chit 			// Prefetched cache hits after last prefetch
 if Chit = 0 then
 	if Pt follows the current trend then
 		PWsizet ← 1	// Prefetch a page along trend
 	else
 		PWsizet ← 0	// Suspend prefetching
 else 				// Earlier prefetches had hits
 	PWsizet ← Round up Chit +1 to closest power of 2
 PWsizet ← min(PWsizet, PWsizemax )
 if PWsizet <PWsizet−1/2 then	// Low cache hit
 	PWsizet ← PWsizet−1			// Shrink window smoothly
 Chits ← 0
 PWsizet−1 ← PWsizet
 return PWsizet
 
 procedure DOPREFETCH(page Pt)
	PWsizet ← GETPREFETCHWINDOWSIZE(Pt)
	if PWsizet 0 then
		∆ma j ← FINDTREND(N_split)
		if ∆ma j 6= 0/ then
			Read PWsizet pages with ∆ma j stride from Pt
		else
			Read PWsizet pages around Pt with latest ∆ma j
	else
		Read only page Pt
  • logic
  1. determine the prefetch window size (PWsizet) based on the accuracy of prefetches between two consecutive prefetch requests (GETPREFETCHWINDOWSIZE).

  2. Any cache hit of the prefetched data between two consecutive prefetch requests indicates the overall effectiveness of the prefetch.

    • In case of high effectiveness (i.e., a high cache hit), PWsizet is expanded until it reaches maximum size (PWsizemax).
    • low cache hit indicates low effectiveness, in that case, the prefetch window size gets reduced.
    • The prefetch window is shrunk smoothly to make the algorithm flexible to short-term irregularities.
    • When prefetching is suspended, no extra pages are prefetched until a new trend is detected. (to avoid cache pollution)
  3. Given a non-zero PWsize, the prefetcher brings in PWsize pages following the current trend, if any (DOPREFETCH).

    • If no majority trend exists, instead of giving up right away, it speculatively brings PWsize pages around Pt’s offset following the previous trend.
profile
그냥 기록하는 블로그

0개의 댓글