[WEEK 13] PintOS - Project 4: File System (Indexed and Extensible Files)

신호정 벨로그·2021년 10월 30일

Today I Learned

목록 보기

69/89

Introduction

Indexed and Extensible Files

The basic file system allocates files as a single extent, making it vulnerable to external fragmentation, that is, it is possible that an n-block file cannot be allocated even though n blocks are free (i.e. external fragmentation). Eliminate this problem by modifying the on-disk inode structure.

기존의 파일 시스템은 파일을 개별적으로 파일을 할당하기 때문에 n개의 블록이 가용한 상황에도 n-블록 파일은 할당될 수 없는 외부 단편화 문제를 유발한다. on-disk inode 구조를 수정함으로써 이러한 문제점을 해결한다.

In practice, this probably means using an index structure with direct, indirect, and doubly indirect blocks. (In previous semesters, most students adopted something like what you have learned as Berkeley UNIX FFS with multi-level indexing.) However, to make your life easier, we make you implement it in an easier way: FAT. You must implement FAT with given skeleton code. Your code MUST NOT contain any code for multi-level indexing (FFS in the lecture). You will receive 0pts for file growth parts.

direct, indirect, doubly indirect 블록을 사용하는 인덱스 구조를 통해 단편화 문제를 해결할 수 있다. FAT 인덱스 구조를 구현한다.

NOTE: You can assume that the file system partition will not be larger than 8 MB. You must support files as large as the partition (minus metadata). Each inode is stored in one disk sector, limiting the number of block pointers that it can contain.

파일 시스템 파티션은 최대 8MB의 크기를 초과하지 않는다. 각 inode는 한 디스크 섹터에 저장되며 블록 포인터의 수를 제한한다.

Indexing large files with FAT (File Allocation Table)

In the basic filesystem that you have used for the previous projects, a file was stored as a contiguous single chunk over multiple disk sectors. Let's call contiguous chunk as cluster, since a cluster (chunk) can contain one or more contiguous disk sectors. In this point of view, the size of a cluster in the basic file system was equal to the size of a file that stored in the cluster.

기존의 파일 시스템에서 파일은 여러 디스크 섹터에 걸쳐 인접한 chunk에 저장된다. 인접한 chunk를 클러스터라고 부르며 클러스터는 하나 이상의 인접한 디스크 섹터를 포함한다. 기존 파일 시스템에서 클러스터의 크기는 클러스터에 저장된 파일의 크기와 같다.

To mitigate external fragmentation, we can shrink the size of cluster (recall the page size in virtual memory). For simplicity, in our skeleton code, we fixed number of sectors in a cluster as 1. When we use smaller clusters like it, a cluster might not enough to store the entire file. In this case, we need multiple clusters for a file, so we need a data structure to index the clusters for a file in the inode. One of the easiest way is to use linked-list (a.k.a chain). An inode can contain the sector number of the first block of the file, and the first block may contain the sector number of the second block. This naïve approach, however, is too slow because we have to read every block of the file, even though what we really need was only the last block. To overcome this, FAT (File Allocation Table) puts the connectivity of blocks in a fixed-size File Allocation Table rather than the blocks themselves. Since FAT only contains the connectivity value rather than the actual data, its size is small enough to be cached in DRAM. As a result, we can just read corresponding entries in the table.

외부 단편화 문제를 해결하기 위해 클러스터의 크기를 줄인다. 한 클러스터에 하나의 섹터로 고정한다. 작은 크기의 클러스터를 사용하면 파일 전체를 저장하기에 부족할 수 있다. 이러한 경우 여러 클러스터를 사용하여 파일을 저장하며, 클러스터가 inode의 파일을 가리키기 위해 자료구조가 필요하다.

가장 간단한 방법은 연결 리스트를 사용하는 방법이다. inode는 파일의 첫 번째 블록의 섹터의 개수를 저장하며, 첫 번째 블록은 두 번째 블록의 섹터의 개수를 저장한다. 이러한 단순한 방식은 파일의 모든 블록을 읽어야 하기 때문에 느리다(비효율적이다). FAT는 고정된 크기의 file allocation table에 연결 블록을 사용한다. FAT는 실제 데이터 대신 connectivity value를 포함하기 때문에 크기가 작아 DRAM에 캐시되기에 적합하다. 따라서 테이블로부터 엔트리를 읽기만 하면 된다.