Optimizing Seeks

Seeking involves processing linked file system data structures. Since clusters are stored on singly linked lists, seeks typically begin from a file’s beginning, end, or current position; whichever is both closest to the desired position and either at or before it. Sometimes this does not provide optimum behavior, as in the following example:

A manufacturer of MP3 players had a problem with the different header sizes used by different song formats. Before starting a new song, the player got its name, compression type, etc. by reading an amount equal to the largest supported header. If the actual header was smaller, a seek backward to the beginning of the song was required before playing it. This meant seeking from the beginning of the file. For large music files, this took too long.

To address this, Blunk added a past position to the file control block. By default, this is overwritten by the current position before each seek. For more flexibility, fseekmark()/lseekmark() save the current position and optionally prevent updates to it.

Now seeks start from the file’s beginning, end, current, or saved position; whichever is closest to the requested position and at or before it. A selectable fourth starting position has been added. This gave a tremendous gain in seek performance and met the customer’s requirements.

Another enhancement adds a set of past positions. FILE_OFFSETS, defined in “config.h”, is the number of additional offsets saved per file control block. These offsets, and the associated cluster number, are saved in an ordered list. New seeks start from the optimal offset in the set. To disable this feature, set FILE_OFFSETS to zero.

Applications can use fseekmark()/lseekmark() or the FILE_OFFSETS feature depending on whether their seeks benefit most by starting from a fixed past offset or a set of constantly updated (after each seek) past offsets. Both features can be used at the same time.