The Object Cache allows users to retrieve FTP, Gopher, and HTTP data quickly and efficiently, often avoiding the need to cross the Internet. The Harvest cache is more than an order of magnitude faster than the CERN cache and other popular Internet caches, because it never forks, is implemented with non-blocking I/O, keeps meta data and especially hot objects cached in RAM, caches DNS lookups, supports non-blocking DNS lookups, and implements negative caching both of objects and of DNS lookups. A technical paper is available that discusses the Harvest cache's design, implementation, and performance [9].
The Cache can be run in two different modes: as a proxy object cache, or as an httpd accelerator. In this section we discuss the use as a proxy cache; we discuss the httpd accelerator in Section 6.4.
The Cache consists of a main server program cached, a Domain Naming System lookup caching server program dnsserver, a Perl program for retrieving FTP data, and some optional Tcl-based management and client tools. The FTP program arose because of FTP complexities---while we retrieve Gopher and HTTP data from across the Internet using C code built in to cached, we retrieve remote FTP data using an external program ( ftpget.pl), which uses three Perl library files (discussed below). Once the FTP data have been loaded into the local cached copies, subsequent accesses are performed without running these external programs.
When the cache starts, it spawns three dnsserver processes, each of which can perform a single blocking Domain Naming System (DNS) lookup in parallel with the cache. This reduces the amount of time the cache waits for DNS lookups.