patch-2.2.1 linux/Documentation/proc.txt
Next file: linux/Makefile
Previous file: linux/CREDITS
Back to the patch index
Back to the overall index
- Lines: 1354
- Date:
Thu Jan 28 12:16:40 1999
- Orig file:
v2.2.0/linux/Documentation/proc.txt
- Orig date:
Wed Dec 31 16:00:00 1969
diff -u --recursive --new-file v2.2.0/linux/Documentation/proc.txt linux/Documentation/proc.txt
@@ -0,0 +1,1353 @@
+
+-----------------------------------------------------------------------
+
+ T H E /proc F I L E S Y S T E M
+
+-----------------------------------------------------------------------
+/proc/sys Terrehon Bowden <terrehon@wpi.com> January 27 1999
+ Bodo Bauer <bb@ricochet.net>
+-----------------------------------------------------------------------
+Version 1.1 Kernel version 2.2
+-----------------------------------------------------------------------
+Contents
+
+1 Introduction/Credits
+
+1.1 Legal Issues
+
+2 The /proc file system
+
+2.1 Process specific subdirectories
+2.2 Kernel data
+2.3 IDE devices in /proc/ide
+2.4 Networking info in /proc/net
+2.5 SCSI info
+2.6 Parallel port info in /proc/parport
+2.7 TTY info in /proc/tty
+
+3 Reading and modifying kernel parameters
+
+3.1 /proc/sys/debug and /proc/sys/proc
+3.2 /proc/fs - File system data
+3.3 /proc/fs/binfmt_misc - Miscellaneous binary formats
+3.4 /proc/sys/kernel - General kernel parameters
+3.5 /proc/sys/vm - The virtual memory subsystem
+3.6 /proc/sys/dev - Device specific parameters
+3.7 /proc/sys/sunrpc - Remote procedure calls
+3.8 /proc/sys/net - Networking stuff
+3.9 /proc/sys/net/ipv4 - IPV4 settings=20
+3.10 Appletalk
+3.11 IPX
+
+-----------------------------------------------------------------------
+
+1 Introduction/Credits
+
+This documentation is part of a soon to be released book published by
+IDG Books on the SuSE Linux distribution. As there is no complete
+documentation for the /proc file system and we've used many freely
+available sources to write this chapter, it seems only fair to give
+the work back to the Linux community. This work is based on the
+2.1.132 and 2.2.0-pre-kernel versions. I'm afraid it's still far from
+complete, but we hope it will be useful. As far as we know, it is the
+first 'all-in-one’ document about the /proc file system. It is
+focused on the Intel x86 hardware, so if you are looking for PPC, ARM,
+SPARC, APX, etc., features, you probably won't find what you are
+looking for. It also only covers IPv4 networking, not IPv6 nor other
+protocols - sorry.
+
+We'd like to thank Alan Cox, Rik van Riel, and Alexey Kuznetsov. We'd
+also like to extend a special thank you to Andi Kleen for
+documentation, which we relied on heavily to create this document, as
+well as the additional information he provided. Thanks to everybody
+else who contributed source or docs to the Linux kernel and helped
+create a great piece of software... :)
+
+If you have any comments, corrections or additions, please don't
+hesitate to contact Bodo Bauer at bb@ricochet.net. We'll be happy to
+add them to this document.
+
+The latest version of this document is available online at
+http://www.suse.com/~bb/Docs/proc.html in HTML, ASCII, and as
+Postscript file.
+
+1.1 Legal Stuff
+
+We don't guarantee the correctness of this document, and if you come
+to us complaining about how you screwed up your system because of
+incorrect documentation, we won't feel responsible...
+
+-----------------------------------------------------------------------
+
+2 The /proc file system
+
+The proc file system acts as an interface to internal data structures
+in the kernel. It can be used to obtain information about the system
+and to change certain kernel parameters at runtime. It contains
+(among other things) one subdirectory for each process running on the
+system which is named after the process id (PID) of the process. The
+link self points to the process reading the file system.
+
+2.1 Process specific subdirectories
+
+Each process subdirectory has the in table 1.1 listed entries.
+
+ _________________________________________________
+ cmdline Command line arguments
+ environ Values of environment variables
+ fd Directory, which contains all file descriptors
+ mem Memory held by this process
+ stat Process status
+ status Process status in human readable form
+ cwd Link to the current working directory
+ exe Link to the executable of this process
+ maps Memory maps
+ root Link to the root directory of this process
+ statm Process memory status information
+ _________________________________________________
+ Table 1.1: Process specific entries in /proc
+
+For example, to get the status information of a process, all you have
+to do is read the file /proc/PID/status:
+
+> cat /proc/self/status
+Name: cat
+State: R (running)
+Pid: 5633
+PPid: 5609
+Uid: 501 501 501 501
+Gid: 100 100 100 100
+Groups: 100 16
+VmSize: 804 kB
+VmLck: 0 kB
+VmRSS: 344 kB
+VmData: 68 kB
+VmStk: 20 kB
+VmExe: 12 kB
+VmLib: 660 kB
+SigPnd: 0000000000000000
+SigBlk: 0000000000000000
+SigIgn: 0000000000000000
+SigCgt: 0000000000000000
+CapInh: 00000000fffffeff
+CapPrm: 0000000000000000
+CapEff: 0000000000000000
+
+This shows you almost the same information as you would get if you
+viewed it with the ps command. In fact, ps uses the proc file system
+to obtain its information.
+
+The statm file contains more detailed information about the process
+memory usage. It contains seven values with the following meanings:
+
+size total program size
+resident size of in memory portions
+shared number of the pages that are shared
+trs number of pages that are 'code'
+drs number of pages of data/stack
+lrs number of pages of library
+dt number of dirty pages
+
+The ratio text/data/library is approximate only by heuristics.
+
+2.2 Kernel data
+
+Similar to the process entries, these are files which give information
+about the running kernel. The files used to obtain this information
+are contained in /proc and are listed in table 1.2. Not all of these
+will be present in your system. It depends on the kernel configuration
+and the loaded modules, which files are there, and which are missing.
+
+ ________________________________________________
+ apm Advanced power management info
+ cmdline Kernel command line
+ cpuinfo Info about the CPU
+ devices Available devices (block and character)
+ dma Used DMS channels
+ filesystems Supported filesystems
+ interrupts Interrupt usage
+ ioports I/O port usage
+ kcore Kernel core image
+ kmsg Kernel messages
+ ksyms Kernel symbol table
+ loadavg Load average
+ locks Kernel locks
+ meminfo Memory info
+ misc Miscellaneous
+ modules List of loaded modules
+ mounts Mounted filesystems
+ partitions Table of partitions known to the system
+ rtc Real time clock
+ slabinfo Slab pool info
+ stat Overall statistics
+ swaps Swap space utilization
+ uptime System uptime
+ version Kernel version
+ ________________________________________________
+ Table 1.2: Kernel info in /proc
+
+You can, for example, check which interrupts are currently in use and
+what they are used for by looking in the file /proc/interrupts:
+
+> cat /proc/interrupts
+ CPU0
+ 0: 8728810 XT-PIC timer
+ 1: 895 XT-PIC keyboard
+ 2: 0 XT-PIC cascade
+ 3: 531695 XT-PIC aha152x
+ 4: 2014133 XT-PIC serial
+ 5: 44401 XT-PIC pcnet_cs
+ 8: 2 XT-PIC rtc
+ 11: 8 XT-PIC i82365
+ 12: 182918 XT-PIC PS/2 Mouse
+ 13: 1 XT-PIC fpu
+ 14: 1232265 XT-PIC ide0
+ 15: 7 XT-PIC ide1
+NMI: 0
+
+There three more important subdirectories in /proc: net, scsi and
+sys. The general rule is that the contents, or even the existence of
+these directories, depends on your kernel configuration. If SCSI is
+not enabled, the directory scsi may not exist. The same is true with
+the net, which is only there when networking support is present in the
+running kernel.
+
+The slabinfo file gives information about memory usage on the slab
+level. Linux uses slab pools for memory management above page level
+in version 2.2. Commonly used objects have their own slab pool (like
+network buffers, directory cache, etc.).
+
+2.3 IDE devices in /proc/ide
+
+This subdirectory contains information about all IDE devices that the
+kernel is aware of. There is one subdirectory for each device
+(i.e. hard disk) containing the following files:
+
+ cache The cache
+ capacity Capacity of the medium
+ driver Driver and version
+ geometry Physical and logical geometry
+ identify Device identify block
+ media Media type
+ model Device identifier
+ settings Device setup
+ smart_thresholds IDE disk management thresholds
+ smart_values IDE disk management values
+
+2.4 Networking info in /proc/net
+
+This directory follows the usual pattern. Table 1.3 lists the files
+and their meaning.
+
+ ____________________________________________________
+ arp Kernel ARP table
+ dev network devices with statistics
+ dev_mcast Lists the Layer2 multicast groups a
+ device is listening to (interface index,
+ label, number of references, number of
+ bound addresses).
+ dev_stat network device status
+ ip_fwchains Firewall chain linkage
+ ip_fwnames Firewall chains
+ ip_masq Directory containing the masquerading
+ tables.
+ ip_masquerade Major masquerading table
+ netstat Network statistics
+ raw Raw device statistics
+ route Kernel routing table
+ rpc Directory containing rpc info
+ rt_cache Routing cache
+ snmp SNMP data
+ sockstat Socket statistics
+ tcp TCP sockets
+ tr_rif Token ring RIF routing table
+ udp UDP sockets
+ unix UNIX domain sockets
+ wireless Wireless interface data (Wavelan etc)
+ igmp IP multicast addresses, which this host joined
+ psched Global packet scheduler parameters.
+ netlink List of PF_NETLINK sockets.
+ ip_mr_vifs List of multicast virtual interfaces.
+ ip_mr_cache List of multicast routing cache.
+ udp6 UDP sockets (IPv6)
+ tcp6 TCP sockets (IPv6)
+ raw6 Raw device statistics (IPv6)
+ igmp6 IP multicast addresses, which this host joineed (IPv6)
+ if_inet6 List of IPv6 interface addresses.
+ ipv6_route Kernel routing table for IPv6
+ rt6_stats global IPv6 routing tables statistics.
+ sockstat6 Socket statistics (IPv6)
+ snmp6 Snmp data (IPv6)
+ ____________________________________________________
+ Table 1.3: Network info in /proc/net
+
+You can use this information to see which network devices are
+available in your system and how much traffic was routed over those
+devices:
+
+> cat /proc/net/dev
+Inter-|Receive |[...
+ face |bytes packets errs drop fifo frame compressed multicast|[...
+ lo: 908188 5596 0 0 0 0 0 0 [...
+ ppp0:15475140 20721 410 0 0 410 0 0 [...
+ eth0: 614530 7085 0 0 0 0 0 1 [...
+
+...] Transmit
+...] bytes packets errs drop fifo colls carrier compressed
+...] 908188 5596 0 0 0 0 0 0
+...] 1375103 17405 0 0 0 0 0 0
+...] 1703981 5535 0 0 0 3 0 0
+
+2.5 SCSI info
+
+If you have a SCSI host adapter in your system, you'll find a
+subdirectory named after the driver for this adapter in /proc/scsi.
+You'll also see a list of all recognized SCSI devices in /proc/scsi:
+
+>cat /proc/scsi/scsi
+Attached devices:
+Host: scsi0 Channel: 00 Id: 00 Lun: 00
+ Vendor: QUANTUM Model: XP34550W Rev: LXY4
+ Type: Direct-Access ANSI SCSI revision: 02
+Host: scsi0 Channel: 00 Id: 01 Lun: 00
+ Vendor: SEAGATE Model: ST34501W Rev: 0018
+ Type: Direct-Access ANSI SCSI revision: 02
+Host: scsi0 Channel: 00 Id: 02 Lun: 00
+ Vendor: SEAGATE Model: ST34501W Rev: 0017
+ Type: Direct-Access ANSI SCSI revision: 02
+Host: scsi0 Channel: 00 Id: 04 Lun: 00
+ Vendor: ARCHIVE Model: Python 04106-XXX Rev: 703b
+ Type: Sequential-Access ANSI SCSI revision: 02
+
+The directory named after the driver has one file for each adapter
+found in the system. These files contain information about
+the controller, including the used IRQ and the IO address range:
+
+>cat /proc/scsi/ncr53c8xx/0
+General information:
+ Chip NCR53C875, device id 0xf, revision id 0x4
+ IO port address 0xec00, IRQ number 11
+ Synchronous period factor 12, max commands per lun 4
+
+2.6 Parallel port info in /proc/parport
+
+The directory /proc/parport contains information about the parallel
+ports of your system. It has one subdirectory for each port, named
+after the port number (0,1,2,...).
+
+This directory contains four files:
+
+ autoprobe Autoprobe results of this port
+ devices Connected device modules
+ hardware Hardware info (port type, io-port, DMA, IRQ, etc.)
+ irq Used interrupt, if any
+
+2.7 TTY info in /proc/tty
+
+Information about the available and the actually used tty's can be
+found in /proc/tty. You'll find entries for drivers and line
+disciplines in this directory, as shown in the table below:
+
+ drivers List of drivers and their usage
+ ldiscs Registered line disciplines
+ driver/serial Usage statistic and status of single tty lines
+
+To see which tty's are currently in use, you can simply look into the
+file /proc/tty/drivers:
+
+>cat /proc/tty/drivers
+pty_slave /dev/pts 136 0-255 pty:slave
+pty_master /dev/ptm 128 0-255 pty:master
+pty_slave /dev/ttyp 3 0-255 pty:slave
+pty_master /dev/pty 2 0-255 pty:master
+serial /dev/cua 5 64-67 serial:callout
+serial /dev/ttyS 4 64-67 serial
+/dev/tty0 /dev/tty0 4 0 system:vtmaster
+/dev/ptmx /dev/ptmx 5 2 system
+/dev/console /dev/console 5 1 system:console
+/dev/tty /dev/tty 5 0 system:/dev/tty
+unknown /dev/tty 4 1-63 console
+
+-----------------------------------------------------------------------
+
+3 Reading and modifying kernel parameters
+
+A very interesting part of /proc is the directory /proc/sys. This not
+only provides information, it also allows you to change parameters
+within the kernel. Be very careful when trying this. You can optimize
+your system, but you also can crash it. Never play around with kernel
+parameters on a production system. Set up a development machine and
+test to make sure that everything works the way you want it to. You
+may have no alternative but to reboot the machine once an error has
+been made.
+
+To change a value, simply echo the new value into the file. An example
+is given below in the section on the file system data. You need to be
+root to do this. You can create your own boot script to get this done
+every time your system boots.
+
+The files in /proc/sys can be used to tune and monitor miscellaneous
+and general things in the operation of the Linux kernel. Since some
+of the files can inadvertently disrupt your system, it is advisable to
+read both documentation and source before actually making
+adjustments. In any case, be very careful when writing to any of these
+files. The entries in /proc may change slightly between the 2.1.* and
+the 2.2 kernel, so review the kernel documentation if there is any
+doubt. You'll find the documentation in the directory
+/usr/src/linux/Documentation/sys. This chapter is heavily based on the
+documentation included in the pre 2.2 kernels. Thanks to Rick van Riel
+for providing this information.
+
+3.1 /proc/sys/debug and /proc/sys/proc
+
+These two subdirectories are empty.
+
+3.2 /proc/fs - File system data
+
+This subdirectory contains specific file system, file handle, inode,
+dentry and quota information.
+
+Currently, these files are in /proc/sys/fs:
+
+dentry-state
+ Status of the directory cache. Since directory entries are
+ dynamically allocated and deallocated, this file gives information
+ about the current status. It holds six values, in which the last
+ two are not used and are always zero. The other four mean:
+
+ nr_dentry Seems to be zero all the time
+ nr_unused Number of unused cache entries
+ age_limit Age in seconds after the entry may be
+ reclaimed, when memory is short
+ want_pages internal
+
+dquot-nr and dquot-max
+ The file dquot-max shows the maximum number of cached disk quota
+ entries.
+
+ The file dquot-nr shows the number of allocated disk quota
+ entries and the number of free disk quota entries.
+
+ If the number of free cached disk quotas is very low and you have
+ a large number of simultaneous system users, you might want
+ to raise the limit.
+
+file-nr and file-max
+ The kernel allocates file handles dynamically, but as yet
+ doesn't free them again.
+
+ The value in file-max denotes the maximum number of file handles
+ that the Linux kernel will allocate. When you get a lot of error
+ messages about running out of file handles, you might want to raise
+ this limit. The default value is 4096. To change it, just write the
+ new number into the file:
+
+ # cat /proc/sys/fs/file-max
+ 4096
+ # echo 8192 > /proc/sys/fs/file-max
+ # cat /proc/sys/fs/file-max
+ 8192
+
+ This method of revision is useful for all customizable parameters
+ of the kernel - simply echo the new value to the corresponding
+ file.
+
+ The three values in file-nr denote the number of allocated file
+ handles, the number of used file handles, and the maximum number of
+ file handles. When the allocated file handles come close to the
+ maximum, but the number of actually used ones is far behind, you've
+ encountered a peak in your usage of file handles and you don't need
+ to increase the maximum.
+
+ However, there is still a per process limit of open files, which
+ unfortunatly can't be changed that easily. It is set to 1024 by
+ default. To change this you have to edit the files limits.h and
+ fs.h in the directory /usr/src/linux/include/linux. Change the
+ definition of NR_OPEN and recompile the kernel.
+
+inode-state, inode-nr and inode-max
+ As with file handles, the kernel allocates the inode structures
+ dynamically, but can't free them yet.
+
+ The value in inode-max denotes the maximum number of inode
+ handlers. This value should be 3 to 4 times larger than the value
+ in file-max, since stdin, stdout, and network sockets also need an
+ inode struct to handle them. If you regularly run out of inodes,
+ you should increase this value.
+
+ The file inode-nr contains the first two items from inode-state, so
+ we'll skip to that file...
+
+ inode-state contains three actual numbers and four dummy values. The
+ actual numbers are (in order of appearance) nr_inodes, nr_free_inodes,
+ and preshrink.
+
+ nr_inodes
+ Denotes the number of inodes the system has allocated. This can
+ be slightly more than inode-max because Linux allocates them one
+ pageful at a time.
+
+ nr_free_inodes
+ Represents the number of free inodes and pre shrink is nonzero
+ when the nr_inodes > inode-max and the system needs to prune the
+ inode list instead of allocating more.
+
+super-nr and super-max
+ Again, super block structures are allocated by the kernel,
+ but not freed. The file super-max contains the maximum number of
+ super block handlers, where super-nr shows the number of
+ currently allocated ones.
+
+ Every mounted file system needs a super block, so if you plan to
+ mount lots of file systems, you may want to increase these
+ numbers.
+
+3.3 /proc/fs/binfmt_misc - Miscellaneous binary formats
+
+Besides these files, there is the subdirectory
+/proc/sys/fs/binfmt_misc. This handles the kernel support for
+miscellaneous binary formats.
+
+Binfmt_misc provides the ability to register additional binary formats
+to the Kernel without compiling an additional module/kernel. Therefore
+binfmt_misc needs to know magic numbers at the beginning or the
+filename extension of the binary.
+
+It works by maintaining a linked list of structs, that contain a
+description of a binary format, including a magic with size (or the
+filename extension), offset and mask, and the interpreter name. On
+request it invokes the given interpreter with the original program as
+argument, as binfmt_java and binfmt_em86 and binfmt_mz do.
+Since binfmt_misc does not define any default binary-formats, you have to
+register an additional binary-format.
+
+There are two general files in binfmt_misc and one file per registered
+format. The two general files are register and status.
+
+Registering a new binary format
+
+echo :name:type:offset:magic:mask:interpreter: > /proc/sys/fs/binfmt_misc/register
+
+with appropriate name (the name for the /proc-dir entry), offset
+(defaults to 0, if omitted), magic and mask (which can be omitted,
+defaults to all 0xff) and last but not least, the interpreter that is
+to be invoked (for example and testing '/bin/echo'). Type can be M for
+usual magic matching or E for filename extension matching (give
+extension in place of magic).
+
+To check or reset the status of the binary format handler:
+
+If you do a cat on the file /proc/sys/fs/binfmt_misc/status, you will
+get the current status (enabled/disabled) of binfmt_misc. Change the
+status by echoing 0 (disables) or 1 (enables) or -1 (caution: this
+clears all previously registered binary formats) to status. For
+example echo 0 > status to disable binfmt_misc (temporarily).
+
+Status of a single handler
+
+Each registered handler has an entry in /proc/sys/fs/binfmt_misc.
+These files perform the same function as status, but their scope is
+limited to the actual binary format. By cating this file, you also
+receive all related information about the interpreter/magic of the
+binfmt.
+
+Example usage of binfmt_misc (emulate binfmt_java)
+
+cd /proc/sys/fs/binfmt_misc
+echo ':Java:M::\xca\xfe\xba\xbe::/usr/local/java/bin/javawrapper:' > register
+echo ':HTML:E::html::/usr/local/java/bin/appletviewer:' > register
+echo ':Applet:M::<!--applet::/usr/local/java/bin/appletviewer:' > register
+echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register
+
+These three lines add support for Java executables and Java applets
+(like binfmt_java, additionally recognizing the .html extension with
+no need to put <!--applet> to every applet file). You have to install
+the JDK and the shell-script /usr/local/java/bin/javawrapper too. It
+works around the brokenness of the Java filename handling. To add a
+Java binary, just create a link to the class-file somewhere in the
+path.
+
+3.4 /proc/sys/kernel - general kernel parameters
+
+This directory reflects general kernel behaviors. As I've said before,
+the contents are depend on your configuration. I'll list the most
+important files, along with descriptions of what they mean and how to
+use them.
+
+acct
+ The file contains three values; highwater, lowwater, and
+ frequency.
+
+ It exists only when BSD-style process accounting is enabled. These
+ values control its behavior. If the free space on the file system
+ where the log lives goes below lowwater%, accounting suspends. If
+ it goes above highwater%, accounting resumes. Frequency determines
+ how often you check the amount of free space (value is in
+ seconds). Default settings are: 4, 2, and 30. That is, suspend
+ accounting if there left <= 2% free; resume it if we have a value
+ >=3%; consider information about the amount of free space valid
+ for 30 seconds
+
+ctrl-alt-del
+ When the value in this file is 0, ctrl-alt-del is trapped and sent
+ to the init(1) program to handle a graceful restart. However, when
+ the value is > 0, Linux's reaction to this key combination will be
+ an immediate reboot, without syncing its dirty buffers.
+
+ Note: when a program (like dosemu) has the keyboard in raw mode,
+ the ctrl-alt-del is intercepted by the program before it ever
+ reaches the kernel tty layer, and it is up to the program to decide
+ what to do with it.
+
+domainname and hostname
+ These files can be controlled to set the NIS domainname and
+ hostname of your box. For the classic darkstar.frop.org a simple:
+
+ # echo "darkstar" > /proc/sys/kernel/hostname
+ # echo "frop.org" > /proc/sys/kernel/domainname
+
+ would suffice to set your hostname and NIS domainname.
+
+osrelease, ostype and version
+
+ The names make it pretty obvious what these fields contain:
+
+ >cat /proc/sys/kernel/osrelease
+ 2.1.131
+ >cat /proc/sys/kernel/ostype
+ Linux
+ >cat /proc/sys/kernel/version
+ #8 Mon Jan 25 19:45:02 PST 1999
+
+ The files osrelease and ostype should be clear enough. Version
+ needs a little more clarification however. The #8 means that this
+ is the 8th kernel built from this source base and the date behind
+ it indicates the time the kernel was built. The only way to tune
+ these values is to rebuild the kernel.
+
+panic
+ The value in this file represents the number of seconds the kernel
+ waits before rebooting on a panic. When you use the software
+ watchdog, the recommended setting is 60. If set to 0, the auto
+ reboot after a kernel panic is disabled, this is the default
+ setting.
+
+printk
+ The four values in printk denote console_loglevel,
+ default_message_loglevel, minimum_console_level, and
+ default_console_loglevel respectively.
+
+ These values influence printk() behavior when printing or logging
+ error messages, which come from inside the kernel. See syslog(2)
+ for more information on the different log levels.
+
+ console_loglevel
+ Messages with a higher priority than this will be printed to
+ the console.
+
+ default_message_level
+ Messages without an explicit priority will be printed with
+ this priority.
+
+ minimum_console_loglevel
+ Minimum (highest) value to which the console_loglevel can be set.
+
+ default_console_loglevel
+ Default value for console_loglevel.
+
+sg-big-buff
+ This file shows the size of the generic SCSI (sg) buffer. At this
+ point, you can’t tune it yet, but you can change it at compile time
+ by editing include/scsi/sg.h and changing the value of
+ SG_BIG_BUFF.
+
+ If you use a scanner with SANE (Scanner Access now easy) you
+ might want to set this to a higher value. Look into the SANE
+ documentation on this issue.
+
+modprobe
+ The location where the modprobe binary is located. The kernel
+ uses this program to load modules on demand.
+
+3.5 /proc/sys/vm - The virtual memory subsystem
+
+The files in this directory can be used to tune the operation of the
+virtual memory (VM) subsystem of the Linux kernel. In addition, one of
+the files (bdflush) has a little influence on disk usage.
+
+bdflush
+ This file controls the operation of the bdflush kernel daemon. It
+ currently contains 9 integer values, 6 of which are actually used
+ by the kernel:
+
+ nfract Percentage of buffer cache dirty to
+ activate bdflush
+ ndirty Maximum number of dirty blocks to
+ write out per-wake-cycle
+ nrefill Number of clean buffers to try to obtain
+ each time we call refill
+ nref_dirt Dirty buffer threshold for activating bdflush
+ when trying to refill buffers.
+ dummy unused
+ age_buffer Time for normal buffer to age before you flush it
+ age_super Time for superblock to age before you flush it
+ dummy unused
+ dummy unused
+
+ nfract
+ This parameter governs the maximum number of dirty buffers
+ in the buffer cache. Dirty means that the contents of the
+ buffer still have to be written to disk (as opposed to a
+ clean buffer, which can just be forgotten about). Setting
+ this to a high value means that Linux can delay disk writes
+ for a long time, but it also means that it will have to do a
+ lot of I/O at once when memory becomes short. A low value
+ will spread out disk I/O more evenly.
+
+ ndirty
+ Ndirty gives the maximum number of dirty buffers that
+ bdflush can write to the disk at one time. A high value will
+ mean delayed, bursty I/O, while a small value can lead to
+ memory shortage when bdflush isn't woken up often enough.
+
+ nrefill
+ This the number of buffers that bdflush will add to the list
+ of free buffers when refill_freelist() is called. It is
+ necessary to allocate free buffers beforehand, since the
+ buffers are often different sizes than the memory pages
+ and some bookkeeping needs to be done beforehand. The
+ higher the number, the more memory will be wasted and the
+ less often refill_freelist() will need to run.
+
+ nref_dirt
+ When refill_freelist() comes across more than nref_dirt
+ dirty buffers, it will wake up bdflush.
+
+ age_buffer and age_super
+ Finally, the age_buffer and age_super parameters govern the
+ maximum time Linux waits before writing out a dirty buffer
+ to disk. The value is expressed in jiffies (clockticks), the
+ number of jiffies per second is 100. Age_buffer is the
+ maximum age for data blocks, while age_super is for
+ filesystems meta data.
+
+buffermem
+ The three values in this file control how much memory should be
+ used for buffer memory. The percentage is calculated as a
+ percentage of total system memory.
+
+ The values are:
+
+ min_percent
+ This is the minimum percentage of memory that should be
+ spent on buffer memory.
+
+ borrow_percent
+ When Linux is short on memory, and the buffer cache uses more
+ than it has been allotted, the memory mangement (MM) subsystem
+ will prune the buffer cache more heavily than other memory to
+ compensate.
+
+ max_percent
+ This is the maximum amount of memory that can be used for
+ buffer memory.
+
+freepages
+ This file contains three values: min, low and high:
+
+ min
+ When the number of free pages in the system reaches this number,
+ only the kernel can allocate more memory.
+
+ low
+ If the number of free pages gets below this point, the kernel
+ starts swapping aggressively.
+
+ high
+ The kernel tries to keep up to this amount of memory free; if
+ memory comes below this point, the kernel gently starts swapping
+ in the hopes that it never has to do really aggressive swapping.
+
+kswapd
+ Kswapd is the kernel swap out daemon. That is, kswapd is that piece
+ of the kernel that frees memory when it gets fragmented or
+ full. Since every system is different, you'll probably want some
+ control over this piece of the system.
+
+ The file contains three numbers:
+
+ tries_base
+ The maximum number of pages kswapd tries to free in one round is
+ calculated from this number. Usually this number will be divided
+ by 4 or 8 (see mm/vmscan.c), so it isn't as big as it looks.
+
+ When you need to increase the bandwidth to/from swap, you'll want
+ to increase this number.
+
+ tries_min
+ This is the minimum number of times kswapd tries to free a page
+ each time it is called. Basically it's just there to make sure
+ that kswapd frees some pages even when it's being called with
+ minimum priority.
+
+
+ swap_cluster
+ This is probably the greatest influence on system
+ performance. swap_cluster is the number of pages kswapd writes in
+ one turn. You’ll want this value to be large so that kswapd does
+ its I/O in large chunks and the disk doesn’t have to seek as
+ often., but you don’t want it to be too large since that would
+ flood the request queue.
+
+overcommit_memory
+ This file contains one value. The following algorithm is used to
+ decide if there's enough memory: if the value of overcommit_memory
+ is positive, then there's always enough memory. This is a useful
+ feature, since programs often malloc() huge amounts of memory 'just
+ in case', while they only use a small part of it. Leaving this
+ value at 0 will lead to the failure of such a huge malloc(), when
+ in fact the system has enough memory for the program to run.
+
+ On the other hand, enabling this feature can cause you to run out
+ of memory and thrash the system to death, so large and/or important
+ servers will want to set this value to 0.
+
+pagecache
+ This file does exactly the same as buffermem, only this file
+ controls the amount of memory allowed for memory mapping and
+ generic caching of files.
+
+ You don't want the minimum level to be too low, otherwise your
+ system might thrash when memory is tight or fragmentation is
+ high.
+
+pagetable_cache
+ The kernel keeps a number of page tables in a per-processor cache
+ (this helps a lot on SMP systems). The cache size for each
+ processor will be between the low and the high value.
+
+ On a low-memory, single CPU system, you can safely set these values
+ to 0 so you don't waste memory. It is used on SMP systems so that
+ the system can perform fast pagetable allocations without having to
+ aquire the kernel memory lock.
+
+ For large systems, the settings are probably fine. For normal
+ systems they won't hurt a bit. For small systems (<16MB ram) it
+ might be advantageous to set both values to 0.
+
+swapctl
+ This file contains no less than 8 variables. All of these values
+ are used by kswapd.
+
+ The first four variables sc_max_page_age, sc_page_advance,
+ sc_page_decline and sc_page_initial_age are used to keep track of
+ Linux's page aging. Page aging is a bookkeeping method to track
+ which pages of memory are often used, and which pages can be
+ swapped out without consequences.
+
+ When a page is swapped in, it starts at sc_page_initial_age
+ (default 3) and when the page is scanned by kswapd, its age is
+ adjusted according to the following scheme:
+
+ o If the page was used since the last time we scanned, its age
+ is increased by sc_page_advance (default 3) up to a
+ maximum of sc_max_page_age (default 20).
+
+ o Else (meaning it wasn't used) its age is decreased by
+ sc_page_decline (default 1).
+
+ When a page reaches age 0, it's ready to be swapped out.
+
+ The next four variables sc_age_cluster_fract, sc_age_cluster_min,
+ sc_pageout_weight and sc_bufferout_weight, can be used to control
+ kswapd's aggressiveness in swapping out pages.
+
+ Sc_age_cluster_fract is used to calculate how many pages from a
+ process are to be scanned by kswapd. The formula used is
+
+ sc_age_cluster_fract
+ -------------------- * resident set size
+ 1024 =20
+
+ So if you want kswapd to scan the whole process,
+ sc_age_cluster_fract needs to have a value of 1024. The minimum
+ number of pages kswapd will scan is represented by
+ sc_age_cluster_min, this is done so kswapd will also scan small
+ processes.
+
+ The values of sc_pageout_weight and sc_bufferout_weight are used
+ to control how many tries kswapd will make in order to swap out
+ one page/buffer. These values can be used to fine-tune the ratio
+ between user pages and buffer/cache memory. When you find that
+ your Linux system is swapping out too many process pages in order
+ to satisfy buffer memory demands, you might want to either
+ increase sc_bufferout_weight, or decrease the value of
+ sc_pageout_weight.
+
+3.6 /proc/sys/dev - Device specific parameters
+
+Currently there is only support for CDROM drives, and for those, there
+is only one read only file containing information about the CD-ROM
+drives attached to the system:
+
+>cat /proc/sys/dev/cdrom/info
+CD-ROM information
+
+drive name: sr0 hdc
+drive speed: 0 6
+drive # of slots: 1 0
+Can close tray: 1 1
+Can open tray: 1 1
+Can lock tray: 1 1
+Can change speed: 1 1
+Can select disk: 0 1
+Can read multisession: 1 1
+Can read MCN: 1 1
+Reports media changed: 1 1
+Can play audio: 1 1
+
+You see two drives, sr0 and hdc, and their lists of features.
+
+3.7 /proc/sys/sunrpc - Remote procedure calls
+
+This directory contains four files, which enable or disable debugging
+for the RPC functions NFS, NFS-daemon, RPC and NLM. The default values
+are 0. They can be set to one, to turn debugging on. (The default
+value is 0 for each)
+
+3.8 /proc/sys/net - Networking stuff
+
+The interface to the networking parts of the kernel is located in
+/proc/sys/net. The table below shows all possible subdirectories. You
+may see only some of them, depending on the configuration of your
+kernel:
+
++-------------------------------------------------------------+
+| core General parameter |appletalk Appletalk protocol |
+| unix Unix domain sockets |netrom NET/ROM |
+| 802 E802 protocol |ax25 AX25 |
+| ethernet Ethernet protocol |rose X.25 PLP layer |
+| ipv4 IP version 4 |x25 X.25 protocol |
+| ipx IPX |token-ring IBM token ring |
+| bridge Bridging |decnet DEC net |
+| ipv6 IP version 6 | |
++-------------------------------------------------------------+
+
+We will concentrate on IP networking here. As AX15, X.25, and DEC Net
+are only minor players in the Linux world, we'll skip them in this
+chapter. You'll find some short info to Appletalk and IPX further down
+in section 3.10 and 3.11. Please look in the online documentation and
+the kernel source to get a detailed view of the parameters for those
+protocols. In this section we'll discuss the subdirectories printed in
+bold letters in the table above. As default values are suitable for
+most needs, there is no need to change these values.
+
+/proc/sys/net/core - Network core options
+
+rmem_default
+ The default setting of the socket receive buffer in bytes.
+
+rmem_max
+ The maximum receive socket buffer size in bytes.
+
+wmem_default
+ The default setting (in bytes) of the socket send buffer.
+
+wmem_max
+ The maximum send socket buffer size in bytes.
+
+message_burst and message_cost
+ These parameters are used to limit the warning messages written to
+ the kernel log from the networking code. They enforce a rate limit
+ to make a denial-of-service attack impossible. The higher the
+ message_cost factor is, the less messages will be
+ written. Message_burst controls when messages will be dropped. The
+ default settings limit warning messages to one every five seconds.
+
+netdev_max_backlog
+ Maximal number of packets, queued on INPUT side, when the interface
+ receives packets faster than kernel can process them.
+
+optmem_max
+ Maximum ancillary buffer size allowed per socket. Ancillary data is
+ a sequence of struct cmsghdr structures with appended data.
+
+/proc/sys/net/unix - Parameters for UNIX domain sockets
+
+There are only two files in this subdirectory. They control the delays
+for deleting and destroying socket descriptors.
+
+3.9 /proc/sys/net/ipv4 - IPV4 settings
+
+IP version 4 is still the most used protocol in Unix networking. It
+will be replaced by IP version 6 in the next couple of years, but for
+the moment it's the de facto standard for the internet and is used in
+most networking environments around the world. Because of the
+importance of this protocol, we'll have a deeper look into the subtree
+controlling the behavior of the IPv4 subsystem of the Linux kernel.
+
+Let's start with the entries in /proc/sys/net/ipv4 itself.
+
+ICMP settings
+
+icmp_echo_ignore_all and icmp_echo_ignore_broadcasts
+ Turn on (1) or off (0), if the kernel should ignore all ICMP ECHO
+ requests, or just those to broadcast and multicast addresses.
+
+ Please note that if you accept ICMP echo requests with a
+ broadcast/multicast destination address your network may be used
+ as an exploder for denial of service packet flooding attacks to
+ other hosts.
+
+icmp_destunreach_rate, icmp_echoreply_rate,
+icmp_paramprob_rate and icmp_timeexeed_rate
+ Sets limits for sending ICMP packets to specific targets. A value of
+ zero disables all limiting. Any positive value sets the maximum
+ package rate in hundredths of a second (on Intel systems).
+
+IP settings
+
+ip_autoconfig
+ This file contains one, if the host got its IP configuration by
+ RARP, BOOTP, DHCP or a similar mechanism. Otherwise it is zero.
+
+ip_default_ttl
+ TTL (Time To Live) for IPv4 interfaces. This is simply the
+ maximum number of hops a packet may travel.
+
+ip_dynaddr
+ Enable dynamic socket address rewriting on interface address change. This
+ is useful for dialup interface with changing IP addresses.
+
+ip_forward
+ Enable or disable forwarding of IP packages between interfaces. A
+ change of this value resets all other parameters to their default
+ values. They differ if the kernel is configured as host or router.
+
+ip_local_port_range
+ Range of ports used by TCP and UDP to choose the local
+ port. Contains two numbers, the first number is the lowest port,
+ the second number the highest local port. Default is 1024-4999.
+ Should be changed to 32768-61000 for high-usage systems.
+
+ip_no_pmtu_disc
+ Global switch to turn path MTU discovery off. It can also be set
+ on a per socket basis by the applications or on a per route
+ basis.
+
+ip_masq_debug
+ Enable/disable debugging of IP masquerading.
+
+
+IP fragmentation settings
+
+ipfrag_high_trash and ipfrag_low_trash
+ Maximum memory used to reassemble IP fragments. When
+ ipfrag_high_thresh bytes of memory is allocated for this purpose,
+ the fragment handler will toss packets until ipfrag_low_thresh is
+ reached.
+
+
+ipfrag_time
+ Time in seconds to keep an IP fragment in memory.
+
+TCP settings
+
+tcp_retrans_collapse
+ Bug-to-bug compatibility with some broken printers. On retransmit
+ try to send bigger packets to work around bugs in certain TCP
+ stacks. Can be turned off by setting it to zero.
+
+tcp_keepalive_probes
+ Number of keep alive probes TCP sends out, until it decides that the
+ connection is broken.
+
+tcp_keepalive_time
+ How often TCP sends out keep alive messages, when keep alive is
+ enabled. The default is 2 hours.
+
+tcp_syn_retries
+ Number of times initial SYNs for a TCP connection attempt will be
+ retransmitted. Should not be higher than 255. This is only the
+ timeout for outgoing connections, for incoming connections the
+ number of retransmits is defined by tcp_retries1.
+
+tcp_sack
+ Enable select acknowledgments after RFC2018.
+
+tcp_timestamps
+ Enable timestamps as defined in RFC1323.
+
+tcp_stdurg
+ Enable the strict RFC793 interpretation of the TCP urgent pointer
+ field. The default is to use the BSD compatible interpretation
+ of the urgent pointer pointing to the first byte after the urgent
+ data. The RFC793 interpretation is to have it point to the last
+ byte of urgent data. Enabling this option may lead to
+ interoperatibility problems. Disabled by default.
+
+tcp_syncookies
+ Only valid when the kernel was compiled with
+ CONFIG_SYNCOOKIES. Send out syncookies when the syn backlog queue
+ of a socket overflows. This is to prevent against the common 'syn
+ flood attack'. Disabled by default.
+
+ Note that the concept of a socket backlog is abandoned, this
+ means the peer may not receive reliable error messages from an
+ over loaded server with syncookies enabled.
+
+tcp_window_scaling
+ Enable window scaling as defined in RFC1323.
+
+tcp_fin_timeout
+ How many seconds to wait for a final FIN before the socket is
+ always closed. This is strictly a violation of the TCP
+ specification, but required to prevent denial-of-service attacks.
+
+tcp_max_ka_probes
+ How many keepalive probes are sent per slow timer run. Shouldn't be
+ set too high to prevent bursts.
+
+tcp_max_syn_backlog
+ Length of the per socket backlog queue. Since Linux 2.2 the backlog
+ specified in listen(2) only specifies the length of the backlog
+ queue of already established sockets. When more connection requests
+ arrive Linux starts to drop packets. When syncookies are enabled
+ the packets are still answered and the maximum queue is effectively
+ ignored.
+
+tcp_retries1
+ Defines how often an answer to a TCP connection request is
+ retransmitted before giving up.
+
+tcp_retries2
+ Defines how often a TCP packet is retransmitted before giving up.
+
+Interface specific settings
+
+In the directory /proc/sys/net/ipv4/conf you'll find one subdirectory
+for each interface the system knows about and one directory calls
+all. Changes in the all subdirectory affect all interfaces, where
+changes in the other subdirectories affect only one interface.
+
+All directories have the same entries:
+
+accept_redirects
+ This switch decides if the kernel accepts ICMP redirect messages
+ or not. The default is 'yes', if the kernel is configured for a
+ regular host; and 'no' for a router configuration.
+
+accept_source_route
+ Should source routed packages be accepted or declined. The
+ default is dependent on the kernel configuration. It's 'yes' for
+ routers and 'np' for hosts.
+
+bootp_relay
+ Accept packets with source address 0.b.c.d destined not to this
+ host as local ones. It is supposed that BOOTP relay daemon will
+ catch and forward such packets.
+
+ The default is 'no', as this feature is not implemented yet
+ (kernel version 2.2.0-pre?).
+
+forwarding
+ Enable or disable IP forwarding on this interface.
+
+log_martians
+ Log packets with source addresses with no known route to kernel log.
+
+mc_forwarding
+ Do multicast routing. The kernel needs to be compiled with
+ CONFIG_MROUTE and a multicast routing daemon is required.
+
+proxy_arp
+ Do (1) or don't (0) do proxy ARP.
+
+rp_filter
+ Integer value deciding if source validation should be made.
+ 1 means yes, 0 means no. Disabled by default, but
+ local/broadcast address spoofing is always on.
+
+ If you set this to 1 on a router that is the only connection
+ for a network to the net , it evidently prevents spoofing attacks
+ against your internal networks (external addresses can still be
+ spoofed), without the need for additional firewall rules.
+
+secure_redirects
+ Accept ICMP redirect messages only for gateways, listed in
+ default gateway list. Enabled by default.
+
+shared_media
+ If it is not set the kernel does not assume that different subnets
+ on this device can communicate directly. Default setting is 'yes'.
+
+send_redirects
+ Determines if or if not to send ICMP redirects to other hosts.
+
+
+Routing settings
+
+The directory /proc/sys/net/ipv4/route contains several file to
+control routing issues.
+
+error_burst and error_cost
+ These parameters are used to limit the warning messages written to
+ the kernel log from the routing code. The higher the error_cost
+ factor is, the fewer messages will be written. Error_burst controls
+ when messages will be dropped. The default settings limit warning
+ messages to one every five seconds.
+
+flush
+ Writing to this file results in a flush of the routing cache.
+
+gc_elastic, gc_interval, gc_min_interval, gc_tresh, gc_timeout
+ Values to control the frequency and behavior of the garbage
+ collection algorithm for the routing cache.
+
+max_size
+ Maximum size of the routing cache. Old entries will be purged
+ once the cache has this size.
+
+max_delay, min_delay
+ Delays for flushing the routing cache.
+
+redirect_load, redirect_number
+ Factors which determine if more ICPM redirects should be sent to
+ a specific host. No redirects will be sent once the load limit or
+ the maximum number of redirects has been reached.
+
+redirect_silence
+
+ Timeout for redirects. After this period redirects will be sent
+ again, even if this has been stopped, because the load or number
+ limit has been reached.
+
+Network Neighbor handling
+
+Settings about how to handle connections with direct neighbors (nodes
+attached to the same link) can be found in the directory
+/proc/sys/net/ipv4/neigh.
+
+As we saw it in the conf directory, there is a default subdirectory
+which holds the default values, and one directory for each
+interface. The contents of the directories are identical, with the
+single exception that the default settings contain additional options
+to set garbage collection parameters.
+
+In the interface directories you'll find the following entries:
+
+base_reachable_time
+ A base value used for computing the random reachable time value
+ as specified in RFC2461.
+
+retrans_time
+ The time, expressed in jiffies (1/100 sec), between retransmitted
+ Neighbor Solicitation messages. Used for address resolution and to
+ determine if a neighbor is unreachable.
+
+unres_qlen
+ Maximum queue length for a pending arp request - how many packets
+ are accepted from other layers while the arp address is still
+ resolved.
+
+anycast_delay
+ Maximum for random delay of answers to neighbor solicitation
+ messages in jiffies (1/100 sec). Not yet implemented (Linux does
+ not have anycast support yet).
+
+ucast_solicit
+ Maximum number of retries for unicast solicitation.
+
+mcast_solicit
+ Maximum number of retries for multicast solicitation.
+
+delay_first_probe_time
+ Delay for the first time probe if the neighbor is reachable. (see
+ gc_stale_time).
+
+locktime
+ An ARP/neighbor entry is only replaced with a new one if the old
+ is at least locktime old. This prevents ARP cache thrashing.
+
+proxy_delay
+ Maximum time (real time is random [0..proxytime]) before
+ answering to an arp request for which we have an proxy arp entry.
+ In some cases, this is used to prevent network flooding.
+
+proxy_qlen
+ Maximum queue length of the delayed proxy arp timer (see
+ proxy_delay).
+
+app_solcit
+ Determines the number of requests to send to the user level arp
+ daemon. 0 to turn off.
+
+gc_stale_time
+ Determines how often to check for stale ARP entries. After an ARP
+ entry is stale it will be resolved again (useful when an IP address
+ migrates to another machine). When ucast_solicit is > 0 it first
+ tries to send an ARP packet directly to the known host, when that
+ fails and mcast_solicit is > 0, an ARP request is broadcasted.
+
+3.10 Appletalk
+
+The /proc/sys/net/appletalk directory holds the Appletalk
+configuration data when Appletalk is loaded. The configurable
+parameters are:
+
+aarp-expiry-time
+ The amount of time we keep an AARP entry before expiring
+ it. Used to age out old hosts.
+
+aarp-resolve-time
+ The amount of time we will spend trying to resolve an Appletalk
+ address.
+
+aarp-retransmit-limit
+ The number of times we will retransmit a query before giving up.
+
+aarp-tick-time
+ Controls the rate at which expiries are checked.
+
+
+The directory /proc/net/appletalk holds the list of active appletalk
+sockets on a machine.
+
+The fields indicate the DDP type, the local address (in network:node
+format) the remote address, the size of the transmit pending queue,
+the size of the received queue (bytes waiting for applications to
+read) the state and the uid owning the socket.
+
+/proc/net/atalk_iface lists all the interfaces configured for
+appletalk.It shows the name of the interface, its appletalk address,
+the network range on that ad- dress (or network number for phase 1
+networks), and the status of the interface.
+
+/proc/net/atalk_route lists each known network route. It lists the
+target (network) that the route leads to, the router (may be directly
+connected), the route flags, and the device the route is via.
+
+3.11 IPX
+
+The IPX protocol has no tunable values in /proc/sys/net.
+
+The IPX protocol does, however, provide /proc/net/ipx. This lists each
+IPX socket giving the local and remote addresses in Novell format
+(that is network:node:port). In accordance with the strange Novell
+tradition, everything but the port is in hex. Not_Connected is
+displayed for sockets that are not tied to a specific remote
+address. The Tx and Rx queue sizes indicate the number of bytes
+pending for transmit and receive. The state indicates the state the
+socket is in and the uid is the owning uid of the socket.
+
+The /proc/net/ipx_interface file lists all IPX interfaces. For each
+interface it gives the network number, the node number, and indicates
+if the network is the primary network. It also indicates which device it is bound to (or
+Internal for internal networks) and the Frame Type if
+appropriate. Linux supports 802.3, 802.2, 802.2 SNAP and DIX (Blue
+Book) ethernet framing for IPX.
+
+The /proc/net/ipx_route table holds a list of IPX routes. For each
+route it gives the destination network, the router node (or Directly)
+and the network address of the router (or Connected) for internal
+networks.
FUNET's LINUX-ADM group, linux-adm@nic.funet.fi
TCL-scripts by Sam Shen (who was at: slshen@lbl.gov)