Guidelines for Usage of Disk Space
(This is CSN 92/320, revised 22-Oct-1998) This is a routine update of the disk space usage guidelines for LNS computers.
Types of disk space
Four types of public disk space are made available to general LNS users, USER, TEM, MTM, and STM. In addition, many users and collaborators have private or semiprivate disk space attached to their local machine. USER disk space is the most limited in space, but its integrity is guaranteed by the computer group. It is backed up daily on all platforms, Alpha, DECstation, and VAX. (Send requests for restores from backup to the computer group 'service' address.) USER disk space is intended for smallish but irreplaceable files like source code, command files, e-mail correspondence, and tex code (but not postscript versions) for documents in process. Some people also like to use USER space for archiving their small, but historically interesting files, even though they don't need them backed up daily.
TEM, MTM, and STM are pooled temporary disk space. A file on the TEM (temporary) disk is automatically deleted two weeks after its last use. For MTM (medium-term) space (available only on the DECstations), the time limit is one week. For STM (short-term) space the limit is three days, but the clock starts at file creation time and is not reset to zero each time the file is accessed, so STM space is mostly useful for spooling or accumulating Monte Carlo files. Files may disappear at any time after their time limit has expired. Deletions may be made at any time of day at the convenience of the computer group. Temporary disk space is intended for files whose usefulness is transitory, like executables (which tend to be updated regularly), log files of batch jobs, histograms and n-tuples of work in progress, data samples, or Monte Carlo files being accumulated prior to being written to tape. The user is free to choose which of the temporary disks he/she wishes, based on where space is available and an estimate of how long the file is likely to be useful. MTM and STM (and TEM on the VAX) are not automatically backed up by the computer group; the user is responsible for writing to tape any files whose replacement would be inconvenient, expensive, or impossible. The AXP and DECstation TEM disks are backed up by the computer group at 2-week intervals, so that in principle all files which are automatically deleted will be on a backup tape somewhere. See Suzanne to get files off backup.
Private and semiprivate disk space is provided on hard disks attached to user's personal computers and workstations. An attempt is made to back up all user files on personal computers every day (incremental backup) and weekly (full backup). Some days the workload makes this impossible, however. The unix workstations are backed up only by special arrangement. Contact Suzanne for details.
Allotment of USER disk space
Each user is allotted 40 MB of USER space on the AXP /home disk. The UNIX system automatically administers disk space allotments. Each user is responsible for making sure that he or she doesn't lose a job because of exceeding the allocation. The administering program recognizes two limits, the allotment and a drop-dead limit. If a user's job exceeds the 40 MB allotment, a one-week timer is started. After a week of being over quota, the user is not allowed to open any new files. If a user exceeds the drop-dead limit (50 MB), his/her job is immediately aborted if it tries to write a file. There will be no warning when the 50 MB limit is exceeded. When the 40 MB timer is started, there will be two warning mechanisms activated. When the user logs in, there will be a system warning message reminding him/ her of the number of days remaining before file-writing will be inhibited. In parallel, a nightly batch job will examine all users' disk usage and send warning e-mails to anyone exceeding 40 MB. The computer group will watch USER disk space and try to ensure that all users can comfortably use their allotments. Unfortunately, there may be bugs in the operating system which allow the disks to be completely filled, so we cannot confidently predict that crashes due to filled /home disks are a thing of the past.
On the DECStation, the system is the same as the AXP, except that the hard limit is 40MB, the guideline for usage is 10MB, and the automatic warning and job-killing system has not proven to be necessary with the more laid-back DECstation users.
On the VAX user disk space is allocated under a quota system, with each user receiving an individually negotiated limit, typically about 12,000 blocks (6MB). Since most people use the VAX only for e-mail, the advice to prune your mail files is particularly relevant here.
Sharing TEM, MTM, and STM disk space
For TEM, MTM, and STM disk space, it is impractical and inflexible to use this kind of allotment system. We will continue to use the old pooled resources model for disk space usage. The working hypothesis is that many individuals will each need a lot of disk space for only a short time, but that different user's needs will not be simultaneous, so that we can get by with a relatively small amount of disk space on the average. By pooling the space we get much more efficient use of the scarce resources. This only works if almost all users are good about not tying up space unnecessarily. The computer group is aware of the practice of some individuals of deliberately "touching" their temporary disk files to prevent deletion, even though the files are not really in use. This practice, of course, reduces the amount of disk available to the pool, and is therefore strongly discouraged. In an emergency, we reserve the right to delete such files without warning, but normally we will attempt to convince the user to voluntarily relinquish space. Habitual abusers (we know who you are) will be handled more roughly.
Occasionally we run out of disk space. Batch jobs start to bomb and users complain that they can't get their work done. On these occasions special and sometimes drastic action is taken by the computer group, the software czar, or the designated vigilante (me).
Tips for Reducing Disk Usage
Avoid using precious USER disk space for files of transient interest like executables, load maps, and histogram files. As long as you keep using such files, they will stay on the temporary disks. If you keep them on the TEM disk (on the AXP), they will even be backed up before being deleted. The available amount of USER disk space is limited not only by the number of disks we can afford to devote to it, but by the logistics of backing it up. Delete your files off temporary disk as soon they are no longer of use, even if they would be automatically deleted in a week or two anyhow. This can be a factor two improvement in available space, since typically the useful lifetime of a file is less than half of the time before we autodelete it.
Although tape drives are somewhat unreliable, in some circumstances they may be a viable alternative to disk.
Compressing: Sometimes UNIX users want to archive old source or histogram files in their USER disk space so they can conveniently refer to them. The UNIX commands gzip/gunzip typically reduce disk usage by factor of 2 to 5. In brief, the command
compresses the file to MYFILE.gz, and the command
expands it again. Do
for details. This technique can also be used to reduce the size of executables, load maps, log files, and libraries which you are afraid to delete, but to which you are unlikely to refer. For .fzx files gzip helps, but it only reduces files by one third. Nevertheless, CLEVER has built in the ability to read gz files directly, since a 1/3 reduction in a several hundred MB file is significant. Gzip typically does little good on .rp files, which are already pretty compact.
You can redirect the log output of a batch job to the tem or stm disk instead of the default home disk with the -o switch, for example,
submit with the command:
qsub -o /tem/dlk/logfiles/ myjob
or include in the script:
#$ -o /tem/dlk/logfiles
In the example, the -o switch redirects the output to dlk's logfiles directory on the /tem disk. You can also specify a maximum length in the script, after which the batch job will be terminated, for example if the job generates too many error messages. Use
ulimit -f (# of 512-byte blocks) # for ksh
ulimit -f (# of kilobytes) # for sh, bash
limit filesize (#k) # for [t]csh; k=KB, b=block, m=MB
Netscape users: Netscape cache files take 5MB of space on the AXP home disk by default. See Controlling Netscape Caching
for strategies to reduce this waste of space.
Old mail files can take up a substantial amount of disk space. Prune occasionally and/or copy them to an archive which you gzip.
Some people have submitted huge numbers of batch jobs and then left town. This can result in the filling of all available disk space with data files or log files if something goes wrong or if queues are worked off faster than the submitter anticipated. Please do not submit large numbers of jobs unless you will be able to check up on their progress. If the computer group sees signs of such abuse, they will delete jobs and files at their discretion.
How much disk space is it reasonable to occupy?
Total available space and disk usage guidelines are summarized in Tables I and II. Responsible users will exceed the guidelines only occasionally, and then for short periods of time. If you need to exceed a guideline by a large amount or for a longer period than a day, please discuss possible alternatives with the software coordinator. The guidelines are, of course, subject to change as the supply of disk space and patterns of usage change. To check your usage of home disk space, type
in your root area on the AXPs, and
on the DECstations. The system will print your total disk space allotment in kilobytes. To see how this breaks down, type
TABLE I TOTAL AVAILABLE DISK SPACE, IN GB
--- USER --- ---- TEM ---- - MTM - ---- STM ----
AXP | 8.8 | 8.8 | - | 17.6 |
| | | | |
DS | 3.7 | 0.9 | 1.4 | 0.9 |
| | | | |
VAX | 2.2 | 4.2 | - | 2.8 |
| | | | |
TABLE II DISK USAGE GUIDELINES IN KB, [BLOCKS]
--- USER --- ---- TEM ---- - MTM- ---- STM ----
| | | | |
AXP | 40000 | 250000 (=0.25GB)| | 500000 (=0.5GB) |
| | | | |
DS | 10000 | 50000 | 100000 | 150000 |
| | | | |
VAX | ~6000 [~12000] | 100000  | - | 250000  |
| | | | |
| | | | |