Matt Bishop
Technical Report: PCS-TR91-156
1991
This work was supported by grants NAG2-328 and NAG2-628 from the National Aeronautics and Space Administration to Dartmouth College.
The threat of attack by computer viruses is in reality a very small part of a much more general threat, specifically attacks aimed at subverting computer security. This paper examines computer viruses as malicious logic in a research and development environment, relates them to various models of security and integrity, and examines current research techniques aimed at controlling the threats viruses in particular, and malicious logic in general, pose to computer systems. Finally, a brief examination of the vulnerabilities of research and development systems that malicious logic and computer viruses may exploit is undertaken.
A computer virus is a sequence of instructions that copies itself into other programs in such a way that executing the program also executes that sequence of instructions. Rarely has something seemingly so esoteric captured the imagination of so many people; magazines from Business Week to the New England Journal of Medicine [39][48][60][72][135], books [20][22][31][40][50][67][83][90][108][124], and newspaper articles [85][91][92][94][114][128] have discussed viruses, applying the name to various types of malicious programs.
As a result, the term “computer virus” is often misunderstood. Worse, many who do understand it do not understand protection in computer systems, for example believing that conventional security mechanisms can prevent virus infections, or are flawed because they cannot. But computer viruses use a number of well-known techniques in an unusual order; they do not employ previously-unknown methods. So, although existing computer security mechanisms were not designed specifically to counter computer viruses, many of those mechanisms were designed to deal with techniques used by computer viruses. While security mechanisms cannot prevent computer virus infections any more than they can prevent all attacks, they can impede a virus’ spread as well as make the introduction of a computer virus difficult, just as they can limit the damage done in an attack, or make a successful attack very difficult. This paper tries to show the precise impact of many conventional security mechanisms on computer viruses by analyzing viruses in a general framework.
Because the probability of encountering a computer virus and the controls available to deal with it vary widely among different environments, this paper confines itself to that environment consisting of computers running operating systems designed for research and development, such as the UNIX1 operating system, the VAX/VMS2 operating system, and so forth. There is already a wealth of literature on computer viruses within the personal computing world (for example, see [34][62][65][124]), and a simple risk analysis (upon which we shall later elaborate) suggests that systems designed for accounting, inventory control, and other primarily business oriented operations are less likely to be attacked by using computer viruses than by other methods. So, while some of the following discussion may be fruitfully applied to computer systems in those environments (for example, see [1]), many of the underlying assumptions of system management and administration simply do not apply to those environments.
First, we shall review what a computer virus is, and analyze the properties that make it a threat to computer security. Next, we present a very brief history of computer viruses and consider whether their threat is relevant to research and development systems, and if so, how. After exploring some of the research in secure systems that show promise for coping with viruses, we examine several specific areas of vulnerability in research-oriented systems. We conclude with a quick summary.
Computer viruses do not appear spontaneously [25]; an attacker must introduce one to the targeted computer system, usually by persuading, or tricking, someone with legitimate access into placing the virus on the system. This can readily be done using a Trojan horse, a program which performs a stated function while performing another, unstated and usually undesirable one (see sidebar 1).3 For example, suppose a file used to boot a microcomputer contains a Trojan horse designed to erase a disk. When the microcomputer boots, it will execute the Trojan horse, which would erase the disk. Here, the overt function is to provide a basic operating system; the covert function is to erase the disk.
Many studies have shown the effectiveness of the Trojan horse attack (see [99][101], for example), and one such study [74] described a Trojan horse that reproduces itself (a replicating Trojan horse). If such a program infects another by inserting a copy of itself into the other file or process, it is a computer virus. (See sidebar 2; Leonard Adelman first called programs with the infection property “viruses” in a computer security seminar in 1983 [25].)
A computer virus infects other entities during its infection phase, and then performs some additional (possibly null) actions during its execution phase. Many view the infection phase as part of the “covert” action of a Trojan horse, and consequently consider the virus to be a form of the Trojan horse [44][69]. Others treat the infection phase as “overt” and distinguish between the virus and the Trojan horse, since a virus may infect and perform no covert action [25][97]. But all agree that a virus may perform covert actions during the execution phase.
Like Trojan horses [39], computer viruses are instances of malicious logic or malicious programs. Other programs which may be malicious but are not computer viruses are worms, which copy themselves from computer to computer4; bacteria, which replicate until all available resources of the host computer are absorbed; and logic bombs, which are run when specific conditions, such as the date being Friday the 13th, hold.
Malicious logic uses the user’s rights to perform their functions; a computer virus will spread only as the user’s rights will allow it, and can only take those actions that the user may take, since operating systems cannot distinguish between intentional and unintended actions. As the programs containing viruses are shared among users, the viruses spread among those users [25][97] until all programs writable by any infected program are themselves infected [56].
A site’s security policy describes how users may access the computer system or information on it, and the policy’s nature depends largely on how the system is to be used. Military system security policies deal primarily with disclosure of information, whereas commercial security policies deal primarily with the integrity of data on a system.
Security mechanisms that enforce policies partition the system into protection domains which define the set of objects that processes may access. Mandatory access controls prevent processes from crossing protection domain boundaries. Discretionary access controls condition permission to cross domain boundaries upon both the process identity and information associated with the object to be accessed.
Policies using mandatory access controls to prevent disclosure define a linear ordering of security levels, and a set of classes into which information is placed. Each entity’s security classification is defined by the pair (security level, set of classes); the security classification of entity A dominates that of entity B if A’s security level is at least that of B and A’s set of classes contains all elements of B’s set of classes. Then the controls usually enforce some variant of the Bell-LaPadula model [9]: a subject may read an object only if the subject’s security classification dominates that of the object (the simple security property) and a subject may modify an object only if the object’s security classification dominates that of the subject (the *-property or the confinement property). Hence subjects may obtain information only from entities with “lower” security classifications, and may disclose information only to entities with a “higher” security classification. These controls limit malicious logic designed to disclose information to the relevant protection domain; they do not limit malicious logic designed to corrupt information in “higher” security classifications.
Policies using discretionary access controls to limit disclosure assume that all processes of a given identity act with the authorization of that identity. When a program containing malicious logic is executed, the malicious logic executes with the same identity as that user’s legitimate processes. The protection mechanism has no way to distinguish between acts done for the user and acts done for the attacker by the malicious logic.
Policies using mandatory access controls to limit modification of entities often implement the mathematical dual of the multilevel security model described above. Multilevel integrity models define integrity levels and classes analogous to those of the multilevel security models; then controls may enforce the Biba integrity model [11], which allows a subject to read an entity only if the entity’s integrity classification dominates that of the subject (the simple integrity property), and a subject to modify an entity only if the subject’s integrity classification dominates that of the entity (the integrity confinement property). This prevents a subject from modifying data or other programs at a higher integrity level, and a subject from relying on data or other programs at a lower integrity level. Hence, malicious logic can only damage those entities with lower or equal integrity classifications.
Lipner has proposed using the multilevel disclosure model to enforce multilevel integrity by assigning classifications and levels to appropriate user communities [87]; however, he notes that malicious logic could “write up” and thereby infect programs or alter production data and code. Clark and Wilson have proposed an alternate model [24] in which data and programs are manipulated by well-defined “transformation procedures,” these procedures having been certified by the system security officer as complying with the site integrity policy. Hence computer viruses could only propagate among production programs if a transformation procedure which contains one is itself certified to conform to the integrity policy.
Policies using discretionary access controls to limit modification of entities make the same assumptions as security policies using discretionary access controls, with similar results.
Systems implementing multilevel security and integrity policies usually allow some small set of trusted entities to violate the stated policy when necessary for the smooth operation of the computer system. The usefulness of whatever security model the system implements depends to a very great extent on these exceptions; for should a trusted entity attempt to abuse its power to deviate from the strict policy, little can be done. The statements describing the effects of the controls on malicious logic above apply only to the model, and must be suitably modified for those situations in which a security policy allows (trusted) entities to violate the policy.
The two phases of a computer virus’ execution illustrate this. Infecting (altering) a program may be possible due to an allowed exception to the site’s integrity model. Executing a computer virus to disclose some information across protection domain boundaries may also be possible because of an allowed exception to the site’s disclosure model. So the virus may spread more widely because of the allowed exceptions.
An alternate view of malicious logic is that it causes the altered program to deviate from its specification. If this is considered an “error” as well as a breach of security, fault-tolerant computer systems, which are designed to continue reliable operation when errors occur, could constrain malicious logic. Designers of reliable systems place emphasis on both recovery and preventing failures [106]; however, if malicious logic discloses information or gives away rights, or controls other critical systems (such as life support systems), recovery may not be possible. So the areas of reliability and fault-tolerance are relevant to the study of malicious logic, but those areas of fault recovery are less so.
In the most general case, whether a given program will infect another is undecidable [2][25], so programs that look for virus infections must check characteristics of known viruses rather than rely on a general infection detection scheme. Further, viruses can be programmed to mutate, and hence be able to evade those agents, which in turn can be programmed to detect the mutations; and in the general case, whether or not one virus mutated to produce another virus is also undecidable [30].
One of the earliest documented replicating Trojan horses was a version of the game program animal which when played created another copy of itself. A later version deleted one copy of the first version, and then created two additional copies of itself. Because it spread even more rapidly than the first version, this later program supplanted the first entirely. After a preset date, whenever anyone played the second version, it deleted itself after the game ended [41].
Ken Thompson created a far more subtle replicating Trojan horse when he rigged a compiler to break login security [107][127]. When the compiler compiled the login program, it would secretly insert instructions to cause the resulting executable program to accept a fixed, secret password as well as a user’s real password. Also, when compiling the compiler, the Trojan horse would insert commands to modify the login command into the resulting executable compiler. Thompson then compiled the compiler, deleted the new source, and reinstalled the old source. Since it showed no traces of being doctored, anyone examining the source would conclude the compiler was safe. Fortunately, Thompson took some pains to ensure that it did not spread further, and it was finally deleted when someone copied another version of the executable compiler over the sabotaged one. Thompson’s point was that “no amount of source-level verification or scrutiny will protect you from using untrusted code” ([127], p. 763), which bears remembering, especially given the reliance of many security techniques relying on humans certifying programs to be free of malicious logic.
In 1983, Fred Cohen designed a computer virus to acquire privileges on a VAX-11/750 running UNIX; he obtained all system rights within half an hour on the average, the longest time being an hour, and the least being under 5 minutes. Because the virus did not degrade response time noticeably, most users never knew the system was under attack. In 1984 an experiment involving a UNIVAC 1108 showed that viruses could spread throughout that system too. Viruses were also written for other systems (TOPS-205, VAX/VMS, and a VM/3706 system) but testing their effectiveness was forbidden. Cohen’s experiments indicated that the security mechanisms of those systems did little if anything to inhibit computer virus propagation [25][26].
In 1987, Tom Duff experimented on UNIX systems with a small virus that copied itself into executable files. The virus was not particularly virulent, but when Duff placed 48 infected programs on the most heavily used machine in the computing center, the virus spread to 46 different systems and infected 466 files, including at least one system program on each computer system, within eight days. Duff did not violate the security mechanisms in any way when he seeded the original 48 programs [45]. By writing another virus in a language used by a command interpreter common to most UNIX systems, he disproved a common fallacy [50] that computer viruses are intrinsically machine dependent, and cannot spread to systems of varying architectures.
On November 2, 1988, a program combining elements of a computer worm and a computer virus targeting Berkeley and Sun UNIX-based computers entered the Internet; within hours, it had rendered several thousand computers unusable [46][47][109][117][118][122][123][125]. Among other techniques, this program used a virus-like attack to spread: it inserted some instructions into a running process on the target machine and arranged for those instructions to be executed. To recover, these machines had to be disconnected from the network, rebooted, and several critical programs changed and recompiled to prevent re-infection. Worse, the only way to determine if the program had other malicious side effects (such as deleting files) was to disassemble it. Fortunately, its only purpose turned out to be to propagate. Infected sites were extremely lucky that the worm7 did not infect a system program with a virus designed to delete files, or did not attempt to damage attacked systems. Since then, there have been several incidents involving worms [59][66][125].
In general, though, computer viruses and replicating Trojan horses have been laboratory experiments rather than attacks from malicious or careless users. This raises a question of risk analysis: do the benefits gained in defending against computer viruses offset the costs of recovery and the likelihood of being attacked?
As worded, the above question implies that the mechanisms defending against computer viruses are useful only against computer viruses. However, computer viruses use techniques that are also used in other methods of attack, such as scavenging8, as well as by other forms of malicious logic. Defenses which strengthen access controls to prevent illicit access, or which prevent or detect the alteration of other files, also limit, prevent, or detect these other attacks as well. So, a more appropriate question is whether the benefits gained in defending against all such attacks offset the costs of recovery and the likelihood of being attacked.
Because this paper focuses primarily on computer viruses, we shall not delve into the history of computer security or malicious logic in general. Suffice it to say that the vulnerability of computer systems to such attacks is well known, and attacks on computer systems are common enough (see both [99] and [101] for descriptions of such incidents) that the use of mechanisms to inhibit them is generally agreed to be worthwhile.
The effectiveness of any security mechanism depends upon the security of the underlying base on which the mechanism is implemented, and the correctness of the necessary checking done at each step. If the trust in the base or in the checking is misplaced the mechanism will not be secure. Thus “secure” is a relative notion, as is “trust,” and mechanisms to enhance computer security attempt to balance the cost of the mechanism with the level of security desired and the degree of trust in the base that the site accepts as reasonable. Research dealing with malicious logic assumes the interface, software, and/or hardware used to implement the proposed scheme performs exactly as desired, meaning the trust is in the underlying computing base, the implementation, and (if done) the verification.
Current research uses specific properties of computer viruses to detect and limit their effects. Because of the fundamental nature of these properties, these defenses work equally well against most other forms of malicious logic.
Techniques exploiting this property treat all programs as type “data” until some certifying authority changes the type to “executable” (instructions). Both new systems designed to meet strong security policies and enhancements to existing systems use this method.
Boebert and Kain [18] have proposed labelling subjects and objects in the Logical Coprocessor Kernel or LOCK (formerly the Secure Ada Target or SAT) [17][61][112][113], a system designed to meet the highest level of security under the Department of Defense criteria [43]. Once compiled, programs have the label “data,” and cannot be executed until a sequence of specific, auditable events changes the label to “executable.” After that, the program cannot be modified. This scheme recognizes that viruses treat programs as data (when they infect them by changing the file’s contents) and as instructions (when the program executes and spreads the virus), and rigidly separates the two. The Argus Security Model [3] uses the same principle.
Duff [45] has suggested a variant for UNIX-based systems. Noting that users with execute permission for a file usually also have read permission, he proposes that files with execute permission be of type “executable,” and those without it be of type “data.” Unlike the LOCK, “executable” files could be modified but doing so would change the type to “data.” If the certifying authority were the omnipotent user, the virus could spread only if run as that user. To prevent infection from non-executable files, libraries and other system components of programs must also be certified before use.
Both the LOCK scheme and Duff’s proposal trust that the administrators will never certify a program containing malicious logic (either by accident or deliberately), and that the tools used in the certification process are not themselves corrupt.
Among the many enhancements to discretionary access controls are suggestions to allow the user to reduce the associated protection domain [29][72][121][134]; to base access to files on some characteristic of the command or program [27][81], possibly including subject authorizations as well [25]; and to use a knowledge-based subsystem to determine if a program makes reasonable file accesses [73]. Allowing users to specify semantics for file accesses [10][36] may prove useful in some contexts, for example protecting a limited set of files.
All such mechanisms trust the users to take explicit action to limit their protection domains sufficiently; or trust tables to describe the programs’ expected actions sufficiently for the mechanism to apply those descriptions, and the mechanism to handle commands with no corresponding table entries effectively; or they trust specific programs and the kernel, when those would be the first programs a virus would attack.
Inhibiting users in different protection domains from sharing programs or data will inhibit viruses from spreading among those domains. For example, when users share procedures, the LOCK keeps only one copy of the procedure in memory. A master directory, accessible only to a trusted hardware controller, associates with each procedure a unique owner, and with each user a list of others whom that user trusts. Before executing any procedure, the dynamic linker checks that the user executing the procedure trusts the procedure’s owner [16]. This scheme assumes that users’ trust in one another is always well-placed.
A more general proposal [137] suggests placing programs to be protected at the lowest possible level of an implementation of a multilevel security policy. Since the mandatory access controls will prevent those processes from writing to objects at lower levels, any process can read the programs but no process can write to them. Such a scheme would have to be combined with an integrity model to provide protection against viruses to prevent both disclosure and file corruption. Carrying this idea to its extreme would result in isolation of each domain; since sharing is not possible, no viruses can propagate. Unfortunately, the usefulness of such systems would be minimal.
Mechanisms using manipulation detection codes (or MDCs) apply some function to a file to obtain a set of bits called the signature block and then encrypt that block. If, after recomputing the signature block and reencrypting it, the result differs from the stored signature block, the file has changed [86][95], possibly due to infection or some other cause not related to viruses.
An assumption is that the signed file does not contain a virus before it is signed. Page [100] has suggested expanding the model in [17] to include the software development process (in effect limiting execution domains for each development tool and user) to ensure software is not contaminated during development. Pozzo and Grey [104][105] have implemented Biba’s integrity model on the distributed operating system LOCUS [103] to make the level of trust in the above assumption explicit. They have different classes of signed executable programs. Credibility ratings (Biba’s “integrity levels”) assign a measure of trustworthiness on a scale of 0 (unsigned) to N (signed and formally verified), based on the origin of the software. Trusted file systems contain only signed executable files with the same credibility level. Associated with each user (subject) is a risk level that starts out as the highest credibility level. Users may execute programs with credibility levels no less than their risk level; when the credibility level is lower than the risk level, a special “run-untrusted” command must be used.
All integrity-based schemes rely on software which if infected may fail to report tampering. Performance will be affected as encrypting the file or computing the signature block may take a significant amount of time. The encrypting key must also be secret, for if not then malicious logic can easily alter a signed file without the change being detected.
Network implementations of MDC-based mechanisms require that public keys be certified by a trusted authority and distributed in a trusted fashion (see for example [15][75]). If the key distribution mechanism used the same paths as the data transmission and the public keys were not verifiable using an out-of-bands method, a malicious site (or set of cooperating malicious sites) could alter the data or program being sent, recompute the signature block and sign it with its own (bogus) private key, and then transmit the data; when the public key were requested, it would simply send the one corresponding to the (bogus) private key. The more general (non-network) software distribution problem has similar requirements [35].
Anti-virus agents check files for specific viruses and if present either warn the user or attempt to “cure” the infection by removing the virus. Many such agents exist for personal computers, but since each must look for a particular virus or set of viruses, they are very specific tools and, because of the undecidability results stated earlier, cannot deal with viruses not yet analyzed.
Fault-tolerant techniques keep systems functioning correctly when the software or hardware fails to perform to specification. Joseph and ˘Avizienis have suggested treating a virus’ infection and execution phases as errors. The first such proposal [70][71] breaks programs into sequences of non-branching instructions, and checksums each sequence, storing the results in encrypted form. When the program is run, the processor recomputes checksums, and at each branch, a co-processor compares the computed checksum to the encrypted checksum; if they differ, an error (which may be an infection) has occurred. Later proposals advocate checking each instruction [35]. These schemes raise issues of key management and protection, as well as how much the software managing keys, transmitting the control flow graph to the co-processor, and implementing the recovery mechanism, may be trusted.
A proposal based on N-Version Programming [5] requires implementing several different versions of an algorithm, running them concurrently and periodically checking intermediate results against each other. If they disagree, the value assumed correct is the intermediate value that a majority of the programs have obtained, and the programs with a different value are malfunctioning (possibly due to malicious logic). This requires a majority of the programs not to be infected, and the underlying operating system to be secure. Also, the issue of the efficacy of N-version programming is highly questionable [77]. Despite claims that the method is feasible [6][23], detecting the spread of a virus would require voting upon each file system access; to achieve this level of comparison, the programs would all have to implement the same algorithm, which defeats the purpose of using N-version programming [78].
Proposals to examine the appearance of programs for identical sequences of instructions or byte patterns [69][137] require a high number of comparisons and would need to take into account the reuse of common library routines or of code [76]. Malicious logic might be present if a program appears to have more programmers than were known to have worked on it, or if one particular programmer appears to have worked on many different and unrelated programs [137]; but several assumptions must first be validated, namely that programmers have their own individual styles of writing programs, that the executable programs generated by the compilers will reflect these styles, and that a coding style analyzer can distinguish these styles from one another. If an object file contains conditionals not corresponding to any in the source, the object may be infected [54]. A fourth proposal suggests designing a filter to detect, analyze, and classify all modifications that a program will make as ordinary or suspicious [32].
Finally, Dorothy Denning has suggested using an intrusion-detection expert system to detect viruses by looking for increases in the size of files, increases in the frequency of writing to executable files, or alterations in the frequency of executing a specific program in ways not matching the profile of users spreading the infection [38]. Several such systems have been implemented [8][88][126] and have detected many anomalies without noticeably degrading the monitored computer. These experiments did not attempt to validate claims about detecting viruses.
Those research proposals that are being implemented are either targeted for specific architectures or are in the very early stages of development. This state of affairs is unsettling for the managers and administrators of existing systems, who need to take some action to protect their users and systems.
The vulnerabilities exploited by a computer virus can also be exploited by other forms of malicious logic, and unless the purpose of the attack is to cause mischief, the other forms of malicious logic are much easier to create. Rather than describe appropriate countermeasures, we simply note that these will differ from environment to environment, and no such list (or even set of lists) can accurately reflect the idiosyncracies of all the different research and development systems and environments; in short, providing such a generic list could give a very false sense of security.
This section discusses the areas of vulnerability. While we emphasize computer viruses throughout, these same vulnerabilities can be exploited by Trojan horses, computer worms, other forms of malicious logic, and, more generally, other types of attacks. We leave it to the reader to formulate appropriate techniques to detect or hinder attacks exploiting each area. (Sidebar 3 offers a starting point for UNIX-based systems.)
Users assume that the computer system provides a set of trustworthy tools for compiling, linking and loading, and running programs. In most systems, the “trust” is the user’s estimate of the quality of the tools available [28] and the working environment. If the estimates are incorrect, the system may be subverted.
Even systems with security enhancements are vulnerable. One version of the UNIX operating system with security enhancements was breached when a user created a version of the directory lister, with a Trojan horse, in his home directory. He then requested assistance from the system operator, who changed to the user’s home directory, and listed the names of the files in it. As the command interpreter checked for commands in the current working directory and then in the system directories, the user’s doctored lister, not the system lister, was executed [120].
In the above, the system administrator trusted the command interpreter to look for system programs before executing programs in users’ directories. Other examples include trusting that the login banner being presented is actually from the login program and not from a user’s program which will record passwords [58], or that page faults cannot be detected while checking passwords one character at a time [82].
Intimately bound with the notion of trust is the ability to share. When many computers share a copy of an infected program, every file accessible from every one of those machines can be infected. Methods of sharing include making and distributing copies of software, accessing bulletin board systems, public file servers, and obtaining source files from remote hosts using a network or electronic mail.
The probability of any new program containing malicious logic depends on the integrity of the author (or authors), the security and integrity of the computer on which they worked, on which the distribution was prepared, and on the method of distribution. Programs sent through electronic mail or posted to bulletin boards may be altered in transit, either by someone modifying them while they sit on an intermediate node, or while they are crossing networks [133]. Further, electronic messages can easily be forged [116][132], so it is unwise to rely on such a program’s stated origin.
In the early 1980s a program posted to the USENET news network contained a command to delete all files on the system in which it was run. Some system administrators executed the program with unlimited privileges, thereby damaging their systems. In another case, although vendors usually take care that their software contains no malicious logic, a company selling software for the Macintosh9 unwittingly delivered copies of programs infected by a computer virus which printed a message asking for universal peace [51].
The infection phase of a virus’ actions require writing to files; for reasons discussed earlier, discretionary access controls provide little protection. Typically some form of auditing is used to detect changes [14][19]; however, auditing schemes cannot prevent damage, but only attempt to provide a record of it and (possibly) indicate the culprit. The best auditing methods use a mechanism that records changes to files or their characteristics. Such schemes require kernel modifications [102] and should be designed into new systems [57][79][96]; if a site has only object code, it cannot add these mechanisms and so must scan the file system [13]. Audit logs must also be protected from illicit modification; again, an element of trust in the underlying subsystem is needed.
A computer virus can defeat any auditing scheme by infecting a file and then altering the file’s contents or characteristics during the audit, for example by restoring the uncorrupted version temporarily. An example of such a stealth virus is the 4096 (personal computer) virus [89].
No program can determine if an arbitrary virus has infected a file because of the undecidability results cited earlier; however, virus detectors or anti-virus agents can check files for specific virus. If a virus detector reports that no infection is present, the file may contain a virus unknown to the detector, or the detector may be corrupt. In February 1989, at Dartmouth College, a user ran an infected version of the virus detection program Interferon, infecting files on his disk. More widely known is the Trojan horse in a doctored copy of the anti-virus program FLUSHOT [64]; later versions are called FSP+ to avoid confusion with the tampered version [7].
Using backups to replace infected files, or files which contain malicious logic, may remove such programs from the system. As most systems make backup copies of files which have changed since the time the previous backup was made, it is quite likely that several backups will need to be examined to find an uncontaminated version of the infected program. Further, unless all malicious programs are found and restored at the same time, the restoration of some uncorrupted programs may do little (for example, computer viruses still resident on the system could infect the newly-restored programs).
If the backup and restore programs themselves contain malicious logic that prevents uncorrupted software from being restored, then the backups are useless until a way is found to replace (or fix) the restore program. Worse, some research and development systems (such as variants of the UNIX operating system) do not allow users to “lock” devices, so one user can access media mounted by another user. Thus, between the mounting and the attempt to restore, another program containing malicious logic could easily infect or erase a mounted backup.
It has been said that computer viruses are a management issue, because they are introduced by people [37]; the same may be said for all malicious logic, and computer security in general. Ideally, security procedures should balance the security and safety of the system and data with the needs of the users and systems personnel to get work done. All too often, users (and systems personnel) see them as burdens to be evaded. Lack of awareness of the reasons for security procedures and mechanisms leads to carelessness or negligence, which can in turn lead to system compromise (see for example [101]).
Little if anything can be done to prevent compromise by trusted personnel. Malicious users and system administrators can often circumvent security policy restrictions without being stopped, or even detected, by using the exceptions to the mechanisms enforcing the policies. (See [99] for examples of these “inside jobs.”) The study of computing ethics, or of a code of ethical conduct, reduces this threat by making clear what actions are considered acceptable; should a breach occur, legal remedies may be available [55][111].
Multi-user computer systems often provide many different levels of privilege; for example, UNIX provides a separate set of privileges for each user, and one all-powerful superuser. Enforcing the principle of least privilege [110] can limit the files that malicious logic can read or write.
If someone using a privileged account accidentally executes a program containing a computer virus, the virus will spread throughout the system rapidly [45]. Hence, simply logging in as a privileged user and remaining so empowered increases the possibility of accidentally triggering some form of malicious logic. More subtle is the use of programs which can cross protection domain boundaries; when the boundary being crossed involves the addition of a privilege or capability that enables the user to affect objects in many other protection domains (such as changing from an unprivileged to a privileged mode), a malicious program could read or alter data or programs not normally accessible to the user. In general, computer systems do not force such programs to function with as few privileges as possible. For example, the setuid and setgid mechanism of UNIX [12][21][84] violate this principle.
A related but widely-ignored problem is the use of “smart” terminals to access privileged accounts. These terminals will respond to control sequences from a host by transmitting portions of the text on their screen back to the host [52], and often perform simple editing functions for the host. Such a terminal can issue a computer virus’ commands in the name of the terminal’s user when appropriate text and control sequences are sent to it (for example, by using an inter-terminal communications program or displaying files with appropriate characters in it.) These commands could instruct the computer to execute an infected program, which would run in the protection domain of the user of the terminal (and not that of the attacker). As many computers use such terminals as their consoles, and allow access to the most privileged accounts only when the user is at the console, the danger is obvious.
The principle of complete mediation [110] requires checking the validity of every access. Although multi-user systems have virtual memory protection to prevent processes from writing into each other’s memory, some represent devices and memory as addressable objects (such as files). If these objects are improperly or inadequately protected, a process could bypass the virtual memory controls and write to any location in memory by placing data and addresses on the bus, thereby altering the instructions and data in another’s memory space (the “core war” games [42] did this). If any process could write to disks without the kernel’s intervention, anyone can change executable programs regardless of their protection – and a virus can easily spread by taking advantage of the (lack of) protection.
This paper has described the threats that computer viruses pose to research and development multi-user computer systems; it has attempted to tie those programs with other, usually simpler, programs that can have equally devastating effects. Although reports of malicious programs in general abound, no non-experimental computer viruses have been reported on mainframe systems.10 Noting that the number of people with access to mainframes is relatively small compared to the number with access to personal computers [130], Highland suggests that as malicious people make up a very small fraction of all computer programmers, most likely fewer malicious people use research and development systems than personal computers [64]. A more persuasive argument, advanced by Fåk [49] and supported by Kurzban [80] is that, as only programmers can create computer viruses, and malicious mainframe programmers can accomplish their goals with less trouble than writing a computer virus, computer virus attacks will most likely be confined to personal computers. Exceptions would most likely be motivated by a perceived intellectual challenge of creating a virus, by a desire to demonstrate limits of existing security mechanisms, by a desire for publicity, or attacks launched simply by carelessness or error [98].11
Should an attacker use a computer virus or other malicious program, security mechanisms currently in use will be as effective as they are against other types of attacks. As with attempts to breach security in general, though, people can prepare for such an attack and minimize the damage done. This paper has described several vulnerabilities in the research and development environment that malicious programs could exploit, and also discussed research underway to improve defenses against malicious logic. How effective these new mechanisms will be in reducing the vulnerabilities, only time will tell.
Thanks to Holly Bishop, Ken Bogart, André Bondi, Emily Bryant, Peter Denning, Donald Johnson, John Rushby, Eugene Spafford, Ken Van Wyk, and the anonymous referees, all of whose comments and advice improved the quality of the paper greatly. Josh Alden of the Dartmouth Virus Clinic described the Interferon infection incident, Robert Van Cleef and Gene Spafford helped reconstruct the USENET logic bomb incident, and Ken Thompson confirmed that he had indeed doctored an internal version of the C compiler as described in [127]. My thanks to them also.
There are many contradictory versions of this story; it appears only briefly in The Odyssey ([68], Book VIII), but later writers elaborated it considerably. Aeneas, a Trojan survivor of the sacking of the city, told the following version to Queen Dido of Carthage during his wanderings that ended with the founding of Rome ([131], Book II).
After many years of besieging Troy and failing to take the city, the Greeks, on the advice of Athene, their patron goddess, built a large wooden horse in which many Greek soldiers hid. The horse was inscribed with a prayer to Athene to grant the Greeks safe passage home, and then the Greek army left.
The next morning, the Trojans discovered the siege had been lifted and went to examine the wooden horse. One of the elders, Thymoetes, noticed the inscription, and urged the horse be brought into the city and placed in Athene’s temple. Others counseled that the horse must be destroyed; Laocoon, a priest of Apollo, threw a spear against the horse’s belly as he cried that he did not trust Greeks bearing gifts.
Meanwhile, shepherds allied with the Trojans brought over a Greek soldier named Sinon. Sinon explained that the Greeks had desecrated Apollo’s shrine and killed a virgin attendant in a raid, so to appease Apollo they had to sacrifice one of their men. Sinon was chosen. He promptly fled and was abandoned when the Greeks left for home. As for the horse, Sinon claimed that one night Odysseus and Diomede desecrated Athene’s shrine, turning their protecting goddess against them. Calchas, the Greeks’ priest, advised that the horse must be built to appease the goddess before they could leave; and the horse was made so big to keep the Trojans from moving it into their city, for if they did their triumph over the Greeks would be assured.
At that moment, two sea serpents slithered out of the waters and crushed Laocoon and his sons to death. Believing this to be retribution for his profaning an offering to Athene, the Trojans immediately breached the walls of the city and pulled the horse inside.
That night, as the Trojans celebrated, they did not notice Sinon slip out to the horse and open a trap door through which the Greek soldiers emerged, nor did they see the Greeks opening the gates to the city. The Greek forces had by this time returned, and they sacked the city. Aeneas and his companions alone escaped.
This pseudocode fragment shows how a very simple computer virus works:
beginvirus: if spread-condition then begin for some set of target files do begin if target is not infected then begin determine where to place virus instructions copy instructions from beginvirus to endvirus into target alter target to execute added instructions end; end; end; perform some action goto beginning of infected program endvirus:
First, the virus determines if it is to spread; if so, it locates a set of target files it is to infect, and copies itself into a convenient location within the target file. It then alters portions of the target to ensure the inserted code will be executed at some time. For example, the virus may append itself just beyond the end of the instruction space and then adjust the entry points used by the loader so that the added instructions will execute when the target program is next run. This is the infection phase It then performs some other action (the execution phase). Finally, it returns control to the program currently being run. Note that the execution phase can be null and the instructions still constitute a virus; but if the infection phase is missing, the instructions are not a virus.
The Lehigh virus [62] had as a spread-condition that “there is an uninfected boot file on the disk;” the set of target files was “the uninfected boot file,” and perform some action was to increment a counter and test to see if the counter had reached 4; if so, it would erase the disk.
This list of suggestions, intended as a starting point for a basic, “vanilla” UNIX-based computer system, may help prevent the introduction of malicious logic, like computer viruses, into the computer system, and also lessen the chances of accidentally invoking programs with that type of logic. Attackers can render these methods ineffective because the weaknesses they seek to patch are fundamental to the design and use of the computer system, and anything effective would require changing the system more than is practical. Still, following these suggestions may help.
More details on UNIX security in general may be found in [33], [50], [53], and [136].
The UNIX shell checks the value of the variable PATH for a list of directories to check for programs. The system administrator had put the current working directory before the system directories in the example in §6.1., Hence the user’s directory listing program, not the system one, was executed.
This rule presumes that the underlying computing base (compiler, loader, operating system, etc.) are all uncorrupted; if this assumption is false, malicious logic may be inserted during compilation, linking, or execution. An obvious corollary is to test all such software in an environment with very limited privileges before installing it, and never to test the program where it can access critical or irreplaceable files, or as a highly-privileged user.
This requires first, that some security policy designating who has access to what files and how be created; and second, that some enforcement mechanism be implemented. Note the caveat: if the audit log created by that mechanism, or the mechanism itself, can be tampered with, the introduction of malicious logic into the system can be done undetectably. However, depending on the security mechanisms implementing the auditing and the access to the log, this may require some sophistication. (Or, it may not.)
This is really a corollary to the previous rule. Note that the checksums computed at installation must be protected, since an attacker could change a file, then compute its new checksum and replace the stored checksum with it. Again, this requires that the underlying system be trusted to provide such protection to the checksum program, the stored checksums, and the audit program comparing the two.
Typically, sites make both daily and weekly incremental backups (which save all files that have changed since the last incremental backup of the same period); then once a month they simply make a copy of all file systems. Enough of each kind is saved to be able to restore the system to its current state. Notice that if restoring to eliminate a malicious program, the restored version of the program should also be thoroughly checked.
The system staff should cultivate good relations with the users and vendors, should be certain to explain the reasons for all security policies, and should assist users whenever possible in providing a pleasant and secure working environment, acting as an intermediary between them and the vendors if need be. Users and staff should know what constitutes a breach of security, and there should be a well-designed set of procedures for handling breaches. Thinking through the best procedures for a particular installation carefully, putting them into place tactfully, and explaining them fully, will do far more to prevent security problems than any quick action.
If malicious programs are determined to be rampant on the system, the administrators should reload the original compilation and installation software from the distribution medium and recompile and regenerate all system files after checking all sources thoroughly. This assumes that the (distributed) compilation and installation software is not infected and the program loading that software does not infect it. As always, the elements of trust are present here.
The reason is explained in the text. Note this means preventing modification access by the hardware, for example by removing the write ring from a tape. If the prevention mechanism is done in software, it can be infected and/or disabled by a malicious program. Here, the element of trust is in the hardware mechanism working correctly.
Should someone using a privileged account accidentally execute a program containing a computer virus, the virus will spread throughout the system rapidly. This is less likely to happen if those accounts are used only when necessary; even so, a window of vulnerability still exists. Computers designed with security in mind typically limit the power of privileged accounts, in some cases very drastically.
The more programs that can cross protection domain boundaries while executing, the more potential targets for the addition of malicious logic exist. This suggestion essentially recommends minimizing the number of programs that can be modified to provide an attacker with entry to the privileged state.
Note that the second version is much weaker, because a malicious program could tamper with an executable program and cause it to display the control sequences to produce the requisite commands from the terminal. The privileged user executing such a command springs the trap. Any file the malicious program could write to can be similarly booby-trapped.
If memory and devices are objects addressable by the user, the access control plan described earlier should include these objects and prevent direct access to them. Specifically, the device and memory files on UNIX systems should never have any world permissions set; this gives users direct access to memory and to the raw device, and allows them to bypass the UNIX access control mechanisms.
The VIRUS-L mailing list, moderated by Kenneth R. van Wyk, is a forum for discussing all aspects of computer viruses, especially existing computer viruses and countermeasures as well as theory. To subscribe, send an electronic mail message containing only the line
SUB VIRUS-L your name
to LISTSERV@LEHIIBM1.BITNET. Back issues of the digest are available by anonymous ftp from IBM1.CC.LEHIGH.EDU or cert.sei.cmu.edu; users not on the internet may send to the above address an electronic mail message containing only the line
GET VIRUS-L LOGyymmx
where yy is the last two digits of the year, mm the number of the month, and x a letter indicating the number of the week in the month. For example, LOG8901B refers to the digests issued in the second week of January, 1989.
The mailing list VALERT-L is used only to announce viruses; any discussion is relegated to VIRUS-L. To subscribe, send an electronic mail message containing only the line
SUB VALERT-L your name
to the above address. Messages sent to VALERT-L appear in the next VIRUS-L digest as well.
Peter Neumann of SRI International moderates the Forum on Risks to the Public in Computers and Related Systems, or RISKS, list. This mailing list focuses on the risks involved in computer technology, and has discussed implications of viruses, although with a thrust different than the VIRUS-L mailing list. To subscribe, if on the Internet, send an electronic mail message to RISKS-request@CSL.SRI.COM; if on BITNET, send an electronic mail message containing only the line
SUBSCRIBE MD4H your name
to LISTSERV@CMUCCVMA.BITNET, or
SUBSCRIBE RISKS your name
to LISTSERV@UGA.BITNET, LISTSERV@UBVM.BITNET, or LISTSERV@FINHUTC.BITNET. Back issues of the digest are available by anonymous ftp from crvax.sri.com in the directory “RISKS:” and are named RISKS-v.nn where v is the volume and nn the number within the volume.
1. UNIX is a registered trademark of AT&T Bell Laboratories.
2. VAX and VMS are registered trademarks of Digital Equipment Corporation.
3. D. Edwards first referred to this type of program as a “Trojan horse” in [4]
4. Originally, a worm was simply a distributed computation [115]; it is now most often used in the above sense.
5. TOPS-20 is a registered trademark of Digital Equipment Corporation.
6. VM/370 is a registered trademark of IBM.
7. We use the conventional terminology of calling this program a “computer worm” because its dominant method of propagation was from computer system to computer system. Others, notably [46], have labelled it a “computer virus” using a taxonomy more firmly grounded in biology than the conventional one.
8. Reading private files to obtain information (such as user names and passwords) that can then be used to break into other systems, or other parts of the system on which the information is found.
9. Macintosh is a Registered Trademark of Apple Computer
10. Cohen tantalizingly claims that one has been found, but reports no other details [27]. Suppression of details (or, more commonly, the existence) of attacks, virus or otherwise, is common; it is estimated that victims report only 10% to 35% of computer crimes in general [119][129], in part to prevent embarrassment or loss of public confidence in the company, or to avoid the expense of gathering sufficient evidence to prosecute the offender [101].
11. It is worth noting that the author of the Internet worm stated that the worm disabled machines due to a programming error [93].
[Back to index] [Comments (0)]