Fridrik Skulason, Vesselin Bontchev
CARO meeting
1991
At a CARO meeting in 1991, a committee was formed with the objective of reducing the confusion in virus naming. This committee consisted of Fridrik Skulason (Virus Bulletin's technical editor) Alan Solomon (S&S International) and Vesselin Bontchev (University of Hamburg).
The following naming convention was chosen:
The full name of a virus consists of up to four parts, desimited by points ('.'). Any part may be missing, but at least one must be present. The general format is
Family_Name.Group_Name.Major_Variant.Minor_Variant[:Modifier]
Each part is an identifier, constructed with the characters [A-Za-z0-9_$%&!'`#-]. The non-alphanumeric characters are permitted, but should be avoided. The identifier is case-insensitive, but mixed-case characters should be used for readability. Usage of underscore ('_') (instead of space) is permitted (and even encouraged), if it improves readability. Each part is up to 20 characters long (in order to allow such monstriosities like "Green_Caterpillar"), but shorter names should be used whenever possible. However, if the shorter name is just an abbreviation of the long name, it's better to use the long name.
The Family_Name represents the family to which the virus belongs. Every attempt is made to group the existing viruses into families, depending on the structural similarities of the viruses, but we understand that a formal definition of a family is impossible.
When selecting a Family_Name, the following guidelines must be applied:
The variants in each family are named after their infective length.
The variants in each family are named after the contents of the 2nd and the 3rd bytes of the infected boot sector in hexadecimal
The Group_Name represents a major group of similar viruses in a virus family, something like a sub-family. Examples are AntiCAD (a distinguished clone of the Jerusalem family, containing numerous variants), or 1704 (a group of several virus variants in the Cascade family).
When selecting a Group_Name, the same guidelines as for a Family_Name should be applied, except that numeric names are more permissible - but only if the respective group of viruses is well known under this name.
The major variant name is used to group viruses in a Group_Name, which are very similar, and usually have one and the same infective length. Again, the above guidelines are applied, with one major exception. The Major_Variant is almost always a number, representing the infective length, since it helps to distinguish that particular sub-group of viruses. The infective length should be used as Major_Variant name always when it is known. Exceptions of this rule are:
Minor variants are viruses with the same infective length, with similar structure and behaviour, but slightly different. Usually the minor variants are different patches of one and the same virus.
When selecting a Minor_Variant name, usually consecutive letters of the alphabet are used (A, B, C, etc...). However, this is not a very hard restriction and longer names can be used as well, especially if the virus is already known under this (longer) name, or if the name is more descriptive than just a letter.
The producers of virus detection software are strongly usrged to use the virus names proposed here. The anti-virus researchers are advised to use the described guidelines when selecting names for new viruses, in order to avoid further confusion.
If a scanner is not able to distinguish between two minor variants of a virus, it should output the virus name up to the recognized major variant. For instance, if it cannot distinguish between Dark_Avenger.2000.Traveller.Copy and Dark_Avenger.Traveller.Zopy, it should report both variants of the virus as Dark_Avenger.Traveller.
If it is also not able to distinguish between the major variants, it should report the virus up to the recognized group name. That is, if the scanner cannot make the difference between Dark_Avenger.2000.Traveller.* and Dark_Avenger.2000.Die_Young, it should report all the variants as Dark_Avenger.2000.
At last, if the scanner is also unable to distinguish between the different groups, it should output only the family name of the virus (Dark_Avenger in our example).
It is possible that a virus belongs to a particular family by its structure, but the virus writer has used some kind of concealing of this fact. Such concealing could be the conversion of the virus into a polymorphic one by linking one of the avialable polymorphic engines to it, or by compressing it with some executable-file compressor (e.g., PKLite, LZEXE, etc.). The latter method is of concern only if the virus is able to spread in compressed form. Since one and the same virus could be concealed with different methods (or even with more than one method), this could cause classification confusion.
Such viruses should be classified as if the concealing mechanism has not been used, with a modifier appended to their name. This modifier indicates the particular concealing mechanism used. If the concealing tool conforms to a naming hierarchy, it's full name (e.g., TPE.1_3) should be used as a modifier. When the modifier indicates a compression tool, only the first two characters of the name of the tool should be used.
For instance, the Pogue virus is a member of the Gotcha family, but uses the MtE.0_90 polymorphic engine. Therefore, its full name should be "Gotcha.Pogue:MtE.0_90".
It is permitted to use more than one modifier in the full name of the virus, if the virus uses more than one concealing mechanism, e.g. "Civil_War.1234.A:TPE.1_3:MtE.1_00:PK".
[Back to index] [Comments (0)]