Linux.Lacrimae 0.30 (x) 2007-2008 by herm1t Earlier I wrote pretty large doc for this virus, descibing its work step by step. Now I think that there is no point to examine each function of the virus individually or the practices of working with ELF headers. The code came to be big and sloppy, so I will try just to tell you what virus is doing and why some features were implemented as they were. Probably I would write the tutorial on linux viruses. Or I will not. Anyway, if you cannot understand something, it will be better for you to search for the answers not in forums or zines, but in the sources and standards. There are everything you need: System V ABI/386, TIS ELF, libc and kernel sources, man after all. The basic principle was set before by Z0mbie in his "Methodology of undetectable virus". In a few words: you should take the file to pieces (headers, instructions, parts of data; when virus code should be mixed with victim and all should be assembled back. It became possible in Linux after introducing PIC-executables (or just PIE). This feature was added about 2003, and was meant to enforce the security of the system. The most "dangerous" programs build as PIE. But you know, if something increases in one place, it decreases elsewhere. Programs ran slowly, prelink trying to fix this, and in turn complicating integrity checking. Crucially, relocations for data appeared in the executables and the code became position-independent, and this allows the virus to move the victims code as it wishes. Disassembling So. The virus found the file (search), loaded it into the memory (load_elf), performed all neccessary checks before (elf_check, the ELF type ET_DYN, the presence of section headers). Matched. Must be infected. First of all we need to disassemble the code. Just feed the disassembler with loaded section (at the beginning I used the XDE, but then made my own and significantly different version of this engine called YAD), and put the result into the list of structures (code_t). For the fast access to the list make the array of pointers with the number of entries equal to the section size. For example, if one needs to test is where any instruction at the given offset, it's enough to write: if (fast[offset] != NULL) ... instead of for (c = code; c != NULL; c = c->next) if (c->src - map == offset) { ... break; } To make it possible afterwards to restore the alignment of particular instructions, the virus comparing the code against standard paddings generated by binutils. The victim disassembled from top downards with less regard to jumps. If there are any data within code - that's just tough, return with error. Though there is no convenient way to produce such PIE, and all programs which using inline assembly and data within code just cannot be build as PIE. In such a way distinction of code and data is the last thing one should bother about. Moreover, the used data structures doesn't perfectly fit with that task. The virus is working with sections and there are at least four sections with code (.init, .fini, .plt, .text) linked to each other by jumps. We will fix the numerous headers and initialized pointers in the data segment, we have all neccesary information in relocation table (.rel.dyn). One must find all places in code which calculate the addresses. So, some explanations about the ELFs internal structure are of necessity. Position Independent Executables PIE is an ELF file with type ET_DYN and zero base address. Naturally, it will be loaded at quite different address, and to make it possible for the the program to run, the loader and the program itself will take pains. The code has the read-only permissions and will not be fixed by loader. If the function is not a leaf one (one which makes calls) or accessing the global variables, the following scheme is used: 130b: e8 23 00 00 00 call 1333 # 1310+405c=536c (.got.plt) 1310: 81 c3 5c 40 00 00 add $0x405c,%ebx ....................................................... # eax = .got.plt - 16236 = 0x1400 (main) 1326: 8d 83 94 c0 ff ff lea 0xffffc094(%ebx),%eax 132c: 50 push %eax 132d: e8 b2 fd ff ff call 10e4 <__libc_start_main@plt> ....................................................... 1333: 8b 1c 24 mov (%esp),%ebx # (1) 1336: c3 ret ....................................................... 133c: e8 00 00 00 00 call 1341 # (2) 1341: 5b pop %ebx 1342: 81 c3 2b 40 00 00 add $0x402b,%ebx All addresses are relative to the start of .got.plt section. If the function does not call another functions, another register could be used instead of ebx. To set the value of PIC-register the two constructs (1) and (2) are used. The first in the CRT functions, added by compiler, the second in the program itself. Parsing The virus traversing the instruction lists (mark), this time in the order of execution (to prevent the situation when PIC-register got used before initialization; it is not likely that gcc will produce such code, but this feature will come into play later); fill the "link" fields for the branch instructions (the pointer to the destination of the branch), look for the external functions calls, initializations of the PIC-register and instructions which using that register. Mark the found instructions by following flags: * FL_GOTPTR1 - call 0f / 0: pop reg / add reg, imm32 (set on add) * FL_GOTPTR2 - call 0f / add reg, imm32 / ... / 0: mov reg, esp / ret * FL_GOTOFF - mod==2 && (rm==pic_reg|base == pic_reg|index == pic_reg) * FL_EXTERN - call of external function (the symbol field is the name of function) * FL_SEEN - the code without this flag cannot be reached If the instruction with FL_GOTOFF refer to the code, it's destination will be tested also. JMP-tables As such all instructions that need to be fixed found. And all could be just fine and simple except for jump tables. The array of addresses is in the .rodata, the address of array calculated in the same way as any other address, but one need to know not only the address of table, but its size too. The most simple case of jump table looks as follows: 145e: 83 f8 34 cmp $0x34,%eax 1461: 77 35 ja 1498 1463: 8b 84 83 90 ea ff ff mov 0xffffea90(%ebx,%eax,4),%eax 146a: 01 d8 add %ebx,%eax 146c: ff e0 jmp *%eax Virus should look for the cmp argument that is the size of table. If the "jmp reg" instruction occuried, virus starts to look through the code in the opposite direction for the table address (in this case it is the "mov" insn, when the address was found, the virus memorizes index register and look further for the "cmp index, imm". If this command was found, the virus has all the things needed for parsing and saving the contents of table for later fixing. The fatal error will occur and the file will be left uninfected, if something missed or the found addresses does not referrence the code. It's not the end of misfortunes with jump-tables, but the beginning. Table index may reside not in register but in the local variable: 3d60: 83 7d e8 04 cmpl $0x4,0xffffffe8(%ebp) 3d64: 0f 87 8b 03 00 00 ja 40f5 3d6a: 8b 55 e8 mov 0xffffffe8(%ebp),%edx 3d6d: 8b 84 93 30 de ff ff mov 0xffffde30(%ebx,%edx,4),%eax 3d74: 01 d8 add %ebx,%eax 3d76: ff e0 jmp *%eax To work with such tables the virus trying to keep track the movings between registers and local variables. Both for the code w/ or w/o stack frame (variables referrenced through esp, not ebp) - in the latter case virus has little to none chances. The jmp-table parsing code is quite fragile, but still works, and if it will fail, it is just as well possible to find another file, less complex. There are another approaches to the problem, but they require more detailed analysis of the code and used in decompilers. The same procedures applied to the virus body itself during disassembly. To be able to split the virus code and the victim, the virus code marked with FL_ME flag. The virus must not use uninitialized data in the .bss, jump-tables, relocations and pointers to the code. Only pointers to data is allowed. There is nothing fundamental in these restrictions, it is possible to add the missed capabilities if you wish, but this requires additional code, and the code is too big as it is. Imports Let's care now for the external functions the virus is needed to work on the new place. We need to find all the calls with FL_EXTERN flag and the corresponding functions within imports. If something missed - extend the corresponding sections (extend_section) and add (add_symbol): the name of function to .dynstr, symbol to .dynsym, helper-code to .plt, reloc to .rel.plt, the address of helper to .got.plt and finally the version to .gnu.version_r. Rebuild the .hash (build_hash). It's all taped. Integration, mutation, optimization The virus now has two lists ready for the intermixture (insert_code). The lists dissected on ret/jmp instructions to preserve the order of execution and to prevent the virus from fallthrough to the victim and vice versa and cut in to the single list (code_mixer). Piece of virus, piece of victim, piece of virus... And so forth. Since we have a single list, we could care about such chicken shit as alignment. The virus discards the fillings, calculates the boundary on which the insn is aligned, prevents jumps from ending on filling. The alignment requirements saved in the "al" field. Now you can do with the list whatever you want. Additionaly rearrange the code blocks (M_BRO in config.h). Randomly invert the conditional jumps (M_IJCC), mutate some instructions (MUTATE) - it just a sketch, the code was taken from RPME and BI-PERM and one piece was written after Lexotan. It's time to think of how to compose all this to a working program. Let's find out the length of jump commands (short/near) (OJMP in config.h). Extend the data section to add there our own data later. Add the call to virus to .init or virus entry to .ctors. Assembly and linking Let's build the file (build_elf). Copy the ELF header, table of segments (PHT) and start to copy the sections which belongs to the loadable segments, saving new addresses and offsets. Align the section if neccessary. If section contains code - assemble it by copying each command from the list (don't forget to align it) and save its new address. If the section contains data just copy it. Fix the rest entries (non PT_LOAD) in PHT. Copy not-loadable sections. Copy and fix the sections table (SHT). Copy all stuff that was appended to the file and does not belong to any section. And now we must link and fix the resulting file. Fix .rel.dyn and rebuild the .rel.plt and .got.plt sections. When virus added imports it was impossible to do, since the addresses of the sections were not known. It's time to attend to the code. The virus knew the new address of .got.plt (the base for address calculations in program). It is high time. For each instruction in list (depending on flags): find new addresses for the calls to PLT-helpers for the external functions calls; find new offsets relative to .got.plt for the commands referrencing data and reassemble 'em; fix the addresses of conditional and uncoditional branches; fix the offsets from current position to .got.plt (FL_GOTPTR?). Fix all addresses from jump-tables, that were saved. Fix the symbol tables (.dynsym and optionally .symtab). I had no guts to fix the debugging information. :-) Set the entry point. Copy the virus data to the previously extended data section. Initialize the variables in the data section. Encrypt the data. Set the infection marker. Fix the .dynamic entries. All is in readiness. Write off the file, free memory (cunningly, so the list with virus code will be left untouched to avoid repetitive disassembly (freec). And continue the victim search. A few details Let's discuss the same in greater detail. "Fix", "set", but how to "fix"? Suppose, that we have some absolute address, that needs to be fixed, the old and new section table and the code in the lists. First of all map the address to the pair (section_io_by_addr). If address belong to the section with data, then new address = new address of section + offset, if it's code section, then: fast[offset]->dsta (fix_addr). If we need to fix the address relative to .got.plt, then: new rel. address = fix_addr(old_gotoff + old rel. addr.) - new_gotoff, where old_gotoff and new_gotoff are old and new addresses of .got.plt respectively. Prelink Once in a while, when KDE-application or something as big is loading it will drag a lot of libraries with itself. The setup of program and libraries even if the address resolution performed lazily is a time-consuming task. Prelink is intended to solve it by designating the base addresses to all programs and libraries in the system, bind the external calls and saving the undo information. What should the virus do? Just save the base address (address of segment with offset 0, variable "prelink") and convert all absolute addresses to relative, i.e. RVA. All that one should do is just to substract or add the base address. There are not too many such places in the virus. It's also necessary to fix the prelink undo information (it just EHDR, PHT and SHT as they were before prelinking, with zero base address and without sections added by prelink) and fix the checksum. There is just one (or more?) snag. If the infected file be checked with "prelink -y", then prelink will report that "file differs". What's wrong? When virus added imports, it wrote the addresses of PLT-helpers to .got.plt, while prelink saves there the real addresses of functions. You can either put up with that, or to enable PRELINK_PERFECT in config.h. It's dreary operation. To find the path to the library, the virus will run the interpreter. Load library into memory and look for all necessary addresses of the functions. The resolve procedure differs from the once in RTLD, so on some files prelink will swear anyway. It might be fixed, but implement the RTLD in virus is a obvious overkill. The infected program will work anyway. Restrictions on virus code Now a few words about the restrictions. To avoid adding its own relocs (for the virus entry, pointers to .dynsym, .dynstr, .rel.plt), the virus regardless of whether the file was prelinked will save RVAs, and when will find the base address (get_base) in the run-time and add it to the values of variables. The initial values must be set by patch utility, for the succeeding generation of the virus the values of variables will be set by previous one. The virus is working with only one section - .data. That is why you cannot use pointers to functions (referrenes to .text), unitialized variables (.bss) and jump-tables (.rodata). Otherwise, one could save the section index for each command accessing memory and which sections must be extended (or the section_io_by_addr will fail). Actually, there is one such item in the virus (if EXTEND_CTOR is enabled). The major changes since version 0.25.2 (described in "Crimea River") * New disassembler YAD (instead of XDE) * Partial support for jump-tables indexes in local variables * Prelink support * .gnu.version_r support * Inverting conditional jumps * Short/near jumps optimization Conclusion I suppose, that's all. The virus is not too fast, most probably it contains bugs and memory leaks. The used method of infection allows much more opportunities than implemented in the current version. But this virus is just the first demo of integration on Linux, and one may take care of the rest. :-) herm1t@vx.netlux.org, june, 2008