Piotr Bania
SecurityFocus
June 2005
This short article describes the so-called Entry-Point Obscuring (EPO) virus coding technique, primarily through a direct analysis of the Win32.CTX.Phage virus. The reader should know the basics of IA-32 assembly and the main elements of the Portable Executable (PE) file structure to fully understand this article. The author also advises the reader to review the Win32.CTX.Phage description written by Peter Szor and Wason Han , since this article does not cover all the features of the virus.
Entry-point obscuring viruses are very interesting because of the very difficult nature of its detection, disinfection and removal. Nowadays the EPO technique is used in many different ways, however Win32.CTX.Phage has been chosen for this article because it was written by the same author of other such infamous viruses as Win9x.Margburg (one of the first Windows9x polymorphic virus, which first appeared in the wildlist) and Win9x.HPS. The author of these viruses is known for his difficult-to-detect and difficult-to-disinfect creations. CTX.Phage in particular involves many techniques that make the disinfection process highly difficult, even after the virus is fully understood.
When a virus infects a file, it must find some way to attain control and be executed. Most of the PE file infectors use the most common way of doing this -- they simply change the entry-point of the infected application and make it point to the virus body. An example is shown below.
Original EXE | Infected EXE |
---|---|
Entry-point: 0x1000 (.code section) | Entry-point: 0x6000 (.reloc section) |
Such virus activity is very easy to detect, as it usually results in files whose entry-point resides outside the code section, and are therefore marked as suspicious by a virus scanner. Here is some example code, which detects this type of infection:
(checks if the 'entry-point section' is the last section):
The very reason why the EPO technique was developed was to avoid virus scanner detection. An entry-point obscuring virus is a virus that doesn't get control from the host program directly. Typically, the virus patches the host program with a jump/call routine, and receives control that way. While there are many variations of the EPO technique, in this article we will look at one of them in detail.
The Phage virus doesn't modify the entry-point of an infected file, instead it scans all over the host code section and searches for API calls generated by Borland or the Microsoft linker. When such code is found, the virus checks that the destination address points somewhere inside the IMPORT section. If the call is really an import call, Phage gets a random number which tells the virus to patch the current processed instruction or to find next one. Figures 1, 2, 3, and 4 below show a few example schemas.
Figure 1. Original application (ENTRYPOINT: 0x1000 - LINKER: BORLAND).
Figure 2. Infected application (ENTRYPOINT: 0x1000 - LINKER: BORLAND).
Figure 3. Original application (ENTRYPOINT: 0x1039 - LINKER: MICROSOFT).
Figure 4. Infected application (ENTRYPOINT: 0x1039 - LINKER: MICROSOFT).
The above schemas show how the CTX.Phage EPO virus works. As mentioned before, the virus injects the call instruction by overwriting it with a randomly found call. As the application size grows (and also the injected call range from the entry-point), it becomes increasingly difficult to find the injection of the virus. On the other hand, while using this EPO technique reduces the risk of virus execution, there are also some cases when the "call-to-virus" will not be executed at all.
At this point, let's find a way to detect such injections such that it does not cause false alarms.
How difficult is it to find CTX.Phage injections? First of all, the virus inserts a call instruction as follows:
E8 ?? ?? ?? ?? | CALL XXXXXXXX |
Where:
Before we go any further, let's summarize all the information we know about the current EPO:
As the reader probably knows, we could simply search for 0xE8 bytes (call opcodes) but there is large possibility that we might find some "suspicious" call that thands in non-call instruction, for example:
68 332211E8 | PUSH E8112233 |
As you can see, this is the push instruction, but the scanner finds the E8 byte and could consider it as a call. Unless we don't want to build up our disassembler engine (which is very long and hard work) we need to find another way. Yes, you guessed it: we need to add a condition for the E8 byte scanning routine, remembering that the call always executes code that resides in last section! Now that everything is clear, here are the conditions we require:
Where:
A sample temp_loc calculation might look as follows:
Scanned instruction:
Calculation: temp_loc = 1025 (virtual address) + 00002758 (call destination) + 5 (size of call instruction)
If the temp_loc address resides somewhere between last section's virtual address (start) and the last section's virtual address + its virtual size, the call is marked as suspicious. Here is the short snippet from the author's scanner:
(searches for call and jump instructions and checks theirs destinations):
While scanning files with this code, I haven't seen any false alarms, so it is probably one of the best solutions or techniques one can use to find such virus injections.
Since our scanner is able to find the injected call, we can move on. Now we need to reload the original call. On other words, we need to clear the injection. To do this we should first know more information about the virus.
The main problem is that the virus is encrypted and the polymorphic decryptor will decrypt the full virus body several times. We need to obtain the clear virus body in order to reset the original instruction. We can't get to those bytes directly since the code is encrypted. There are a couple of solutions to clear/bypass the polymorphic decryption layers, such as using emulation and so on. Writing a full emulator is surely not a quick and easy job, however a different solution does exist. Most Windows viruses use the GetProcAddress API to obtain needed API addresses for their future execution. Lets try to set a breakpoint at GetProcAddress (of course to avoid false GetProcAddress requests. First we need to execute the virus injection, which is easy since we have located it before). This is shown below in Figure 5.
Figure 5. GetProcAddress.
The call came from 0x406AF3, which in fact points to the decrypted body. Indeed, the poly layers were bypassed! Here is the sample proof using the decrypted string, shown in Figure 6.
Figure 6. Decrypted string.
To make the disinfector able to break on GetProcAddress, we need to build a small debugger (which is likely the fastest way to do it). This is easy since Windows platform already comes with Debug APIs.
Basically, following the code debugs the virus process, modifies the original entry of GetProcAddress to 0x90 (nop), 0x90 (nop), 0xCC (int 3 - breakpoint) and takes over the EXCEPTION_BREAKPOINT only if it comes from the "hooked" range:
(debugs process, executes virus call, hooks GetProcAddress and obtains caller (virus) address):
Now when we have the clean virus body we can try to locate the original instructions. Since CTX.Phage doesn't modify the bits from the host code section, it has only one way to reset the original instruction - by using WriteProcessMemory API (well, it could use VirtualProtect API to get write access to host code section and then write the original bytes, but it doesn't). So here is the break on WriteProcessMemory, shown in Figure 7.
Figure 7. Break on WriteProcessMemory.
As you can see, BytesToWrite is equal to 5 and Address is equal to the location found by the scanner. The only problem is that the call comes from allocated memory (the virus allocated it, copied itself and continued execution from there). But lets try to check the caller address below in Figure 8.
Figure 8. Checking the caller address.
The "const" bytes (for example those marked in the picture above) are:
Where:
Here is the signature, useful to find original host bytes (there are the same in every generation), however these ones are located in the allocated memory. So the question is: does the same bytes exists somewhere inside the unencrypted body of virus, in other words, somewhere inside last section? Lets try to scan for it in Figure 9.
Figure 9. Scanning the virus.
Indeed, the same bytes were found in "native" virus location. The GetProcAddress was called by the virus from 0x406AF3, as you can see the original bytes that lay far before it. Here is the code example from the scanner which searches for the original bytes by using the signature. The same could be done by reducing 0x406AF3 by some const size, but regardless here it is:
(searches the virus body for the original bytes by using a signature, it also repairs the call by reading original bytes directly to mapped file):
The full EPO heuristics scanner, together with Win32.CTX.Phage disinfector, is attached to the last section of the paper. Here is a screenshot from that application, as shown in Figure 10.
Figure 10. Screenshot of the EPO scanner.
I hope you have enjoyed this short article on EPO techniques. The disinfector discussed in this article only cancels virus injections, of course - the virus still resides in last section but fortunately it will never be executed. However, this provides an opportunity for the reader to add some kind of virus "overwriter," it is really an easy job and a good task to undertake.
If you have any comments don't hesitate to contact the author. The author would also like to thank Satish Ks for moral support.
Piotr Bania is an independent IT Security/Anti-Virus Researcher from Poland with over five years of experience. He has discovered several highly critical security vulnerabilities in popular applications like RealPlayer. More information can be found on his website.
Here is the full source code of the scanner and disinfector. If you have problems with formatting the full source and precompiled binary is also available on here or through author's website.
[Back to index] [Comments (0)]