-
Shellcode
Ok, I was just wondering. When someone writes a new exploit, the shellcode usually looks like
/xff/ca/23/5b/54/ba etc... and bunch of abstract characters and then end (usually) in /x2f/x62/x69/x6e/x2f/x73/x68 (ASCII /bin/sh) . Now before you reply, yes, I know what hex is.
What is was wondering was what all the previous hex stuff was. Is it used to control the EIP or what? And also, how do the shellcoders know what characters to use ? It almost seems like you would have to know machine code to use the correct sequence of characters. I was just musing about this earlier...
-
Hi
The bunch of abstract characters, which are ended by "/bin/sh",
is the hexadecimal "representation" of assembler commands. People,
writing shell codes, first create a little assembler program - by doing so,
one immediately gets the hexadecimal representation (just check the
executable in a hex-editor, or use a runtime debugger, for example ollydbg).
After a while, and if the executable is not too complex, one actually can
"read" the hex-code. For example, taking your example as "real":
Code:
\xff\xca could represent DEC EDX (although 0xff could be problematic)
\x23\x5b\x54 could represent AND EBX, DWORD PTR DS:[EBX+54]
\xba could represent MOV EDX, ...
As per the "question", why shell codes often end with the hexadecimal
version of "/bin/sh": the variable declaration and definition, like
Code:
shell db "/bin/sh",0
is at the end.
Cheers.
-
Just a little filler...
Assembly is a language that the compiler (assembler) will then take and convert into the hexadecimal/binary instructions used by the microprocessor to actually run the instructions. When the assembler 'compiles' the program, those instructions are broken into a series of OPCODES and arguments (as dictated by the opcode), which will then translate into the hexadecimal number. You can go backwards of course, but it can be time consuming to do so (at least manually) and things like sec_ware mentioned (ida or ollydbg) can be useful for this...
Would recommend that you get whatever assembly/operations book is available for the processor you are interested (the assembly/opcodes will be different for each processor family, i.e., SPARC is different than i386 is different than MIPS) and this would help you in that respect (and would also show the syntax and break down how the hex number is generated form the instruction).
-
A book that gives a good explanation of shellcode creation is "Hacking: the Art of Exploitation". And for you paranoid types: no, the book does not teach you how to "hack", it is intended as a tool to understanding how various exploits work. The book is actually quite useful if you are a programmer because it shows you many of the simple mistakes that can make programs exploitable, and how to avoid them.
I believe there are also books that deal solely with shellcode creation, you may wish to search for them.
ac
-
I was in the same place as you for a long time. I would look at the shellcode and be like, how the heck did they come up with that? In the past few weeks I've been messing with it and made some progress. If you're looking into being able to write exploits though, you really don't need to be able to make the hex strings, there are lots of programs out there to do it for you, and also you can just use shellcode that someone else already came up with. But you're probably like me and don't care if it's available, you just want to know how to do it. If you need any help feel free to pm me. Also, the book gothic_type mentioned is a lot like aleph1's article about buffer overflows and the stack. Both give a way to make the shellcode using gdb on a program you write in C with inline assembly. Just search for "Smashing the stack for fun and profit", the part about using gdb to get the hex values is about halfway through. Good luck.