Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: PDF file check question

  1. #1
    Member webtrol's Avatar
    Join Date
    Jan 2010
    Posts
    113

    Default PDF file check question

    Hi,
    I was just wondering... If someone asked you to ensure that a large PDF file is clean of any evil things.
    How would you do this.
    (assume the file passed virus scanner, and it legitimately contains some JS content - so scanning source for existence of it is not enough).
    This is a curiosity/non urgent question for those with time on their hands to share their most secret white hat ways.

    Sin-cerely,
    Trol (trolling the Forum since OMG ago)

  2. #2
    Senior Member streaker69's Avatar
    Join Date
    Jan 2010
    Location
    Virginville, BlueBall, Bird In Hand, Intercourse, Paradise, PA
    Posts
    3,535

    Default

    Quote Originally Posted by webtrol View Post
    Hi,
    I was just wondering... If someone asked you to ensure that a large PDF file is clean of any evil things.
    How would you do this.
    (assume the file passed virus scanner, and it legitimately contains some JS content - so scanning source for existence of it is not enough).
    This is a curiosity/non urgent question for those with time on their hands to share their most secret white hat ways.

    Sin-cerely,
    Trol (trolling the Forum since OMG ago)
    There's a GPO template for AD that disables JS on all clients of AD. So far, I haven't found a single PDF that's actually needed JS.
    A third party security audit is the IT equivalent of a colonoscopy. It's long, intrusive, very uncomfortable, and when it's done, you'll have seen things you really didn't want to see, and you'll never forget that you've had one.

  3. #3
    Member webtrol's Avatar
    Join Date
    Jan 2010
    Posts
    113

    Default

    Quote Originally Posted by streaker69 View Post
    There's a GPO template for AD that disables JS on all clients of AD. So far, I haven't found a single PDF that's actually needed JS.
    Nice, And after reading this, I looked it up and learned new cool thing, thank you .

    But that protect you from bad PDF, it doesn't tell you IF PDF was BAD.
    Also, if you then forward such PDF into the wild, it could contaminate others.

    Soooo, all the great white hats with too much time on your hands, How do you (if you do) ensure that PDF does not contain a new cool exploit?

    Sin-cerely,
    Trol

  4. #4
    Senior Member streaker69's Avatar
    Join Date
    Jan 2010
    Location
    Virginville, BlueBall, Bird In Hand, Intercourse, Paradise, PA
    Posts
    3,535

    Default

    Quote Originally Posted by webtrol View Post
    Nice, And after reading this, I looked it up and learned new cool thing, thank you .

    But that protect you from bad PDF, it doesn't tell you IF PDF was BAD.
    Also, if you then forward such PDF into the wild, it could contaminate others.

    Soooo, all the great white hats with too much time on your hands, How do you (if you do) ensure that PDF does not contain a new cool exploit?

    Sin-cerely,
    Trol
    I know this may not be the answer you want to hear, but my personal opinion is that it is not really my concern about someone else's network and if they're vulnerable to exploit. If it's a 0-day, it's a 0-day, chances are, nothing that I would have available is going to detect it. Eventually, it will be detected as updated definitions are deployed, and the network is scanned during it's normal cycle.

    The only real solution to this would be to quarantine all attachments until such time that definitions are available to scan. This of course is disruptive to business workflow, so it's not a real good solution. Of course, you could always manually look at every PDF that comes in, if you have nothing else to do with your time, I really don't have the time to do that myself.
    A third party security audit is the IT equivalent of a colonoscopy. It's long, intrusive, very uncomfortable, and when it's done, you'll have seen things you really didn't want to see, and you'll never forget that you've had one.

  5. #5
    Good friend of the forums
    Join Date
    Feb 2009
    Posts
    356

    Default

    JS is not the only "bad" think in a PDF. Most dangerous ones out there actually exploit the reader and you can't make sure it's clean unless you open it in a hex editor... and even then, if it's large, you're probably screwed. So no, you can't protect yourself - UPDATE your reader software and hope for the best!!!!

  6. #6
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Here's the method that I use in analysing malicious PDFs:

    I use the tools pdfid and pdf-parser from here. I the past I have also used pdftk, but Im finding that less useful recently.

    The process:
    1. Use pdfid to analyse the pdf document. pdfid can tell you if a pdf has Javascript included as well as autorun functionality and how many pages it has. A one page document with Javascript and autorun functionality is suspicious.
    2. If Javascript is present, extract it from the document to determine its purpose. Sometimes the Javascript is included in plain text, in which case you can just use the strings utility to extract it. Otherwise, you can use pdf-parser to extract certain types of encoded Javascript.
    3. Malicious Javascript often contains obfuscation to disguise its true purpose. To remove this obfuscation I modify the script a little to allow easier debugging (e.g. assign the code from eval statements to a variable instead) and use the Rhino Javascript debugger to show me how the code is transformed as it runs.
    4. Many of the Javascript based PDF exploits often involve buffer overflows, and the shellcode is often in unicode format. I have a perl script that I wrote to convert this type of shellcode to a C program (really just C style shellcode with some wrapper code) which can then be compiled to be further analysed using standard binary analysis techniques. I can post the script if anyone wants it.


    I will note that PDF exploits are possible without Javascript, but in practice most of the ones out in the wild seem to use it. Certainly the ones I have seen have it.
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  7. #7
    Member webtrol's Avatar
    Join Date
    Jan 2010
    Posts
    113

    Default

    Thank you Lupin for reply, between your information and that from streaker (xorred also my thanks) my escapade into PDF documents might end up being successful (since it is goal/subject selected for fun - i also define success which is handy)

    Quote Originally Posted by lupin View Post
    I can post the script if anyone wants it.
    I would love to see that script


    Sin-cerely,
    Trol

  8. #8
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Quote Originally Posted by webtrol View Post
    I would love to see that script
    I knocked this together in the middle of an incident and haven't had a chance to tidy it up, so be warned its pretty rough. You basically just run it at the command line with the JS shellcode as a parameter and it spits out a C program that you can compile.

    Code:
    #!/usr/bin/perl
    # Takes shellcode in javascript unicode coded format as a parameter and outputs it to STDOUT into c code that can be compiled into a windows executable (gcc code.c -o code.exe).  Borrows metasploit c code for compiling shellcode.
    
    # shellcode here in format "%uHHHH%uHHHH" where HH is a hexidecimal value
    $jsshellcode = $ARGV[0];
    
    if ($jsshellcode eq "") {die("Enter the javascript shellcode as the first parameter to this script in format %uHHHH%uHHHH...\n\n"); }
    
    $code = '';
    
    @array = split "%", $jsshellcode;
    foreach $part (@array) {
    	if (! $part == "") { # encoding is little endian so we swap order of encoded bytes
    		$code = $code . '\x' . substr($part, 3, 2);
    		$code = $code . '\x' . substr($part, 1, 2);	
    	}
    }
    
    print 'char code[] = "' . $code . '";' . "\n\n";
    
    print <<CODE
    int main(int argc, char **argv)
    {
    	int (*funct)();
    	funct = (int (*)()) code;
    	(int)(*funct)();
    }
    
    CODE
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  9. #9
    Member webtrol's Avatar
    Join Date
    Jan 2010
    Posts
    113

    Default

    Thank you kindly good sir!

    Sin-cerely,
    Trol

  10. #10
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default

    Something relevant that I just found on the Internet Storm Center blog, Lenny Zeltser's guide to analysing malicious documents, including PDFs!

    There are some usage command lines for some of the tools I mentioned (not the same command lines I have used, but still useful nonetheless).

    There are also a number of tools listed there I hadn't head of before, as well as a guide to analysing Microsoft Office documents, which I haven't had to do so far.

    Analyzing Malicious Documents Cheat Sheet by Lenny Zeltser


    Edit: Documented my PDF analysis process in more detail on my blog here.
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •