Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 24

Thread: Data Mining

  1. #11
    Just burned his ISO
    Join Date
    Jun 2010
    Posts
    5

    Default Re: Data Mining

    Quote Originally Posted by Thorn View Post
    OK, now that we've cleared that up...

    First of all, some purists might argue that what you're asking isn't really about pen testing per se, but more about the goal after you've penetrated the system/network.

    Personally, however, I happen to think that it's a very important piece of what we do. Finding a given vulnerability might impress someone in the IT department that might be rectified (someday) when time and money are available, but tell the CEO you found a vulnerability on Port 173 on the server will make him start yawning in the middle of presenting your findings.

    On the other hand, getting some information that is vital to the company (e.g. customers' credit card numbers) is the kind of thing that makes C-level people sit up and taken notice, and you can see them get heartburn right in front of you as they think about having to explain the potential loss to the board of directors. That's the kind of finding that will actually get things fixed.

    However, my impression is that you have identified some potential vulnerabilities, but don't know exactly what you want to find.

    What you need to find is can only be answered by determining the goal, and that is determined by asking "what kind(s) of things can the client not afford to lose without disastrous consequences?" It may be one type of data, say, the big proprietary company secret, (think of the formula for Coke-a-Cola) or multiple data types such as patient health data and/or patient credit cards, or could also be non-data such as the taking over or disrupting the process control for a chemical plant.

    Of course, once you've determined what the goal is, you have to ask, "where does it live?" After all, looking at a secretary's PC and reading her tweets about how drunk she got last weekend and what she did with the fives sailors may be entertaining, (look for pictures!) but it isn't going to help you track down spreadsheets with the CFO's projections for the next year's secret plans for a potential stock split.

    So ask yourself, are you looking at the CIO's workstation, or the workstation of an engineering team? Small servers running Windows or *nix? How about IBM I-series or even AS-400's mainframes? (Yes, there are still AS-400's out there holding a lot of data...) Or SCADA PLC's and RTU's?

    Now that you've got those questions answered, you can determine what tools (if any) that you can use. It may be a matter of using a commercial tool such as Tripwire; you may be able to just do a simple command line wildcard search for something as simple as a particular file type; or perhaps you'll need to craft some custom packets using Scapy to make an RTU turn off a pump.

    Once you answer those questions: "What is the goal?" and "Where does the data live that we need to find to achieve the goal?", you can start to determine what the tools you'll need. But until you have some direction, searching for any useful data is will be akin to searching for a black cat in a cellar at midnight without a flashlight.
    First off, my apologies for calling your Thor.

    Thanks for your thought out response. I completely agree with you about the level/type of findings dictating the response, and that is the motivation behind my idea. I also liked your analogy about the cat.

    I actually have a pretty good idea of what I'm looking for. In terms of where I'm looking for it, that is simply everywhere I can look for it. The goal is to ensure that potentially dangerous data is not present on machines in which it shouldn't be (and in my case it shouldn't be anywhere within the scope of testing). For example, I am browsing through the contents of a compromised box and discover xyz through manual searching. That's all well and good, but it doesn't scale well at all. I definitely want to check out that loot feature that Lupin was talking about though. Can you think of any other strategies of approaching this type of task?

    Thanks

  2. #12
    Senior Member Thorn's Avatar
    Join Date
    Jan 2010
    Location
    The Green Dome
    Posts
    1,509

    Default Re: Data Mining

    Quote Originally Posted by Blu3Robot View Post
    First off, my apologies for calling your Thor.

    Thanks for your thought out response. I completely agree with you about the level/type of findings dictating the response, and that is the motivation behind my idea. I also liked your analogy about the cat.

    I actually have a pretty good idea of what I'm looking for. In terms of where I'm looking for it, that is simply everywhere I can look for it. The goal is to ensure that potentially dangerous data is not present on machines in which it shouldn't be (and in my case it shouldn't be anywhere within the scope of testing). For example, I am browsing through the contents of a compromised box and discover xyz through manual searching. That's all well and good, but it doesn't scale well at all. I definitely want to check out that loot feature that Lupin was talking about though. Can you think of any other strategies of approaching this type of task?

    Thanks
    A typo of my name isn't going to get me upset. I get called a lot worse most days.

    Without knowing the specifics of what you're looking for, it's difficult to point you at any one tool or even a set of tools. It really depends on what you're looking to find. Most tools concentrate on a given class of data, such as credit card numbers for PCI compliance.

    However, one thing to remember about generic searches is that it almost always boils down to looking for a signature of some type. So even if there isn't a tool that covers what you need, if you can figure out a pattern to what your seeking, then you may be able to write a script that will automate your search across multiple boxes using that pattern. For a simple example, consider that US Social Security numbers are 9 digits long, and are usually expressed in this format: NNN-NN-NNNN. If you were seeking SSN's then the script would have to search for strings of 9 digits without the dashes, and in the format with the dashes; to cover storage of either the common format or one that is reduced to just numerals.

    Feed your script a list of boxes, and it would log onto a box, do a search, log the results, and move on the next box. After that, all you have to do is look at the log file. Sure, it might miss data that was formatted oddly, but it would get most of the data that might have been compromised by a sloppy user, or one with evil intent.

    The same principle could be applied to a search for credit card numbers, or keyword search for specific proprietary data. Just remember that the more specific you are about the "signature", that the less false positives, but you may miss data that is formatted oddly or misspelled.
    Thorn
    Stop the TSA now! Boycott the airlines.

  3. #13
    Very good friend of the forum hhmatt's Avatar
    Join Date
    Jan 2010
    Posts
    660

    Default Re: Data Mining

    Quote Originally Posted by Thorn View Post
    A typo of my name isn't going to get me upset. I get called a lot worse most days.
    Whatever they say they are wrong.

    I would inquire as to whether a search like this would have to be platform specific. Also I can see how it may be quite unreliable due to compression, encryption, or other means of obfuscating files and/or file structures. I would call this a "data risk assessment" just to be technical about it.

    Good luck and happy hacking!

  4. #14
    Senior Member Thorn's Avatar
    Join Date
    Jan 2010
    Location
    The Green Dome
    Posts
    1,509

    Default Re: Data Mining

    Quote Originally Posted by hhmatt View Post
    Whatever they say they are wrong.
    Thanks, although it's usually the people who know me pretty well.

    Quote Originally Posted by hhmatt View Post
    I would inquire as to whether a search like this would have to be platform specific.
    Exactly, which is why I previously mentioned things like mainframes, IBM iSeries, and AS/400s. Those things are still the workhorses in a lot of large organizations like banks. Searching on those guys is very platform specific.
    Thorn
    Stop the TSA now! Boycott the airlines.

  5. #15
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default Re: Data Mining

    This is an interesting topic, and I think its actually a bit more complicated than it may at first seem, especially if you are looking for a solution that is reusable and which will work across a number of types of exploited hosts. Any tool you use obviously should be able to be easily transferred to exploited boxes, which means ideally it should be small, and it needs to be able to run on the boxes in question in which case you need to think about what functions/libraries/interpreters will be supported by the target Operating System as well as which architecture the system uses if you are considering using compiled code.

    If we make an assumption that you want to focus on Windows boxes you might want to try and create something that makes use of the built in Windows searching functionality. This actually contains a number of plugins that allow it to search within particular file formats for particular words (for example when you install Acrobat Reader it includes the PDF iFilter library which allows the MS indexing client easy access to the contents of PDF files). Without making use of functionality like this you would probably be limited to more basic signature/grep style checks, which would probably be more portable but which would possibly limit the types of files in which you could search for your particular data. So your consideration of how to do this searching may also need to consider what file types you may want to search within, or which file types would be most likely to include the type of data you are interested in. If your target data is likely to exist within files AND be stored in those files in a plain text format then a simple regular expression style search may be your best bet.
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  6. #16
    Just burned his ISO
    Join Date
    Jun 2010
    Posts
    5

    Default Re: Data Mining

    Quote Originally Posted by lupin View Post
    This is an interesting topic, and I think its actually a bit more complicated than it may at first seem, especially if you are looking for a solution that is reusable and which will work across a number of types of exploited hosts. Any tool you use obviously should be able to be easily transferred to exploited boxes, which means ideally it should be small, and it needs to be able to run on the boxes in question in which case you need to think about what functions/libraries/interpreters will be supported by the target Operating System as well as which architecture the system uses if you are considering using compiled code.

    If we make an assumption that you want to focus on Windows boxes you might want to try and create something that makes use of the built in Windows searching functionality. This actually contains a number of plugins that allow it to search within particular file formats for particular words (for example when you install Acrobat Reader it includes the PDF iFilter library which allows the MS indexing client easy access to the contents of PDF files). Without making use of functionality like this you would probably be limited to more basic signature/grep style checks, which would probably be more portable but which would possibly limit the types of files in which you could search for your particular data. So your consideration of how to do this searching may also need to consider what file types you may want to search within, or which file types would be most likely to include the type of data you are interested in. If your target data is likely to exist within files AND be stored in those files in a plain text format then a simple regular expression style search may be your best bet.
    My thought progression is very similar to yours. It became evident fairly quickly that there would be limitations in performing such a task. To start with, I am aiming for plaintext capabilities through regular expressions and support for "the meat and potatoes" like MS Word, PDF, etc. Initially, as you stated, my goal is to focus on a somewhat specific OS and platform. The application footprint also needs to be small and portable. The first thing that came to mind was a C program with whatever supporting libraries needed linked in.

    I think IFilter and other capabilities like this combined with native searching functionality and regexs would be a vast improvement over the current process. What would you implement this in? Also, how would you handle incorporating these plugins into the main application?

    Thanks Lupin

  7. #17
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default Re: Data Mining

    Quote Originally Posted by Blu3Robot View Post
    My thought progression is very similar to yours. It became evident fairly quickly that there would be limitations in performing such a task. To start with, I am aiming for plaintext capabilities through regular expressions and support for "the meat and potatoes" like MS Word, PDF, etc. Initially, as you stated, my goal is to focus on a somewhat specific OS and platform. The application footprint also needs to be small and portable. The first thing that came to mind was a C program with whatever supporting libraries needed linked in.

    I think IFilter and other capabilities like this combined with native searching functionality and regexs would be a vast improvement over the current process. What would you implement this in? Also, how would you handle incorporating these plugins into the main application?

    Thanks Lupin
    Hmm, well compiled languages are not my specialty, but its kind of hard to suggest anything else when it comes to Windows. You can't guarantee that interpreters for the main text based interpreted languages (Perl, Python, Ruby) will be present on a Windows system, Java is likely to be present but its slow and clunky, JScript/VBScript/PowerShell/Batch files will probably run but are unlikely to be capable of doing what you want. So compiled language, probably C or C++.

    Adding your chosen functionality as an extension to Meterpreter sounds like a good approach to me, because Meterpreter itself can take care of most of the additional functionality you need such as a user interface, communications channel, etc, but I haven't looked into this enough to give good advice on how easy/difficult this would be to do.
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

  8. #18
    Just burned his ISO
    Join Date
    Jan 2010
    Location
    Secret Undisclosed Tiki Bar
    Posts
    1

    Default Re: Data Mining

    OP, though you seem to not want to use dir, find, grep, ls and other simple commands, I have found in my experience that those simple commands placed in a batch file or shell script can do the job quite nicely to get a basic list from which you can narrow down and fine tune your data mining.
    For example, one of the first commands I run once I have established a permanent presence on the box is a "Dirwalk" that gets a list of every file and directory on the specified drive:

    (These are all windows centric)

    dir c: /A /S * (/A for all attributes including hidden and system files and /S for recursive)

    The drawback is that this is rather CPU/Memory intensive. A quicker scan would use the old tree command:

    tree c:/ (list of all directories and sub directories under c

    From that, I look for interesting directories (i.e. c:\cc_dbase and so on) and then dirwalk through those specifically.

    If you want to narrow or refine just to a specific set of files or file types, I use the following:

    dir c:\*.rtf /S
    dir c:\*.pdf /S
    dir c:\*.xls /S
    dir c:\*.doc /S
    dir c:\*.ppt /S
    dir c:\*.vsd /S

    You can of course substitute other extensions.

    I put these into an interactive batch file and run the commands I need depending on the box in question. I also redirect the output to a temp file
    so I can review the output later on.

    Hope this helps...

    Cheers,
    cybrsnpr

  9. #19
    Senior Member Thorn's Avatar
    Join Date
    Jan 2010
    Location
    The Green Dome
    Posts
    1,509

    Default Re: Data Mining

    Quote Originally Posted by cybrsnpr View Post
    A quicker scan would use the old tree command:

    tree c:/ (list of all directories and sub directories under c: )
    Tree! Now you're really getting old school! A man after my own heart.

    BTW, I also like looking for MS Access files: *.mdb and *.accdb There's a lot of juicy info in databases.
    Thorn
    Stop the TSA now! Boycott the airlines.

  10. #20
    Super Moderator lupin's Avatar
    Join Date
    Jan 2010
    Posts
    2,943

    Default Re: Data Mining

    Some good suggestions for search targeting there cybrsnpr and Thorn, and if we are "going native" so to speak, there is also the "findstr" and "find" commands on Windows to search for text strings in files (findstr does regex!).
    Capitalisation is important. It's the difference between "Helping your brother Jack off a horse" and "Helping your brother jack off a horse".

    The Forum Rules, Forum FAQ and the BackTrack Wiki... learn them, love them, live them.

Page 2 of 3 FirstFirst 123 LastLast

Similar Threads

  1. Data Mining
    By morpheous in forum Experts Forum
    Replies: 22
    Last Post: 02-19-2010, 06:50 AM
  2. getting no data
    By chief30 in forum OLD Newbie Area
    Replies: 4
    Last Post: 02-26-2009, 02:59 AM
  3. Data or IV still do nothing
    By Upsman in forum OLD BT3final Support
    Replies: 2
    Last Post: 11-01-2008, 10:59 PM
  4. robots.txt mining made easy
    By imported_spudgunman in forum OLD Tutorials and Guides
    Replies: 7
    Last Post: 07-14-2008, 11:10 AM
  5. No data from AP
    By Cookie Monster in forum OLD Newbie Area
    Replies: 1
    Last Post: 01-01-2008, 05:23 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •