Results 1 to 5 of 5

Thread: Extract Excel metadata

  1. #1
    Just burned their ISO
    Join Date
    Jan 2010
    Posts
    23

    Default Extract Excel metadata

    Good afternoon,

    I'm working on a tool that I later hope to have integrated into Backtrack. It's a very useful tool. Anyway, can someone assist me with extracting Excel file metadata? I am able to extract metadata from any other file, but I'm having the most difficulty when extracting data from Excel files. Assistance with PowerPoint metadata extraction in linux would also be helpful.

    I've conducted various searches to find more information to this, but I keep running into a dead end. DOC files work fine; however, Excel and PowerPoint files do not.

    Thanks.

    EDIT: I can extract metadata from Excel files, but there is only a limited amount of information I can get back. I have opened the file manually and located additional metadata (which also was discovered by FOCA), but I cannot do this with Linux. >_<
    Last edited by altjx; 02-20-2013 at 04:23 AM.

  2. #2
    My life is this forum thorin's Avatar
    Join Date
    Jan 2010
    Posts
    2,629

    Default Re: Extract Excel metadata

    So lacking in detail.

    1) Are we talking XML files (i.e.: xlsx) or proprietary files (.xls)?
    2) What have you tried?
    3) Are you doing all of this via shell scripting or programatically? If programatically what language and libraries are you leveraging?
    I'm a compulsive post editor, you might wanna wait until my post has been online for 5-10 mins before quoting it as it will likely change.

    I know I seem harsh in some of my replies. SORRY! But if you're doing something illegal or posting something that seems to be obvious BS I'm going to call you on it.

  3. #3
    Just burned their ISO
    Join Date
    Jan 2010
    Posts
    23

    Default Re: Extract Excel metadata

    Quote Originally Posted by thorin View Post
    So lacking in detail.

    1) Are we talking XML files (i.e.: xlsx) or proprietary files (.xls)?
    2) What have you tried?
    3) Are you doing all of this via shell scripting or programatically? If programatically what language and libraries are you leveraging?
    I have simply used the extract command (extract -V <file>). I am able to grab metadata from every other file type, and even .xls .xlsx but only up to a certain extent.

  4. #4
    My life is this forum thorin's Avatar
    Join Date
    Jan 2010
    Posts
    2,629

    Default Re: Extract Excel metadata

    Interesting, I was unaware of that package.

    It looks like you might simply be bound by limitations of the program. I don't have a BT box handy on which I can check the version details, however looking at the Ubuntu package archive both 10.04 and 12.04 seem to use libextractor 0.5.23 (extract -v). Looking at http://www.gnu.org/software/libextractor/ I see that there have been quite a few point releases and a full 1.0 with point release 1.0.1. If you're brave you could grab an updated copy and see if it produces any better results for you.

    I did a quick test on a Ubuntu 10.04 box with 0.5.23 and got much more data out of a Word Doc (.docx) than an Excel file (.xlsx), so it could just be that Excel files don't provide the level of details that other office files do.

    As an alternative you could also use the file command or run strings on the file and grep for the important parts.
    I'm a compulsive post editor, you might wanna wait until my post has been online for 5-10 mins before quoting it as it will likely change.

    I know I seem harsh in some of my replies. SORRY! But if you're doing something illegal or posting something that seems to be obvious BS I'm going to call you on it.

  5. #5
    Just burned their ISO
    Join Date
    Jan 2010
    Posts
    23

    Default Re: Extract Excel metadata

    Quote Originally Posted by thorin View Post
    Interesting, I was unaware of that package.

    It looks like you might simply be bound by limitations of the program. I don't have a BT box handy on which I can check the version details, however looking at the Ubuntu package archive both 10.04 and 12.04 seem to use libextractor 0.5.23 (extract -v). Looking at http://www.gnu.org/software/libextractor/ I see that there have been quite a few point releases and a full 1.0 with point release 1.0.1. If you're brave you could grab an updated copy and see if it produces any better results for you.

    I did a quick test on a Ubuntu 10.04 box with 0.5.23 and got much more data out of a Word Doc (.docx) than an Excel file (.xlsx), so it could just be that Excel files don't provide the level of details that other office files do.

    As an alternative you could also use the file command or run strings on the file and grep for the important parts.
    yeah. I agree... looks like there's some limitations going on.

    i appreciate your response man

Similar Threads

  1. VBAA Script not working in Excel
    By pryanka in forum Experts Forum
    Replies: 0
    Last Post: 08-29-2010, 03:33 PM
  2. MetaGoofil – Metadata Analyzer
    By firebits in forum Tutoriais e Howtos
    Replies: 1
    Last Post: 03-28-2010, 10:27 PM
  3. M$ Excel pass cracker?
    By b3r00tb4ck in forum OLD Newbie Area
    Replies: 2
    Last Post: 10-23-2009, 10:29 PM
  4. MS Excel oh how I loathe thee
    By Oneiroi in forum OLD General IT Discussion
    Replies: 14
    Last Post: 07-30-2008, 07:43 AM
  5. excel passwords
    By webtrol in forum OLD General IT Discussion
    Replies: 8
    Last Post: 05-25-2008, 03:16 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •