Search in office files on the command line on the command line with the free Open Source tool Swiss File Knife

sfk office file support

sfk can read Open Document Format office files,
which are a standard since 2007, with these
filename extensions:

.docx .dotx .dotm .docb .xlsx .xlsm .xltx .xltm 
.pptx .pptm .potx .potm .ppam .ppsx .ppsm .sldx 
.sldm .odt .ods .odp .odg .odc .odf .odi 
.odm .ott .ots .otp .otg 

sfk can not read older office file formats
like .doc .xls or .ppt.

supported commands:

  sfk olist mydir
     list all office files in folder mydir.

  sfk ofind mydir "/myword/"
     search office and plain text files in mydir
     containing the word 'myword'.

  sfk ofind mydir "/foo*bar/"
     search foo followed by bar in the same line.
     for more infos type: sfk ofind

  sfk ofilter in.xlsx -+foo
     filter content of a spreadsheet table
     for lines containing 'foo'
     for more infos type: sfk ofilter

  sfk oload in.xlsx
     load and display content of in.xlsx

  sfk oload in.xlsx +xex "/*\tapple\t/*"
     find fields containing just 'apple'
     and get the whole row around.

  sfk oload in.xlsx +filt -spat -+\tapple\t
     same as above, using +filter.
     for more infos type: sfk oload

  sfk snapto=alldoc.txt -office mydir
     collect plain text, and text from office
     files like .docx .xlsx .odt .ods,
     from folder mydir into one file alldoc.txt
  sfk find alldoc.txt foo bar
     search alldoc.txt for lines containing
     the words foo and bar
  dview alldoc.txt
     browse and search alldoc.txt interactively
     with the Depeche View text file browser.
     for details see: sfk view

  sfk list mydir +fview -office
     similar to above, view office file contents
     and plain text contents in mydir, all in one
     window using Depeche View (dview.exe)

  dview -office mydir
     same as before, using dview directly.
     for more infos type: sfk view

  sfk unzip in.xlsx -todir tmp
     extract all contents of in.xlsx
     into a folder tmp. -todir is important
     otherwise you end up with many files
     in the current folder.

  sfk zip -rel out.xlsx tmp
     recreate an office file out.xlsx
     from all contents in tmp.
     -rel is important to strip folder
     name 'tmp' from the content filenames.