Load text from an office file like .docx .xlsx .odt .ods on the command line with the free Open Source tool Swiss File Knife
sfk oload in.docx +...
load office file content as plain text,
for easy display or further processing.
about line wrapping
- when sending to a command that expects
text lines like
sfk oload in.xlsx +filter ...
then long lines or stream text will be
hard wrapped at 4096 characters.
- when sending to stream capable commands
like xed and xex data is not wrapped.
options
-force if file cannot be read, continue script
with empty data
-noerr don't show error messages
-utfout keep raw UTF-8 encoding on output, to use it
with further commands requiring UTF-8 data.
-raw get raw xml data, for content analysis.
implies -utfout.
-subnames with .xlsx files only: add header lines
with sheet subfile names.
see alsosfk help office supported office file types
sfk xex extract phrases from text
sfk ofilter get lines from office file
examplessfk oload in.docx
display contents of a .docx word file
sfk oload in.xlsx
display spreadsheet table data as plain text
sfk oload in.xlsx +filter -+foo
get all lines with 'foo' from a table
sfk oload in.docx +xex "/foo**bar/"
extract multi line blocks from a word file
starting with foo and ending with bar
sfk oload in.xlsx +filt -no-empty-lines +tabtocsv
get records from a table, drop empty lines,
then convert from tabs to comma separated data.
sfk oload -raw in.docx +xmlform +view
reformat and display xml content using dview.