I need a way to convert .doc or .docx extensions to .txt without installing anything. I also don\'t want to have to manually open Wor
For .doc, I've had some success with the linux command line tool antiword. It extracts the text from .doc very quickly, giving a good rendering of indentation. Then you can pipe that to a text file in bash.
For .docx, I've used the OOXML SDK as some other users mentioned. It is just a .NET library to make it easier to work with the OOXML that is zipped up in an OOXML file. There is a lot of metadata that you will want to discard if you are only interested in the text. Some other people have already written the code I see: DocXToText.
Aspose.Words has a very simple API with great support too I have found.
There is also this bash command from commandlinefu.com which works by unzipping the .docx:
unzip -p some.docx word/document.xml | sed -e 's/<[^>]\{1,\}>//g; s/[^[:print:]]\{1,\}//g'