Anyone know of an easy way to convert Microsoft Word files into text files? I have about 600 files that needs to be converted to text files in order use them. Hate to have to try and do them manually Thanks
In order to use the contents of a Word Document (“.doc†or “.docx†extension) in a concordancer it must be converted or saved as a plain text file (“.txt†extension). I will outline two different ways you can do this below. Method 1 open the document in Word, do a “Save as†in Word (goto File > Save as), select “Save as type†(see image) as “plain textâ€, click “Saveâ€, when the dialogue box appears (for non-English OSs) check “allow character substitution†and then click “OKâ€, Source: http://corpora.wordpress.com/2008/01/11/how-to-convert-word-document-files-into-plain-text-files/
Well saving 600 files manually is gonna get reaaaally boring. Best to try programming something if you can code a little. There are some examples using PHP out there, Google for "php word to text". A simple script could iterate through every file in a folder, convert, and save as a new text file.
I am not sure if this is accurate, but you can open DOC files in Google Docs, and as far as I know, you can save the opened files as TXT if you wish