I am making a site and over 40 pages of content is in MS word. When I save it as .html, the files get huge! 1 page is then 70Kb as .html The html code then looks like this. <p><span style="font-size:12.0pt;font-family:"Times New Roman"; mso-fareast-font-family:"Times New Roman";mso-ansi-language:NL;mso-fareast-language: NL;mso-bidi-language:AR-SA">Bla bla bla bla blblala</span></p> Code (markup): Is there a way to copy past the whole thing (including tables, graphs etc.. etc..) and still keeping a small file?
In Macromedia Dreamweaver there is an option to remove html code that word creates (saves as HTML). You don't really need it, it basically just shows that code was created by MS Word. Bedst thing however is to use XHTML validation available at http://validator.w3.org/
@infonote Thank you I don't understand how http://validator.w3.org/ can clean up my file. Do you have a direct link to the software?
Might not be helping much, but I've have experienced the same thing when using MS Word. I usually avoid it. But a small tip I use is this: I copy all the text from MS Word into Notepad. Then copy from Notepad to your web design program (Frontpage/Dreamweaver). The unnecessary code will disappear because that's what Notepad is there for. But I have to say if you have tables, it'll be tough.
For dealing with tables, outlines, and highly formatted work the Dreamweaver clean up option was they best one I found. Shannon
Some WYSIWYG editors, besides Dr*amw*av*r, have functions specifically for copying and pasting from Word. http://www.innovastudio.com has a button in its editor specifically for this purpose. If you know ASP, you can output the form input wherever and however you like.
I prefer when makeing a website to use notepad. Word has some problems that i just dont like. Im not sure why people use Word to write HTML but... notepad is better in my opinion. Peace
I remember that there used to be a little tool specifically for cleaning up Word-generated HTML - but I can't remember what it was called. Google can probably find it.
Tidy is one such. Does not work on tables and outlines or at least did not when I trired it. If using FrontPage you can copy word code and use the paste special option. Same drawback as Tidy, basically only useful for paragraphed document. Shannon
It was before the time of tidy. I had a Windows app specifically for cleaning up Word's generated HTML. It may even have been from Microsoft... I dumped Office along with Windows several years ago, so I've no idea if it's still around.
Dealing with word generated html has really curled my straight hair. I never found anything that was ideal. I could not retrain person sending me the word docs to use something else, could not find a reasonable solution and, and spent a year and a half cleaning his mess. I finally got to the point of using find and replace feature to strip. Finally had enough and exited the situation of keeping up the site. Site is back to using html generated by Word. What a fiasco. </rant>
A great app for people like that is Macromedia Contribute. Let them initially create the site in Word, you then manually clean it up ONCE, then give them Contribute. It'll download the site over FTP, they can then edit it in the WYSIWYG and just apply the changes. The interface is quite similar to Word, but the code it generates is a lot better!