View Full Version : Cleaning up Word generated html
Smyrl
Apr 23rd 2004, 1:56 pm
I have been working with a group that has a site with code generated by many people using many editors. The person updating many of the pages that are updated the most frequently is using Word to generate the pages which he will want to continue to use. Has anyone had any positive experience cleaning up Word generated code?
Shannon
hans
Apr 23rd 2004, 2:38 pm
HotMeTaL pro 6 was once a good tool to create, validate, modify and clean up html sites under win OS. it also includes a full linkcheck offline.
another approach i prefer and practised when i converted my (wordperfect, staroffice, netscape, and other generator-created old pages into current version ):
copy and paste all text only ( without HTML )
and REDO all HTML from scratch with professional generator producing CLEAN code.
i use Quanta plus under Linux since many years to my fullest satisfaction!
http://quanta.sourceforge.net/
while this method MAY appear appear much work
it is VERY little in deed ...
because you do the full html stripping and rebuild ONCE in a life time and have clean code for a FULL lifetime !
Smyrl
Apr 23rd 2004, 3:52 pm
Thanks for the ideas. I am tired of stripping code by hand. What garbage! Formatting alone trippled load time on one document I stripped.
Shannon
hans
Apr 23rd 2004, 4:23 pm
in this current millenium we use to to all formatting by external CSS
leaving all pages clean and fast to load !
AND ...
to assure web site owner AND webmaster pass the parachute test successfully and alive.
Such Great Heights
Apr 23rd 2004, 4:55 pm
I have been working with a group that has a site with code generated by many people using many editors. The person updating many of the pages that are updated the most frequently is using Word to generate the pages which he will want to continue to use. Has anyone had any positive experience cleaning up Word generated code?
I've had good expereinces with Dreamweaver's clen up Word HTML code command. Just go to COMMAND and select "Clean up Word HTML".
Dreamweaver has some options to only clean up CSS, or all or others.
so try dreamweaver to clean it up.
nlopes
May 9th 2004, 3:27 am
I have been working with a group that has a site with code generated by many people using many editors. The person updating many of the pages that are updated the most frequently is using Word to generate the pages which he will want to continue to use. Has anyone had any positive experience cleaning up Word generated code?
Shannon
The word I have for you is: Tidy!
Tidy is a tool originally from W3C and cleans up html documents INCLUDING MS WORD html (it has a special configuration option for this!).
http://tidy.sf.net
-or-
PHP binding: http://php.net/tidy
mopacfan
May 13th 2004, 8:23 am
I've found the easiest way to clean up word html is to paste it into wordpad then cut/paste it into your html editor or notepad. It removes all that stupid idiotic bull pucky crap that word puts in the code.
Regards,
Michael
Smyrl
May 13th 2004, 2:48 pm
Thank you for your ideas. If I never see the worrd span again I will be happy. The amount of garbage Word generates for a blank line or blank cell in a table should be humorous but at this stage all I see is red.
I do appreciate your input.
Sincerely,
Shannon Smyrl
Johnburk
Feb 11th 2006, 5:59 am
@smyrl
What tool do you use to do the job?
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.