1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Converting MS Word files to html?

Discussion in 'HTML & Website Design' started by Johnburk, Feb 11, 2006.

  1. #1
    I am making a site and over 40 pages of content is in MS word. When I save it as .html, the files get huge! 1 page is then 70Kb as .html

    The html code then looks like this.
    
    <p><span style="font-size:12.0pt;font-family:&quot;Times New Roman&quot;;
    mso-fareast-font-family:&quot;Times New Roman&quot;;mso-ansi-language:NL;mso-fareast-language:
    NL;mso-bidi-language:AR-SA">Bla bla bla bla blblala</span></p>
    
    Code (markup):
    Is there a way to copy past the whole thing (including tables, graphs etc.. etc..) and still keeping a small file?
     
    Johnburk, Feb 11, 2006 IP
  2. infonote

    infonote Well-Known Member

    Messages:
    4,032
    Likes Received:
    68
    Best Answers:
    0
    Trophy Points:
    160
    #2
    In Macromedia Dreamweaver there is an option to remove html code that word creates (saves as HTML). You don't really need it, it basically just shows that code was created by MS Word.

    Bedst thing however is to use XHTML validation available at http://validator.w3.org/
     
    infonote, Feb 11, 2006 IP
  3. Johnburk

    Johnburk Peon

    Messages:
    777
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #3
    @infonote

    Thank you
    I don't understand how http://validator.w3.org/ can clean up my file.

    Do you have a direct link to the software?
     
    Johnburk, Feb 11, 2006 IP
  4. my44

    my44 Peon

    Messages:
    722
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Might not be helping much, but I've have experienced the same thing when using MS Word. I usually avoid it.

    But a small tip I use is this: I copy all the text from MS Word into Notepad. Then copy from Notepad to your web design program (Frontpage/Dreamweaver). The unnecessary code will disappear because that's what Notepad is there for.

    But I have to say if you have tables, it'll be tough.
     
    my44, Feb 11, 2006 IP
  5. infonote

    infonote Well-Known Member

    Messages:
    4,032
    Likes Received:
    68
    Best Answers:
    0
    Trophy Points:
    160
    #5
    By validating the code to XHTML or HTML you are cleaning up the code.
     
    infonote, Feb 11, 2006 IP
  6. melaniejk

    melaniejk Peon

    Messages:
    397
    Likes Received:
    17
    Best Answers:
    0
    Trophy Points:
    0
    #6
    I agree with my44. When I had info in MS Word, I finally ended up putting it in Notepad.
     
    melaniejk, Feb 11, 2006 IP
  7. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #7
    For dealing with tables, outlines, and highly formatted work the Dreamweaver clean up option was they best one I found.

    Shannon
     
    Smyrl, Feb 11, 2006 IP
  8. AWD1

    AWD1 Peon

    Messages:
    191
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Some WYSIWYG editors, besides Dr*amw*av*r, have functions specifically for copying and pasting from Word.

    http://www.innovastudio.com has a button in its editor specifically for this purpose. If you know ASP, you can output the form input wherever and however you like.
     
    AWD1, Feb 11, 2006 IP
  9. D_C

    D_C Well-Known Member

    Messages:
    1,107
    Likes Received:
    21
    Best Answers:
    1
    Trophy Points:
    160
    #9
    I prefer when makeing a website to use notepad. Word has some problems that i just dont like. Im not sure why people use Word to write HTML but... notepad is better in my opinion.

    Peace
     
    D_C, Feb 11, 2006 IP
  10. dj1471

    dj1471 Well-Known Member

    Messages:
    97
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    116
    #10
    I remember that there used to be a little tool specifically for cleaning up Word-generated HTML - but I can't remember what it was called. Google can probably find it.
     
    dj1471, Feb 12, 2006 IP
  11. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #11
    Tidy is one such. Does not work on tables and outlines or at least did not when I trired it.

    If using FrontPage you can copy word code and use the paste special option. Same drawback as Tidy, basically only useful for paragraphed document.

    Shannon
     
    Smyrl, Feb 12, 2006 IP
  12. dj1471

    dj1471 Well-Known Member

    Messages:
    97
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    116
    #12
    It was before the time of tidy. I had a Windows app specifically for cleaning up Word's generated HTML. It may even have been from Microsoft...

    I dumped Office along with Windows several years ago, so I've no idea if it's still around.
     
    dj1471, Feb 12, 2006 IP
  13. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #13
    Dealing with word generated html has really curled my straight hair. I never found anything that was ideal. I could not retrain person sending me the word docs to use something else, could not find a reasonable solution and, and spent a year and a half cleaning his mess. I finally got to the point of using find and replace feature to strip. Finally had enough and exited the situation of keeping up the site. Site is back to using html generated by Word. What a fiasco. </rant>
     
    Smyrl, Feb 12, 2006 IP
  14. dj1471

    dj1471 Well-Known Member

    Messages:
    97
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    116
    #14
    A great app for people like that is Macromedia Contribute. Let them initially create the site in Word, you then manually clean it up ONCE, then give them Contribute. It'll download the site over FTP, they can then edit it in the WYSIWYG and just apply the changes. The interface is quite similar to Word, but the code it generates is a lot better!
     
    dj1471, Feb 12, 2006 IP