Saturday, May 12, 2007

Word 2003's HTML is a mess.

...and I doubt Word 2007 is any better.

I could make a test webpage with this code:

<html>
<head>
<title>test</title>
</head>
<body>
test
</body>
</html>

But Word makes this code:
<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 11">
<meta name=Originator content="Microsoft Word 11">
<link rel=File-List href="test_files/filelist.xml">
<title>test</title>
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Author>Alison</o:Author>
<o:LastAuthor>Alison</o:LastAuthor>
<o:Revision>1</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Created>2007-05-12T19:29:00Z</o:Created>
<o:LastSaved>2007-05-12T19:29:00Z</o:LastSaved>
<o:Pages>1</o:Pages>
<o:Characters>4</o:Characters>
<o:Company>MSL Computers</o:Company>
<o:Lines>1</o:Lines>
<o:Paragraphs>1</o:Paragraphs>
<o:CharactersWithSpaces>4</o:CharactersWithSpaces>
<o:Version>11.5606</o:Version>
</o:DocumentProperties>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:WordDocument>
<w:GrammarState>Clean</w:GrammarState>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
</w:Compatibility>
<w:BrowserLevel>MicrosoftInternetExplorer4</w:BrowserLevel>
</w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" LatentStyleCount="156">
</w:LatentStyles>
</xml><![endif]-->
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-parent:"";
margin:0cm;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:"Times New Roman";
mso-fareast-font-family:"Times New Roman";}
@page Section1
{size:595.3pt 841.9pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;
mso-header-margin:35.4pt;
mso-footer-margin:35.4pt;
mso-paper-source:0;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin:0cm;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:10.0pt;
font-family:"Times New Roman";
mso-ansi-language:#0400;
mso-fareast-language:#0400;
mso-bidi-language:#0400;}
</style>
<![endif]-->
</head>

<body lang=EN-GB style='tab-interval:36.0pt'>

<div class=Section1>

<p class=MsoNormal>test</p>

</div>

</body>

</html>

WTF!?

No comments: