There’s a n new article on MSDN about troubleshooting common XMLSerializer problems.

There’s one thing with the StreamWriter/XMLSerializer/XMLTextWriter that really aggrivates me. I’ve torn my hair out the first time trying to figure out the problem only to be mystified at why the problem even exists.

At one point, every XML saved in UTF-8 unicode had three extra bytes (0xEFBBBF) at the beginning before <?xml version="1.0"> started. This freaked ASP out, as well as certain versions of the Expat parser and many editors.

Once I remembered that it was the BOM (Byte Order Marker), I set about to find out how to turn it off. At first glance, there is no property anywhere to simply turn it off.

During the creation, you can pass in an encoding. There are two extremely similiar options that yield two completely different results.

The first is System.Text.Encoding.UTF8. With this option, the BOM is always written. There is no obvious way to turn it off.

wr = New XmlTextWriter("file.xml", New System.Text.Encoding.UTF8)

The second option is System.Text.UTF8Encoding which doesn’t write the BOM.

wr = New XmlTextWriter("file.xml", New System.Text.UTF8Encoding())

Christ, those are really different aren’t they. :-/
THat namespace/class nameing could have been just a little better.

See more posts about: microshaft | All Categories