Windows Live Agents: Non-UTF8 encodings


One problem we've recently found is that the current version of the SDK has limited support for XML encodings other than UTF8. In fact, the encoding attribute in the XML files seems to be just ignored.

We are reading spanish encoded XMLs (ISO-8859-1), and we had some problems with accents and special characters. As not displaying them is clearly not an option, I digged in inside the SDK libraries until I found one function that solved the problem, StringUTF8ToLatin1().

Let's see how it works with a simple example.

This is a simple XML with spanish characters:

<?xml version="1.0" encoding="ISO-8859-1"?>
<messages>
<message>
<text>TEST ENCODING ...</text>
</message>
</messages>

We build the datasource as usual:

datasource TestEncoding() => TEXT, USER
file
testencoding.xml
simple xml
messages
message {loop=content}
text

And a sample pattern that loads the xml and displays each <message> element properly:

+ test_spanishencoding
MESSAGES = ()
try
RESULTS <= TestEncoding()
- StringUTF8ToLatin1(RESULTS.TEXT)
catch ERROR
- ERROR

Simple, but works perfectly :)

Comments?

Posted by Kartones on 2008-04-16