[solved] loadString, system() & Unicode


#1

Hello hello,

I’m having a bit of trouble with Utf-8 encoded files. Let’s make it a very simple example:

CI_LOG_I(loadString(loadAsset("test.yaml"));

And test.yaml:

ÄÄää

If I save my file as a latin1 variant (i.e. ISO-8869-15) I see the correct output in my Visual Studio Output. If I save it as UTF-8 or UTF-8 with BOM I get something like “ÄÄÔ. (Tested with vim and Sublime)

Now I’m not too sure how Visual Studio Console Output handles Unicode. So maybe that is the problem. If I make TextBox and use my loaded string it works like a charm either way (fair enough!).

However: I need to pipe my loaded content at some point to system() for some ugly behind-the-scenes font rendering mumbo jumbo. In general system() should be fine with this and don’t care too much about the encoding. I end up getting the very same “ÄÄÔ rendered though which makes me think that how I (or cinder or c++ or…) handles the encoding might be the problem.

I’d be grateful for any ideas or hints on this, thank you!

// edit:
So it basically seems that at some point my UTF-8 encoded string is treated as latin-1. For example I get the same kind of characters if I save a html file as unicode and ommit the Content-Type: Charset=Utf-8 header.

// edit2:
I’m on windows :slight_smile:


#2

I believe this is actually a bug in Cinder’s console() function (which wraps OutputDebugString() on MSW). I’ve made a pull request here:

I’d be curious if that addresses your use case as well.


#3

Yes! It did indeed and also helped me with my other problem.

To pass on unicoded arguments to a system call I now use this:

_wsystem( msw::toWideString(cmd).c_str() );

Thank you so much, Andrew.