I’d like to use cinder’s
toUtf16() function on a string like this “ñêQRV”, but this leads to an invalid UTF-8 exception. I’m guessing I have to convert them to characters representing their unicode value, as the character itself isn’t a proper UTF-8 character? Any tips/hints on doing this?
I’m reading in text from Mp3 ID3 tags and apparently it occasionally uses a (now) deprecated encoding. Specifically, “UCS-2 encoded Unicode with BOM” according to the wikipedia page. I spent at least an hour trying to find the ‘right way’ to properly decode it, until I eventually gave up and settled for a solution provided by a kind soul on the github for the id3 decode library. I’m left with an
std::string which will occasionally contain what is apparently invalid UTF-8 characters because when I use
toUtf16() on it, I’m dealt an exception highlighting this.
I use it primarily so I can remove a single character at a time from a string and be sure it is a ‘complete’ character rather than a portion of one.
So I’m wondering what I’m to do - perhaps convert the characters to proper UTF-8 using
toUtf8()? But if that’s the approach, how do I do this when these characters are in an
std::string to begin with?
Thanks in advance,