At the moment I’m trying to make a small bridge between libcinder and dlib. Everything is working except image conversion from dlib to Cinder. Let me elaborate a bit:
In dlib images are either 2D arrays or matrixes. The type of pixel is templated and can be anything including unsigned char, unsigned short, float etc. . Following the same methods used in OpenCV3 block (and having a look at the openframeworks impl), I’ve made a ImageSource class as:
When dealing with all sorts of dlib images with unsigned char pixel type, everything works fine but when using float or unsigned short I get weird results like this:
Don’t mind picture 4 and 5. These are because of dlib’s different color models (HSI and LAB). But as you can see in picture 7 and 8 the results are very weird. 7 is float and 8 is unsigned short.
Can anyone help me out here? Am I doing something wrong in load() method?
Thanks a lot guys
I should also mention that the DataType, ColorModel and ChannelOrder are all correct when I debugged the code. unsigned short should be ci::ImageIo::DataType::UINT16, right?
Thanks Paul for the suggestion, it didn’t seem to cut it though.
Meanwhile I tried some variations with the previous code. I suspected the reinterpret_cast to be the culprit and when started digging in Cinder’s source, I found ImageFileTinyExr.cpp which used another method to load into the ImageTargetRef. hacked around a little and reached this:
the unsigned short version is looking correct and the float version kind of inverted.
I still can’t figure out where the problem is though, this was just trial and error on my side so this could be pure luck and this could be an incorrect code yielding correct results. Can you maybe see where the problem is?
Yeah, that’s looking better. All I can think of now is that the channel order is different for the float data. Maybe it’s BGRA, or RGBA, or any other combination? I believe Cinder defaults to ARGB.
I don’t think that’s it, in dlib the rgb, rgba and brg have their own pixel_types and all range from 0-255 (unsigned shorts), they’re actually the first three in my test. The last three are all grayscale and have only one channel. They are of types array2d<unsigned char>, array2d<float>, array2d<unsigned short>.
I’m also starting to suspect the conversion process of the values. I remember reading in dlib docs that when using each type, the min and max value range can change based on the value type. And as far as I understand cinder likes uint8_t, right? Also the first 6 images are all in root unsigned chars, and dlib explicitly mentions 0-255 range for them in here but the last two follows this rule:
min() == the minimum obtainable value of objects of type T
max() == the maximum obtainable value of objects of type T
2- Is the numeric range for different DataTypes different or are they all between 0-255? I guess this question is also valid when it comes to the surface/channel pixel iterators. For instance does the Surface32f iterator’s value go across float’s numerical range or a float between 0-255. I checked Channel.h for instance and found out than getData() returns reinterpret_cast<T*>( reinterpret_cast<unsigned char*>( mData + offset.x * mIncrement ) + offset.y * mRowBytes ),why interpreting to unsigned it and back to T, I can’t wrap my head around this easily.
Regarding your second question: the reinterpret_cast results in an unsigned char*, making sure the subsequent addition of offset.y * mRowBytes is expressed in bytes, rather than anything else. The result of that is a pointer to a new address, containing the actual value, which is then cast to the correct type.
I believe float values are expected to be in the range [0…1], otherwise known as normalised values, but I could be wrong. HDR images (like the EXR format), for example, will have a far greater range of values.
As for your first question: no, I believe that mData can be any kind of data, as long as it is supported by the conversion routine in func. Best advice I can give you is to step through the code with the debugger to see what exactly is going on. Also use image source data that you know intimately, for example only floats with a value of 1.0f. Then you know whether or not the conversion is done correctly.
Here’s the explanation: At first I started playing with values to understand Cinder’s numeric ranges, for uint8_t type we already knew that this was 0-255. For floats as Paul mentioned in his post, it’s a normalized range between 0.0f - 1.0f . For uint16_t the range was 0-65535 , to be precise for floats it’s 0.0f - 1.0f and for uint types it is:
As for dlib’s range (see this issue for detailed answer from dlib’s Davis), I realized that for most use cases and also in my case (since I was loading a jpg in a float container) the ranges are 0-255.
Here mSourceValueMin and mSourceValueMax are imagined 0-255 unless changed by the user. Also I’m not so sure if using vectors like I did is the fastest route but it works pretty good for my case.
The only piece of conversion that is left to tackle are the LAB and HSI color models. I’ve seen CHAN_LAB_L, CHAN_LAB_A and CHAN_LAB_B definitions in Cinder’s ChannelType but still have to dig in to see how it will work for my case.
Is there any chance you would share your completed file, Kino? I’m also trying to bridge cinder and dlib and would like to see how you are converting from cinder to dlib as well (I’ve been attempting a pretty naive approach and am curious to see how you have done the different colour spaces)
Sorry for seeing this late, have been terribly busy the past two weeks and didn’t realize you addressed me here.
Actually I managed to (almost) wrap up a Cinder-dlib block back then. I needed it for a project that got cancelled and I never got to upload it anywhere. I’d be happy to upload it to github but I might not be able to get around to do it in 3-4 days since I’m travelling today and will be busy with installing a project the next days. I hope that’s ok. Will let you know here.
That would be great, thanks!
I have got a lot closer, but I’d love to see how you’ve done it - hopefully it will clarify my questions regarding the imageSource and imageTarget classes (although I think I’m understanding better now).