External library seems to cause heap corruption when used in Cinder, but not outside of it...?

Hey Embers,

I’ve been on a fun journey down the rabbithole of using TagLib together with Cinder.

I have encountered what appears to be nasty heap corruption using the following code:

TagLib::FileRef tagLibFileRef( "D:/Music/PMPVsTokyo/New2/gibs - VIXEN - Thru Corinnes Eyes.mp3" );
TagLib::String artist = tagLibFileRef.tag()->artist();

std::string test3 = artist.to8Bit( true ); // Works Great - Celebration!

std::string test4;
test4 = artist.to8Bit( true ); // Debug Assertion Failed! Line 996 - Expression: __acrt_first_black == header

I’m no expert on memory / heap related stuff, but I think the code above should be fine. I inquired with the library makers who reasonably state that the code’s been in wide use since 2013 and so is unlikely to be buggy.

I then tried using TagLib in a completely empty C++ project without Cinder. No crash, the code above works fine.

Running the same code in a completely empty C++ project with Cinder (0.9.2) causes issues.

Things I’ve investigated:

  • The heap corruption is only ‘visible’ when compiled in Debug mode. Release mode yields the expected string in both cases. My assumption is that the heap corruption also occurs in Release mode, but given less rigorous checking simply isn’t caught.
  • Suspecting perhaps improper Debug/Release library linking I’ve tried linking both with TagLib installed via vcpkg (which I use for lots of other packages with no issues), as well as via a manually built local TagLib version. Same behavior.
  • I dynamically link with TagLib (due to LGPL) and the Windows Runtime library my Cinder project is build with is Multi-Threaded, as opposed to Multi-Threaded DLL. I noticed that TagLib was built using the Multi-Threaded DLL, so I tried switching to Multi-Threaded. Same behavior.

And now I’m here… Pretty stumped as to what’s next…?

Open to any suggestions. Also open to someone confirming that the run-time library switching actually doesn’t matter as dynamic linking means both the exe and DLL can have their own run-time library.

Cheers,
Gazoo

Using the STL across DLL boundaries is always problematic, you need to make sure you have a consistent ABI (i.e same compiler with same settings) to compile both otherwise you’ll run into a lot of this.

What you’re seeing now is memory that is allocated in a separate heap (the DLL) which is then being deallocated in your heap, perhaps using a completely different implementation of ~std::string

Nitty gritty here, though I suspect you won’t like his conclusion.

*edit Looking at taglib itself, there’s methods provided that only use primitive datatypes, can you try: std::string test3 = artist.toCString ( true ) instead? This way the std::string only ever exists on your side.

2 Likes

@lithium - Wish I had you sitting next to me 2 days ago :slight_smile: Would have saved me a lot of trouble. On the other hand isolating issues and learning to love CMake is kind of useful too.

I actually love the conclusion and explanation. While I definitely prefer more modern data structures, such as std::string, if I get a well argued case as to where I need to work around using it, then I’m all ears. I just hope this experience sits a bit in me so I have some memory of it next time.

I think .toCString() is just the ticket @lithium ! Thank you very much for the insight!

Hopefully you don’t me mind picking your brain again @lithium. As my good fortune would have it, I ran into the exact same issue again with a different library again (Dpp). The library uses call-back functionality to handle various behaviors which make it murky for me to see whether the allocation of STL objects actually crosses Application / DLL boundaries. See sample code below:

dpp::cluster bot( "discord_bot_token" );

    bot.on_ready([&bot](const dpp::ready_t& event) {
        bot.log(dpp::ll_info, "Logged in as " + bot.me.username);
    });

    bot.on_message_create([&bot](const dpp::message_create_t& event) {
        if (event.msg->content == "!ping") {
            bot.message_create(dpp::message(event.msg->channel_id, "Pong!"));
        }
    });

    bot.on_log([](const dpp::log_t& event) {
        std::cout << dpp::utility::loglevel( event.severity ) << ": " << event.message << "\n";
        if (event.severity > dpp::ll_trace) {
            std::cout << dpp::utility::loglevel(event.severity) << ": " << event.message << "\n";
        }
    });

    bot.start(false); 

Unfortunately, regardless of what the library does or doesn’t do, it still triggered the same heap corruption error (__acrt_first_block == header). Not seeing an easy way to work-around the problem in this library (no .toCString() here), I decided to google the specific error, leading to this page:

https://docs.microsoft.com/en-us/answers/questions/240332/assert-when-using-mtd-dynamic-library.html

The explanation on that page further supports your previous advice. But that doesn’t provide a straight-forward solution for using dpp. Therefore, based on the explanation I thought to re-compile my application using ‘Multi-Threaded DLL’ (MD) as opposed to just ‘Multi-Threaded’ (MT). This results in the error going away.

So why this giant epilogue? Well - I’d still like your advice on these details if you’re willing to offer it. I’ve further read up a bit on using MD v. MT:

Cinder’s CMake by default builds with MT, but the general sentiment on the stackoverflow appears to be to favor MD if you plan to link with external DLLs. I’m admittedly a bit worried that the issue has now just become invisible rather than disappeared, although if one is forcing the executable and DLLs to use the same heap manager, it would seem that the problem ‘should’ actually be solved…?

Thoughts?

Cheers,
Gazoo

I don’t have a firm enough grasp on the nuances of the different CRT modes to comfortably offer any real advice, I’m afraid. I encountered enough of this stuff early in my c++ career to engender a pretty serious NIH syndrome so I try to write and/or control as much of my own code as humanly possible to avoid wasting time in DLL hell. The down side means I haven’t spent a huge amount of time dicking around in visual studio’s various property panels trying to fluke the right incantations. Once a library starts being difficult my instinct is to spend the time rewriting its functionality rather than bending it to my will, for better or worse :wink:

Disclaimers out of the way, this is well worn territory and isn’t some secret undocumented compiler feature that you’re using, so I’d expect it to work at least as well as any other CRT mode you might use. The other good thing is that it’s very simple to change, you don’t have to invest a month just to find out /MD doesn’t work for you. I’d just go with /MD and keep an eye on it until it gave me an actual reason to doubt it. Not exactly scientific but at least then you can move on improving your product in the meantime rather than rat-holing on one particular platform’s nasty compiler peccadilloes.

The only caveat that does jump to mind is that I believe you’re shipping as a downloadable product? (as opposed to fixed installations which is what I mainly do) so you may need to provide a redistributable c++ runtime to install alongside your software since the CRT will no longer be statically compiled into your binary and you don’t have control over the end user machine.

That’s a lot of words without much usable information, sorry I can’t be more concretely helpful. Good luck.

A

*edit Just reproduced your problem with dpp and i narrowed down the issue to the string that was being returned by dpp::utility::loglevel ( event.severity ). Removing those fixed the issue for me. It’s not exactly a scalable solution but if by some chance your code sample was the extent of what you needed DPP to do then that change might help you limp over the line :wink:

1 Like

I really appreciate you engaging me in a dialogue, and going further to test stuff out on your end. You’re a lovely person for doing that @lithium.

NIH syndrome? Did a quick google and didn’t find any acronym explanations that appeared to fit - what’s NIH stand for, for you?

I’m hoping to use a fairly wide swath of dpp's functionality as I move forward, so while the discovery is really appreciated, I think it worth switching to MD if that truly sidesteps this issue (in this and other libraries). That is - if I get it to work properly. For some reason the hello world example I pasted doesn’t work properly in my blank cinder framework app. Still looking into that. It does however work in a completely clean C++ application.

On minor comment on your ‘for better or for worse’ stance. The majority of the developers I have encountered appear to sway that way. In my experience, I have taken this task on once or twice and eventually regretted it. Typically because it just added to the maintenance load and given the scope of what my Product is becoming, I’m working overtime to try and shed as much of the load as possible and even with every single step, more load seems to some how creep in. In select instances I can see the tactic work for the best.

Cheers,
Gazoo

You’re welcome mate. :slight_smile:

NIH stands for Not Invented Here

In computer programming, “NIH syndrome” refers to the belief that in-house developments are inherently better suited, more secure, more controlled, quicker to develop, and incur lower overall cost (including maintenance cost) than using existing implementations.[ citation needed ]

I’d personally add “are less of a pain in the arse to build” to the top of that list of reasons though :wink:

For some reason the hello world example I pasted doesn’t work properly in my blank cinder framework app. Still looking into that. It does however work in a completely clean C++ application.

I had to run your sample code in a separate thread in order to have the wrapping cinder app function as normal because bot.start(false); is a blocking call, just on the off chance that’s what you mean by not working properly.

A

Ah that makes more sense - thanks for the clarification :slight_smile:

As for the library in Cinder, yeah, I did notice and I thought that blocking the rest of cinder shouldn’t prevent it from doing its thing. But for reasons that are currently unclear to me it does seem like there’s some interaction between Cinder and Dpp that causes issues. I’ve tried portioning the test code for Dpp into its own thread, by even so, the bot.on_ready() never gets triggered.

The output I’m expecting looks like this (from the non-Cinder C++ App):

INFO: Auto Shard: Bot requires 1 shard
DEBUG: Auto Shard: 987 of 1000 session starts remaining
DEBUG: Starting with 1 shards...
DEBUG: Connecting new session...
INFO: Shard 0/1 ready!
INFO: Logged in as PlanMixPlay
DEBUG: Shards started.

But what I get is this (in the empty Cinder app):

INFO: Auto Shard: Bot requires 1 shard
DEBUG: Auto Shard: 986 of 1000 session starts remaining
DEBUG: Starting with 1 shards...
DEBUG: Shards started.

I’ve written the Dpp authors as I’m a bit at a loss without more debug output where to look. I am also totally unclear as to how Cinder would be causing the Dpp code to not do its thing… O_o

Given that their example code works completely fine in a pure C++ (i.e. non-Cinder) application, I’m expecting them to (understandably) say "Got nothin’ to do with us’ :E

Gazoo

FYI I know literally nothing about discord but I just followed the instructions here to get a bot token and pasted it into your hello world code and it worked out of the box. I got all the messages you expected plus lots of JSON dumping to the console, so there’s nothing inherently wrong with it and cinder running together.

1 Like

Then I’m even more flummoxed. O_O

Do you mind if I ask how you separated the hello world into another thread…?

In the grossest possible way :wink:

void GazooTestApp::setup ( )
{
	static std::thread kRunner { [&]
	{
		dpp::cluster bot ( kToken );

		bot.on_ready ( [&bot] ( const dpp::ready_t & event ) 
		{
			bot.log ( dpp::ll_info, "Logged in as " + bot.me.username );
		} );

		bot.on_message_create ( [&bot] ( const dpp::message_create_t & event ) 
		{
			if ( event.msg->content == "!ping" ) 
			{
				bot.message_create ( dpp::message ( event.msg->channel_id, "Pong!" ) );
			}
		} );

		bot.on_log ( [] ( const dpp::log_t & event ) 
		{
			std::cout << event.message << "\n";
		} );


		bot.start ( false );
	} };
}

Still not behaving on my end. Given that the token I have works in the other project, I’m assuming it is not the problem.

Don’t quite understand this at all. I’m using their 32-bit binaries that they provided a link for and they seem to work in a separate project that’s pure C++. So I’m properly flummoxed…

Ok i tracked down the 32 bit binaries from their various repositories and made a 32 bit version of the hello world code inside a cinder app and now it’s presenting the same issue that you’re seeing, so at the very least you’re not going crazy, it’s a reproducible problem.

Anecdotally, I feel like I’ve had an issue with openssl silently failing on me before, but I can’t remember the actual case.

2 Likes

Ok - that restores a large part of my sanity. You’re a treasure @lithium .

On my list is to move to the 64bit binaries for my product, so I may as well spend a few hours dealing with that. I’ve yet to find a modern good example of a custom find_package build that can deal with debug/release, and x86/x64 too. But I have found lots of individual pieces. Let’s hope I can put it all together.

@Gazoo – you mentioned you’re using vcpkg; are you using the manifest system? The manifest system allows you to create a list of libraries that you want vcpkg to link to your project and build for you; it will make sure to build them with the same CRT mode, build settings, etc. It’s saved me a TON of headaches of finding the right compiler settings for each external library I want to use, and might be worth a look if you’re just using vcpkg by invoking the command line interface.

1 Like

@morphogencc Much appreciate the heads up. If I’m not mistaken, I believe the manifest mode is the only way to also version packages.

I’ve not had too much trouble with the compiler settings for the libraries thus far. If anything, a bigger nightmare was accidentally installing boost via vcpkg and already having my framework support boost too and then trying to force some specific boost settings in CMake which the vcpkg CMake toolchain file just totally overwrites :confused: