Why is the factory pattern, producing shared_ptr only, enforced everywhere?


#1

I wonder why cinder forces the use of the ::create factory pattern everwhere.
This leads to the nonoptional use of heap allocation, (imho) ugly pointer syntax and potential memory fragmentation. shouldnt a small object like Texture2d not be allocated on the stack ? further, shared_ptr’s ref-counting causes a small but sometimes noticable overhead. Could someone please explain to me the design rationale behind this decision ? or is there maybe an old forum thread that discusses this “issue” in detail ?

here is a quote from someone else that expresses my dislike of pointers better than i could do:

Most uses of pointers in C++ are unnecessary.
( http://programmers.stackexchange.com/questions/56935/why-are-pointers-not-recommended-when-coding-with-c )

Unlike other languages, C++ has very strong support for value
semantics and simply doesn’t need the indirection of pointers. This
wasn’t immediately realised – historically, C++ was invented to
facilitate easy object orientation in C, and relied heavily on
constructing object graphs which were connected by pointers. But in
modern C++, this paradigm is rarely the best choice, and modern C++ idioms often don’t need pointers at all. They operate on values rather than pointers.

Unfortunately, this message has still not caught on in large parts of
the C++ user community. As a result, most of the C++ code that is
written is still littered with superfluous pointers which make the code
complex, slow and faulty / unreliable.

For somebody who knows modern C++, it’s clear that you very rarely need any
pointers (either smart or raw; except when using them as iterators).
The resulting code is shorter, less complex, more readable, often more
efficient and more reliable.


#2

here the intentionally polemic slides
"Dont use f*ckng pointers !"
http://klmr.me/slides/modern-cpp/#1
by Konrad Rudolph


#3

Shared pointer semantics in the case of a number of OpenGL objects make memory management easy, relatively error-free, and allows for easy use of those objects in multiple sections of a user codebase. They allow us to guarantee that the OpenGL resource referred to by the stored id is cleaned up at an appropriate time in a way that would be difficult using only value semantics.

Move-only types are a viable (and very recent) alternative to using smart pointers for enabling this kind of correct, automatic cleanup with value types. Unfortunately, they only support behavior similar to that of unique_ptr: you can only meaningfully store the value in one place. And furthermore, they can’t become shared values the way a unique_ptr can become a shared_ptr. If you wanted to later share a move-only type with other parts of your program, you would need to pass a reference or value that is no longer strongly associated with the validity of the underlying value.

So: shared_ptrs are used for managing the lifetime of OpenGL data because they are good at automatically managing the lifetime of objects without lots of user hoop-jumping. Profile your application to check whether shared_ptr’s are slowing you down in your specific case.

Elsewhere: value objects are used throughout Cinder to great effect. See all the gl::Scoped* types that use the RAII pattern to manage OpenGL state.


#4

Just read through the pointers slides and it is unrelated to the original question. Same with the stackexchange article.

Both those links are arguing against using raw pointers. Cinder does not hand you raw pointers and does not expect them from you (except when declaring i/o parameters to a function). Smart pointers (unique_ptr shared_ptr weak_ptr) are value types that manage dynamic memory so you don’t need to. They aren’t the pointers Konrad is complaining about.

I think my previous response should address your question about the factory pattern. Shared pointers are a very good idea. They’re such a good idea Apple took it and rebranded it as ARC and built the whole Swift programming language to use them.

If you want a refresher on the difference between pointer and value types and what shared_ptrs actually are, you might enjoy reading A Soso Tour of C++. It’s a brief reference guide I wrote for Sosolimited’s developers to quickly get up to speed with C++ concepts.


#5

Also note that shared_ptr's are actually created on the stack. The data they point to, of course, is created on the heap.


#6

@sansumbrella: thank you for your answer. in complex scenarios, the use of a ref counting mechanism to ensure proper and not premature free’ing of a texture id makes sense. therefore, the smart_ptr is (ab)used (?) for that scenario (please excuse me beeing provocative).
but for simple scenarios this is overkill and enforces the usage of pointers (imagine a 2d opengl app having a background texture that needs to be drawn once and does not need to be shared or handed around different parts of the program).

my question why the factory pattern is enforced keeps unanswered. why not offer both options? stack allocated objects and - for complex scenarios - shared_ptr wrapped objects via the ::create factory ? but looking at the code, the constructors are intentionally made protected (and are not complete).

@paul: yes i know that the shared_ptr object itself lives on the stack. and this makes it even more unpleasant, knowing that a Texture2d merely holds a opengl texture-id. yet, in the create function, this tiny object is allocated on the heap !

imagine a particle system with procedurally generated textures. it may allocate tens of thousands of little small textures with a short lifetime. all that goes with the potentially very slow ‘new’ operator. further, shared_ptr are thread safe, using atomic increments/decrements, causing (a tiny) performance penalty…

Texture2dRef Texture2d::create( int width, int height, const Format &format )
{
    if( format.mDeleter )
        return TextureRef( new Texture( width, height, format ), format.mDeleter );
    else
        return TextureRef( new Texture( width, height, format ) );
}

maybe the better solution would be to allow the programmer to choose between Texture2dRef and stack allocated textures using c++11 move semantics ?

setup()
{
    ...
    texture_bg = Texture2d(loadImage(loadAsset("bg.png")));
    ...
}

or is it impossible to combine both concepts in a clean way ?


#7

The factory pattern is convenient and ensures correct behavior. Being concerned about the dynamic allocation CPU-side is a red-herring: every texture or vbo handle on the CPU corresponds to a much larger dynamic allocation happening on the GPU to store the texture or vertex data. Furthermore, there is also the expensive process of transferring data from the CPU to the GPU to initialize that dynamic memory.

Basically: try not to worry about it. There are much bigger bottlenecks in texture creation than the tiny dynamic allocation made for the handle. And the shared_ptr makes it easier for you to reuse that texture and reap the performance benefits of not creating the world from scratch every frame.

If the dynamic CPU allocation gives you pause about creating tons of textures on the fly, then that is an incidental benefit; you should not be creating tons of textures on the fly if you want your application to run smoothly.


#8

It’s also worth mentioning that, where it makes sense, you are able to construct and use objects with value semantics in tandem with the create() pattern. The Surface, TriMesh, and audio::Buffer classes are all examples of this. For these types of objects, they can be fairly large and you may want to share them, so there are shared_ptr<T> typedefs for them, but at the same time you might be wanting to use the object as a container that can be copied just like a std::vector. Also worth noting that all of these objects allocate their memory on the heap (just like std::vector), so it isn’t like you’re going to have any worse memory fragmentation one way or the other. To avoid that you’d need a much more complicated engine creating your objects, and it would be very platform-specific (like console games do).


#9

As the person largely responsible for that original design decision, I can weigh in a bit on the reasons and a bit about where we’ll likely head in the future.

There are two core issues here to my mind: allocation and copy semantics. I would agree with those who’ve said that the performance implications here are more theoretical than practical. This pattern is most heavily used with Cinder’s OpenGL objects, and the cost of allocating any of the underlying GL types easily overwhelms heap vs stack allocation. Relatedly, short-lived GL objects almost always point to a badly performing design, or at least design where performance is not a primary concern.

The other issue here though is centered on copy semantics. In particular, what does a user mean with a statement like myTextureA = myTextureB;? A GPU-side clone is almost always undesirable - certainly for automatic behavior. An under-the-hood refcount (which an early design of Cinder used) is confusing in practice as users have to question if a given class is using that technique or not. However disallowing copying entirely is problematic when a user wants something perfectly natural like a std::vector<gl::Texture>. A shared_ptr<> however provides well-defined copy semantics and again, in practical terms the overhead is easily underwritten by the cost of the underlying GL object. It’s also worth pointing out that atomics overhead is basically only incurred in copying, constructing and destructing. In general we pass *Ref instances as const& to avoid this overhead, since it exceeds the double-dereference overhead.

That said, this design antedates widely available rvalue refs, which we likely would have used otherwise. As an aside, a fact I did not fully appreciate before starting Cinder is that design for a library like it has to account for user comfortability with concepts. For better or worse, Cinder is many users’ introduction to C++ itself, so in some instances we are less aggressive with technical complexity than we might be otherwise, though as a general rule we would favor power over simplicity when they’re at odds.

Move semantics allow us to address issues like the std::vector<gl::Texture> example, while still disambiguating copy semantics (by the blunt instrument of preventing copies outright, at least with GL objects). For other types like ci::Surface we support stack instantiation as well as the static create() pattern. ci::Surface also supports move semantics, and in that case we support copying as there’s well-defined, intuitive behavior.

Going forward, a reasonable first step will be to expose constructors for classes like gl::Texture, implement move and disallowing copy. However there are subtleties that remain. A simple example would the various methods like ci::gl::draw( const TextureRef &, ... ). Because we know that gl::Texture only exist as shared_ptr, we can write only that variant, but if some users have Texture and some TextureRef things are a more complicated. We either need to duplicate all of these methods for a non-Ref variant, or we have to break existing Cinder apps and force passing Texture by value. There are other cases as well - none insoluble but there’s some nontrivial work to be done.

Hopefully this sheds a little light on how we got to this design - it’s certainly not for lack of consideration, though I understand it’s imperfect. In practice it works well enough, and we’re always looking to improve Cinder, so thanks for weighing in with your question.

-Andrew


#10

andrew, rich.e and sansumbrella, thank you all for your in-depth answers. i now much better understand the design rationale and i am looking forward to cinder 1.0 wich as andrew mentioned might use move semantics:

Going forward, a reasonable first step will be to expose constructors for classes like gl::Texture, implement move and disallowing copy.

the general reason why i am unhappy with heap allocation is it’s unpredictable nondeterministic runtime behavior and execution time.
but maybe i overreacted a bit with opening this thread.


btw - thank you all for making the wonderful cinder library !


#11

To be honest, I’ve never encountered serious memory problems that required me to rethink my allocation strategy. And I’ve been using Cinder professionally for many years now. In all fairness, and I mean no offensive, I think this is a classic case of premature optimization.


#12

I’m not against the widespread use of shared_ptr in cinder, but that statement is missleading: Every object that is managed by shared_ptr requries an aditional control block that is allocated on the freestore. So
auto std::shared_ptr<Foo> ref(new Foo{}); Results in TWO dynamic memory allocations: One for the Foo object and one for the control block.
std::make_shared mitigates that by allocating a single memory block for the object and the control block, but this can’t always be used (e.g. when you want to use a custom destructor or an array, or when you create a shared_ptr from a unique_ptr)