Using smart pointers for DocumentObject*.

tanderson69 · September 17, 2023, 6:56pm

Time and a place for everything. The core guidelines are great but are just guidelines.

I did some testing this morning and I don’t like what I am seeing from std::enable_shared_from_this. The one feature I would look for is a decoupling of allocation and smart pointer creation and that seems to be restricted. I understand why they want to do the coupling, as it makes things more memory safe. In a brand new project that restriction can be baked into the design. Out in the real world I feel it is too much restriction. If I can’t arbitrarily create an intrusive smart pointer from a raw pointer without worrying about the existence of other smart pointers, I don’t really consider it an intrusive pointer and would look elsewhere. Here are 2 tests using std::shared and the other using boost::intrusive. compiled and ran these with address sanitizer. the std::shared triggered address sanitizer while the boost::intrusive_ptr finished without error.

//std::enable_shared_from_this: fails address sanitizer:
#include <iostream>
#include <memory>

class Test : public std::enable_shared_from_this<Test>
{
        int id;
public:
        Test() = delete;
        Test(int idIn) : id(idIn){}
        void go(){std::cout << "object #: " << id << " Hello World" << std::endl;}
};

int main(int, char**)
{
        auto p0 = std::make_shared<Test>(0);
        Test *tp = p0.get();
        std::shared_ptr<Test> p1(tp);
        p0->go();
        p1->go();
}

//boost::intrusive_ptr: passes address sanitizer:
#include <iostream>
#include <memory>
#include <boost/smart_ptr/intrusive_ptr.hpp>
#include <boost/smart_ptr/intrusive_ref_counter.hpp>

class Test : public boost::intrusive_ref_counter<Test>
{
        int id;
public:
        Test() = delete;
        Test(int idIn) : id(idIn){}
        void go(){std::cout << "object #: " << id << " Hello World" << std::endl;}
};

int main(int, char**)
{
        boost::intrusive_ptr<Test> p0 = new Test(0);
        Test *tp = p0.get();
        boost::intrusive_ptr<Test> p1(tp);
        p0->go();
        p1->go();
}

andrecaldas · September 17, 2023, 9:37pm

I didn’t understand most of it. Sorry.
But I think you want to pass raw pointers to the constructor of std::shared_ptr. You should not do that. This will make two instances of shared_ptr that have different control blocks.

The boost intrusive pointer are too smart for me… I don’t think we need them.

I am thankful the sanitizer was triggered, because the code is not good. This is not how you use an std::shared_ptr. The idea (it we were ideal ) is to never pass a raw pointer to its constructor. The correct code would be:

int main(int, char**)
{
        auto p0 = std::make_shared<Test>(0);
        // Test *tp = p0.get();
        std::shared_ptr<Test> p1 = p0;
        p0->go();
        p1->go();
}

For the case where you really want to get the shared_ptr from the raw pointer, you would do:

int main(int, char**)
{
        auto p0 = std::make_shared<Test>(0);
        Test *tp = p0.get();
        std::shared_ptr<Test> p1 = tp->shared_from_this();
        p0->go();
        p1->go();
}

If the pointers are always managed by a shared_ptr, the above is enough.

Unfortunately, we do not have such warranties in FreeCAD, yet.

andrecaldas · September 20, 2023, 10:34pm

I think I got what you mean. You prefer a smart shared pointer that stores its “control block” inside the pointed resource. Then, you would be able to “decouple allocation and smart pointer creating”… or would you???

From what I understood from this “intrusive pointer”, you want to:

Use new to allocate stuff.
Put this stuff inside smart pointers that will have a unique “control block” associated to each pointed resource.
This way you can keep passing ownership through raw pointers.

My question is… is the following pseudo code correct? (SmartPtr is the “intrusive pointer” thing)

  auto p = new A();
  SmartPtr sp(p);
  delete p;
  sp->DoStuff();

I mean… is “sp” aware of the fact that “p” was deleted?
Or… can sp prevent the “real deletion” of the resource pointed by “p”?
What warranties SmartPtr can give me that it is not dangling?

When I try to look at the big picture, I feel it is an incomplete solution, because it tries to do much more then it is capable of doing. At the end of the day, allocation will not be that decoupled. People would have to be very aware of having or not having full ownership of the pointer you are deleting. And the best and predictable thing to do would be to either:

Always allocate with SharedPtr; or
Be aware that after calling a few specific functions you are passing ownership and are no longer to delete your raw pointer. (and by the way, your raw pointer can become dangling!)

If you opt for “1”, there is no need to this “intrusive pointer”. You can get what you need from std::shared_ptr.

In the std::shared_ptr case, ideally, we should:

Always allocate those objects using std::make_shared or, through a fabric.
1.1 Never use “new”, except for the fabric.
1.2 Never pass a raw pointer to the shared_ptr’s constructor, except for the fabric.
Use the raw pointers whenever we do not intend to share ownership.
Use the shared_ptr whenever we do want to share ownership.
Use the weak_ptr to hold a reference to something that might get destructed. (It is much simpler then a signaling mechanism, but it is delayed, because you only get the information that the pointer became invalid when you try to access it)

boost::intrusive_ptr

Now, if you are talking about boost::intrusive_ptr, it seems to me that it is for something else. It is for when you want to provide your own reference counting. For example, if you wanted to use OCCT’s reference counting for shapes.

It is much more complicated and it does not resolve the “raw pointer deletion problem”.

And from the documentation: (I’d like to make some emphasis to the expression isn’t obvious)

As a general rule, if it isn’t obvious whether intrusive_ptr better fits your needs than shared_ptr, try a shared_ptr-based design first.

wmayer · September 20, 2023, 11:06pm

My question is… is the following pseudo code correct? (SmartPtr is the “intrusive pointer” thing)

auto p = new A();
SmartPtr sp(p);
delete p;
sp->DoStuff();

>

Definitely not! Deleted is deleted and there is no mechanism that prohibits the resource from destruction.

> I mean... is "sp" aware of the fact that "p" was deleted?

No.

> Or... can sp prevent the "real deletion" of the resource pointed by "p"?

No.

> What warranties SmartPtr can give me that it is not dangling?

SmartPtr can't do anything on it. The only option that prevents the resource from being deleted is to make its destructor protected. This is e.g. how OpenInventor classes do it.

andrecaldas · September 21, 2023, 11:13am

Ping: abdullah, wandererfan. I hope you guys can read past the first part. I believe you might find the second part interesting. Having shared_ptr in FreeCAD might be a challenge… but IMO it is not as hard as it seems. And it can make many things easier and actually safer.

That being said, tanderson69, it seems to me that there is no decoupling. There is not much purpose in using “new” and not being able to use “delete”. Or worse, using “new” and then destructing with “delete” only when it is correct. The best thing to do, in my opinion, would be to tightly couple object allocation and passing it to a smart pointer. That would mean:

Never allocate with “new” without passing it to a smart pointer right after it.
Never construct a smart pointer using a raw pointer. Except in case “1”, just after you have new’d it.
Never use “delete”.

The std::smart_ptr is just like the Olympic flame. After it was lightened in the Temple of Hera, you only lighten another Olympic Torch by using another already lightened Olympic Torch.

The std::enable_shared_from_this

Now, when an object derives from std::enable_shared_from_this, it carries a weak_ptr to the “control block” shared by a shared_ptr that owns it. A weak_ptr has some properties properties:

It does not increase the reference counter, so it does not hold the resource from being destructed.
It has access to the control block and can generate a new shared_ptr for the resource, as long as the reference counter is not zero. A zero counter means that the destruction of the object has already happened or at least, started.
(PS: Probably the control block has two counters. One for the resource deletion and one for the control block deletion. The weak_ptr must increase the control block counter, but not the resource counter.)

In the case of an enable_shared_from_this object, if you abide to the three rules at the beginning of this post, the object will hold a weak_ptr to itself. In this case:

It is a weak_ptr, so it does not hold itself from being destructed. Otherwise it would be a memory leak.
The weak_ptr would never be invalid and you would always be able to derive a valid shared_ptr from it. Because there are two ways the weak_ptr can be invalid:
2.1 A shared_ptr was never attributed to it. (we prevent this through policy: always tightly couple allocation and shared_ptr creation… item 1, above)
2.2 The counter reached zero. (in this case, the object was deleted and we are not supposed to access the weak_ptr anyways)

For that reason, enable_shared_from_this::shared_from_this() throws an exception when the weak_ptr is not valid.

Also, we avoid having two “control blocks” for the same resource through policy… item 2. We never create a shared_ptr from a raw pointer.

And we avoid the resource being deleted through policy: never delete this kind of resource… item 3. It would be incorrect to delete it, if you follow item 1.

Role of the enable_shared_from_this

The shared_from_this allows that anyone that has access to the resource might prevent it from being destructed:

In the present: by holding a shared_ptr.
In the future: by holding a weak_ptr… in this case, the resource might already be destructed when you try to access it in the future. But as long as it has not been destructed, you can acquire a lock from the weak_ptr to prevent its destruction.

Of course, we can instead implement our own “RequestWeakPtr” kind of method, instead.

Use of raw pointers… normally!

If we opt for shared_ptr, we can still use raw pointers almost everywhere. As long as those raw pointers are held ONLY locally, by functions being called by someone that holds a shared_ptr. Or by some one that is called by someone that is called… (n times)… holds a shared_ptr. The rule is simple:

You never store a raw pointer. You have to store a weak_ptr.
You never pass a raw pointer to a different thread.
If you follow those rules, you can use raw pointers normally.

Other advantages

If you use a shared_ptr, you:

Do not need to wait for certain threads to finish before “releasing a resource”.

This happens, for example, in TechDraw. The destructor for DrawViewPart’ waits for remaining threads to finish. This can block the application!

Do not need to disconnect signals.

The boost library allows one to pass a weak_ptr to slot::track.

By the use of this mechanism, only possible through the use of some shared_ptr kind:

The weak_ptr is locked before the slot is executed.
The object will remain valid until the end of the slot call.
If the weak_ptr cannot lock the resource (it has been destructed), the connection will be undone.

They call it “Automatic Connection Management”.

tanderson69 · September 22, 2023, 12:09am

‘New’ is hardly ever the problem. It is almost always delete. That is why there are millions of ‘smart pointered’ lines of code with ‘new’ all over the place without a ‘delete’ in site. I think I have made my point clear. If I am looking at large, mature c++ code base where raw pointers are the norm, I want a smart pointer that is going to do something sensible when it is constructed with a raw pointer. std::shared_ptr is not that. I thought it might change its behavior when it was instantiated with a type that was derived from enable_shared_from_this, but I was wrong as proved by my test.

Here is a thought experiment.
count up the number places that a DocumentObject*(or one of its subclasses) is stored other than the document.
count up all the unique functions in all possible call paths between the document and each of those other pointers.
This is the number of function overloads you will have to add to enable shared_ptr.
is that number:
10 ?
100 ?
1000 ?
10000 ?
100000 ?
Or don’t worry about it and just use an intrusive pointer at the start and end of the call path.

I have no ‘skin in this game’. I have done what I came to do and that is say enough so that I can come back in a year or two and say “I told you so”

andrecaldas · September 22, 2023, 5:09am

This is not true. Raw pointers can be used whenever you have warranties that the object will be alive when you access it. I don’t mind if people keep raw pointers because they believe the object shall live. If their hypothesis is wrong, their code shall crash one day. If it is right, then its alright.

It is obvious that delete is the problem. Ownership is all about being able or not to delete the pointer. There is nothing new here. The problem we face is the lack of precise definition of who holds ownership of what.

The class you want can be easily accomplished in more than one way. It does not solve anything but your own desire to pass a raw pointer to your “intrusive ptr”, though. If people go manually deleting pointers which ownership was already handled away, your solution will not solve anything. On the other hand, if you have specific an precise methods where ownership is known to be transferred, your solution is not needed.

I hope that only a fraction of those functions will store the pointer somewhere. And an even smaller fraction will take ownership (this means feel free to delete the pointer). Those are the ones you need to worry about.

I have some doubts on how the ownership works in FreeCAD. Because when you take your object and add it to the document, you are passing the ownership to the document. The document might delete your pointer. This is one reason we have so many signals signaling around.

What I don’t understand is:

Is there a way to pull the ownership back from the Document? I do know that Transaction and Document pass ownership to each other. But once you put it in the document, can you pull it back? That would be a problem, because once you share a pointer, there is no way to pull it back.

I thank you for your advice. But I am learning a lot… so no work is really lost.

But I suppose I’d better simply ignore you, then.

andrecaldas · October 27, 2023, 1:55pm

I have succeeded in using “std::shared_ptr” instead of “DocumentObject*” in:

Notice how the Transaction::~Transaction destructor became really simple! All it does is send signals.
https://github.com/andre-caldas/FreeCAD/blob/e162e72d536d6d50ecb419491e30382e1f603715/src/App/Transactions.cpp#L94

I hope we can eliminate the need of most of those signals using std::weak_ptr. A weak_ptr is, in a sense, a delayed signaling mechanism. You get informed of “destruction” when you try to lock it.

What made the process a little difficult, is the fact that the order of object destruction might change, and FreeCAD is full of hypothesis on the validity of raw pointers. In special, the use of “DocumentObject::getNameInDocument()” was very problematic, because it might return nullptr, but this is never checked. A series of PRs resolves this (and the code stopped crashing ):
https://github.com/FreeCAD/FreeCAD/pull/11139
https://github.com/FreeCAD/FreeCAD/pull/11141
(future PR) https://github.com/andre-caldas/FreeCAD/tree/name_in_document_step_03

By the way, abdullah… with all I have learned, I can say that this discussion over if methods shall return weak_ptr or shared_ptr… how to restrict non-core code developer from getting a shared_ptr… is basically non sense.

Everytime a function is passed a pointer, it shall get:

A raw pointer: this pointer shall NOT be stored. It is guaranteed to be valid as long as the function returns. The pointer cannot be passed to a thread, also.
A shared_ptr: if it will be (possibly) used and then stored either as a shared_ptr or as a weak_ptr.
A shared_ptr: if it will be passed to a new thread.
A weak_ptr: if it will only be stored. It makes no sense to receive a weak_ptr for use locally, because you will have to lock it and test for its validity.

Everytime someone asks for a pointer, s/he shall get: (function return)

A raw pointer: if the developer is sure the resource will not get destroyed by some thread. This happens, for example, if you hold an object that is not shared, and this object returns you a pointer to some data it holds. Or, if the object you hold has a valid shared_ptr for the raw pointer it is providing.
A shared_ptr: it the caller will access this shared resource and this resource can be released by the other holders.
A shared_ptr: it the caller might want to hold this shared_ptr or a weak_ptr version of it.

Developers need to know that:

As long as they hold a shared_ptr, the resource is assured to exist.
S/he shall not save the shared_ptr unless there is a deliberate intention of not allowing destruction.
To hold a pointer to a resource that might get destructed, the developer shall store a weak_ptr, not a raw pointer. Instead of listening for signals that indicate (in a non thread-safe way) that the pointer might be invalid, we use a weak_ptr and “lock()” it when we need to assure its validity. The same way the pointer might become invalid, the weak_ptr might become invalid and the result of the “lock()” might be checked.