accessing uninitialized memory is UB/boost.geometry currently requires well-defined default initialization

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

accessing uninitialized memory is UB/boost.geometry currently requires well-defined default initialization

Volker Schöch

Hi,

 

As of boost 1.59 (and presumably boost 1.60 is the same), boost.geometry assumes that a default-constructed point_type is well-defined. For instance, struct segment_intersection_points defined in boost\geometry\strategies\intersection_result.hpp may be copied while count < 2, i.e., when some points have not yet been assigned a well-defined value. As we all know, with uninitialized memory, any access including copy or move is undefined behavior.

 

At think-cell, we believe in tight life cycles and undefined behavior. For all purposes in our own code, default-constructed variables of point_type are uninitialized rather than default-initalized, meaning not only that any value stored in there must not be used, but also that the contents of those variables must not be copied or moved. This works very well for our purpose and helps to discover bugs that otherwise are still there but go unnoticed. In theory, this paradigm also facilitates additional compile-time optimization which may or may not be leveraged in current compilers. In our opinion, default-initialization to, e.g., zero, is quite arbitrary and meaningless. If you want to construct an object of point_type with some specific value, it is entirely adequate to require that this specific value be explicitly stated upon construction.

 

In this regard, we follow the gist of the C++ standard: “8.5.12: If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.18). […] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases: [… all cases only apply to unsigned char …]”

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4582.pdf

 

That said, I have some questions to the boost.geometry team:

-          Can you tell if you actually use the values that happen to be found in a default-constructed object? Does your code rely on specific values as the result of default construction, e.g., zero?

-          Do you think you could avoid access to default-constructed objects, e.g., by using boost::optional, or variant, or dynamically sized containers, or some other obvious tools?

 

Currently, we use some work-around and provide a special point_type class for boost.geometry for the sole purpose of default-initialization to keep your code from entering UB. Perspectively, we’d love to remove that workaround and simply use our own point_class which after default-construction contains uninitialized memory.

 

It would be very helpful to know where you are heading: Do you think that our approach is a good idea, or do you see some benefits to having default-constructed objects initialized with some arbitrary but specific value? If so, would you amend your documentation to explicitly state that default-constructed objects must be well-defined, and which specific value (zero?) you expect?

 

Any feedback is appreciated!

   Volker

--
Volker Schöch | [hidden email]
Senior Software Engineer


We are looking for C++ Developers: http://www.think-cell.com/career

think-cell Software GmbH http://www.think-cell.com
Chausseestr. 8/E phone / fax +49 30 666473-10 / -19
10115 Berlin, Germany US phone / fax +1 800 891 8091 / +1 212 504 3039
Amtsgericht Berlin-Charlottenburg, HRB 85229 | European Union VAT Id DE813474306
Directors: Dr. Markus Hannebauer, Dr. Arno Schödl


_______________________________________________
Geometry mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/geometry
Reply | Threaded
Open this post in threaded view
|

Re: accessing uninitialized memory is UB/boost.geometry currently requires well-defined default initialization

Barend
Hi Volker,

Op 5-4-2016 om 11:52 schreef Volker Schöch:

Hi,

 

As of boost 1.59 (and presumably boost 1.60 is the same), boost.geometry assumes that a default-constructed point_type is well-defined. For instance, struct segment_intersection_points defined in boost\geometry\strategies\intersection_result.hpp may be copied while count < 2, i.e., when some points have not yet been assigned a well-defined value. As we all know, with uninitialized memory, any access including copy or move is undefined behavior.

 

At think-cell, we believe in tight life cycles and undefined behavior. For all purposes in our own code, default-constructed variables of point_type are uninitialized rather than default-initalized, meaning not only that any value stored in there must not be used, but also that the contents of those variables must not be copied or moved.


Aren't these different things?

To answer your main question (I think), the point type, supplied as a default point type by Boost.Geometry (but it is possible to define your own types), is not initialized, indeed on purpose, so I think we agree here.

However, it does not keep track that it is initialized or not, so it can be copied (uninitialized) without any problem, so a copy will contain the same garbage as the input does.

The library user is responsible to initialize the point type, exactly the same way as he is responsible to initialize any variable like int, double, etc.

I don't think behaviour is changed at 1.59

This is all about the point type supplied by the library (geometry::model::point), libraries are free to use their own point types and the might or might not be initialized.

This works very well for our purpose and helps to discover bugs that otherwise are still there but go unnoticed. In theory, this paradigm also facilitates additional compile-time optimization which may or may not be leveraged in current compilers. In our opinion, default-initialization to, e.g., zero, is quite arbitrary and meaningless. If you want to construct an object of point_type with some specific value, it is entirely adequate to require that this specific value be explicitly stated upon construction.

 

In this regard, we follow the gist of the C++ standard: “8.5.12: If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.18). […] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases: [… all cases only apply to unsigned char …]”

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4582.pdf

 

That said, I have some questions to the boost.geometry team:

-          Can you tell if you actually use the values that happen to be found in a default-constructed object? Does your code rely on specific values as the result of default construction, e.g., zero?


No

-          Do you think you could avoid access to default-constructed objects, e.g., by using boost::optional, or variant, or dynamically sized containers, or some other obvious tools?


The library is not designed like this. It is designed to be concept-based, any point might be used (even a plain c-array, though that can not be used in polygons for example). Because of that, we cannot make any assumption about the initialization of the contents. It can be initialized, or not, and in the last case also the output will be garbage.

 

Currently, we use some work-around and provide a special point_type class for boost.geometry for the sole purpose of default-initialization to keep your code from entering UB. Perspectively, we’d love to remove that workaround and simply use our own point_class which after default-construction contains uninitialized memory.


So you can remove that workaround, however, it is (of course...) better to not supply uninitialized points to the library, as its behaviour is then undefined...

 

It would be very helpful to know where you are heading: Do you think that our approach is a good idea, or do you see some benefits to having default-constructed objects initialized with some arbitrary but specific value? If so, would you amend your documentation to explicitly state that default-constructed objects must be well-defined, and which specific value (zero?) you expect?


The library already states that points are not initialized in the doc:
http://www.boost.org/doc/libs/1_60_0/libs/geometry/doc/html/geometry/reference/models/model_point.html

We can add something that it is the users responsibility to supply an initialized point here:
http://www.boost.org/doc/libs/1_60_0/libs/geometry/doc/html/geometry/reference/concepts/concept_point.html

Regards, Barend


_______________________________________________
Geometry mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/geometry