(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x00000001014589ec libR.dylib`Rf_error(format="bad value") at errors.c:961:26 [opt]
frame #1: 0x00000001014bded4 libR.dylib`SETCAR(x=0x000000013000a0e0, y=0x00000001309d81c8) at memory.c:4285:2 [opt]
frame #2: 0x0000000123188814 arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::$_0::release(this=<unavailable>, cell=<unavailable>) at protect.hpp:388:5 [opt]
frame #3: 0x00000001231887d0 arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::sexp::~sexp(this=<unavailable>) at sexp.hpp:58:23 [opt]
frame #4: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::sexp::~sexp(this=<unavailable>) at sexp.hpp:58:11 [opt]
frame #5: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::environment::~environment(this=<unavailable>) at environment.hpp:21:7 [opt]
frame #6: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::environment::~environment(this=<unavailable>) at environment.hpp:21:7 [opt]
frame #7: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared(this=<unavailable>) at memory:3318:23 [opt]
frame #8: 0x0000000123183c2c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::__shared_count::__release_shared(this=0x00006000001faac0) at memory:3169:9 [opt]
frame #9: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::__shared_weak_count::__release_shared(this=0x00006000001faac0) at memory:3211:27 [opt]
frame #10: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::shared_ptr<cpp11::environment>::~shared_ptr(this=0x000060000388c830) at memory:3884:19 [opt]
frame #11: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::shared_ptr<cpp11::environment>::~shared_ptr(this=0x000060000388c830) at memory:3882:1 [opt]
frame #12: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType(this=0x000060000388c790) at extension.h:33:7 [opt]
frame #13: 0x0000000123182e8c arrow.so`RExtensionType::~RExtensionType() [inlined] RExtensionType::~RExtensionType(this=<unavailable>) at extension.h:33:7 [opt]
frame #14: 0x0000000123182e88 arrow.so`RExtensionType::~RExtensionType(this=<unavailable>) at extension.h:33:7 [opt]
frame #15: 0x0000000123224ea8 arrow.so`std::__1::shared_ptr<parquet::internal::RecordReader>::~shared_ptr() + 56
frame #16: 0x0000000123507510 arrow.so`void std::__1::allocator_traits<std::__1::allocator<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, void*> > >::destroy<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::shared_ptr<arrow::ExtensionType> > >(std::__1::allocator<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, void*> >&, std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::shared_ptr<arrow::ExtensionType> >*) + 24
frame #17: 0x00000001235074dc arrow.so`std::__1::__hash_table<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::__unordered_map_hasher<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::__unordered_map_equal<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::allocator<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> > > >::__deallocate_node(std::__1::__hash_node_base<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, void*>*>*) + 32
frame #18: 0x000000012350749c arrow.so`std::__1::__hash_table<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::__unordered_map_hasher<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::__unordered_map_equal<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::allocator<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> > > >::~__hash_table() + 24
frame #19: 0x00000001235071d4 arrow.so`arrow::ExtensionTypeRegistryImpl::~ExtensionTypeRegistryImpl() + 44
frame #20: 0x0000000123224ea8 arrow.so`std::__1::shared_ptr<parquet::internal::RecordReader>::~shared_ptr() + 56
frame #21: 0x00000001a60fdec4 libsystem_c.dylib`__cxa_finalize_ranges + 476
frame #22: 0x00000001a60fdc4c libsystem_c.dylib`exit + 44
frame #23: 0x00000001015980b8 libR.dylib`Rstd_CleanUp(saveact=SA_NOSAVE, status=0, runLast=<unavailable>) at sys-std.c:1246:5 [opt]
frame #24: 0x000000010159ae60 libR.dylib`R_CleanUp(saveact=<unavailable>, status=0, runLast=<unavailable>) at system.c:87:5 [opt]
frame #25: 0x00000001014b539c libR.dylib`do_quit(call=<unavailable>, op=<unavailable>, args=0x0000000112b84bd8, rho=<unavailable>) at main.c:1487:5 [opt]
frame #26: 0x0000000101466a24 libR.dylib`bcEval(body=0x0000000112b84f58, rho=<unavailable>, useCache=<unavailable>) at eval.c:7446:14 [opt]
frame #27: 0x000000010145f048 libR.dylib`Rf_eval(e=0x0000000112b84f58, rho=0x0000000112b84cb8) at eval.c:1013:8 [opt]
frame #28: 0x000000010147bccc libR.dylib`R_execClosure(call=0x0000000112b81400, newrho=0x0000000112b84cb8, sysparent=<unavailable>, rho=<unavailable>, arglist=<unavailable>, op=0x0000000112b850a8) at eval.c:0 [opt]
frame #29: 0x000000010147a54c libR.dylib`Rf_applyClosure(call=0x0000000112b81400, op=0x0000000112b850a8, arglist=0x000000013000a0e0, rho=0x0000000130042f88, suppliedvars=<unavailable>) at eval.c:2113:16 [opt]
frame #30: 0x000000010145f31c libR.dylib`Rf_eval(e=0x0000000112b81400, rho=0x0000000130042f88) at eval.c:1140:12 [opt]
frame #31: 0x00000001014b33b4 libR.dylib`Rf_ReplIteration(rho=0x0000000130042f88, savestack=<unavailable>, browselevel=<unavailable>, state=0x000000016ee85b10) at main.c:262:2 [opt]
frame #32: 0x00000001014b4928 libR.dylib`R_ReplConsole(rho=0x0000000130042f88, savestack=0, browselevel=0) at main.c:314:11 [opt]
frame #33: 0x00000001014b4864 libR.dylib`run_Rmainloop at main.c:1200:5 [opt]
frame #34: 0x00000001014b49d0 libR.dylib`Rf_mainloop at main.c:1207:5 [opt]
frame #35: 0x0000000100f7bea0 R`main + 32
frame #36: 0x00000001a5ee7f28 dyld`start + 2236
The PR included an attempt to be backwards compatible with old versions of cpp11, but it seems like it wasn't quite right (and is quite hard to get right, as we see here).
We have had many issues with "shared cpp11 state" over the years, and the fact that a package built with, say, cpp11 0.4.2, has to share state with a package built with cpp11 0.4.4 makes it very hard to introduce changes to the shared state objects in a backwards compatible way.
Lionel, Kevin, and I have come up with a proposal to do the same thing here. Rather than having 1 "global" preserve list that is shared across multiple package and across compilation units within the same package, we propose that we go back to the original implementation of the preserve list (before it was changed here #92), which creates one preserve list per compilation unit. The crux of the implementation would look like
static SEXP get_preserve_list() {
static SEXP out = R_NilValue;
if (out == R_NilValue) {
out = new_preserve_list();
R_PreserveObject(out);
}
return out;
}
The following will error if arrow is compiled with cpp11 0.4.4
It should error on restart with:
This error is related to
SETCAR()calls, which we confirmed with an lldb backtrace:That error is due to the changes in this PR #285, which slightly tweaked the structure of the preserve list.
duckdb vendors cpp11 0.4.2, which uses the "old" structure of the preserve list. It gets loaded first (and presumably something internally calls cpp11) so the preserve list is initialized to the "old" structure. Then we load arrow and that internally uses cpp11 0.4.4 which included that PR. Since the structures differ, the insertion/release calls seem to not be aligned quite right, so as R exits and the destructors for the arrow objects are all run, we end up with an error in
SETCAR()which seems to come from therelease()call in~sexp().The PR included an attempt to be backwards compatible with old versions of cpp11, but it seems like it wasn't quite right (and is quite hard to get right, as we see here).
We have had many issues with "shared cpp11 state" over the years, and the fact that a package built with, say, cpp11 0.4.2, has to share state with a package built with cpp11 0.4.4 makes it very hard to introduce changes to the shared state objects in a backwards compatible way.
In #327 we faced a related issue, and our solution was actually to remove that shared state altogether.
Lionel, Kevin, and I have come up with a proposal to do the same thing here. Rather than having 1 "global" preserve list that is shared across multiple package and across compilation units within the same package, we propose that we go back to the original implementation of the preserve list (before it was changed here #92), which creates one preserve list per compilation unit. The crux of the implementation would look like
We think this should get rid of the shared state problem altogether:
staticvariable means that the pointer to it will persist across accesses, andR_PreserveObject()handles the protection for us. This means we no longer have to deal with storing external pointers as global options (which is the only thing we could use due to needing to avoid the option being copyable and needing to avoid a stack overflow when serializing the protection list). It also means a user can't bork it accidentally.sizeof(empty preserve list) * number of compilation units, which seems reasonable.Two very minor downsides:
release_all()is a helper that was intended for R < 3.5.0 that would release all cpp11 objects from the preserve list. This would no longer work as expected since it would only release objects from the list managed by that compilation unit. But we think that is okay because we don't think this function is particularly safe anyways - you could release an object that you don't own if the preserve list is a global one! A github search makes us think no one is using it, and the tidyverse / r-lib orgs have moved on to require R >= 3.6.0 at this point, so I think we can setR (>= 3.5.0)for cpp11 and remove this function safely. It seems likerelease_all()was one of the motivating reasons for a global preserve list according to Preserved list refactor #92, so if we remove that then that is even more reason to consider alternative approaches.load_all(), then that will reload the C++ internals resulting in new preserve lists being created. We think this is okay, as the old preserve lists will have been protected withR_PreserveObject()so the R vroom objects should continue to work, but their memory will not be released until R restarts. This seems okay since it would be a very rare developer only issue.Some other related issues: