Skip to content

Segmentation Fault when using Continuous Query Notification #1009

@wvanderdeijl

Description

@wvanderdeijl

I am creating a CQN subscription for select * from demo with only SUBSCR_QOS_ROWIDS as qos. This works fine, unless I do a lot of transactions very quickly. For example a SQL script doing 10 update statements on that table with a commit between each update. This creates 10 notification to the node container and that frequently (but not always) crashes the node process with a 'Segmentation Fault' or 'Aborted' message on the console.

A reproducable test case with two docker containers in a docker-compose can be found at https://github.com/wvanderdeijl/oracle-leech/tree/segfault

The console output from the node container with the subscription:

node_1_5b71996e877c | >> callback of type 6
node_1_5b71996e877c | Aborted
node_1_5b71996e877c | npm ERR! code ELIFECYCLE
node_1_5b71996e877c | npm ERR! errno 134
node_1_5b71996e877c | npm ERR! [email protected] start: `ts-node ./index.ts`
node_1_5b71996e877c | npm ERR! Exit status 134
node_1_5b71996e877c | npm ERR! 
node_1_5b71996e877c | npm ERR! Failed at the [email protected] start script.
node_1_5b71996e877c | npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
node_1_5b71996e877c | 
node_1_5b71996e877c | npm ERR! A complete log of this run can be found in:
node_1_5b71996e877c | npm ERR!     /root/.npm/_logs/2018-11-21T15_36_44_404Z-debug.log
oracle-leech_node_1_5b71996e877c exited with code 134

The first line (callback of type 6) is from my javascript callback being invoked with the first message. The second message never arrives as the node process has crashed.

Usually the nodejs container simply crashes with a message 'Aborted', but sometimes it crashes with 'Segmentation Fault'. When that happens the segfault-handler npm module we use gets a chance to write a crash.log with a stack trace. One example of such a log file:

PID 436 received SIGSEGV for address: 0x0
/usr/src/app/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x2b90)[0x7f56a37dfb90]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f56c1554890]
/usr/src/app/node_modules/oracledb/build/Release/oracledb.node(_ZN15njsSubscription13CreateMessageEP16dpiSubscrMessage+0x49)[0x7f56ba5ab879]
/usr/src/app/node_modules/oracledb/build/Release/oracledb.node(_ZN15njsSubscription19ProcessNotificationEP10uv_async_s+0x65)[0x7f56ba5ac365]
node[0xa4732f]
node[0xa58018]
node(uv_run+0x14b)[0xa47c6b]
node(_ZN4node5StartEPN2v87IsolateEPNS_11IsolateDataERKSt6vectorISsSaISsEES9_+0x565)[0x8e5255]
node(_ZN4node5StartEiPPc+0x462)[0x8e34a2]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f56c11bbb45]
node[0x89dd45]

This stack makes me believe the error is in the CreateMessage method of Subscription which was invoked by ProcessNotification The issue only seems to reproduce when we get a number of messages in a very short timeframe. I am no C++ developer, but looking at Subscription.cpp it seems like the message lives as a property on the subscription itself (subscription->message). Could it be that this is a concurreny issue and some of the native code is already starting processing the second message and putting that message on the subscription while the processing of the first message has not completed yet.

Answer the following questions:

  • Running Node version 10.13.0 running on x64 linux (the offical node:10 docker container)
  • Using node-oracledb 3.0.1
  • Using instantclient 18.3.0.0.0 on linux64 (docker container)
  • Database Enterprise Edition v11.2.0.1

Command to reproduce once the docker-compose (with an oracle EE container and a node container) are up:

docker-compose exec node sh -c "/opt/oracle/instantclient_18_3/sqlplus system/oracle@oracle/EE.oracle.docker @update.sql"

This SQL script is just 10 update statements on the table with a commit between each statement: https://github.com/wvanderdeijl/oracle-leech/blob/segfault/update.sql

NodeJS typescript code for the listener that fails can be found at https://github.com/wvanderdeijl/oracle-leech/blob/segfault/index.ts

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions