Skip to content

Library grows connections out of control. #1591

@ncollins-vs

Description

@ncollins-vs
  1. What versions are you using?

NodeJS 18 on Debian 12 x86 (node:lts docker image)

node-oracledb v6.0.3

Oracle Database 19c in Amazon RDS

  1. Is it an error or a hang or a crash?

Neither; connection count went way too high causing other processes that connect to the db to crash

  1. What error(s) or behavior you are seeing?

I have a server with an Oracle connection and two instances are running connected to the same database. Yesterday, we had a brief spike in load, and the Oracle library opened up lots of connections to the database. On one instance the connection count went up to 2570, and on the other instance it went up to 685. Both instances kept functioning normally, but any other new processes that tried to connect to the database died because the database killed new connections.

The server uses a connection pool with the following parameters:

await oracledb.createPool({
    connectString: `...`,
    user: "...",
    password: "...",
    poolMin: 100,
    poolMax: 100,
    poolIncrement: 100,
    poolTimeout: 120,
    queueTimeout: 10000,
    enableStatistics: true,
    poolAlias: "10"
  });

Each request to the server calls oracledb.getConnection("10"); to get a connection and closes it (in a finally block so guaranteed to happen) before returning the response.

We also log pool.getStatistics() every minute, and we got the following logs from that:

{"gatheredDate": 1689867256441, "upTime": 157508834, "upTimeSinceReset": 157501416, "connectionRequests": 147331, "requestsEnqueued": 0, "requestsDequeued": 0, "failedRequests": 0, "rejectedRequests": 0, "requestTimeouts": 0, "maximumQueueLength": 0, "currentQueueLength": 0, "timeInQueue": 0, "minimumTimeInQueue": 0, "maximumTimeInQueue": 0, "averageTimeInQueue": 0, "connectionsInUse": 52, "connectionsOpen": 68, "connectString": "...", "edition": "", "events": false, "externalAuth": false, "homogeneous": true, "poolAlias": "10", "poolIncrement": 100, "poolMax": 100, "poolMin": 100, "poolPingInterval": 60, "poolTimeout": 120, "queueMax": 500, "queueTimeout": 10000, "stmtCacheSize": 50, "user": "...", "threadPoolSize": "104", "thin": true}
{"gatheredDate": 1689867316441, "upTime": 157568834, "upTimeSinceReset": 157561416, "connectionRequests": 148070, "requestsEnqueued": 84, "requestsDequeued": 84, "failedRequests": 0, "rejectedRequests": 0, "requestTimeouts": 0, "maximumQueueLength": 42, "currentQueueLength": 0, "timeInQueue": 30284, "minimumTimeInQueue": 19, "maximumTimeInQueue": 1156, "averageTimeInQueue": 361, "connectionsInUse": 0, "connectionsOpen": 362, "connectString": "...", "edition": "", "events": false, "externalAuth": false, "homogeneous": true, "poolAlias": "10", "poolIncrement": 100, "poolMax": 100, "poolMin": 100, "poolPingInterval": 60, "poolTimeout": 120, "queueMax": 500, "queueTimeout": 10000, "stmtCacheSize": 50, "user": "...", "threadPoolSize": "104", "thin": true}
{"gatheredDate": 1689867376442, "upTime": 157628835, "upTimeSinceReset": 157621417, "connectionRequests": 148116, "requestsEnqueued": 84, "requestsDequeued": 84, "failedRequests": 26, "rejectedRequests": 0, "requestTimeouts": 0, "maximumQueueLength": 42, "currentQueueLength": 0, "timeInQueue": 30284, "minimumTimeInQueue": 19, "maximumTimeInQueue": 1156, "averageTimeInQueue": 361, "connectionsInUse": 0, "connectionsOpen": 685, "connectString": "...", "edition": "", "events": false, "externalAuth": false, "homogeneous": true, "poolAlias": "10", "poolIncrement": 100, "poolMax": 100, "poolMin": 100, "poolPingInterval": 60, "poolTimeout": 120, "queueMax": 500, "queueTimeout": 10000, "stmtCacheSize": 50, "user": "...", "threadPoolSize": "104", "thin": true}

Note that connectionsOpen grows significantly larger than poolMax. The other instance had similar logs, growing connectionsOpen to 2570.

I've dug through all my code and I can't find anything that seems like it could possibly be wrong. I've also dug through the oracle library code and I can't see any possible way this could happen. connectionsOpen is determined by the size of _freeConnectionsLIst, which only grows in _growPool, which is only called if there are more connections slots open less than poolMax.

I'm thoroughly confused about how this could happen, but it definitely seems like it must be a library bug if connectionsOpen grows larger than poolMax. The only other unusual thing about the logs in this situation is that this is the first time since upgrading to node-oracledb v6 that we've had connections queued (requestsEnqueued > 0). I don't think that's a bug on our part, but it might point toward the answer.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions