Reliable Delivery #167

objectundefined · 2014-04-18T20:02:09Z

this commit encapsulates a few changes:

First, when a message is sent against a socket and produces an error, we know implicitly that all messages sent before it on the same socket were submitted successfully. In this case, we emit an "acknowledged" event for those messages.

Second, we provide an optional "ackTimeout" option to the constructor. If this is provided, after the timeout period (in ms), we submit a known bad token if there are any buffered messages. If this token produces an error before the previously queued messages, those messages will fire "acknowleged". This provides the consumer the ability to force acknowledgement after a specific amount of time. This timeout only fires if no other messages have already failed and interrupted the socket.

… without errors

argon · 2014-04-18T20:20:32Z

I think you're missing some commits here. Both sock and self are undefined. I have come across this idea of sending "sentinel" messages before and I'm not too keen on it. I'm coming round to the idea and if we can fettle this design a bit further I will include it.

I think acknowledged notifications should only be sent when the user has specified an ackTimeout value. This way it is "guaranteed" that an acknowledged event will be fired regardless of scenario. Without the timeout set an acknowledged event will sometimes be sent, if an invalid notification is sent.

Secondly, I am not necessarily keen on the idea of using an arbitrary timeout, as notifications which have already been purged from the cache will never receive an event. Instead I'm inclined to suggest that if this method of "reliable" delivery is enabled then when the sockets' cache fills up it will then send the sentinel. This can then be tuned by adjusting the cache size where necessary and will provide a greater degree of determinism to what is a pretty horrible situation (thanks Apple!).

However, I am concerned about the performance implications of this - by forcing the socket to be torn down periodically it will greatly limit the throughput. I tend to see ~800ms startup time for the socket in which time several thousand notifications could have been sent. In cases where millions of notifications are being sent (as with several existing products using this library) that will introduce a significant time penalty. This is of course a trade off which would need to be made if reliable delivery is required, but this will naturally need associated documentation to explain.

objectundefined · 2014-04-18T20:25:22Z

Ah, you’re right when I merged your latest changes I missed those references. If anything, this is an attempt to start the conversation about mediating some sort of “reliability guarantee.” Now that it’s on your mind, I hope there’s a good resolution on the horizon.
--
Gabriel Lipson

argon · 2014-04-18T20:29:30Z

Reliability guarantees are something I have been thinking a lot about recently with the new ECONNRESET problems people are experiencing, this would be another potential solution to mitigate them. Writing that previous message has allowed me to crystallise my ideas on this topic and I will definitely pursue it further. I'll keep this issue up to date as I develop it. I am currently ramping up to a release with the current set of changes and will look at this to include in the next version.

objectundefined · 2014-04-18T20:48:50Z

I’m going to remove this PR to reset to the original commit against an old version of APN that seems to be working more deterministically. thanks for the discussion, and I look forward to you addressing the problem.
--
Gabriel Lipson

argon · 2014-04-18T20:50:57Z

Which version are you reverting to?

objectundefined · 2014-04-18T20:56:03Z

still 1.4.4, but what’s in #master doesn’t seem to be sending notifications in my testing.
--
Gabriel Lipson

On April 18, 2014 at 1:20:38 PM, Andrew Naylor ([email protected]) wrote:

I think you're missing some commits here. Both sock and self are undefined. I have come across this idea of sending "sentinel" messages before and I'm not too keen on it. I'm coming round to the idea and if we can fettle this design a bit further I will include it.

I think acknowledged notifications should only be sent when the user has specified an ackTimeout value. This way it is "guaranteed" that an acknowledged event will be fired regardless of scenario. Without the timeout set an acknowledged event will sometimes be sent, if an invalid notification is sent.

Secondly, I am not necessarily keen on the idea of using an arbitrary timeout, as notifications which have already been purged from the cache will never receive an event. Instead I'm inclined to suggest that if this method of "reliable" delivery is enabled then when the sockets' cache fills up it will then send the sentinel. This can then be tuned by adjusting the cache size where necessary and will provide a greater degree of determinism to what is a pretty horrible situation (thanks Apple!).

However, I am concerned about the performance implications of this - by forcing the socket to be torn down periodically it will greatly limit the throughput. I tend to see ~800ms startup time for the socket in which time several thousand notifications could have been sent. In cases where millions of notifications are being sent (as with several existing products using this library) that will introduce a significant time penalty. This is of course a trade off which would need to be made if reliable delivery is required, but this will naturally need associated documentation to explain.

—
Reply to this email directly or view it on GitHub.

argon · 2014-04-18T20:59:57Z

Be aware of the new production parameter which might be the problem - I have swapped the default behaviour for which environment sandbox/production it connects to.

#master will be a 1.5.0 release as it breaks backwards compatibility, but I think it is worth doing to simplify the API and provide automatic switching using NODE_ENV if the developer desires.

On 18 Apr 2014, at 21:56, Gabriel Lipson [email protected] wrote:

still 1.4.4, but what’s in #master doesn’t seem to be sending notifications in my testing.

Gabriel Lipson

On April 18, 2014 at 1:20:38 PM, Andrew Naylor ([email protected]) wrote:

I think you're missing some commits here. Both sock and self are undefined. I have come across this idea of sending "sentinel" messages before and I'm not too keen on it. I'm coming round to the idea and if we can fettle this design a bit further I will include it.

I think acknowledged notifications should only be sent when the user has specified an ackTimeout value. This way it is "guaranteed" that an acknowledged event will be fired regardless of scenario. Without the timeout set an acknowledged event will sometimes be sent, if an invalid notification is sent.

Secondly, I am not necessarily keen on the idea of using an arbitrary timeout, as notifications which have already been purged from the cache will never receive an event. Instead I'm inclined to suggest that if this method of "reliable" delivery is enabled then when the sockets' cache fills up it will then send the sentinel. This can then be tuned by adjusting the cache size where necessary and will provide a greater degree of determinism to what is a pretty horrible situation (thanks Apple!).

However, I am concerned about the performance implications of this - by forcing the socket to be torn down periodically it will greatly limit the throughput. I tend to see ~800ms startup time for the socket in which time several thousand notifications could have been sent. In cases where millions of notifications are being sent (as with several existing products using this library) that will introduce a significant time penalty. This is of course a trade off which would need to be made if reliable delivery is required, but this will naturally need associated documentation to explain.

—
Reply to this email directly or view it on GitHub.
—
Reply to this email directly or view it on GitHub.

-A

objectundefined added 2 commits April 18, 2014 12:32

emit acknowledged event in the case that we know the message was sent…

679e461

… without errors

adding ackTimeout behavior to new conn instantiator

0dd062f

sock and self refs

95179ce

argon changed the title ~~emit "acknowledged" evt when we know that a message was sent without errors~~ Reliable Delivery Apr 18, 2014

objectundefined closed this Apr 18, 2014

argon mentioned this pull request Apr 18, 2014

Reliable Delivery #168

Closed

argon mentioned this pull request May 29, 2014

Notifications being lost/dropped when sent in high volume. #154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reliable Delivery #167

Reliable Delivery #167

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

still 1.4.4, but what’s in #master doesn’t seem to be sending notifications in my testing.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Reliable Delivery #167

Reliable Delivery #167

Uh oh!

Conversation

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

Uh oh!

objectundefined commented Apr 18, 2014

Uh oh!

argon commented Apr 18, 2014

still 1.4.4, but what’s in #master doesn’t seem to be sending notifications in my testing.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants