You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The seed nodes in the Polykey network provide the first point of entry for bootstrapping new keynodes into the wider Polykey network.
We've now got support for deterministic root keypair generation according to a 24-word recovery key mnemonic, as per #202. We will be using this deterministic generation to generate our own set of seed nodes. This generation of seed nodes will be done in #285.
We'll require two sets of keynodes: one for testnet.polykey.io and another for mainnet.polykey.io. At least for the time being, we'll just be considering a single, shared seed node for both testnet and mainnet. An additional environment variable (PK_ENV) or config flag will be required to switch between which nodes to use (i.e. either mainnet in production, or testnet for debugging/testing).
Specifying the list of seed nodes
The list of seed nodes is required to be a static list of mappings from NodeId to NodeAddress (in this case, containing a hostname and port). The port for all of these seed nodes should be pre-defined (not defaulted) to be 1314.
Recall that the canonical version of a node ID is a 32-byte structure. Discussions of this can be found in #261. Therefore, this 32-byte structure would ideally be used for this static storage. Alternatively, we need to use some other string-encoded version of the node ID.
This list will be able to be supplied from 3 sources:
Environment variable: PK_SEED_NODES
can use a semi-colon separated structure: ENCODED-NODEID@testnet.polykey.io:1314;...
parse with new URL(): e.g. new URL('pk://dsfoisudf@testnet.polykey.io:80'), new URL('pk://sdof8er@[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80'). Note the need for the prepended pk:// when using new URL
seed nodes specified in this environment variable would be expected to be used for both mainnet and testnet
Configuration file:
dynamically read from file by the Polykey agent at the path specified by nodePath
similar to the password file
structure of this? Most likely sufficient
src/config.ts:
statically define the seed nodes in the source code. e.g.:
switch selection based on config flag/environment variable
DNS resolution
There's a couple of aspects to supporting DNS resolution.
Performing the actual resolution from hostname to IP address
if necessary, may need to use the DNS module to resolve to the appropriate IP - this should be wrapped as a promisify (if already at an IP, no need to call this)
all of this resolution should be done at the NodeGraph level (before creating a NodeConnection)
resolution should occur dynamically - we should never store a resolved IP address over a hostname. The underlying IP address can be dynamic. Therefore, always dynamically resolve from hostname to IP when creating a NodeConnection. This should also occur for any reconnections on dropped connections (see Propagate networking connection error handling into NodeConnection error handling #224 for this)
Supporting the same IP address mapping to different node IDs.
if a hostname resolves to multiple IP addresses, then we choose 1 randomly. Use the all parameter in the dns.lookup options. For this, no need to do crypto random. Just use Math.random and select an index (https://stackoverflow.com/a/38448710/582917)
as a consequence of this, this means that a hostname could resolve to a different node ID
verification of the keynode should then succeed if the node matches any of the potential node IDs of the seed nodes
this may require some customisation of the current verification process
There was also discussion of needing to resolve #224 in this issue too, such that we can implement some kind of reconnection logic when a previously initialised connection is dropped. This will need to be further discussed, and if required, may take another day or so.
I wonder if we actually care about reconnecting at all. Would it not be sufficient to simply propagate the error up, terminate the connection, and only re-establish connection if another agent to agent call is made? This would fit the current structure of NodeConnection (where we have a getConnectionToNode function that either uses an existing connection, or creates a new one).
Investigate 32-byte canonical representation of node ID
Can we use this byte structure in our static storage of node IDs?
Alternatively, what base encoding to use instead? base58btc? base64 is dangerous because there's unsafe URL characters in the encoding. base64url is also a potential one here (this is used for the JWS claims in the sigchain too, as this removes the unsafe URL characters.
What does this mean for the Kademlia implementation in NodeGraph? From memory in Canonicalise node ID representation #261, we've refactored this such that we use the 32-byte representation. If we use a string encoding for the node ID, we need to firstly convert back to the 32-byte UInt8Array.
A note on this though. This could mean that eventually the seed nodes are removed from a bucket (if the bucket were to become full, and they were considered the "least active" node. Is this desired? We'd still have their static information in our config, but not in the internal NodeGraph system. Perhaps this is okay, given that we should reduce reliance on these nodes as the node becomes aware of other nodes in the Polykey network
We'll be embedding them just like any other node into the Kademlia system. Any further complexity can be added later
Ensure that we support both IP address and hostname in the NodeGraph system (i.e. where NodeId can then map to a NodeAddress containing either [ IP address, port ] or [ hostname, port ])
prototype this with NodeConnection and the GRPCClient to see what happens
if necessary, use the
Ensure we support the chosen encoding/representation for the seed node IDs in `NodeGraph (from above point 1.3)
Provide support for the sources of seed nodes:
Environment variable: PK_SEED_NODES
write parser to extract the components from the semicolon separated structure
Once we deploy the test keynodes #194, and make use of recovery codes #202 to generate deterministic root keys and root certificates, this will mean we will have a deterministic Node ID.
We should have a set of Node Ids to be trusted, but we can start with just 1.
The Node ID has to be embedded into our default trusted seed list in the source code.
This can be done in 3 tiers of configuration:
Environment variable PK_SEED_NODES
Configuration file, this would be a configuration used by the PK agent, and should be edited in the node path.
I forgot why I chose 1314, or if we had registered anything here before, but this is suitable as the well known port we use by default.
The ENCODED-NODEID should be a multi-base encoding of the Node ids. This means we should also resolve #261, as decoding the node ids will result in a Uint8Array to be used as the canonical node ID.
In the case of an environment variable, where one has to specify this configuration as a string, we should use a standardised env variable format involving : colon separation, unforunately the standard way of separating domains from ports is to use the :, therefore we use ; as the separation.
And a parser should be written to do that. Note that we require the port setting at all times. There is no defaulting of the port.
The resulting value should be referred to from src/config.ts. All other code should be doing import config from '../config'.
Now the NodeGraph that reads this, needs to support dns hostnames as well as IP addresses. At this point we have not tested what happens if we just use hostnames. The hostnames will be established in the NodeConnection which uses GRPC client and also the networking domain which uses uTP. You can prototype this and see what happens first.
The DNS module will resolve to the IP addresses that we want, and we would then use these. Because the underlying IP addresses may be dynamic, do not cache the IP addresses. Instead you resolve every time you want to establish a running NodeConnection. This allows us to change the IP addresses due to migration/redeployment/crashes... etc, and any PK agents connected to the seeds, will just re-connect (there is no re-connection logic built into NodeManager yet). Consult the #224 regarding propagating asynchronous errors from networking connections to NodeConnection. And when you re-connect, you resolve the DNS again to get the appropriate IP address.
At this point there's a problem with uTP not supporting IPv6, see #224 for that. So the testnet.polykey.io will only return A records which are IPv4 addresses.
Multiple A records may be resolved. When multiple are resolved, you must choose 1 randomly to be used. Use the all parameter in the dns.lookup options. For this, no need to do crypto random. Just use Math.random and select an index (https://stackoverflow.com/a/38448710/582917).
The encoded node ids will need to be generated via the #194 procedure, we will keep the recovery keys secret in our own private git repo. But also dog-food PK when appropriate here.
To accompolish this, these 2 additional things need to be done as well:
Re-connection logic to node connections should be in NodeManager, the networking domain doesn't know when to reconnect, so this responsibility is in NodeManager.
Reference documentation should be done in the configuration of the PK (should have a configuration page) as a well as the testnode component architecture.
Specification
The seed nodes in the Polykey network provide the first point of entry for bootstrapping new keynodes into the wider Polykey network.
We've now got support for deterministic root keypair generation according to a 24-word recovery key mnemonic, as per #202. We will be using this deterministic generation to generate our own set of seed nodes. This generation of seed nodes will be done in #285.
We'll require two sets of keynodes: one for
testnet.polykey.ioand another formainnet.polykey.io. At least for the time being, we'll just be considering a single, shared seed node for both testnet and mainnet. An additional environment variable (PK_ENV) or config flag will be required to switch between which nodes to use (i.e. either mainnet in production, or testnet for debugging/testing).Specifying the list of seed nodes
The list of seed nodes is required to be a static list of mappings from
NodeIdtoNodeAddress(in this case, containing a hostname and port). The port for all of these seed nodes should be pre-defined (not defaulted) to be 1314.Recall that the canonical version of a node ID is a 32-byte structure. Discussions of this can be found in #261. Therefore, this 32-byte structure would ideally be used for this static storage. Alternatively, we need to use some other string-encoded version of the node ID.
This list will be able to be supplied from 3 sources:
PK_SEED_NODESENCODED-NODEID@testnet.polykey.io:1314;...new URL(): e.g.new URL('pk://dsfoisudf@testnet.polykey.io:80'),new URL('pk://sdof8er@[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80'). Note the need for the prependedpk://when usingnew URLnodePathsrc/config.ts:DNS resolution
There's a couple of aspects to supporting DNS resolution.
promisify(if already at an IP, no need to call this)NodeGraphlevel (before creating aNodeConnection)NodeConnection. This should also occur for any reconnections on dropped connections (see Propagate networking connection error handling into NodeConnection error handling #224 for this)allparameter in thedns.lookupoptions. For this, no need to do crypto random. Just useMath.randomand select an index (https://stackoverflow.com/a/38448710/582917)src/config.tsfor bootstrapping the P2P network #269 (comment)Node reconnection
There was also discussion of needing to resolve #224 in this issue too, such that we can implement some kind of reconnection logic when a previously initialised connection is dropped. This will need to be further discussed, and if required, may take another day or so.
I wonder if we actually care about reconnecting at all. Would it not be sufficient to simply propagate the error up, terminate the connection, and only re-establish connection if another agent to agent call is made? This would fit the current structure of
NodeConnection(where we have agetConnectionToNodefunction that either uses an existing connection, or creates a new one).Additional context
pk agent startandpk bootstrapwithPK_RECOVERY_CODEandPK_PASSWORD#202Tasks
base58btc?base64is dangerous because there's unsafe URL characters in the encoding.base64urlis also a potential one here (this is used for the JWS claims in the sigchain too, as this removes the unsafe URL characters.NodeGraph? From memory in Canonicalise node ID representation #261, we've refactored this such that we use the 32-byte representation. If we use a string encoding for the node ID, we need to firstly convert back to the 32-byteUInt8Array.nodesdomain:brokerintoseednodes (as per Update testnet.polykey.io to point to the list of IPs running seed keynodes #177 (comment))brokerNodeConnections- the seed nodes should be treated as regular nodes, that get added to theNodeGraph(Kademlia system) (as per Update testnet.polykey.io to point to the list of IPs running seed keynodes #177 (comment))A note on this though. This could mean that eventually the seed nodes are removed from a bucket (if the bucket were to become full, and they were considered the "least active" node. Is this desired? We'd still have their static information in our config, but not in the internalNodeGraphsystem. Perhaps this is okay, given that we should reduce reliance on these nodes as the node becomes aware of other nodes in the Polykey networkNodeGraphsystem (i.e. whereNodeIdcan then map to aNodeAddresscontaining either[ IP address, port ]or[ hostname, port ])NodeConnectionand theGRPCClientto see what happensPK_SEED_NODESsrc/config.tsOld specification
Block 3 from #194 (comment)
Specification
Once we deploy the test keynodes #194, and make use of recovery codes #202 to generate deterministic root keys and root certificates, this will mean we will have a deterministic Node ID.
We should have a set of Node Ids to be trusted, but we can start with just 1.
The Node ID has to be embedded into our default trusted seed list in the source code.
This can be done in 3 tiers of configuration:
PK_SEED_NODESsrc/config.tsThis configuration needs to be something like:
I forgot why I chose 1314, or if we had registered anything here before, but this is suitable as the well known port we use by default.
The
ENCODED-NODEIDshould be a multi-base encoding of the Node ids. This means we should also resolve #261, as decoding the node ids will result in aUint8Arrayto be used as the canonical node ID.In the case of an environment variable, where one has to specify this configuration as a string, we should use a standardised env variable format involving
:colon separation, unforunately the standard way of separating domains from ports is to use the:, therefore we use;as the separation.And a parser should be written to do that. Note that we require the
portsetting at all times. There is no defaulting of the port.The resulting value should be referred to from
src/config.ts. All other code should be doingimport config from '../config'.Now the
NodeGraphthat reads this, needs to support dns hostnames as well as IP addresses. At this point we have not tested what happens if we just use hostnames. The hostnames will be established in theNodeConnectionwhich uses GRPC client and also the networking domain which uses uTP. You can prototype this and see what happens first.To be safe, we can directly use the dns module: https://nodejs.org/api/dns.html to resolve hostnames if we have a hostname (https://nodejs.org/api/dns.html#dnslookuphostname-options-callback). If you have a IP address, don't call the DNS module, it's not necessary. The call will need to be wrapped as a promise, so use the appropriate
promisifywrapper in thesrc/utils.ts.The DNS module will resolve to the IP addresses that we want, and we would then use these. Because the underlying IP addresses may be dynamic, do not cache the IP addresses. Instead you resolve every time you want to establish a running
NodeConnection. This allows us to change the IP addresses due to migration/redeployment/crashes... etc, and any PK agents connected to the seeds, will just re-connect (there is no re-connection logic built intoNodeManageryet). Consult the #224 regarding propagating asynchronous errors from networking connections toNodeConnection. And when you re-connect, you resolve the DNS again to get the appropriate IP address.At this point there's a problem with uTP not supporting IPv6, see #224 for that. So the
testnet.polykey.iowill only return A records which are IPv4 addresses.Multiple A records may be resolved. When multiple are resolved, you must choose 1 randomly to be used. Use the
allparameter in thedns.lookupoptions. For this, no need to do crypto random. Just useMath.randomand select an index (https://stackoverflow.com/a/38448710/582917).The encoded node ids will need to be generated via the #194 procedure, we will keep the recovery keys secret in our own private git repo. But also dog-food PK when appropriate here.
To accompolish this, these 2 additional things need to be done as well:
NodeManager.networkingis propagated tonodesReference documentation should be done in the configuration of the PK (should have a configuration page) as a well as the testnode component architecture.
Additional context
pk agent startandpk bootstrapwithPK_RECOVERY_CODEandPK_PASSWORD#202Tasks