Embed Trusted Default Seed Nodes into `src/config.ts` for bootstrapping the P2P network

## Specification

The seed nodes in the Polykey network provide the first point of entry for bootstrapping new keynodes into the wider Polykey network.

We've now got support for deterministic root keypair generation according to a 24-word recovery key mnemonic, as per #202. We will be using this deterministic generation to generate our own set of seed nodes. This generation of seed nodes will be done in #285.

We'll require two sets of keynodes: one for `testnet.polykey.io` and another for `mainnet.polykey.io`. At least for the time being, we'll just be considering a single, shared seed node for both testnet and mainnet. An additional environment variable (`PK_ENV`) or config flag will be required to switch between which nodes to use (i.e. either mainnet in production, or testnet for debugging/testing).

### Specifying the list of seed nodes

The list of seed nodes is required to be a static list of mappings from `NodeId` to `NodeAddress` (in this case, containing a hostname and port). The port for all of these seed nodes should be pre-defined (not defaulted) to be 1314.

Recall that the canonical version of a node ID is a 32-byte structure. Discussions of this can be found in #261. Therefore, this 32-byte structure would ideally be used for this static storage. Alternatively, we need to use some other string-encoded version of the node ID.

This list will be able to be supplied from 3 sources:
1. Environment variable: `PK_SEED_NODES`
   - can use a semi-colon separated structure: `ENCODED-NODEID@testnet.polykey.io:1314;...`
   - parse with `new URL()`: e.g. `new URL('pk://dsfoisudf@testnet.polykey.io:80')`, `new URL('pk://sdof8er@[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80')`. Note the need for the prepended `pk://` when using `new URL`
   - seed nodes specified in this environment variable would be expected to be used for both mainnet and testnet
2. Configuration file: 
   - dynamically read from file by the Polykey agent at the path specified by `nodePath`
   - similar to the password file
   - structure of this? Most likely sufficient 
3. `src/config.ts`:
   - statically define the seed nodes in the source code. e.g.:
   ```ts
   config = {
     seedNodesTest: {
       "ENCODED-NODEID1": { host: "testnet.polykey.io", port: 1314 },
     },
     seedNodesMain: {
       "ENCODED-NODEID1": { host: "mainnet.polykey.io", port: 1314 },
     }
   };
   ```
   - can specify both mainnet and testnet separately
   - switch selection based on config flag/environment variable

### DNS resolution

There's a couple of aspects to supporting DNS resolution.

1. Performing the actual resolution from hostname to IP address
   - if necessary, may need to use the [DNS module](https://nodejs.org/api/dns.html) to [resolve to the appropriate IP](https://nodejs.org/api/dns.html#dnslookuphostname-options-callback) - this should be wrapped as a `promisify` (if already at an IP, no need to call this)
   - all of this resolution should be done at the `NodeGraph` level (before creating a `NodeConnection`)
   - resolution should occur dynamically - we should never store a resolved IP address over a hostname. The underlying IP address can be dynamic. Therefore, always dynamically resolve from hostname to IP when creating a `NodeConnection`. This should also occur for any reconnections on dropped connections (see #224 for this)
2. Supporting the same IP address mapping to **different** node IDs. 
   - if a hostname resolves to multiple IP addresses, then we choose 1 randomly. Use the `all` parameter in the `dns.lookup` options. For this, no need to do crypto random. Just use `Math.random` and select an index (https://stackoverflow.com/a/38448710/582917)
   - as a consequence of this, this means that a hostname could resolve to a different node ID
   - verification of the keynode should then succeed if the node matches **any** of the potential node IDs of the seed nodes
   - this may require some customisation of the current verification process
   - see https://github.com/MatrixAI/js-polykey/issues/269#issuecomment-955908329

### Node reconnection

There was also discussion of needing to resolve #224 in this issue too, such that we can implement some kind of reconnection logic when a previously initialised connection is dropped. This will need to be further discussed, and if required, may take another day or so.

I wonder if we actually care about reconnecting at all. Would it not be sufficient to simply propagate the error up, terminate the connection, and only re-establish connection if another agent to agent call is made? This would fit the current structure of `NodeConnection` (where we have a `getConnectionToNode` function that either uses an existing connection, or creates a new one).

## Additional context
* #194 
* #224 
* #202
* #261 

## Tasks

1. Investigate 32-byte canonical representation of node ID
   1. Can we use this byte structure in our static storage of node IDs?
   2. Alternatively, what base encoding to use instead? `base58btc`? `base64` is dangerous because there's unsafe URL characters in the encoding. `base64url` is also a potential one here (this is used for the JWS claims in the sigchain too, as  this removes the unsafe URL characters.
   3. What does this mean for the Kademlia implementation in `NodeGraph`? From memory in #261, we've refactored this such that we use the 32-byte representation. If we use a string encoding for the node ID, we need to firstly convert back to the 32-byte `UInt8Array`.
2. Refactor `nodes` domain:
   1. Refactor all notions of `broker` into `seed` nodes (as per https://github.com/MatrixAI/js-polykey/issues/177#issuecomment-884623410)
   2. Remove separation of `brokerNodeConnections` - the seed nodes should be treated as regular nodes, that get added to the `NodeGraph` (Kademlia system) (as per https://github.com/MatrixAI/js-polykey/issues/177#issuecomment-884623879)
      - ~~A note on this though. This could mean that eventually the seed nodes are removed from a bucket (if the bucket were to become full, and they were considered the "least active" node. Is this desired? We'd still have their static information in our config, but not in the internal `NodeGraph` system. Perhaps this is okay, given that we should reduce reliance on these nodes as the node becomes aware of other nodes in the Polykey network~~
      - We'll be embedding them just like any other node into the Kademlia system. Any further complexity can be added later
   3. Ensure that we support both IP address and hostname in the `NodeGraph` system (i.e. where `NodeId` can then map to a `NodeAddress` containing either `[ IP address, port ]` or `[ hostname, port ]`)
      - prototype this with `NodeConnection` and the `GRPCClient` to see what happens
      - if necessary, use the 
   4. Ensure we support the chosen encoding/representation for the seed node IDs in `NodeGraph (from above point 1.3)
3. Provide support for the sources of seed nodes:
   1. Environment variable: `PK_SEED_NODES`
      - write parser to extract the components from the semicolon separated structure
   2. Configuration file
   3. Source code: `src/config.ts`

<details>
<summary>Old specification</summary>

**Block 3** from https://github.com/MatrixAI/js-polykey/issues/194#issuecomment-955904607

### Specification

Once we deploy the test keynodes #194, and make use of recovery codes #202 to generate deterministic root keys and root certificates, this will mean we will have a deterministic Node ID.

We should have a set of Node Ids to be trusted, but we can start with just 1.

The Node ID has to be embedded into our default trusted seed list in the source code.

This can be done in 3 tiers of configuration:

1. Environment variable `PK_SEED_NODES`
2. Configuration file, this would be a configuration used by the PK agent, and should be edited in the node path.
3. The default `src/config.ts`

This configuration needs to be something like:

```
const config = {
  seedNodes: {
    "ENCODED-NODEID1": { host: "testnet.polykey.io", port: 1314 },
    "ENCODED-NODEID2": { host: "testnet.polykey.io", port: 1314 },
  }
};
```

I forgot why I chose 1314, or if we had registered anything here before, but this is suitable as the well known port we use by default.

The `ENCODED-NODEID` should be a multi-base encoding of the Node ids. This means we should also resolve #261, as decoding the node ids will result in a `Uint8Array` to be used as the canonical node ID.

In the case of an environment variable, where one has to specify this configuration as a string, we should use a standardised env variable format involving `:` colon separation, unforunately the standard way of separating domains from ports is to use the `:`, therefore we use `;` as the separation.

```
export PK_SEED_NODES='ENCODED-NODEID1=testnet.polykey.io:1314;ENCODED-NODEID2=testnet.polykey.io:1314
```

And a parser should be written to do that. Note that we require the `port` setting at all times. There is no defaulting of the port.

The resulting value should be referred to from `src/config.ts`. All other code should be doing `import config from '../config'`.

Now the `NodeGraph` that reads this, needs to support dns hostnames as well as IP addresses. At this point we have not tested what happens if we just use hostnames. The hostnames will be established in the `NodeConnection` which uses GRPC client and also the networking domain which uses uTP. You can prototype this and see what happens first.

To be safe, we can directly use the dns module: https://nodejs.org/api/dns.html to resolve hostnames if we have a hostname (https://nodejs.org/api/dns.html#dnslookuphostname-options-callback). If you have a IP address, don't call the DNS module, it's not necessary. The call will need to be wrapped as a promise, so use the appropriate `promisify` wrapper in the `src/utils.ts`.

The DNS module will resolve to the IP addresses that we want, and we would then use these. Because the underlying IP addresses may be dynamic, do not cache the IP addresses. Instead you resolve every time you want to establish a running `NodeConnection`. This allows us to change the IP addresses due to migration/redeployment/crashes... etc, and any PK agents connected to the seeds, will just re-connect (there is no re-connection logic built into `NodeManager` yet). Consult the #224 regarding propagating asynchronous errors from networking connections to `NodeConnection`. And when you re-connect, you resolve the DNS again to get the appropriate IP address.

At this point there's a problem with uTP not supporting IPv6, see #224 for that. So the `testnet.polykey.io` will only return A records which are IPv4 addresses.

Multiple A records may be resolved. When multiple are resolved, you must choose 1 randomly to be used. Use the `all` parameter in the `dns.lookup` options. For this, no need to do crypto random. Just use `Math.random` and select an index (https://stackoverflow.com/a/38448710/582917).

The encoded node ids will need to be generated via the #194 procedure, we will keep the recovery keys secret in our own private git repo. But also dog-food PK when appropriate here.

To accompolish this, these 2 additional things need to be done as well:

* Re-connection logic to node connections should be in NodeManager, the networking domain doesn't know when to reconnect, so this responsibility is in `NodeManager`.
* Need to resolve #224 to ensure that connection closes in `networking` is propagated to `nodes`

Reference documentation should be done in the configuration of the PK (should have a configuration page) as a well as the testnode component architecture.

### Additional context

* #194 
* #224 
* #202 
* #261 

### Tasks

1. ...
2. ...
3. ...

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed Trusted Default Seed Nodes into `src/config.ts` for bootstrapping the P2P network #269

Specification

Specifying the list of seed nodes

DNS resolution

Node reconnection

Additional context

Tasks

Specification

Additional context

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Embed Trusted Default Seed Nodes into src/config.ts for bootstrapping the P2P network #269

Description

Specification

Specifying the list of seed nodes

DNS resolution

Node reconnection

Additional context

Tasks

Specification

Additional context

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Embed Trusted Default Seed Nodes into `src/config.ts` for bootstrapping the P2P network #269