Testbed datasets -

Datasets for Network Performance Evaluation

The field of AI requires data sets for education, research and engineering. Until now the datasets provided by BNN were based on simulation which are time consuming and its duration depends on the amount of traffic to simulate. With the present network testbed, this limitation no longer exists. Moreover, with the datasets generated with the real testbed we intend to model the intricacies of real-hardware which is hard to simulate and adds a lot added value to our research outcomes towards the development of practical solutions. Finally, using hardware routers rather than virtualized technologies the scenario is more realistic and the generated datasets are more easily acceptable by the industry.

Testbed Architeture

Our network testbed consists of 10 Huawei NetEngine 8000 M1A routers using 6 of 16 Gigabit Ethernet ports each, connected through 2 Huawei CloudEngine S5732-H48UM2CC switches. This configuration allows us to build any physical topology with up to 10 nodes, with a node degree from 1 to 5 (1 of the ports is reserved for traffic generation), and configure arbitrary routing polices.

Testbed architecture

In our testbed, traffic is generated by two high-end servers using the open source software TRex in stateless mode and Advance Stateful mode (ASTF) which implements TCP layer. Using combinations of PCAPs from real flow connections we generate TCP/UDP traffic profiles injected to the network testbed to obtain traffic as realistic as possible.

Traffic generated and received by the traffic generators servers is duplicated by mean of optical splitters and processed in the traffic capture servers. This servers have special network interfaces with hardware timestamp support which allow a precise measure of the delay. This is a critical point to obtain an acurate dataset.

Datasets

In this section, you will find the datasets generated with the testbed. To read them, It is recommended to use the datanetAPI file that can be found in the root path of each of the datasets.

Dataset v1

This dataset is used to validate the good behavior of routnet. As a first step, the traffic generated for this dataset is only UDP but generated synthetically. For each path we have one stream that can be Constant Bit Rate) CBR or multi Burst selected randomly. The dataset consists of eleven topologies of different sizes and 10 routing configurations for each of them.

Dataset v1 can be downloaded from this [link]

Dataset v2

This dataset has its focus on adding queue scheduling policies to the nodes. The used policies are Strict Priority (SP) and Weighted Fair Queueing (WFQ). We also have 5 DSCP types of services: BE, AF1, AF2, AF3 and AF4, which are assigned to scheduling queues following one of these two policies.

BE: Best Effort, AF1: WFQ, AF2: WFQ, AF3: WFQ, AF4: WFQ
BE: Best Effort, AF1: WFQ, AF2: WFQ, AF3: WFQ, AF4: SP

For the first scenario WFQ weights are selected from the following list: [17, 20, 20, 43], [34, 5, 44, 17], [33, 36, 13, 18], [8, 12, 44, 36], [9, 18, 28, 45]

While for the second options the weights could be: [15, 19, 66],[13, 50, 37],[49, 21, 30],[49, 18, 33],[22, 11, 67]

For each of the 11 used topologies, we have 5 scheduling configuration candidates and 10 routings. Samples are generated for all combinations of routings and scheduling configurations.

When generating traffic, for each src-dst, we select a random number of streams from 3 to 14, and for each of them, we choosea DSCP type and traffic profile (CBR or multi-Busrt) randomly.

Dataset v2 can be downloaded from this [link]

Dataset v3

The samples generated for this dataset correspond to the combination of 11 different topologies and their 10 routing configurations.

The traffic generated for this dataset is obtained by selecting randomly from 3 to 8 different UDP pacp flows for each src-dst path. We divide the selected bandwidth associated with the path between all pcaps and calculate the number of connections per second (cps) of the flow defined by the pcap to generate the required bandwidth.

Dataset v3 can be downloaded from this [link]

Dataset v4

This dataset can be used to model TCP traffic in a network.

For each src-dst path, we select a random number of pcaps between 3 to 8 extracted from a pcaps library, where each pcap consists of a TCP session of an application. Then a number of connections per session is selected to obtain an estimated bandwidth in a noncongested network. The traffic generator, which implements TCP stack, uses the pcaps files as a template to generate the traffic sessions and inject the traffic into the network.

The performance of the network is evaluated in 11 different topologies with 10 different routings for each topology.

Dataset v3 can be downloaded from this [link]