Sunday 25 June 2017

DPDK Designs

1) DPDK usually pins one pthread per core to avoid the overhead of task switching.

2) Many libc functions are available in the DPDK, via the Linux* application environment. However, many of these functions are not designed for performance.Functions such as memcpy() or strcpy() should not be used in the data plane.The DPDK API provides an optimized rte_memcpy() function.


3) To provide a message-based communication between lcores, it is advised to use the DPDK ring API, which provides a lockless ring implementation.
The ring supports bulk and burst access, meaning that it is possible to read several elements from the ring with only one costly atomic operation 

 4)  Atomic operations imply a lock prefix before the instruction, causing the processor’s LOCK# signal to be asserted during execution of the following instruction. This has a big impact on performance in a multicore environment.
Performance can be improved by avoiding lock mechanisms in the data plane. It can often be replaced by other solutions like per-lcore variables. For example, a PMD maintains a separate transmit queue per-core, per-port. In the same way, every receive queue of a port is assigned to and polled by a single logical core (lcore).

5)To comply with Non-Uniform Memory Access (NUMA), memory management is designed to assign to each logical core a private buffer pool in local memory to minimize remote memory access.
5) DPDK Packet frameworks
#Port typeDescription
1SW ringSW circular buffer used for message passing between the application threads.
 Uses the DPDK rte_ring primitive. Expected to be the most commonly used type of port.
2HW ringQueue of buffer descriptors used to interact with NIC, switch or accelerator ports.
 For NIC ports, it uses the DPDK rte_eth_rx_queue or rte_eth_tx_queue primitives.
3IP reassemblyInput packets are either IP fragments or complete IP datagrams.
Output packets are complete IP datagrams.
4IP fragmentationInput packets are jumbo (IP datagrams with length bigger than MTU) or non-jumbo packets.
 Output packets are non-jumbo packets.
5Traffic managerTraffic manager attached to a specific NIC output port, performing congestion management and
 hierarchical scheduling according to pre-defined SLAs.
6KNISend/receive packets to/from Linux kernel space.
7SourceInput port used as packet generator. Similar to Linux kernel /dev/zero character device.
8SinkOutput port used to drop all input packets. Similar to Linux kernel /dev/null character device.      

No comments:

Post a Comment