Vyatta Network OS Documentation

Learn how to install, configure and operate the Vyatta NOS, which helps drive our virtual networking & physical platforms portfolio.

Controlling IPsec crypto cores to obtain better performance

The following examples to illustrate how Crypto engines are mapped to dataplane cores, and how that can be controlled by the user from CLI, in order to get the desired aggregate throughput.

Note: The CLI for exposing dataplane core usage is: "monitor dataplane".
Figure 1. Controlling IPsec crypto cores

Example 1

In this example, we have a vRouter with 10 cores with IPsec site-to-site tunnels to a single remote network. With the default dataplane core assignment, this results in a single free core, which can be used as a crypto engine, which results in sub-optimal IPsec forwarding performance, as all crypto SAs are assigned to a single core.

set interfaces dataplane dp0s5 address '1.1.1.1/24'
set interfaces dataplane dp0s6 address '192.85.1.1/24' 
set interfaces dataplane dp0s7 address 10.10.10.1/24'
set security vpn ipsec esp-group ESP lifetime '86400' 
set security vpn ipsec esp-group ESP pfs disable'
set security vpn ipsec esp-group ESP proposal 1 encryption 'aes128gcm128' 
set security vpn ipsec esp-group ESP proposal 1 hash 'null'
set security vpn ipsec ike-group IKE ike-version '2' 
set security vpn ipsec ike-group IKE lifetime '86400'
set security vpn ipsec ike-group IKE proposal 1 dh-group '2'
set security vpn ipsec ike-group IKE proposal 1 encryption 'aes256' 
set security vpn ipsec ike-group IKE proposal 1 hash 'sha2_512'

set security vpn ipsec site-to-site peer 1.1.1.5 authentication mode 'pre-shared-secret' 
set security vpn ipsec site-to-site peer 1.1.1.5 authentication pre-shared-secret 'test' 
set security vpn ipsec site-to-site peer 1.1.1.5 default-esp-group 'ESP'
set security vpn ipsec site-to-site peer 1.1.1.5 ike-group 'IKE'
set security vpn ipsec site-to-site peer 1.1.1.5 local-address '1.1.1.1'
set security vpn ipsec site-to-site peer 1.1.1.5 tunnel 1 local prefix '192.85.1.0/24' 
set security vpn ipsec site-to-site peer 1.1.1.5 tunnel 1 protocol 'all'
set security vpn ipsec site-to-site peer 1.1.1.5 tunnel 1 remote prefix '196.85.1.0/24'

The following monitor dataplane and show dataplane command output shows that there is a single core for all crypto traffic and the vRouter total crypto throughput is limited to 1.3 Mpps.

Dataplane CPU activity

Core Interface    RX Rate   TX Rate  Idle
--------------------------------------------------------
1    dp0s7               0              250 µs
2    dp0s7               0              250 µs
3    dp0s6            1.5M              1 µs
4    dp0s6            1.5M              3 µs
5    dp0s5               0              250 µs
6    dp0s5               0              250 µs
7    dp0s5               0              250 µs
8    dp0s5            1.5M              10 µs
9    [crypt]                     1.3M   0 µs

                   RX                    TX                  Slow Path
Interface          Packets Rate          Packets Rate        In      Out
------------------------------------------------------------------------------          
[crypt]                                 333087232 1.3M
dp0s5            397954426 1.5M                                3       11
dp0s6            780658237 2.9M                               51        6
dp0s7                    0 0                                   0        7

As dp0s7 is just for low volumes of traffic, this can be limited to a single core, and dp0s5 can be reduced to just 2 cores as the 40Gb link it underutilized, and dp0s6 remains with 2 cores via the configuration.

set interfaces dataplane dp0s7 cpu-affinity 1
set interfaces dataplane dp0s5 cpu-affinity 2-3
set interfaces dataplane dp0s6 cpu-affinity 4-5

Following a reboot, we now see that the cpu assignment matched our configuration, and we now have 2 crypt processes and the vRouter total crypto throughput is increased 2.8 Mpps.

Dataplane CPU activity

Core Interface    RX Rate   TX Rate  Idle
--------------------------------------------------------

1    dp0s7               0              250 µs
     dp0s7               0              250 µs
2    dp0s5               0              250 µs
     dp0s5               0              250 µs
3    dp0s5               0              250 µs
     dp0s5            1.5M              10 µs
4    dp0s6            1.5M              1 µs
5    dp0s6            1.5M              0 µs
8    [crypt]                     1.3M   0 µs
9    [crypt]                     1.5M   0 µs 
vyatta@dut-1:~$ show vpn ipsec sa
Peer ID / IP                           Local ID / IP
------------                           -------------
1.1.1.5                                1.1.1.1

Tunnel Id         State Bytes Out/In  Encrypt      Hash     DH A-Time L-Time
------ ---------- ----- ------------- ------------ -------- -- ------ ------
1      1          up    10.0G/9.7G    aes128gcm128 null     2 11     86400
vyatta@dut-1:~$ show dataplane

                   RX                    TX                  Slow Path
Interface          Packets Rate          Packets Rate        In      Out
------------------------------------------------------------------------------
[crypt]                                1796307664 2.8M
dp0s5           3304762213 1.5M                               36       40
dp0s6           3845688861 2.9M                               30        7
dp0s7                    0 0                                   0        6

Further performance improvements can be made by splitting the traffic across multiple tunnels whose crypt processes can run on the other 2 free cores. In this example, the customer's traffic profile is split between TCP and other protocols, better performance can be obtained by creating a second tunnel, (which will create a second pair of SAs, and therefore a second pair of crypt processes)

set security vpn ipsec site-to-site peer 1.1.1.5 tunnel 2 local prefix '192.85.1.0/24'
set security vpn ipsec site-to-site peer 1.1.1.5 tunnel 2 remote prefix '196.85.1.0/24'
set security vpn ipsec site-to-site peer 1.1.1.5 tunnel 2 protocol tcp

Now we see that there are 4 crypto processes and the vRouter total crypto through-put is increased to 5.4 Mpps.

Dataplane CPU activity

Core Interface    RX Rate   TX Rate  Idle
--------------------------------------------------------
1    dp0s7               0              250 µs
     dp0s7               0              250 µs
2    dp0s5               0              250 µs
     dp0s5               0              250 µs
3    dp0s5               0              250 µs
     dp0s5            2.6M              10 µs
4    dp0s6            1.5M              0 µs
5    dp0s6            1.5M              1 µs
6    [crypt]                     1.3M   0 µs
7    [crypt]                     1.3M   0 µs
8    [crypt]                     1.4M   0 µs
9    [crypt]                     1.3M   0 µs
vyatta@dut-1:~$ show vpn ipsec sa
Peer ID / IP                           Local ID / IP
------------                           -------------
1.1.1.5                                1.1.1.1

Tunnel Id         State Bytes Out/In  Encrypt      Hash     DH A-Time L-Time
------ ---------- ----- ------------- ------------ -------- -- ------ ------
1      5          up    40.9G/34.0G   aes128gcm128 null     2 106    86400
2      6          up    21.4G/17.0G   aes128gcm128 null     2 105    86400
vyatta@dut-1:~$ show dataplane
                   RX                    TX                  Slow Path
Interface          Packets Rate          Packets Rate        In      Out
--------------------------------------------------------------------------
[crypt]                                5549204792 5.4M
dp0s5           2770036752 2.6M                               31       36
dp0s6           3219935397 2.9M                               30        7
dp0s7                    0 0                                   0        6

If there were initially no free cores, rather than a single one, on a vRouter with 9 cores, the crypt processes would share the cores with the dataplane forwarding threads, as shown below.

Dataplane CPU activity

Core Interface    RX Rate   TX Rate  Idle
--------------------------------------------------------
1    dp0s7               0              250 µs
     [crypt]                     1.3M   0 µs
2    dp0s7               0              250 µs
     [crypt]                     1.5M   0 µs
3    dp0s5               0              250 µs
4    dp0s5               0              250 µs
5    dp0s5               0              250 µs
6    dp0s5            1.5M              7 µs
7    dp0s6            1.5M              0 µs
8    dp0s6            1.5M              1 µs

Example 2

In the following example, the CPU cycles for the crypt process have to be shared with that on the dataplane forwarding, however, as in this case the dp0s7 interface is receiving no traffic, the performance matches that of a dedicated core as shown in the previous example, and the vRouter total crypto throughput is 2.8 Mpps.

vyatta@dut-1:~$ show dataplane
                   RX                    TX                  Slow Path 
Interface          Packets Rate       			Packets Rate        In      Out

------------------------------------------------------------------------------ 
              
[crypt]										                       500745504 2.8M
dp0s5            271005437 1.5M                                3       10
dp0s6            537440476 2.9M                               53        7
dp0s7                    0 0                                   0        7

Performance can be improved by creating a second tunnel for the TCP traffic resulting in total crypto throughput of 4.3 Mpps. Here the performance does not match that of the dedicated cores, as core 6 is now doing both packet forwarding and crypt processing.

Dataplane CPU activity
Core Interface    RX Rate   TX Rate  Idle
--------------------------------------------------------
1    dp0s7               0              250 µs
2    dp0s7               0              250 µs
3    dp0s5               0              250 µs
     [crypt]                     1.3M   0 µs
4    dp0s5               0              250 µs
     [crypt]                     1.4M   0 µs
5    dp0s5               0              250 µs
     [crypt]                   796.8K   2 µs
6    dp0s5            2.3M              0 µs
     [crypt]                   796.8K   0 µs
7    dp0s6            1.5M              2 µs
8    dp0s6            1.5M              1 µs
vyatta@dut-1:~$ show dataplane
                   RX                    TX                  Slow Path 
Interface          Packets Rate          Packets Rate        In      Out
------------------------------------------------------------------------------
[crypt]                                410930913 4.3M
dp0s5            665245216 2.3M                              10       15
dp0s6           1166023120 2.9M                              53        7
dp0s7                    0 0                                  0        7