About this series
Ever since I first saw VPP - the Vector Packet Processor - I have been deeply impressed with its performance and versatility. For those of us who have used Cisco IOS/XR devices, like the classic ASR (aggregation services router), VPP will look and feel quite familiar as many of the approaches are shared between the two. One thing notably missing, is the higher level control plane, that is to say: there is no OSPF or ISIS, BGP, LDP and the like. This series of posts details my work on a VPP plugin which is called the Linux Control Plane, or LCP for short, which creates Linux network devices that mirror their VPP dataplane counterpart. IPv4 and IPv6 traffic, and associated protocols like ARP and IPv6 Neighbor Discovery can now be handled by Linux, while the heavy lifting of packet forwarding is done by the VPP dataplane. Or, said another way: this plugin will allow Linux to use VPP as a software ASIC for fast forwarding, filtering, NAT, and so on, while keeping control of the interface state (links, addresses and routes) itself. When the plugin is completed, running software like FRR or Bird on top of VPP and achieving >100Mpps and >100Gbps forwarding rates will be well in reach!
In the previous post, I added support for VPP to consume Netlink messages that describe interfaces, IP addresses and ARP/ND neighbor changes. This post completes the stablestakes Netlink handler by adding IPv4 and IPv6 route messages, and ends up with a router in the DFZ consuming 133K IPv6 prefixes and 870K IPv4 prefixes.
My test setup
The goal of this post is to show what code needed to be written to extend the Netlink Listener plugin I wrote in the fourth post, so that it can consume route additions/deletions, a thing that is common in dynamic routing protocols such as OSPF and BGP.
The setup from my third post is still there, but it’s no longer a focal point for me. I use it (the regular interface + subints and the BondEthernet + subints) just to ensure my new code doesn’t have a regression.
Instead, I’m creating two VLAN interfaces now:
- The first is in my home network’s servers VLAN. There are three OSPF speakers there:
chbtl0.ipng.ch
andchbtl1.ipng.ch
are my main routers, they run DANOS and are in the Default Free Zone (or DFZ for short).rr0.chbtl0.ipng.ch
is one of AS50869’s three route-reflectors. Every one of the 13 routers in AS50869 exchanges BGP information with these, and it cuts down on the total amount of iBGP sessions I have to maintain – see here for details on Route Reflectors.
- The second is an L2 connection to a local BGP exchange, with only three members (IPng Networks, AS50869, Openfactory AS58299, and Stucchinet AS58280). In this VLAN, Openfactory was so kind as to configure a full transit session for me, and I’ll use it in my test bench.
The test setup offers me the ability to consume OSPF, OSPFv3 and BGP.
Startingpoint
Based on the state of the plugin after the fourth post, operators can create VLANs (including .1q, .1ad, QinQ and QinAD subinterfaces) directly in Linux. They can change link attributes (like set admin state ‘up’ or ‘down’, or change the MTU on a link), they can add/remove IP addresses, and the system will add/remove IPv4 and IPv6 neighbors. But notably, the following Netlink messages are not yet consumed, as shown by the following example:
pim@hippo:~/src/lcpng$ sudo ip link add link e1 name servers type vlan id 101
pim@hippo:~/src/lcpng$ sudo ip link up mtu 1500 servers
pim@hippo:~/src/lcpng$ sudo ip addr add 194.1.163.86/27 dev servers
pim@hippo:~/src/lcpng$ sudo ip ro add default via 194.1.163.65
which does the first three commands just fine, but the fourth:
linux-cp/nl [debug ]: dispatch: ignored route/route: add family inet type 1 proto 3
table 254 dst 0.0.0.0/0 nexthops { gateway 194.1.163.65 idx 197 }
In this post, I’ll implement that last missing piece in two functions called lcp_nl_route_add()
and lcp_nl_route_del()
. Here we go!
Netlink Routes
Reusing the approach from the work-in-progress [Gerrit], I introduce two FIB sources: one
for manual routes (ie. the ones that an operator might set with ip route add
), and another one
for dynamic routes (ie. what a routing protocol like Bird or FRR might set), this is in
lcp_nl_proto_fib_source()
. Next, I need a bunch of helper functions that can translate the
Netlink message information into VPP primitives:
lcp_nl_mk_addr46()
converts a Netlinknl_addr
to a VPPip46_address_t
.lcp_nl_mk_route_prefix()
converts a Netlinkrtnl_route
to a VPPfib_prefix_t
.lcp_nl_mk_route_mprefix()
converts a Netlinkrtnl_route
to a VPPmfib_prefix_t
(for multicast routes).lcp_nl_mk_route_entry_flags()
generatesfib_entry_flag_t
from the Netlink route type, table and proto metadata.lcp_nl_proto_fib_source()
selects the most appropciate FIB source by looking at thert_proto
field from the Netlink message (see/etc/iproute2/rt_protos
for a list of these). Anything RTPROT_STATIC or better isfib_src
, while anything above that becomesfib_src_dynamic
.lcp_nl_route_path_parse()
converts a Netlinkrtnl_nexthop
to a VPPfib_route_path_t
and adds that to a growing list of paths. Similar to Netlink’s nethops being a list, so are the individual paths in VPP, so that lines up perfectly.lcp_nl_route_path_add_special()
adds a blackhole/unreach/prohibit route to the list of paths, in the special-case there is not yet a path for the destination.
With these helpers, I will have enough to manipulate VPP’s forwarding information base or FIB for short. But in VPP, the FIB consists of any number of tables (think of them as VRFs or Virtual Routing/Forwarding domains). So first, I need to add these:
lcp_nl_table_find()
selects the matching{table-id,protocol}
(v4/v6) tuple from an internally kept hash of tables.lcp_nl_table_add_or_lock()
if a table with key{table-id,protocol}
(v4/v6) hasn’t been used yet, create one in VPP, and store it for future reference. Otherwise increment a table reference counter so I know how many FIB entries VPP will have in this table.lcp_nl_table_unlock()
given a table, decrease the refcount on it, and if no more prefixes are in the table, remove it from VPP.
All of this code was heavily inspired by the pending [Gerrit] but a few finishing touches were added, and wrapped up in this [commit].
Deletion
Our main function lcp_nl_route_del()
will remove a route from the given table-id/protocol.
I do this by applying rtnl_route_foreach_nexthop()
callbacks to the list of Netlink message
nexthops, converting each of them into VPP paths in a lcp_nl_route_path_parse_t
structure.
If the route is for unreachable/blackhole/prohibit in Linux, add that path too.
Then, remove the VPP paths from the FIB and reduce refcnt or remove the table if it’s empty. This is reasonably straight forward.
Addition
Adding routes to the FIB is done with lcp_nl_route_add()
. It immediately becomes obvious
that not all routes are relevant for VPP. A prime example are those in table 255, they are
’local’ routes, which have already been set up by IPv4 and IPv6 address addition functions
in VPP. There are some other route types that are invalid, so I’ll just skip those.
Link-local IPv6 and IPv6 multicast is also skipped, because they’re also added when interfaces
get their IP addresses configured. But for the other routes, similar to deletion, I’ll extract
the paths from the Netlink message’s netxhops list, by constructing an lcp_nl_route_path_parse_t
by walking those Netlink nexthops, and optionally add a special route (in case the route was
for unreachable/blackhole/prohibit in Linux – those won’t have a nexthop).
Then, insert the VPP paths found in the Netlink message into the FIB or the multicast FIB, respectively.
Control Plane: Bird
So with this newly added code, the example above of setting a default route shoots to life. But I can do better! At IPng Networks, my routing suite of choice is Bird2, and I have some code to generate configurations for it and push those configs safely to routers. So, let’s take a closer look at a configuration on the test machine running VPP + Linux CP with this new Netlink route handler.
router id 194.1.163.86;
protocol device { scan time 10; }
protocol direct { ipv4; ipv6; check link yes; }
These first two protocols are internal implementation details. The first, called device
periodically scans the network interface list in Linux, to pick up new interfaces. You can
compare it to issuing ip link
and acting on additions/removals as they occur. The second,
called direct, generates directly connected routes for interfaces that have IPv4 or IPv6
addresses configured. It turns out that if I add 194.1.163.86/27
as an IPv4 address on
an interface, it’ll generate several Netlink messages: one for the RTM_NEWADDR
which
I discussed in my fourth post, and also a RTM_NEWROUTE
for the connected 194.1.163.64/27
in this case. It helps the kernel understand that if
we want to send a packet to a host in that prefix, we should not send it to the default
gateway, but rather to a nexthop of the device. Those are intermittently called direct
or connected
routes. Ironically, these are called RTS_DEVICE
routes in Bird2
ref even though they are
generated by the direct
routing protocol.
That brings me to the third protocol, one for each address type:
protocol kernel kernel4 {
ipv4 {
import all;
export where source != RTS_DEVICE;
};
}
protocol kernel kernel6 {
ipv6 {
import all;
export where source != RTS_DEVICE;
};
}
We’re asking Bird to import any route it learns from the kernel, and we’re asking it to
export any route that’s not an RTS_DEVICE
route. The reason for this is that when we
create IPv4/IPv6 addresses, the ip
command already adds the connected route, and this
avoids Bird from inserting a second, identical route for those connected routes. And with
that, I have a very simple view, given for example these two interfaces:
pim@hippo:~/src/lcpng$ sudo ip netns exec dataplane ip route
45.129.224.232/29 dev ixp proto kernel scope link src 45.129.224.235
194.1.163.64/27 dev servers proto kernel scope link src 194.1.163.86
pim@hippo:~/src/lcpng$ sudo ip netns exec dataplane ip -6 route
2a0e:5040:0:2::/64 dev ixp proto kernel metric 256 pref medium
2001:678:d78:3::/64 dev servers proto kernel metric 256 pref medium
pim@hippo:/etc/bird$ birdc show route
BIRD 2.0.7 ready.
Table master4:
45.129.224.232/29 unicast [direct1 20:48:55.547] * (240)
dev ixp
194.1.163.64/27 unicast [direct1 20:48:55.547] * (240)
dev servers
Table master6:
2a0e:5040:1001::/64 unicast [direct1 20:48:55.547] * (240)
dev stucchi
2001:678:d78:3::/64 unicast [direct1 20:48:55.547] * (240)
dev servers
Control Plane: OSPF
Considering the servers
network above has a few OSPF speakers in it, I will introduce this
router there as well. The configuration is very straight forward in Bird, let’s just add
the OSPF and OSPFv3 protocols as follows:
protocol ospf v2 ospf4 {
ipv4 { export where source = RTS_DEVICE; import all; };
area 0 {
interface "lo" { stub yes; };
interface "servers" { type broadcast; cost 5; };
};
}
protocol ospf v3 ospf6 {
ipv6 { export where source = RTS_DEVICE; import all; };
area 0 {
interface "lo" { stub yes; };
interface "servers" { type broadcast; cost 5; };
};
}
Here, I tell OSPF to export all connected
routes, and accept any route given to it. The only
difference between IPv4 and IPv6 is that the former uses OSPF version 2 of the protocol, and IPv6
uses version 3 of the protocol. And, as with the kernel
routing protocol above, each instance
has to has its own unique name, so I make the obvious choice.
Within a few seconds, the OSPF Hello packets can be seen going out of the servers
interface,
and adjacencies form shortly thereafter:
pim@hippo:~/src/lcpng$ sudo ip netns exec dataplane ip ro | wc -l
83
pim@hippo:~/src/lcpng$ sudo ip netns exec dataplane ip -6 ro | wc -l
74
pim@hippo:~/src/lcpng$ birdc show ospf nei ospf4
BIRD 2.0.7 ready.
ospf4:
Router ID Pri State DTime Interface Router IP
194.1.163.3 1 Full/Other 39.588 servers 194.1.163.66
194.1.163.87 1 Full/DR 39.588 servers 194.1.163.87
194.1.163.4 1 Full/Other 39.588 servers 194.1.163.67
pim@hippo:~/src/lcpng$ birdc show ospf nei ospf6
BIRD 2.0.7 ready.
ospf6:
Router ID Pri State DTime Interface Router IP
194.1.163.87 1 Full/DR 32.221 servers fe80::5054:ff:feaa:2b24
194.1.163.3 1 Full/BDR 39.504 servers fe80::9e69:b4ff:fe61:7679
194.1.163.4 1 2-Way/Other 38.357 servers fe80::9e69:b4ff:fe61:a1dd
And all of these were inserted into the VPP forwarding information base, taking for example
the IPng router in Amsterdam, loopback address 194.1.163.32
and 2001:678:d78::8
:
DBGvpp# show ip fib 194.1.163.32
ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] epoch:0 flags:none locks:[adjacency:1, recursive-resolution:1, default-route:1, lcp-rt:1, nat-hi:2, ]
194.1.163.32/32 fib:0 index:70 locks:2
lcp-rt-dynamic refs:1 src-flags:added,contributing,active,
path-list:[49] locks:142 flags:shared,popular, uPRF-list:49 len:1 itfs:[16, ]
path:[69] pl-index:49 ip4 weight=1 pref=32 attached-nexthop: oper-flags:resolved,
194.1.163.67 TenGigabitEthernet3/0/1.3
[@0]: ipv4 via 194.1.163.67 TenGigabitEthernet3/0/1.3: mtu:1500 next:5 flags:[] 9c69b461a1dd6805ca324615810000650800
forwarding: unicast-ip4-chain
[@0]: dpo-load-balance: [proto:ip4 index:72 buckets:1 uRPF:49 to:[0:0]]
[0] [@5]: ipv4 via 194.1.163.67 TenGigabitEthernet3/0/1.3: mtu:1500 next:5 flags:[] 9c69b461a1dd6805ca324615810000650800
DBGvpp# show ip6 fib 2001:678:d78::8
ipv6-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] epoch:0 flags:none locks:[adjacency:1, default-route:1, ]
2001:678:d78::8/128 fib:0 index:130058 locks:2
lcp-rt-dynamic refs:1 src-flags:added,contributing,active,
path-list:[116] locks:220 flags:shared,popular, uPRF-list:106 len:1 itfs:[16, ]
path:[141] pl-index:116 ip6 weight=1 pref=32 attached-nexthop: oper-flags:resolved,
fe80::9e69:b4ff:fe61:a1dd TenGigabitEthernet3/0/1.3
[@0]: ipv6 via fe80::9e69:b4ff:fe61:a1dd TenGigabitEthernet3/0/1.3: mtu:1500 next:5 flags:[] 9c69b461a1dd6805ca3246158100006586dd
forwarding: unicast-ip6-chain
[@0]: dpo-load-balance: [proto:ip6 index:130060 buckets:1 uRPF:106 to:[0:0]]
[0] [@5]: ipv6 via fe80::9e69:b4ff:fe61:a1dd TenGigabitEthernet3/0/1.3: mtu:1500 next:5 flags:[] 9c69b461a1dd6805ca3246158100006586dd
In the snippet above we can see elements of the Linux CP Netlink Listener plugin doing its work.
It found the right nexthop, the right interface, enabled the FIB entry, and marked it with the
correct FIB source lcp-rt-dynamic
. And, with OSPF and OSPFv3 now enabled, VPP has gained visibility
to all of my internal network:
pim@hippo:~/src/lcpng$ traceroute nlams0.ipng.ch
traceroute to nlams0.ipng.ch (2001:678:d78::8) from 2001:678:d78:3::86, 30 hops max, 24 byte packets
1 chbtl1.ipng.ch (2001:678:d78:3::1) 0.3182 ms 0.2840 ms 0.1841 ms
2 chgtg0.ipng.ch (2001:678:d78::2:4:2) 0.5473 ms 0.6996 ms 0.6836 ms
3 chrma0.ipng.ch (2001:678:d78::2:0:1) 0.7700 ms 0.7693 ms 0.7692 ms
4 defra0.ipng.ch (2001:678:d78::7) 6.6586 ms 6.6443 ms 6.9292 ms
5 nlams0.ipng.ch (2001:678:d78::8) 12.8321 ms 12.9398 ms 12.6225 ms
Control Plane: BGP
But the holy grail, and what got me started on this whole adventure, is to be able to participate in the Default Free Zone using BGP, So let’s put these plugins to the test and load up a so-called full table which means: all the routing information needed to reach any part of the internet. As of August'21, there are about 870'000 such prefixes for IPv4, and aboug 133'000 prefixes for IPv6. We passed the magic 1M number, which I’m sure makes some silicon vendors anxious, because lots of older kit in the field won’t scale beyond a certain size. VPP is totally immune to this problem, so here we go!
template bgp T_IBGP4 {
local as 50869;
neighbor as 50869;
source address 194.1.163.86;
ipv4 { import all; export none; next hop self on; };
};
protocol bgp rr4_frggh0 from T_IBGP4 { neighbor 194.1.163.140; }
protocol bgp rr4_chplo0 from T_IBGP4 { neighbor 194.1.163.148; }
protocol bgp rr4_chbtl0 from T_IBGP4 { neighbor 194.1.163.87; }
template bgp T_IBGP6 {
local as 50869;
neighbor as 50869;
source address 2001:678:d78:3::86;
ipv6 { import all; export none; next hop self ibgp; };
};
protocol bgp rr6_frggh0 from T_IBGP6 { neighbor 2001:678:d78:6::140; }
protocol bgp rr6_chplo0 from T_IBGP6 { neighbor 2001:678:d78:7::148; }
protocol bgp rr6_chbtl0 from T_IBGP6 { neighbor 2001:678:d78:3::87; }
And with these two blocks, I’ve added six new protocols – three of them are IPv4 route-reflector
clients, and three of them are IPv6 ones. Once this commits, Bird will be able to find these IP
addresses due to the OSPF routes being loaded into the FIB, and once it does that, each of the
route-reflector servers will download a full routing table into Bird’s memory, and in turn Bird
will use the kernel4
and kernel6
protocol to export them into Linux (essentially performing
an ip ro add ... via ...
on each), and the kernel will then generate a Netlink message, which
the Linux CP Netlink Listener plugin will pick up and the rest, as they say, is history.
I gotta tell you - the first time I saw this working end to end, I was elated. Just seeing blocks of 6800-7000 of these being pumped into VPP’s FIB each 40ms block was just .. magical. And the performance is pretty good, too, because 7000/40ms is 175K/sec alluding to VPP operators being able to not only consume but also program into the FIB, a full IPv4 and IPv6 table in about 6 seconds, whoa!
DBGvpp#
linux-cp/nl [warn ]: process_msgs: Processed 6550 messages in 40001 usecs, 2607 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6368 messages in 40000 usecs, 7012 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6460 messages in 40001 usecs, 13163 left in queue
...
linux-cp/nl [warn ]: process_msgs: Processed 6418 messages in 40004 usecs, 93606 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6438 messages in 40002 usecs, 96944 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6575 messages in 40002 usecs, 99986 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6552 messages in 40004 usecs, 94767 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 5890 messages in 40001 usecs, 88877 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6829 messages in 40003 usecs, 82048 left in queue
...
linux-cp/nl [warn ]: process_msgs: Processed 6685 messages in 40004 usecs, 13576 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6701 messages in 40003 usecs, 6893 left in queue
linux-cp/nl [warn ]: process_msgs: Processed 6579 messages in 40003 usecs, 314 left in queue
DBGvpp#
Due to a good cooperative multitasking approach in the Netlink message queue producer, I will continuously read Netlink messages from the kernel and put them in a queue, but only consume 40ms or 8000 messages whichever comes first, after which I yield control back to VPP. So you can see here that when the kernel is flooding the Netlink messages of the learned BGP routing table, the plugin correctly consumes what it can, the queue grows (in this case to just about 100K messages) and then quickly shrinks again.
And indeed, Bird, IP and VPP all seem to agree, we did a good job:
pim@hippo:~/src/lcpng$ birdc show route count
BIRD 2.0.7 ready.
1741035 of 1741035 routes for 870479 networks in table master4
396518 of 396518 routes for 132479 networks in table master6
Total: 2137553 of 2137553 routes for 1002958 networks in 2 tables
pim@hippo:~/src/lcpng$ sudo ip netns exec dataplane ip -6 ro | wc -l
132430
pim@hippo:~/src/lcpng$ sudo ip netns exec dataplane ip ro | wc -l
870494
pim@hippo:~/src/lcpng$ vppctl sh ip6 fib sum | awk '$1~/[0-9]+/ { total += $2 } END { print total }'
132479
pim@hippo:~/src/lcpng$ vppctl sh ip fib sum | awk '$1~/[0-9]+/ { total += $2 } END { print total }'
870529
Results
The functional regression test I made on day one, the one that ensures end-to-end connectivity to and
from the Linux host interfaces works for all 5 interface types (untagged, .1q tagged, QinQ, .1ad tagged
and QinAD) and for both physical and virtual interfaces (like TenGigabitEthernet3/0/0
and BondEthernet0
),
still works. Great.
Here’s a screencast [asciinema, gif] showing me playing around a bit with that configuration shown above, demonstrating that RIB and FIB synchronisation works pretty well in both directions, making the combination of these two plugins sufficient to run a VPP router in the Default Free Zone, Whoohoo!
Future work
Atomic Updates - When running VPP + Linux CP in a default free zone BGP environment, IPv4 and IPv6 prefixes will be constantly updated as the internet topology morphs and changes. One thing I noticed is that those are often deletes followed by adds with the exact same nexthop (ie. something in Germany flapped, and this is not deduplicated), which shows up as many of these pairs of messages like so:
linux-cp/nl [debug ]: route_del: netlink route/route: del family inet6 type 1 proto 12 table 254 dst 2a10:cc40:b03::/48 nexthops { gateway fe80::9e69:b4ff:fe61:a1dd idx 197 }
linux-cp/nl [debug ]: route_path_parse: path ip6 fe80::9e69:b4ff:fe61:a1dd, TenGigabitEthernet3/0/1.3, []
linux-cp/nl [info ]: route_del: table 254 prefix 2a10:cc40:b03::/48 flags
linux-cp/nl [debug ]: route_add: netlink route/route: add family inet6 type 1 proto 12 table 254 dst 2a10:cc40:b03::/48 nexthops { gateway fe80::9e69:b4ff:fe61:a1dd idx 197 }
linux-cp/nl [debug ]: route_path_parse: path ip6 fe80::9e69:b4ff:fe61:a1dd, TenGigabitEthernet3/0/1.3, []
linux-cp/nl [info ]: route_add: table 254 prefix 2a10:cc40:b03::/48 flags
linux-cp/nl [info ]: process_msgs: Processed 2 messages in 225 usecs
See how 2a10:cc40:b03::/48
is first removed, and then immediately reinstalled to the exact same
nexthop fe80::9e69:b4ff:fe61:a1dd
on interface TenGigabitEthernet3/0/1.3
? Although it only takes
225µs, it’s still a bit sad to parse, create paths, just to remove from the FIB and re-insert the
exact same thing into the FIB. But more importantly, if a packet destined for this prefix arrives in that
225µs window, it will be lost. So I think I’ll build a peek-ahead mechanism to capture specifically
this occurence, and let the two del+add messages cancel each other out.
Prefix updates towards lo - When writing the code, I borrowed a bunch from the pending [Gerrit] but that one has a nasty crash which was hard to debug and I haven’t yet fully understood it. When a add/del occurs for a route towards IPv6 localhost (these are typically seen when Bird shuts down eBGP sessions and I no longer have a path to a prefix, it’ll mark it as ‘unreachable’ rather than deleting it. These are additions which have a nexthop without a gateway but with an interface index of 1 (which, in Netlink, is ’lo’). This makes VPP intermittently crash, so I currently commented this out, while I gain better understanding. Result: blackhole/unreachable/prohibit specials can not be set using the plugin. Beware! (disabled in this [commit]).
Credits
I’d like to make clear that the Linux CP plugin is a collaboration between several great minds, and that my work stands on other software engineer’s shoulders. In particular most of the Netlink socket handling and Netlink message queueing was written by Matthew Smith, and I’ve had a little bit of help along the way from Neale Ranns and Jon Loeliger. I’d like to thank them for their work!
Appendix
VPP config
We only use one TenGigabitEthernet device on the router, and create two VLANs on it:
IP="sudo ip netns exec dataplane ip"
vppctl set logging class linux-cp rate-limit 1000 level warn syslog-level notice
vppctl lcp create TenGigabitEthernet3/0/1 host-if e1 netns dataplane
$IP link set e1 mtu 1500 up
$IP link add link e1 name ixp type vlan id 179
$IP link set ixp mtu 1500 up
$IP addr add 45.129.224.235/29 dev ixp
$IP addr add 2a0e:5040:0:2::235/64 dev ixp
$IP link add link e1 name servers type vlan id 101
$IP link set servers mtu 1500 up
$IP addr add 194.1.163.86/27 dev servers
$IP addr add 2001:678:d78:3::86/64 dev servers
Bird config
I’m using a purposefully minimalist configuration for demonstration purposes, posted here in full for posterity:
log syslog all;
log "/var/log/bird/bird.log" { debug, trace, info, remote, warning, error, auth, fatal, bug };
router id 194.1.163.86;
protocol device { scan time 10; }
protocol direct { ipv4; ipv6; check link yes; }
protocol kernel kernel4 { ipv4 { import all; export where source != RTS_DEVICE; }; }
protocol kernel kernel6 { ipv6 { import all; export where source != RTS_DEVICE; }; }
protocol ospf v2 ospf4 {
ipv4 { export where source = RTS_DEVICE; import all; };
area 0 {
interface "lo" { stub yes; };
interface "servers" { type broadcast; cost 5; };
};
}
protocol ospf v3 ospf6 {
ipv6 { export where source = RTS_DEVICE; import all; };
area 0 {
interface "lo" { stub yes; };
interface "servers" { type broadcast; cost 5; };
};
}
template bgp T_IBGP4 {
local as 50869;
neighbor as 50869;
source address 194.1.163.86;
ipv4 { import all; export none; next hop self on; };
};
protocol bgp rr4_frggh0 from T_IBGP4 { neighbor 194.1.163.140; }
protocol bgp rr4_chplo0 from T_IBGP4 { neighbor 194.1.163.148; }
protocol bgp rr4_chbtl0 from T_IBGP4 { neighbor 194.1.163.87; }
template bgp T_IBGP6 {
local as 50869;
neighbor as 50869;
source address 2001:678:d78:3::86;
ipv6 { import all; export none; next hop self ibgp; };
};
protocol bgp rr6_frggh0 from T_IBGP6 { neighbor 2001:678:d78:6::140; }
protocol bgp rr6_chplo0 from T_IBGP6 { neighbor 2001:678:d78:7::148; }
protocol bgp rr6_chbtl0 from T_IBGP6 { neighbor 2001:678:d78:3::87; }
Final note
You may have noticed that the [commit] links are all to git commits in my private working copy. I want to wait until my previous work is reviewed and submitted before piling on more changes. Feel free to contact vpp-dev@ for more information in the mean time :-)