- Author: Pim van Pelt <firstname.lastname@example.org>
- Reviewed: Pascal Dornier <email@example.com>
- Status: Draft - Review - Approved
I did this test back in February, but can now finally publish the results! This little SBC is definitely going to be a hit in the ISP industry. See more information about it here.
PC Engines develops and sells small single board computers for networking to a worldwide customer base. This article discusses a new/unreleased product which PC Engines has developed, which has specific significance in the network operator community: an SBC which comes with three RJ45/UTP based network ports, and one SFP optical port.
Due to the use of Intel i210-IS on the SFP port and i211-AT on the three copper ports, and due to it having no moving parts (fans, hard disks, etc), this SBC is an excellent choice for network appliances such as out-of-band or serial consoles in a datacenter, or routers in a small business or home office.
The APU series boards typically ship with 2GB or 4GB of DRAM, 2, 3 or 4 Intel i211-AT network interfaces, and a four core AMD GX-412TC (running at 1GHz). This review is about the following APU6 unit, which comes with 4GB of DRAM (this preproduction unit came with 2GB, but that will be fixed in the production version), 3x i211-AT for the RJ45 network interfaces, and one i210-IS with an SFP cage.
One other significant difference is visible – the trusty rusty DB9 connector that exposes the first
serial RS232 port is replaced with a modern CP2104 (USB vendor
10c4:ea60) from Silicon Labs which
exposes the serial port as TTL/serial on a micro USB connector rather than RS232, neat!
The small form-factor pluggable (SFP) is a compact, hot-pluggable network interface module used for both telecommunication and data communications applications. An SFP interface on networking hardware is a modular slot for a media-specific transceiver in order to connect a fiber-optic cable or sometimes a copper cable. Such a slot is typically called a cage.
The SFP port accepts most/any optics brand and configuration (Copper, regular 850nm/1310nm/1550nm based, BiDi as commonly used in FTTH deployments, CWDM for use behind an OADM). I tried 6 different vendors and types, see below for results. All modules worked, regardless of vendor or brand.
I tried 6 different SFP modules, all successfully. See the links in the list for an output of an optical diagnostics tool (using the SFF-8472 standard for SFP/SFP+ management).
Each module provided link and passed traffic. The loadtest below was done with the BiDi optics in one interface and a boring RJ45 copper cable in another. It’s going to be fantastic to be able to use these APU6’s in a datacenter setting as remote / out-of-band serial devices, specifically nowadays where UTP is becoming a scarcity and everybody has fiber infrastructure in their racks.
|Generic||Unknown(no DOM)||850nm duplex||sfp1.txt|
|Cisco||SFP-GE-BX-D||1490nm Bidirectional (FTTH CPE)||sfp3.txt|
|Cisco||SFP-GE-BX-U||1310nm Bidirectional (FTTH COR)||sfp3.txt|
|Cisco||BT-OC24-20A||1550nm OC24 SDH||sfp4.txt|
|Finisar||FTRJ1319P1BTL-C7||1310nm 20km (w/ 6dB attenuator)||sfp5.txt|
The choice of Intel i210/i211 network controller on this board allows operators to use Intel’s DPDK with relatively high performance, compared to regular (kernel) based routing. I loadtested Linux (Ubuntu 20.04), OpenBSD (6.8), and two lesser known but way cooler DPDK open source appliances called Danos (ref) and VPP (ref) respectively.
Specifically worth calling out that while Linux and OpenBSD struggled, both DPDK appliances had
absolutely no problems filling a bidirectional gigabit stream of “regular internet traffic”
(referred to as
imix), and came close to line rate with “64b UDP packets”. The line rate of
a gigabit ethernet is 1.48Mpps in one direction, and my loadtests stressed both directions
For the loadtests, I used Cisco’s T-Rex (ref) in stateless mode,
with a custom Python controller that ramps up and down traffic from the loadtester to the device
under test (DUT) by sending traffic out
port0 to the DUT, and expecting that traffic to be
presented back out from the DUT to its
port1, and vice versa (out from
port1 -> DUT -> back
port0). The loadtester first sends a few seconds of warmup, this is to ensure the DUT is
passing traffic and offers the ability to inspect the traffic before the actual rampup. Then
the loadteser ramps up linearly from zero to 100% of line rate (in our case, line rate is
one gigabit in both directions), finally it holds the traffic at full line rate for a certain
duration. If at any time the loadtester fails to see the traffic it’s emitting return on its
second port, it flags the DUT as saturated; and this is noted as the maximum bits/second and/or
usage: trex-loadtest.bin [-h] [-s SERVER] [-p PROFILE_FILE] [-o OUTPUT_FILE] [-wm WARMUP_MULT] [-wd WARMUP_DURATION] [-rt RAMPUP_TARGET] [-rd RAMPUP_DURATION] [-hd HOLD_DURATION] T-Rex Stateless Loadtester -- firstname.lastname@example.org optional arguments: -h, --help show this help message and exit -s SERVER, --server SERVER Remote trex address (default: 127.0.0.1) -p PROFILE_FILE, --profile PROFILE_FILE STL profile file to replay (default: imix.py) -o OUTPUT_FILE, --output OUTPUT_FILE File to write results into, use "-" for stdout (default: -) -wm WARMUP_MULT, --warmup_mult WARMUP_MULT During warmup, send this "mult" (default: 1kpps) -wd WARMUP_DURATION, --warmup_duration WARMUP_DURATION Duration of warmup, in seconds (default: 30) -rt RAMPUP_TARGET, --rampup_target RAMPUP_TARGET Target percentage of line rate to ramp up to (default: 100) -rd RAMPUP_DURATION, --rampup_duration RAMPUP_DURATION Time to take to ramp up to target percentage of line rate, in seconds (default: 600) -hd HOLD_DURATION, --hold_duration HOLD_DURATION Time to hold the loadtest at target percentage, in seconds (default: 30)
It’s worth pointing out that almost all systems are pps-bound not bps-bound. A typical rant I have is that network vendors are imprecise when they specify their throughput “up to 40Gbit” they more often than not mean “under carefully crafted conditions” such as utilizing jumboframes (9216 bytes rather than “usual” 1500 byte MTU found on ethernet, which is easier on the router than a typical internet mixture (closer to 1100 bytes), and much easier yet than if the router is asked to forward 64 byte packets, for instance in a DDoS attack); and only in one direction; and only using exactly one source/destination IP address/port, which is a little bit easier to do than to look up a destination in a forwarding table containing 1M destinations – for context a current internet backbone router carries ~845K IPv4 destinations and ~105K IPv6 destinations.
|Product||Loadtest||Throughput (pps)||Throughput (bps)||% of linerate||Details|
|Linux||imix||150.21 Kpps||452.81 Mbps||45.28%||apu6-linux-imix.json|
|OpenBSD||imix||145.52 Kpps||444.51 Mbps||44.45%||apu6-openbsd-imix.json|
|VPP||imix||654.40 Kpps||2.00 Gbps||199.90%||apu6-vpp-imix.json|
|Danos||imix||655.53 Kpps||2.00 Gbps||200.24%||apu6-danos-imix.json|
|Linux||64b||96.93 Kpps||65.14 Mbps||6.51%||apu6-linux-64b.json|
|OpenBSD||64b||152.09 Kpps||102.20 Mbps||10.22%||apu6-openbsd-64b.json|
|VPP||64b||1.78 Mpps||1.19 Gbps||119.49%||apu6-vpp-64b.json|
|Danos||64b||2.30 Mpps||1.55 Gbps||154.62%||apu6-danos-64b.json|
For more information on the methodology and the scripts that drew these graphs, take a look at my buddy Michal’s GitHub Page, which, given time, will probably turn into its own subsection of this website (I can only imagine the value of a corpus of loadtests of popular equipment in the consumer arena).
The unit was shipped to me free of charge by PC Engines for the purposes of load- and systems integration testing. Other than that, this is not a paid endorsement and views of this review are my own.
Considering the target audience, I wonder if there is a possibility to break out the I2C pins from the SFP cage into a header on the board, so that users can connect them through to the CPU’s I2C controller (or bitbang directly on GPIO pins), and use the APU6 as an SFP flasher. I think that would come in incredibly handy in a datacenter setting.
The DPDK based router implementations are CPU bound, and could benefit from a little bit more power.
I am duly impressed by the throughput seen in terms of packets/sec/watt, but considering a typical
router has a (forwarding) dataplane and needs as well a (configuration) controlplane, we are short
about 30% CPU cycles. If a controlplane (like Bird or FRR (ref) is dedicated
one core, that leaves us three cores for forwarding, with which we obtain roughly 154% of linerate,
we’ll need that
200/154 == 1.298 to obtain line rate in both directions. That said, the APU6 has
absolutely no problems saturating a gigabit in both directions under normal (==imix)
Appendix 1 - Terminology
|OADM||optical add drop multiplexer – a device used in wavelength-division multiplexing systems for multiplexing and routing different channels of light into or out of a single mode fiber (SMF)|
|ONT||optical network terminal - The ONT converts fiber-optic light signals to copper based electric signals, usually Ethernet.|
|OTO||optical telecommunication outlet - The OTO is a fiber optic outlet that allows easy termination of cables in an office and home environment. Installed OTOs are referred to by their OTO-ID.|
|CARP||common address redundancy protocol - Its purpose is to allow multiple hosts on the same network segment to share an IP address. CARP is a secure, free alternative to the Virtual Router Redundancy Protocol (VRRP) and the Hot Standby Router Protocol (HSRP).|
|SIT||simple internet transition - Its purpose is to interconnect isolated IPv6 networks, located in global IPv4 Internet via tunnels.|
|STB||set top box - a device that enables a television set to become a user interface to the Internet and also enables a television set to receive and decode digital television (DTV) broadcasts.|
|GRE||generic routing encapsulation - a tunneling protocol developed by Cisco Systems that can encapsulate a wide variety of network layer protocols inside virtual point-to-point links over an Internet Protocol network.|
|L2VPN||layer2 virtual private network - a service that emulates a switched Ethernet (V)LAN across a pseudo-wire (typically an IP tunnel)|
|DHCP||dynamic host configuration protocol - an IPv4 network protocol that enables a server to automatically assign an IP address to a computer from a defined range of numbers.|
|DHCP6-PD||Dynamic host configuration protocol: prefix delegation - an IPv6 network protocol that enables a server to automatically assign network prefixes to a customer from a defined range of numbers.|
|NDP NS/NA||neighbor discovery protocol: neighbor solicitation / advertisement - an ipv6 specific protocol to discover and judge reachability of other nodes on a shared link.|
|NDP RS/RA||neighbor discovery protocol: router solicitation / advertisement - an ipv6 specific protocol to discover and install local address and gateway information.|
|SBC||single board computer - a compute computer with all peripherals and components directly attached to the board.|