Networking sessions in Red Hat Summit 2016

I recently attended the Red Hat Summit 2016 event that took place at San Francisco, CA, on June 27-30. Red Hat Summit is a great place to interact with customers, partners, and product leads, and learn about Red Hat and the company’s direction.

While Red Hat is still mostly known for its Enterprise Linux (RHEL) business, it also offers products and solutions in the cloud computing, virtualization, middleware, storage, and systems management spaces. And networking is really a key piece in all of these.

In this short post I wanted to highlight a few sessions which are relevant to networking and were presented during the event. While video recordings are not available, slide decks can be downloaded in a PDF format (links included below).

  • Software-defined networking (SDN) fundamentals for NFV, OpenStack, and containers
    • Session overview: With software-defined networking (SDN) gaining traction, administrators are faced with technologies that they need to integrate into their infrastructure. Red Hat Enterprise Linux offers a robust foundation for SDN implementations that are based on an open source standard technologies and designed for deploying containers, OpenStack, and network function virtualization (NFV). We’ll dissect the technology stack involved in SDN and introduce the latest Red Hat Enterprise Linux options designed to address the packet processing requirements of virtual network functions (VNFs), such as Open vSwitch (OVS), single root I/O virtualization (SR-IOV), PCI Passthrough, and DPDK accelerated OVS.
    • Slides

————————————————————————-

  • Use Linux on your whole rack with RDO and open networking
    • Session overview: OpenStack networking is never easy–each new release presents new challenges that are hard to keep up with. Come see how open networking using Linux can help simplify and standardize your RDO deployment. We will demonstrate spine/leaf topology basics, Layer-2 and Layer-3 trade-offs, and building your deployment in a virtual staging environment–all in Linux. Let us demystify your network.
    • Slides

————————————————————————-

  • Extending full stack automation to the physical network
    • Session overview: In this session, we’ll talk about the unique operational challenges facing organizations considering how to encompass the physical network infrastructure when implementing agile practices. We’ll focus on the technical and cultural challenges facing this transition, including how Ansible is uniquely architected to serve as the right foundational framework for powering this change. We’ll touch on why it’s more important than ever that organizations embrace the introduction of new automated orchestration capabilities and start moving away from traditional command and control network device administration being done hop by hop. You’ll see some some of the theories in action and touch on expanding configuration automation to include elements of state validation of configuration changes. Finally, we’ll touch on the changing role of network engineering and operations teams and why their expertise is needed now more than ever to lead this transition.
    • Slides

————————————————————————-

————————————————————————-

  • Telco breakout: Reliability, availability, and serviceability at cloud scale
    • Session overview: Many operators are faced with fierce market competition that is attracting their customers with personalized alternatives. Technologies, like SDN, NFV, and 5G, hold the key to adapting to the networks of the future. However, operators are also looking to ensure that they can continue to offer the service-level guarantees their customers expect.With the advent of cloud-based service infrastructures, building secure, fault-tolerant, and reliable networks that deliver five nines (99.999%) service availability in the same way they have done for years has become untenable. The goal of zero downtime is still the same, as every hour of it is costly to service providers and their customers. As we continually move to new levels of scale, service providers and their customers expect that infrastructure failures will occur and are pro-actively changing their development and operational strategies. This session will explore these industry challenges and how service providers are applying new technologies and approaches to achieve reliability, availability, and serviceability at cloud scale. Service providers and vendors will join us to share their views on this complex topic and explain how they are applying and balancing the use of open source innovations, resilient service and application software, automation, DevOps, service assurance, and analytics to add value for their customers and business partners.
    • Slides

————————————————————————-

  • Red Hat Enterprise Linux roadmap
    • Session overview: Red Hat Enterprise Linux is the premier Linux distribution, known for reliability, security, and performance. Red Hat Enterprise Linux is also the underpinning of Red Hat’s efforts in containers, virtualization, Internet of Things (IoT), network function virtualization (NFV), Red Hat Enterprise Linux OpenStack Platform, and more. Learn what’s new and emerging in this powerful operating system, and how new function and capability can help in your environment.
    • Slides

————————————————————————-

  • Repeatable, reliable OpenStack deployments: Pipe dream or reality?
    • Session overview: Deploying OpenStack is an involved, complicated, and error-prone process, especially when deploying a physical Highly Available (HA) cluster with other software and hardware components, like Ceph. Difficulties include everything from hardware selection to the actual deployment process. Dell and Red Hat have partnered together to produce a solution based on Red Hat Enterprise Linux OSP Director that streamlines the entire process of setting up an HA OpenStack cluster. This solution includes a jointly developed reference architecture that includes hardware, simplified Director installation and configuration, Ceph storage backed by multiple back ends including Dell SC and PS series storage arrays, and other enterprise features–such as VM instance HA and networking segregation flexibility. In this session, you’ll learn how this solution drastically simplifies standing up an OpenStack cloud.
    • Slides

————————————————————————-

  • Running a policy-based cloud with Cisco Application Centric Infrastructure, Red Hat OpenStack, and Project Contiv
    • Session overview:  Infrastructure managers are constantly asked to push the envelope in how they deliver cloud environments. In addition to speed, scale, and flexibility, they are increasingly focused on both security and operational management and visibility as adoption increases within their organizations. This presentation will look at ways Cisco and Red Hat are partnering together to deliver policy-based cloud solutions to address these growing challenges. We will discuss how we are collaborating in the open source community and building products to based on this collaboration. It will cover topics including:
      • Group-Based Policy for OpenStack
      • Cisco Application Centric Infrastructure (ACI) with Red Hat OpenStack
      • Project Contiv and its integration with Cisco ACI
    • Slides

 

 

NFV and Open Networking with RHEL OpenStack Platfrom

(This is a summary version of a talk I gave at Intel Israel Telecom and NFV event on December 2nd, 2015. Slides are available here)

I was honored to be invited to speak on a local Intel event about Red Hat and what we are doing in the NFV space. I only had 30 minutes, so I tried to provide a high level overview of our offering, covering some main points:

  • Upstream first approach and why we believe it is a fundamental piece in the NFV journey; this is not a marketing pitch but really how we deliver our entire product portfolio
  • NFV and OpenStack; I touched on the fact that many service providers are asking for OpenStack-based solutions, and that OpenStack is the de-facto choice for VIM. That said, there are some limitations today (both cultural and technical) with OpenStack and clearly we have a way to go to make it a better engine for the telco needs
  • Full open source approach to NFV; it’s not just OpenStack but also other key projects such as QEMU/KVM, Open vSwitch, DPDK, libvirt, and the underlying Linux operating system. It’s hard to coordinate across these different communities, but this is what we are trying to do, with active participants on all of those
  • Red Hat product focus and alignment with OPNFV
  • Main use-cases we see in the market (atomic VNFs, vCPE, vEPC) with a design example of vPGW using SR-IOV
  • What telco and NFV specific features were introduced in RHEL OpenStack Platform 7 (Kilo) and what is planned for OpenStack Platform 8 (Liberty); as a VIM provider we want to offer our customers and the Network Equipment Providers (NEPs) maximum flexibility for packet processing options with PCI Passthrough, SR-IOV, Open vSwitch and DPDK-accelerated Open vSwitch based solutions.

Thanks to Intel Israel for a very interesting and well-organized event!

 

LLDP traffic and Linux bridges

In my previous post I described my Cumulus VX lab environment which is based on Fedora and KVM. One of the first things I noticed after bringing up the setup is that although I have got L3 connectivity between the emulated Cumulus switches, I can’t get LLDP to operate properly between the devices.

For example, a basic ICMP ping between the directly connected interfaces of leaf1 and spine3 is successful, but no LLDP neighbor shows up:

cumulus@leaf1$ ping 13.0.0.3
PING 13.0.0.3 (13.0.0.3) 56(84) bytes of data.
64 bytes from 13.0.0.3: icmp_req=1 ttl=64 time=0.210 ms
64 bytes from 13.0.0.3: icmp_req=2 ttl=64 time=0.660 ms
64 bytes from 13.0.0.3: icmp_req=3 ttl=64 time=0.635 ms
cumulus@leaf1$ lldpcli show neighbors 

LLDP neighbors:
-------------------------------------

Reading through the Cumulus Networks documentation, I discovered that LLDP is turn on by default on all active interfaces. It is possible to tweak things, such as timers, but the basic neighbor discovery functionality should be there by default.

Looking at the output from lldpcli show statistics I also discovered that LLDP messages are being sent out of the interfaces, but never received:

cumulus@leaf1$ lldpcli show statistics 

Interface:    eth0
  Transmitted:  11
  Received:     0
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     0
  Deleted:      0

Interface:    swp1
  Transmitted:  11
  Received:     0
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     0
  Deleted:      0

Interface:    swp2
  Transmitted:  11
  Received:     0
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     0
  Deleted:      0

So what’s going on?

Remember that leaf1 and spine3 are not really directly connected. They are bridged together using a Linux bridge device.

This is where I discovered that by design, Linux bridges silently drop LLDP messages (sent to the LLDP_Multicast address 01-80-C2-00-00-0E) and other control frames in the 01-80-C2-00-00-xx range.

Explanation to that can be found in the 802.1AB standard which is stating that “the destination address shall be 01-80-C2-00-00-0E. This address is within the range reserved by IEEE Std 802.1D-2004 for protocols constrained to an individual LAN, and ensures that the LLDPDU will not be forward by MAC Bridges that conform to IEEE Std 802.1D-2004.”

It is possible to change this behavior on a per bridge basis, though, by using:

# echo 16384 > /sys/class/net/<bridge_name>/bridge/group_fwd_mask

Retesting with leaf1 and spine3

# echo 16384 > /sys/class/net/virbr1/bridge/group_fwd_mask
cumulus@leaf1$ lldpcli show neighbor
LLDP neighbors:

Interface:    swp1, via: LLDP, RID: 1, Time: 0 day, 00:00:02  
  Chassis:     
    ChassisID:    mac 00:00:00:00:00:33
    SysName:      spine3
    SysDescr:     Cumulus Linux version 2.5.5 running on  QEMU Standard PC (i440FX + PIIX, 1996)
    MgmtIP:       3.3.3.3
    Capability:   Bridge, off
    Capability:   Router, on
  Port:        
    PortID:       ifname swp1
    PortDescr:    swp1
cumulus@leaf1$ lldpcli show statistics 

Interface:      eth0
  Transmitted:  117
  Received:     0
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     0
  Deleted:      0

Interface:      swp1
  Transmitted:  117
  Received:     72
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     1
  Deleted:      0

Interface:      swp2
  Transmitted:  117
  Received:     0
  Discarded:    0
  Unrecognized: 0
  Ageout:       0
  Inserted:     0
  Deleted:      0


LLDP now operates as expected between leaf1 and spine3. Remember that this is a per bridge setting, so in order to get this fixed across the entire setup, the command needs to be issued for the rest of the bridges (virbr2, virbr3, virbr4) as well.

Hands on with Fedora, KVM and Cumulus VX

Cumulus Linux is a network operating system based on Debian that runs on top of industry standard networking hardware. By providing a software-only solution, Cumulus is enabling disaggregation of data center switches similar to the x86 server hardware/software disaggregation. In addition to the networking features you would expect from a network operating system like L2 bridging, Spanning Tree Protocol, LLDP, bonding/LAG, L3 routing, and so on, it enables users to take advantage of the latest Linux applications and automation tools, which is in my opinion its true power.

Cumulus VX is a community-supported virtual appliance that enables network engineers to preview and test Cumulus Networks technology. The appliance is available in different formats (for VMware, VirtualBox, KVM, and Vagrant environments), and since I am running Fedora on my laptop the easiest thing for me was to use the KVM qcow2 image to try it out.

My goal is to build a four node leaf/spine topology. To form the fabric, each leaf will be connected to each spine, so we will end up with two “fabric facing” interfaces on each switch. In addition, I want to have a separate management interface on each device I can use for SSH access as well as automation purposes (Ansible being an immediate suspect), and a loopback interface to be used as the router-id.

base_topology

Prerequisites

  • Install KVM and related virtualization packages. I am running Fedora 22 and used yum groupinstall “Virtualization*” to obtain the latest versions of libvirt, virt-manager, qemu-kvm and associated dependencies.
  • From the Virtual Machine Manager, create four basic isolated networks (without IP, DHCP or NAT settings). Those will serve as transport for the point-to-point links between our switches. I named them as follows:
    • net1
    • net2
    • net3
    • net4
  • Download the KVM qcow2 image from the Cumulus website. At the time of writing the image is based on Cumulus Linux v2.5.5. You would want to copy it four times, and name them as follows:
    • leaf1.qcow2
    • leaf2.qcow2
    • spine3.qcow2
    • spine4.qcow2

Creating the VMs

While creating each VM you will need to specify the network settings, in particular what interfaces you want to be created, what networks they should be part of, and what is their L2 (MAC) information. To ease troubleshooting, I came out with my own convention for the interfaces MAC addresses.

Leaf1:

  • Leaf1 should have three interfaces:
    • One belonging to the “default” network – a network created by virt-manager with DHCP and NAT enabled, and will be used for the management access.
    • One belonging to net1, which is going to be used for the connection between leaf1 and spine3. Behind the scenes, virt-manager created a Linux bridge for this network.
    • One belonging to net2, which is going to be used for the connection between leaf1 and spine4. Behind the scenes, virt-manager created a Linux bridge for this network.
  • Make sure to adjust the path to specify the location of the image.
sudo virt-install --os-variant=generic --ram=256 --vcpus=1 --network=default,model=virtio,mac=00:00:00:00:00:11 --network network=net1,model=virtio,mac=00:00:01:00:00:13 --network network=net2,model=virtio,mac=00:00:01:00:00:14 --boot hd --disk path=/home/nyechiel/Downloads/VX/leaf1.qcow2,format=qcow2 --name=leaf1

Leaf2:

  • Leaf2 should have three interfaces:
    • One belonging to the “default” network – a network created by virt-manager with DHCP and NAT enabled, and will be used for the management access.
    • One belonging to net3, which is going to be used for the connection between leaf2 and spine3. Behind the scenes, virt-manager created a Linux bridge for this network.
    • One belonging to net4, which is going to be used for the connection between leaf2 and spine4. Behind the scenes, virt-manager created a Linux bridge for this network.
  • Make sure to adjust the path to specify the location of the image.
sudo virt-install --os-variant=generic --ram=256 --vcpus=1 --network=default,model=virtio,mac=00:00:00:00:00:22 --network network=net3,model=virtio,mac=00:00:02:00:00:23 --network network=net4,model=virtio,mac=00:00:02:00:00:24 --boot hd --disk path=/home/nyechiel/Downloads/VX/leaf2.qcow2,format=qcow2 --name=leaf2

Spine3:

  • Spine3 should have three interfaces:
    • One belonging to the “default” network – a network created by virt-manager with DHCP and NAT enabled, and will be used for the management access.
    • One belonging to net1, which is going to be used for the connection between leaf1 and spine3. Behind the scenes, virt-manager created a Linux bridge for this network.
    • One belonging to net3, which is going to be used for the connection between leaf2 and spine3. Behind the scenes, virt-manager created a Linux bridge for this network.
  • Make sure to adjust the path to specify the location of the image.
sudo virt-install --os-variant=generic --ram=256 --vcpus=1 --network=default,model=virtio,mac=00:00:00:00:00:33 --network network=net1,model=virtio,mac=00:00:03:00:00:31 --network network=net3,model=virtio,mac=00:00:03:00:00:32 --boot hd --disk path=/home/nyechiel/Downloads/VX/spine3.qcow2,format=qcow2 --name=spine3

Spine4:

  • Spine4 should have three interfaces:
    • One belonging to the “default” network – a network created by virt-manager with DHCP and NAT enabled, and will be used for the management access.
    • One belonging to net2, which is going to be used for the connection between leaf1 and spine4. Behind the scenes, virt-manager created a Linux bridge for this network.
    • One belonging to net4, which is going to be used for the connection between leaf2 and spine4. Behind the scenes, virt-manager created a Linux bridge for this network.
  • Make sure to adjust the path to specify the location of the image.
sudo virt-install --os-variant=generic --ram=256 --vcpus=1 --network=default,model=virtio,mac=00:00:00:00:00:44 --network network=net2,model=virtio,mac=00:00:04:00:00:41 --network network=net4,model=virtio,mac=00:00:04:00:00:42 --boot hd --disk path=/home/nyechiel/Downloads/VX/spine4.qcow2,format=qcow2 --name=spine4

Verifying the hypervisor topology

Before we log in to any of the newly created VMs, I first would like to verify the configuration and make sure that we have got the right connectivity in place. Using ifconfig on my Fedora system, and by looking into the MAC addresses, I correlated between the Linux bridges created by virt-manager (virbr0, virbr1, virbr2, virbr3, virbr4) and the virtual Ethernet devices (vnet). This is giving me the hypervisor point of view, and going to be really useful for troubleshooting purposes. I came up with this topology:

hypervisor_view

Useful commands to use here are brctl show and brctl showmacs. For example, let’s examine the link between leaf1 and spine3 (note that libvirt based the MAC on the configured guest MAC address with high byte set to 0xFE):

$ ip link show vnet1 | grep link
   link/ether fe:00:01:00:00:13 brd ff:ff:ff:ff:ff:ff
$ ip link show vnet10 | grep link
   link/ether fe:00:03:00:00:31 brd ff:ff:ff:ff:ff:ff
$ brctl show virbr1
bridge name      bridge id      STP enabled     interfacesvirbr1       8000.525400d32feb      yes         virbr1-nic                                                vnet1
                                                vnet10
$ brctl showmacs virbr1
port no   mac addr        is local?   ageing timer          2   fe:00:01:00:00:13    yes        18.34
  3   fe:00:03:00:00:31    yes        24.61
  1   52:54:00:d3:2f:eb    yes        0.00

Verifying the fabric topology

Now that we have the basic networking setup between the VMs and we understand the topology, we can jump into the switches and confirm their view. The switches can be accessed with the username “cumulus” and the password “CumulusLinux!”. This is also the password for root.

Using console access to the VMs and the ifconfig command we can learn a couple of things:

  1. eth0 is the base interface on each switch used for management purposes. It picked up an address from the 192.168.122.0/22 range, which is what virt-manager used to setup the “default” network. SSH to this address is enabled by default with standard TCP port 22.
  2. The “fabric” interfaces are swp1 and swp2.

Based on this information we can build up our final topology, which is a representation of the actual fabric:

fabric_topology

Now what?

Now that we have the basic topology setup and the right diagrams to support us, we can go on and configure things. Cumulus has got some good level of documentation so I will let you take it from here. You can configure things manually using the CLI (which is really a bash system with standard Linux commands) or use automation tools to control the switch.  

Using the CLI and following the documentation, it was pretty straightforward to me to configure hostnames, IP addresses and bring up OSPF and BFD (using Quagga) between the switches. Next I plan to play with the more advanced stuff (personally I want to test out BGP and IPv6 configurations), and try to automate things using Ansible. Happy testing!

 

Reflections on the networking industry, part 2: On CLI, APIs and SNMP

In the previous post I briefly described the fact that many networks today are closed and vertically designed. While standard protocols are being adopted by vendors, true interoperability is still a challenge. Sure, you can bring up a BGP peer between platforms from different vendors and exchange route information (otherwise we couldn’t scale the Internet), but management and configuration is still, in most cases, vendor specific.

Every network engineer out there got to respect the CLI. We sometimes love them and sometimes hate them, but we all tend to master them. The glorious way of interacting with a network device, even in 2015. Some common properties of CLIs are –

  1. They are vendor, and sometimes even device, specific;
  2. They are not standardized; there is no standard for setting up the data or for displaying the text
  3. They don’t have a strict notion of versioning or guarantee backward compatibility;
  4. They can change between software releases;

All of the above make CLIs an acceptable solution up to a certain scale. With large-scale networks automation is a key part and usually mandatory. But giving the properties mentioned above, automating a network device configuration based on CLI commands isn’t a trivial task.

Today, you can see more and more vendors that support other protocols such as NETCONF or REST for interacting with their devices. The impression is that you suddenly have a proper API and a standard method to communicate with the devices. Reality is that with such protocols you do have a standard transport to interact with a device, but you still do not have an API, with each device/vendor still represents data differently as brilliantly described by Jason Edelman in this blog post.

We, as an industry, must agree on a standard way for representing the network data. No more vendor-specific implementations, but true, open, models. The last major try was with SNMP, the Simple Network Management Protocol, which is anything but simple. Most people just turn it off, or use it to capture (read: poll) very basic information from a device. Anything more complex than that, not to mention device configuration, requires installation of vendor specific MIBs – and we are back to the same problem.

Reflections on the networking industry, part 1: Welcome to vendor land

I have been involved with networking for quite some time now; I have had the opportunity to design, implement and operate different networks across different environments such as enterprise, data-center, and service provider – which inspired me to create this series of short blog posts exploring the computer networking industry. My view on the history, challenges, hype and reality, and most importantly – what’s next and how we can do better.

Part 1: Welcome to vendor land

Protocols and standards were always a key part of networking and were born out of necessity: we need different systems to be able to talk to each other.

Modern networking suite is built around Ethernet and TCP/IP stack, including TCP, UDP, and ICMP – all riding on top of IPv4 or IPv6. There is a general consensus that Ethernet and TCP/IP won the race against the other alternatives. This is great, right? Well, the problem is not with Ethernet or the TCP/IP stack, but with their “ecosystem”: a long list of complementary technologies and protocols.

Getting the industry to agree on the base layer 2, layer 3 and layer 4 protocols and their header format was indeed a big thing, but we kind of stopped there. Say you have got a standard-based Ethernet link. How would you bring it up and negotiate its speed? And what about monitoring, loop prevention, or neighbor discovery? Except for the very basic, common denominator functionality, vendors came out with their own set of proprietary protocols for solving these issues. Just from the top of my mind: ISL, VTP, DTP, UDLD, PAgP, CDP, and PVST are all examples of the “Ethernet ecosystem” from one (!) vendor.

True, you can find standard alternatives for the mentioned protocols today. Vendors are embracing open standards and tend to replace their proprietary implementation with a standard one if available. But why not to start with the standard one to begin with?

If you think that these are just historical examples from a different era, think again. Even in the 2010s decade, more and more protocols are being developed and/or adopted by single vendors only. I usually like to point out MC-LAG as an example of a fairly recent and very common architecture with no standard-based implementation. This feature alone can lead you to choose one vendor (or even one specific hardware model from one vendor) across your entire network, resulting in a perfect vendor lock-in.