Crossbow for Christmas
Posted on December 29, 2008
After 2 years of waiting, Project Crossbow has arrived! It integrated into Nevada Build 105 on Dec 4th, and BFU’s became available around the middle of the month. SX:CE isn’t available just yet, but should be up in about a week I hope. Crossbow is huge. This is a monumental improvement to Solaris and continues to push the bar out of reach of its competitors.
Simply put, Crossbow redefines the nature of network virtualization. To date, virtualization was limited to creating traditional “virtual interfaces” like so:
root@quadra ~$ ifconfig e1000g1:1 plumb 10.0.0.50 netmask 255.255.255.0 up root@quadra ~$ ifconfig -a lo0: flags=2001000849mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 e1000g1: flags=201000843 mtu 1500 index 2 inet 10.0.0.18 netmask ffffff00 broadcast 10.0.0.255 ether 0:1b:21:25:3e:7b e1000g1:1: flags=201000843 mtu 1500 index 2 inet 10.0.0.50 netmask ffffff00 broadcast 10.0.0.255
Creating virtual interfaces like this gets the job done but has a number of drawbacks, all based on the fact that its not a real interface. Stats are screwed up, you can’t snoop the interface, you can’t tune it, etc.
Crossbow changes all that. Now we can create Virtual NIC’s (vnic’s) which are, for all intents and purposes, real interfaces. They have their own network stack and queues, they can be tuned, the can be snooped, they can be VLAN’ed, etc. Anything you can do to a real interface you can do to a VNIC.
While VNICs are handy things to have in the globalzone, they really shine when used with virtualization such as Solaris Containers (zones) or Xen guests, because we now can hand off interfaces that are fully controllable from within the virtual environment without having to dedicate a physical NIC to each one. The result is virtualized environments that feel way more like real servers.
If you’re not already familiar with the dladm command its time for you to get acquainted. dladm is short for “Data Link Administration”, and now compliments ifconfig. For some time now its been used for managing WIFI, 802.11ad Link Aggregation (“teaming” or “trunking”, depending on your pedigree), and more recently VLANs. its even replacing the old (and crappy) ndd with dladm‘s “link properties”… a welcome improvement.
As of snv_105 several new options are available, namely sub-commands for creating VNICs and Etherstubs. A VNIC is a virtual network interface with all the trimmings of a real network interface. For the moment, it appears the max number of vnic’s is 799, but thats not set in stone, and frankly if you need more than that you need to re-architect. Etherstubs are in-software switches which can be used in concert with VNIC’s to create entirely virtualized in-software networks! In short, a standard VNIC will be associated with a physical GLDv3 network adapter, but we can also create a VNIC associated with an Etherstub to keep anything from ever touching the wire.
Lets ponder this. Why would you want a VNIC that uses a software switch (etherstub)? Seems completely useless right? Not entirely. On a traditional network you would create a DMZ with firewall and other goodies which routes to a private internal network… imagine that you can now do that all inside a single system!
Ok, so lets get cracking. Once you have snv_105 installed, we’ll create a VNIC associated with physical e1000g1, then an etherstub and 3 more VNICs that are internal using that etherstub:
root@quadra ~$ dladm show-link LINK CLASS MTU STATE OVER e1000g1 phys 1500 up -- e1000g2 phys 1500 down -- e1000g0 phys 1500 unknown -- root@quadra ~$ dladm create-vnic -l e1000g1 vnic0 root@quadra ~$ dladm create-etherstub etherstub0 root@quadra ~$ dladm create-vnic -l etherstub0 vnic1 root@quadra ~$ dladm create-vnic -l etherstub0 vnic2 root@quadra ~$ dladm create-vnic -l etherstub0 vnic3 root@quadra ~$ dladm show-link LINK CLASS MTU STATE OVER e1000g1 phys 1500 up -- e1000g2 phys 1500 down -- e1000g0 phys 1500 unknown -- vnic0 vnic 1500 up e1000g1 etherstub0 etherstub 9000 unknown -- vnic1 vnic 9000 up etherstub0 vnic2 vnic 9000 up etherstub0 vnic3 vnic 9000 up etherstub0
So we have a variety of VNIC’s at our disposal. We now treat these like regular interfaces, using ifconfig to plumb them and assign IP’s:
root@quadra ~$ ifconfig -a lo0: flags=2001000849mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 e1000g1: flags=201000843 mtu 1500 index 2 inet 10.0.0.18 netmask ffffff00 broadcast 10.0.0.255 ether 0:1b:21:25:3e:7b root@quadra ~$ ifconfig vnic0 plumb 10.0.0.19 up root@quadra ~$ ifconfig vnic1 plumb 10.100.0.2 netmask 255.255.255.0 up root@quadra ~$ ifconfig vnic2 plumb 10.100.0.3 netmask 255.255.255.0 up root@quadra ~$ ifconfig vnic3 plumb 10.100.0.4 netmask 255.255.255.0 up root@quadra ~$ ifconfig -a lo0: flags=2001000849 mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 e1000g1: flags=201000843 mtu 1500 index 2 inet 10.0.0.18 netmask ffffff00 broadcast 10.0.0.255 ether 0:1b:21:25:3e:7b vnic0: flags=201000843 mtu 1500 index 7 inet 10.0.0.19 netmask ff000000 broadcast 10.255.255.255 ether 2:8:20:3a:70:5a vnic1: flags=201000843 mtu 9000 index 8 inet 10.100.0.2 netmask ffffff00 broadcast 10.100.0.255 ether 2:8:20:f2:56:4d vnic2: flags=201000843 mtu 9000 index 9 inet 10.100.0.3 netmask ffffff00 broadcast 10.100.0.255 ether 2:8:20:bc:b1:a1 vnic3: flags=201000843 mtu 9000 index 10 inet 10.100.0.4 netmask ffffff00 broadcast 10.100.0.255 ether 2:8:20:55:11:56
Please notice that they all have individual MAC addresses! There are severla methods for how the MAC is chosen, but I won’t go into them here.
If you are using Solaris Containers these VNIC’s would be given to a Zone as an “IP-Instance” (exclusive mode), a feature which was added some time ago but untill now only usable by dedicating a physical interface. The same should apply to Xen or other virtualization tools.
Finally, in our whirlwind tour of this amazing technology, lets look at my favorite feature of Crossbow.
Crossbow is both Network Virtualization (we looked at that above) and Network Resource Control. With Crossbow we have a real network resource control capability that is free from the terror that is IPQoS.
There are three types of resource controls at present: max bandwidth (rate limiting), priority (relative to other traffic), and cpu’s. Please note that these controls are not cumulative, but rather apply to any given point in time. These controls can be applied either to an entire link (NIC or VNIC) or alternatively to a particular network flow.
Let me pause here. If your not familiar with a “network flow”, it is a defined collection of network communication. For instance, a flow might refer to all HTTP (port 80) traffic to a given IP address, or perhaps all TCP traffic, or perhaps a combination of FTP, SMTP, and HTTP ports. If you’ve worked with firewall rules your familiar with the concept, a flow simply allows us a way to apply some action to a specific flow of traffic.
Crossbow adds the new command flowadm to define and control network flows. Here is an example:
root@quadra ~$ flowadm add-flow -l vnic0 -a transport=tcp,local_port=80 httpflow root@quadra ~$ flowadm add-flow -l vnic0 -a transport=tcp,local_port=443 httpsflow root@quadra ~$ flowadm show-flow FLOW LINK IP ADDR PROTO PORT DSFLD httpflow vnic0 -- tcp 80 -- httpsflow vnic0 -- tcp 443 --
flowadm relies on attributes that describe a flow, and properties which assign some resource control. We’ll add bandwith control to the flows above by modifying the “maxbw” property:
root@quadra ~$ flowadm show-flowprop FLOW PROPERTY VALUE DEFAULT POSSIBLE httpflow maxbw 50 -- 50M httpflow priority -- -- httpsflow maxbw 80 -- 80M httpsflow priority -- --
Here the maxbw is specified in Mbps. Docs show that percentages, Kbps, etc are supported, but they don’t seem to work right now.
maxbw will rate limit to the specified throughput, priority can be set “low”, “normal”, “high” or “rt” (real time). Using these controls carefully you can partition off bandwidth pretty nicely.
In addition to all this, extended accounting has been extended to incorporate accounting based on links or flows, but I’ll save that for another day.
Congrats to everyone on the Crossbow team. This is a major achievement and an amazing technological advance!