<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0">
	
	<channel>
		<title>The Blog of Ben Rockwood</title>
		<link>http://www.cuddletech.com/blog/index.php</link>
		<description>use unix or die.</description>
		<language>en</language>
		<managingEditor>benr@cuddletech.com</managingEditor>
                <copyright>Copyright 2009</copyright>
		<generator>Pivot Pivot - 1.30 RC2: 'Rippersnapper'</generator>
		<pubDate>Mon, 05 Jan 2009 08:13:18 +0000</pubDate>
		<ttl>60</ttl>
		
		
		
		
		<item>
			<title>Exploring Tokyo</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=1004</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=1004#comm</comments>
                        <description><![CDATA[ <p>
For once, "Tokyo" refers in fact to a physical place, not some code project.  Shocking but true.
</p>
<p>
Just prior to Christmas, I took a week-long trip to Tokyo Japan on Joyent business.  This was interesting for me because it was both my first time to Japan and in fact first time to leave the country.  Given that I am a California native, I've had little reason to leave.  We commonly say here "Your within a 4 hour drive of almost any environment on earth".  California is just a great place and I figured if I ever did leave the country, it should be some place particularly interesting, not just Mexico or Canada.
</p>
<p>
The first thing about traveling to Japan is that jet lag sucks and travel is painful.  Sure, like everone has a  tragic "I took a 36 hour flight" story, but a 12 hour flight in coach just sucks.  The flight is 12 hours there and 9 hours back, thanks to trade winds... shocking that they take a full 3 hours off, but its true.  When you take all travel concerns into account (including, in my case, a connection through LAX) I lost about 3 days to travel.  I wanted to go to the Tokyo OpenSolaris Users Group OpenSolaris 2008.11 release event, and it was funny that I was scrambling Wed morning (PST) to make it in time for a Thursday evening (JST) event.  
</p>
<p>
Once you get there, "jet lag" takes on a new meaning.  Typically I think of "jet lag" as a minor diviation of your sleep schedule, like going coast-to-coast.  But in Japan the time is so off, that you get hit hard about 5PM (JST) and then get a second wind around 7PM and then have trouble sleeping till 3-4AM.  The first morning I was there I woke up at 5AM and by 6 gave up on trying to sleep.  
</p>
<p>
While I can't talk much about my work there, I was in a data center for 2 days straight, then did a hand off to our other staff at home while we were on standby for another 2 days.  We used that time for customer meetings and taking in as much of Tokyo as possible.  Lesson to my fellow administrators, when your in a strange place and up against a deadline... pre-stage, pre-stage, pre-stage.  I actually took a 2.5" USB powered drive with ZFS Datasets ready for mount and use.  ZFS rules.  
</p>
<p>
Anyway... I thought I'd share some miscellaneous thoughts in general about Tokyo for those who've never ventured to Japan:
</p>
<ol>
<li>They say that going to Japan is like going to another planet.  Not true.  It was very much like any other large metropolis... people just don't speak English.
<li>I was told, that in a city like Tokyo which does a lot of international business that most people know english pretty well.  Bullshit.  In the large hotels in Shinjuku, ya, but everywhere else they don't know english.  Due to the ammount that Japanese culture has integrated english words, they might know a couple of words, but it really comes down to hand jestures.  If you walk into McDonalds and say "how are you today?" you get a blank smile.  In my hotel (a really nice one actually, in Ariake, even the front desk barely knew any english.)
<li>"Large Coffee" in Japanese is "Oh-key ko-he"... life was difficult before this.
<li>You always see Japanese crowded into packed areas in the media, so you think Japanese like being crowded.  Wrong.  They like space too... but when you need to rely on public transportation to get anywhere and you can squeeze into a train, you bear it and cram.
<li>Japanese don't look at anyone else.  At least, young people don't.  In America we're constantly sizing up everyone around us, looking, thinking, perhaps even commenting.... not in Japan.  In America if you walk past someone that is alone, you commonly say something like "hey", "yo", "hows it going?", nod, or otherwise acknowledge their existence.  In Japan you can be around hundreds of people and feel absolutely isolated and alone.  Consequently, its a really depressing and lonely place if your alone.  
<li>...unless you wear a kilt.  I wore a kilt one day when there and people couldn't believe what they were seeing, women especially.  After 3 days of feeling like I didn't exist this was a welcome reaffirmation of my humanity. :)
<li>Elderly Japanese (70+?) are much more friendly... they'll commonly give you a smile or say something back if you say hello (in Japanese obviously).
<li>The American understanding of "Hello" in Japanese is "Konichiwa"... but in fact, that means "Good Afternoon".  There are variations for morning, afternoon and evening.  Commonly this is followed with the word "gozaimasu", which adds some formality, like saying "Good morning sir" instead of "Morning" ("Ohayoo gozaimasu").
<li>Japanese pronunciation is more important than even the words themselves.  I asked the from desk where I could find a "Key-mo-noh" (Kimono)... this turned into a confusing number of jestures and ultimately a dash for a Casio pocket translator.  The word was right but due to my bad pronunciation we could not connect.  
<li>In America we give people a hard time about "butchering our language"... if felt somehow redeeming to have people giving me a look of dispare and amusement as I butchered theirs.
<li>Learning Japanese is really tough.  Pronunciation is the key to spoken Japanese... but writing is a whole seperate problem, as they have 3 seperate major writing systems Kanji (iconic, drawn from Chinese), Katakana (syllabic, meaning characters that you can sound out), and Hiragana (the American equivalent is cursive).  The kick in the teeth is that commonly in Japanese they will use all 3 in a single sentence.  
<li>Tokyo is huge.  Taxi's are expensive, especially if your traveling more than a couple miles.  Supposedly a taxi ride from the Narita airport on the edge of town (feels way out of town actually) to the heart of the city will run you US$500 and takes about an hour.
<li>Navigating trains in Tokyo is really complex.  There are hundreds of stops and the kicker is that unlike most places there is not a single central train authority that runs all the trains.... there are several different train companies with their own lines, so you commonly cross over from one to another.  As a result there were many people who have lived there for 5+ years and had considerable trouble navigating the train system unless they were familiar with that particular route.
<li>Tokyo is clean.  Super clean.  And, ironically, finding a trash can is hard to do.  All the taxi's and buses have clean white doily things on the head-rests, and people just don't litter.  You see the occasional cigarette butt, but thats about it.
<li>Bathrooms are fun in Japan.  They use electric dryers exclusively, commonly a "toaster" like contraption in which you insert your hands, and a stream of high-pressure air blows across your hands as you slowly pull them up... bone dry hands, totally awesome.  Even bathrooms in Japan don't have trash-cans.
<li>Toto toilets are scary and wonderful things.  You know, you've seen those images of Japanese toilets with an instrument panel right?  I could write a whole series just on those things, but needless to say the first time you sit down on a toilet seat thats warm, it freaks you out.
<li>The ability to order Sushi like a pro in the US doesn't mean jack sh*t in Japan.
<li>All Japanese are short.  Totally wrong.  I'm 6'4", everyone wanted pictures of me towering over the little Japanese.  Just plain wrong, I didn't notice any difference between California and Japan in terms of variation in height.  In fact, there were several Japanese construction workers that were massive and definitely not to me messed with.
<li>If the Toyota released all their japanese cars in the US, GM and Ford would be out of business.  I saw several Toyota's that put Mercedes to shame.  You have to see it to believe it.
<li>Japanese quality is awesome.  If I traveled there regularly I'd probly buy all my clothes in Japan.
<li>Adjusting to coinage is odd.  The smallest Japanese bill is 1,000 yen (round it to US$10; less due to conversion, but ballpark).  $5 and down is all coinage.  In the US we tend to discard change (collected in jars, or whatever)... but there, you have to adjust to using coinage frequently or you walk around with a bulging pocket all the time.
<li>Mint... apparently mint isn't big in Japan, you don't hardly see it.  If its green its almost certainly green tea flavored.  Strawberry, however, is very popular.
<li>Japanese aren't big on candybars or chocolate in general.  At least, not like we are in the US.  In a mini-mart in the US we have one or more isles dedicated just to chocolate, commonly in candy-bar form.  Over there you find only a couple varieties.  Kit Kat and Snickers are the only US bars I saw.
<li>Yes, Hentai is as common as they say.  Also, Japanese Manga is telephone book sized, not little things like we read in the US.
<li>Strange observation... I was hard pressed to find a Japanese magazine about business or computers.  I found one magazine about PC's, but most were about TV or culture.  I wanted to pick up some economic/news magazines but couldn't find 'em.
<li>Vending Machines.  You hear that they are <b>everywhere</b>.  This is true, there is almost always one within eye shot... however the notion that you can "buy anything in a vending machine" is overblown.  Most of the vending machines were just drinks and maybe a can of nuts or something.  I didn't see any vending machines for portable electronics, or books, or all the wierd stuff you hear about.  I'm sure they exist, but some people make it sounds like you can buy a Sony Walkman in a vending machine in the middle of a park.
<li>Dress.  Dress varies based on what area ("Ward") of Tokyo you are in, but in general they dress much nicer than in the US.  Men most commonly wear a 2 button suit.  Young women wear short skirts with knee or thigh high tights and either leg-warmers or tall boots.  Teenage boys tend toward jeans and a tshirt.
<li>Video Games.  If you walk into an arcade, all the arcades are played sitting down!  What we commonly consider an "up-right" game, has a little bench.  The "crane-pickup" games are really popular and have kool prizes.  One arcade had these games filled with food items like ice-cream bars and such.
<li>Couples.  I was really amazed at how many couples I saw!  In the US its generally difficult to tell who is a couple because we've lost the tradition of holding hands.  A man and women in San Francisco exiting a restaurants may be a couple, or brother-sister, or friends, or co-workers... its hard to tell.  In Tokyo there were tons of couples holding hands and cuddling on trains.  
<li>Gambling.  Gambling is big in Tokyo.  Commonly in the form of slot machines and a game called "Pachinko".  They don't have card games, and thus most people didn't seem to think of it as gambling, but these things are eveywhere!  
<li>Mini-marts.  Mini-marts are big there, particularly 7-11 and Circle K.  People buy lunch, breakfast, and dinner at these places, typically before or after getting on a train.  They sell a lot of Ramen (yes, they do sell "Cup o' Noodle" in Japan) and provide hot-water to fill it up before leaving.  Other meal items include every variation of rice and seafood you can think of, including sushi.
<li>Sushi.  I wondered how much better sushi was there than here.  I wasn't shocked, the sushi in Japan is unlike anything you've had in the US.  I've eaten at some of the high-end places in San Francisco and they don't come close to your average box-lunch sushi there.
</ul>
<p>
I could go on for a while but will leave it there.  I commonly reflected on the movie "Lost in Transation" while in Tokyo.  I even got to quickly venture into Shinjuku to the Tokyo Hyatt where it was largely filmed (the "bar" that he hangs out in has a 1,000 Yen cover charge JUST to sit there.  A Guinness in a pub can cost me 1800 yen.  But man oh man it was a beautiful lounge.)  The theme of being disconnected and alone in Tokyo rings true from the film.
</p>
<p>
I didn't get to see as much of the city as I wanted to.  I especially wish I'd had time to see the legendary Akihabara (Japanese Geek Central), but time didn't permit.   None-the-less I'm happy with what I was able to take in.  We spent one day without a guide just taking the train some place and exploring around the station, the other day with a guide in between customer meetings.
</p>
<p>
I'm absolutely indebted to Alain Hoang who helped guide us and answer our questions.  He's an amazing sysadmin and one of the nicest guys I've ever met.  If it weren't for his help we would have probly never ventured further than we can walk.  Besides that, he deserves a metal for helping me stumble through some basic Japanese and better understand the culture.
</p>
<p>
I don't know if I'll ever have reason to return to Tokyo.  I certainly would enjoy being able to, especially if it weren't so close to Christmas (I returned the day before Christmas Eve), but given the cost I doubt I would ever return to vacation.  Never the less, I've picked up an odd desire to continue learning Japanese and katakana... I've got an odd feeling I'll be back again one day.  Who knows.
</p>
<p>
So, in short, if you ever have the opportunity to visit Tokyo I encourage you to take it, but make sure you pad the trip with at least 5 days to take in as much as possible.</p> ]]></description>
			<guid isPermaLink="false">1004@http://www.cuddletech.com/</guid>
			<category>cuddletech</category>
			<pubDate>Mon, 05 Jan 2009 07:08:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>SysAdmin Advent Calendar Blog In Review</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=1003</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=1003#comm</comments>
                        <description><![CDATA[ <p>
If you missed this years excellent <a href="http://sysadvent.blogspot.com/">Systems Administration Advent Calendar Blog</a> you missed some great content.  But do not despair!  Its all there for your reading pleasure.  Articles on scripting, new technology, primers, and workflow are there to help you into the new year.  I even contributed an entry: <a href="http://sysadvent.blogspot.com/2008/12/day-17-time-management.html">Day 17 - Time Management</a>.
</p>
<p>
A warm round of applause goes to <a href="http://semicomplete.com/">Jordan Sissel</a> for organizing it and rallying various bloggers to participate.</p> ]]></description>
			<guid isPermaLink="false">1003@http://www.cuddletech.com/</guid>
			<category>cuddletech</category>
			<pubDate>Wed, 31 Dec 2008 23:58:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>SA Pro: Episode 3</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=1002</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=1002#comm</comments>
                        <description><![CDATA[ <p>
Just before the end of the year, the third episode of SA Pro, featuring a 1 hour interview with <a href="http://omniti.com/is/theo-schlossnagle">OmniTI Founder & CEO Theo Schlossnagle</a>.
</p>
<img src="http://cuddletech.com/sapro/SApro-Cover002.png">
<ul>
<li><a href="http://cuddletech.com/sapro/SApro-Episode002.m4a">SApro-Episode002 AAC</a>
<li><a href="http://cuddletech.com/sapro/SApro-Episode002.mp3">SApro-Episode002 MP3</a>
</ul>
<p>
Its a bit long, I admit, but Theo is an amazing guy and refreshing to talk with.  Fire it up while you tweek on something fun for New Years.</p> ]]></description>
			<guid isPermaLink="false">1002@http://www.cuddletech.com/</guid>
			<category>cuddletech</category>
			<pubDate>Wed, 31 Dec 2008 23:45:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>Crossbow Experiements and Elation</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=1001</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=1001#comm</comments>
                        <description><![CDATA[ <p>
I wanted to play a little deeper with Crossbow, and in particular get my mind around Etherstubs and inter-stub routing.  So I devised the following experimental architecture:
</p>
<pre>
Etherstub0
        |----> vnic0    ---> zone001
        |----> vnic1    ---> zone002
        +----> vnic2  --
Etherstub1               +-> router01
        |----> vnic3  --/
        |----> vnic4    ---> zone003
        +----> vnic5    ---> zone004
</pre>
<p>
The idea is to have 2 zones on one etherstub (virtual switch) on one subnet, 2 on another, and then an additional zone that sits on both acting as a router.
</p>
<p>
So I set forth to do this.  Create a template zone, cloned it out and brought them all up.  I created all the vnic's assigned to the appropriate etherstubs and gave them to the zones as exclusive-ip interfaces and then configured each zones networking stack by plumbing and ifconfig'ing.  
</p>
<pre>
root@quadra ~$ dladm create-etherstub etherstub0
root@quadra ~$ dladm create-etherstub etherstub1
root@quadra ~$
root@quadra ~$ dladm create-vnic -l etherstub0 vnic0
root@quadra ~$ dladm create-vnic -l etherstub0 vnic1
root@quadra ~$ dladm create-vnic -l etherstub0 vnic2
root@quadra ~$ dladm create-vnic -l etherstub1 vnic3
root@quadra ~$ dladm create-vnic -l etherstub1 vnic4
root@quadra ~$ dladm create-vnic -l etherstub1 vnic5
root@quadra ~$
root@quadra ~$ dladm show-link
LINK        CLASS    MTU    STATE    OVER
e1000g1     phys     1500   up       -- 
e1000g2     phys     1500   down     -- 
e1000g0     phys     1500   unknown  --
etherstub0  etherstub 9000  unknown  --
etherstub1  etherstub 9000  unknown  --
vnic0       vnic     9000   up       etherstub0
vnic1       vnic     9000   up       etherstub0
vnic2       vnic     9000   up       etherstub0
vnic3       vnic     9000   up       etherstub1
vnic4       vnic     9000   up       etherstub1
vnic5       vnic     9000   up       etherstub1
</pre>
<p>
Here is the zone configuration:
</p>
<pre>
zonecfg:template0> info
zonename: template0
zonepath: /quadra/zones/template0
brand: native
autoboot: false
bootargs:
pool:
limitpriv:
scheduling-class:
ip-type: exclusive
inherit-pkg-dir:
        dir: /lib
inherit-pkg-dir:
        dir: /platform
inherit-pkg-dir:
        dir: /sbin
inherit-pkg-dir:
        dir: /usr
inherit-pkg-dir:
        dir: /opt
net:
        address not specified
        physical: vnic0
        defrouter not specified
</pre>
<p>
I then decided on the following IP scheme:
</p>
<pre>
IPs:
vnic0   10.0.90.10      /24
vnic1   10.0.90.11
vnic2   10.0.90.12
vnic3   10.0.91.12
vnic4   10.0.91.11
vnic5   10.0.91.10
</pre>
<p>
Zones up, and it looks like this:
</p>
<pre>
root@quadra ~$ zoneadm list -vc
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   3 zone001          running    /quadra/zones/zone001          native   excl
   4 zone002          running    /quadra/zones/zone002          native   excl
   5 zone003          running    /quadra/zones/zone003          native   excl
   6 zone004          running    /quadra/zones/zone004          native   excl
   7 router01         running    /quadra/zones/router01         native   excl
   - template0        installed  /quadra/zones/template0        native   excl
</pre>
<p>
Now we play!  
</p>
<p>
First things first... can I touch an interface other than the one explicit assigned to it?  And, do dladm commands work in a zone?
</p>
<pre>
root@zone001 ~$ dladm show-vnic
root@zone001 ~$ dladm show-vnic vnic0
dladm: invalid vnic name 'vnic0': object not found
root@zone001 ~$ dladm show-vnic vnic1
dladm: invalid vnic name 'vnic1': object not found
root@zone001 ~$ dladm show-vnic vnic2
dladm: invalid vnic name 'vnic2': object not found
root@zone001 ~$ dladm show-ether
root@zone001 ~$ dladm show-usage
dladm: show-usage requires a file
root@zone001 ~$ dladm create-etherstub zonestub0
dladm: etherstub creation failed: object not found

root@template0 ~$ ifconfig vnic2 plumb 
ifconfig: cannot open link "vnic2": DLPI link does not exist
root@template0 ~$ ifconfig vnic1 plumb
</pre>
<p>
Ok, so dladm is useless and I can't plumb an interface not assigned.  Good.
</p>
<p>
Now, to setup our router.  All we should have to do is enable IPv4 Forwarding on a zone with 2 interfaces, one on each network:
</p>
<pre>
root@router01 ~$ routeadm -e ipv4-forwarding
root@router01 ~$ routeadm -u
root@router01 ~$ routeadm
              Configuration   Current              Current
                     Option   Configuration        System State
---------------------------------------------------------------
               IPv4 routing   enabled              enabled
               IPv6 routing   disabled             disabled
            IPv4 forwarding   enabled              enabled
            IPv6 forwarding   disabled             disabled

           Routing services   "route:default ripng:default"

Routing daemons:

                      STATE   FMRI
                   disabled   svc:/network/routing/legacy-routing:ipv4
                   disabled   svc:/network/routing/legacy-routing:ipv6
                   disabled   svc:/network/routing/ripng:default
                   disabled   svc:/network/routing/ripng:quagga
                     online   svc:/network/routing/ndp:default
                   disabled   svc:/network/routing/zebra:quagga
                   disabled   svc:/network/routing/rip:quagga
                   disabled   svc:/network/routing/ospf:quagga
                   disabled   svc:/network/routing/ospf6:quagga
                   disabled   svc:/network/routing/bgp:quagga
                     online   svc:/network/routing/route:default
                   disabled   svc:/network/routing/rdisc:default

root@router01 ~$ ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
vnic2: flags=201100843<UP,BROADCAST,RUNNING,MULTICAST,ROUTER,IPv4,CoS> mtu 9000 index 2
        inet 10.0.90.12 netmask ffffff00 broadcast 10.0.90.255
        ether 2:8:20:27:5f:6
vnic3: flags=201100843<UP,BROADCAST,RUNNING,MULTICAST,ROUTER,IPv4,CoS> mtu 9000 index 3
        inet 10.0.91.12 netmask ffffff00 broadcast 10.0.91.255
        ether 2:8:20:e9:65:94
lo0: flags=2002000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv6,VIRTUAL> mtu 8252 index 1
        inet6 ::1/128
</pre>
<p>
Easy enough.  In the old days you would enable the "ROUTER" flag on each interface and such, but now its all nicely wrapped by <i>routeadm</i>.  Yeah!
</p>
<p>
I won't bore you with the ping scenario details, but thanks to <i>in.routed</i> running in each zone by default the gateway just appeared auto-magically:
</p>
<pre>
root@zone004 ~$ netstat -nr

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              10.0.91.12           UG        1          0 vnic5     
10.0.91.0            10.0.91.10           U         1          1 vnic5     
127.0.0.1            127.0.0.1            UH        1          0 lo0       

Routing Table: IPv6
  Destination/Mask            Gateway                   Flags Ref   Use    If   
--------------------------- --------------------------- ----- --- ------- ----- 
::1                         ::1                         UH      1       0 lo0   
root@zone004 ~$ ping -s 10.0.90.10
PING 10.0.90.10: 56 data bytes
64 bytes from 10.0.90.10: icmp_seq=0. time=0.549 ms
64 bytes from 10.0.90.10: icmp_seq=1. time=0.091 ms
^C
----10.0.90.10 PING Statistics----
2 packets transmitted, 2 packets received, 0% packet loss
round-trip (ms)  min/avg/max/stddev = 0.091/0.320/0.549/0.324
root@zone004 ~$ traceroute 10.0.90.10
traceroute to 10.0.90.10 (10.0.90.10), 30 hops max, 40 byte packets
 1  10.0.91.12 (10.0.91.12)  0.087 ms  0.041 ms  0.034 ms
 2  10.0.90.10 (10.0.90.10)  0.086 ms  0.056 ms  0.052 ms
</pre>
<p>
How kool is that!  I could take this further by adding in a public interface to the router and routing it as well, but I'd need to bring IP NAT into the mix and I'm not terribly interesting in that tonight.
</p>
<p>
Of course, one other test of interest is will snoop work properly?  We know it works with IP Instances, but still work fine with vnic's and etherstubs?  Yes!
</p>
<pre>
root@zone001 ~$ snoop
Using device vnic0 (promiscuous mode)
  10.0.91.10 -> zone001      ICMP Echo request (ID: 496 Sequence number: 5)
     zone001 -> 10.0.91.10   ICMP Echo reply (ID: 496 Sequence number: 5)
  10.0.91.10 -> zone001      ICMP Echo request (ID: 496 Sequence number: 6)
     zone001 -> 10.0.91.10   ICMP Echo reply (ID: 496 Sequence number: 6)
</pre>
<p>
Furthermore, Etherstub <i>does</i> act as a switch.  Other zones on the same etherstub will not see traffic unless its addressed to it.  
</p>
<p>
As a sidenote, you'll notice that Etherstub's default to JumboFrame.  You should be able to modify this, however the link-property shows as read-only... I'll look into that later.
</p>
<p>
Ever wanted to roll out a functioning, routing, VLAN'ed, multicast network of hundreds of nodes to test your dream setup but only have a laptop?  Now you can.  All my test zones are consuming only 12MB of disk each, and I've got 300GB free on my home SATA RAIDZ2... so do that math. :)
</p>
<p>
BTW.... I did all this from architect to implementation and fully tested in 1 hour, including the time it took to install and configure all the zones.  Solaris rules.
</p>
<p>
Can't resist... lets try IP Filter within the Zone just to see that its happy.  I'll use a simple ruleset that blocks everything but SSH:
</p>
<pre>
root@zone001 ~$ cat /etc/ipf/ipf.conf 
#
# ipf.conf
pass in quick proto tcp from any to 10.0.90.10/32 port = 22
block in log from any to 10.0.90.10/32

root@zone001 ~$ svcadm enable ipfilter
</pre>
<p>
Now we'll test from another node:
</p>
<pre>
root@zone004 ~$ ssh 10.0.90.10
The authenticity of host '10.0.90.10 (10.0.90.10)' can't be established.
RSA key fingerprint is 2e:fc:c7:36:33:70:db:16:d7:74:35:04:1a:3f:02:bb.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.0.90.10' (RSA) to the list of known hosts.
Password: 

root@zone004 ~$ ftp 10.0.90.10
^C
</pre>
<p>
Sweet.  Now just a look at the IP Filter stats to make sure its not a fluke:
</p>
<pre>
root@zone001 ~$ ipfstat 
bad packets:            in 0    out 0
 IPv6 packets:          in 0 out 0
 input packets:         blocked 5 passed 25 nomatch 8 counted 0 short 0
output packets:         blocked 0 passed 15 nomatch 15 counted 0 short 0
 input packets logged:  blocked 5 passed 0
</pre>
<p>
Perfect!  Its actually blocking the packets.  IP Filter works as you expect it too, in a zone on a vnic.  Super sweet.</p> ]]></description>
			<guid isPermaLink="false">1001@http://www.cuddletech.com/</guid>
			<category>OpenSolaris</category>
			<pubDate>Wed, 31 Dec 2008 10:34:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>2008 Year in Review</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=1000</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=1000#comm</comments>
                        <description><![CDATA[ <p>
Here it is, big post 1,000.  I'm fairly proud of that given that the vast bulk of all my blog entries are technical and not just brainless linkdumps.  There is still a lot to blog about and I've still written a great many entries that ended with "more to come...", never the less its a good milestone.
</p>
<img src="http://www.opensolaris.org/os/project/website/content/home_page_draft17/os_fan_button_white_lrg_rnd.gif">
<p>
Looking back at 2008, we've had a very good a productive year in OpenSolaris land.  COMSTAR arrived, Crossbow arrived, ZFS is getting stronger all the time, we got a new iSCSI Target, the first and second release in the 6 month cycle of Indiana went out on schedule, and Solaris 10 is now more or less on par with Nevada.  Technically there is a lot to be proud of and excited about.
</p>
<p>
On the non-technical side we had another OpenSolaris Developers Summit and the first annual OpenSolaris Storage Summit.  Ian Murdock gave a keynote at CommunityOne and there was a heavy emphasis on OpenSolaris at JavaOne.  We did several good conferences this year, although not as many as in years prior.  We had a dominant year at SNIA's Developers Conference, helping solidify Sun's role in the future of storage development.  
</p>
<img src="http://www.sun.com/images/k3/k3_storage7110_1.jpg" width="400">
<p>
On the Sun side, the mighty FISHworks released to the world and the response to the resulting offerings has been tremendous thus far and sets a new standard in storage particularly in the realm of the Sun-created buzzword "OpenStorage".  Business for Sun is poor but there are several areas of growth and although I think the MySQL acquisition was a massive blunder it may all pan out in the end.
</p>
<p>
On the OpenSolaris governance side, its been a sad year.  Rather than moving forward the OGB decided to rehash old ground and fall right back into the same pitfalls.  An all Sun OGB proved to be less effective than a mixed OGB.  OpenSolaris governance in general is more closed off and insular than ever, but thats indirectly what Simon Phipps and others were shooting for.
</p>
<p>
The Silicon Valley OpenSolaris Users Group fell into significant decline over previous years, but tends to be a valley trend as technologies loose their initial buzz and become more established... the Silicon Valley Linux Users Group felt the same kind of declines, although not as sharply.
</p>
<img src="http://www.sun.com/servers/midrange/sunfire_e2900/images/main/k2a_e2900_1.jpg" width="400">
<p>
As we look to 2009, I think the word is "established".  OpenSolaris is here, Nevada is strong, we've proven that its not going to disappear.  We now need to set the tone for the future by definitively establishing the future of Solaris 11 (or lack of one), upgrade path from Solaris 10 (if there is one beyond HP-UX like Update-forever), and wrapping extension technologies like xVM, Sun Cluster, and others around OpenSolaris.  In general, customers are still largely unclear on where this is all ultimately going and what it means to them.  If you have big SPARC box like Sun Fire E2900 in production running Sun Cluster, what does the future hold?  S10 till you retire it?  OpenSolaris makes a lot of sense to new adopting customers, but then a lot of them are running it on non-Sun hardware (Dell, HP, and Supermicro are popular)... how do we monetize them in a compelling way?   And how do we continue to ramp Sun support of Nevada?  To date most experiences with Sun Support over post-S10 releases are horrible as a lot of Sun's Support organization simply doesn't know it well enough.
</p>
<img src="http://www.propertysolutionsnelson.co.nz/_gallery/paving/paving%20stones%20amongst%20stones-lg.JPG" width="400">
<p>
So, the pave stones are on the ground, they now need to be shifted into a resting position so we can start walking people across the path.  Its time to unify offerings and improve Sun's sales, marketing and support around it.
</p>
<p>
Here's to 2009!</p> ]]></description>
			<guid isPermaLink="false">1000@http://www.cuddletech.com/</guid>
			<category>Sun</category>
			<pubDate>Tue, 30 Dec 2008 20:55:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>Crossbow for Christmas</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=999</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=999#comm</comments>
                        <description><![CDATA[ <p>
After 2 years of waiting, <a href="http://opensolaris.org/os/project/crossbow/">Project Crossbow</a> has arrived!  It integrated into Nevada Build 105 on Dec 4th, and BFU's became available around the middle of the month.  SX:CE isn't available just yet, but should be up in about a week I hope.  Crossbow is huge.  This is a monumental improvement to Solaris and continues to push the bar out of reach of its competitors.
</p>
<p>
Simply put, Crossbow redefines the nature of network virtualization.  To date, virtualization was limited to creating traditional "virtual interfaces" like so:
</p>
<pre>
root@quadra ~$ ifconfig e1000g1:1 plumb 10.0.0.50 netmask 255.255.255.0 up
root@quadra ~$ ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
e1000g1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
        inet 10.0.0.18 netmask ffffff00 broadcast 10.0.0.255
        ether 0:1b:21:25:3e:7b 
e1000g1:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
        inet 10.0.0.50 netmask ffffff00 broadcast 10.0.0.255
</pre>
<p>
Creating virtual interfaces like this gets the job done but has a number of drawbacks, all based on the fact that its not a <i>real</i> interface.  Stats are screwed up, you can't snoop the interface, you can't tune it, etc.
</p>
<p>
Crossbow changes all that.  Now we can create <i>Virtual NIC's</i> (vnic's) which are, for all intents and purposes, real interfaces.  They have their own network stack and queues, they can be tuned, the can be snooped, they can be VLAN'ed, etc.  Anything you can do to a real interface you can do to a VNIC.  
</p>
<p>
While VNICs are handy things to have in the globalzone, they really shine when used with virtualization such as Solaris Containers (zones) or Xen guests, because we now can hand off interfaces that are fully controllable from within the virtual environment without having to dedicate a physical NIC to each one.  The result is virtualized environments that feel way more like real servers.
</p>
<p>
If you're not already familiar with the <i>dladm</i> command its time for you to get acquainted.  dladm is short for "Data Link Administration", and now compliments <i>ifconfig</i>.  For some time now its been used for managing WIFI, 802.11ad Link Aggregation ("teaming" or "trunking", depending on your pedigree), and more recently VLANs.  its even replacing the old (and crappy) <i>ndd</i> with <i>dladm</i>'s "link properties"... a welcome improvement.
</p>
<p>
As of snv_105 several new options are available, namely sub-commands for creating VNICs and Etherstubs.  A VNIC is a virtual network interface with all the trimmings of a real network interface.  For the moment, it appears the max number of vnic's is 799, but thats not set in stone, and frankly if you need more than that you need to re-architect.  Etherstubs are in-software switches which can be used in concert with VNIC's to create entirely virtualized in-software networks!  In short, a standard VNIC will be associated with a physical GLDv3 network adapter, but we can also create a VNIC associated with an Etherstub to keep anything from ever touching the wire.
</p>
<p>
Lets ponder this.  Why would you want a VNIC that uses a software switch (etherstub)?  Seems completely useless right?  Not entirely.  On a traditional network you would create a DMZ with firewall and other goodies which routes to a private internal network... imagine that you can now do that all inside a single system!  
</p>
<p>
Ok, so lets get cracking.  Once you have snv_105 installed, we'll create a VNIC associated with physical e1000g1, then an etherstub and 3 more VNICs that are internal using that etherstub:
</p>
<pre>
root@quadra ~$ dladm show-link
LINK        CLASS    MTU    STATE    OVER
e1000g1     phys     1500   up       --
e1000g2     phys     1500   down     --
e1000g0     phys     1500   unknown  --

root@quadra ~$ dladm create-vnic -l e1000g1 vnic0
root@quadra ~$ dladm create-etherstub etherstub0
root@quadra ~$ dladm create-vnic -l etherstub0 vnic1
root@quadra ~$ dladm create-vnic -l etherstub0 vnic2
root@quadra ~$ dladm create-vnic -l etherstub0 vnic3
root@quadra ~$ dladm show-link
LINK        CLASS    MTU    STATE    OVER
e1000g1     phys     1500   up       --
e1000g2     phys     1500   down     --
e1000g0     phys     1500   unknown  --
vnic0       vnic     1500   up       e1000g1
etherstub0  etherstub 9000  unknown  --
vnic1       vnic     9000   up       etherstub0
vnic2       vnic     9000   up       etherstub0
vnic3       vnic     9000   up       etherstub0
</pre>
<p>
So we have a variety of VNIC's at our disposal.  We now treat these like regular interfaces, using ifconfig to plumb them and assign IP's:
</p>
<pre>
root@quadra ~$ ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
e1000g1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
        inet 10.0.0.18 netmask ffffff00 broadcast 10.0.0.255
        ether 0:1b:21:25:3e:7b 

root@quadra ~$ ifconfig vnic0 plumb 10.0.0.19 up
root@quadra ~$ ifconfig vnic1 plumb 10.100.0.2 netmask 255.255.255.0 up
root@quadra ~$ ifconfig vnic2 plumb 10.100.0.3 netmask 255.255.255.0 up
root@quadra ~$ ifconfig vnic3 plumb 10.100.0.4 netmask 255.255.255.0 up

root@quadra ~$ ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
e1000g1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
        inet 10.0.0.18 netmask ffffff00 broadcast 10.0.0.255
        ether 0:1b:21:25:3e:7b 
vnic0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 7
        inet 10.0.0.19 netmask ff000000 broadcast 10.255.255.255
        ether 2:8:20:3a:70:5a 
vnic1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 9000 index 8
        inet 10.100.0.2 netmask ffffff00 broadcast 10.100.0.255
        ether 2:8:20:f2:56:4d 
vnic2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 9000 index 9
        inet 10.100.0.3 netmask ffffff00 broadcast 10.100.0.255
        ether 2:8:20:bc:b1:a1 
vnic3: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 9000 index 10
        inet 10.100.0.4 netmask ffffff00 broadcast 10.100.0.255
        ether 2:8:20:55:11:56
</pre>
<p>
Please notice that they all have individual MAC addresses!  There are severla methods for how the MAC is chosen, but I won't go into them here.
</p>
<p>
If you are using Solaris Containers these VNIC's would be given to a Zone as an "IP-Instance" (exclusive mode), a feature which was added some time ago but untill now only usable by dedicating a physical interface.  The same should apply to Xen or other virtualization tools.
</p>
<p>
Finally, in our whirlwind tour of this amazing technology, lets look at my favorite feature of Crossbow. 
</p>
<p>
Crossbow is both Network Virtualization (we looked at that above) and Network Resource Control.  With Crossbow we have a real network resource control capability that is free from the terror that is IPQoS.
</p>
<p>
There are three types of resource controls at present: max bandwidth (rate limiting), priority (relative to other traffic), and cpu's.  Please note that these controls are not cumulative, but rather apply to any given point in time.  These controls can be applied either to an entire link (NIC or VNIC) or alternatively to a particular network flow.
</p>
<p>
Let me pause here.  If your not familiar with a "network flow", it is a defined collection of network communication.  For instance, a flow might refer to all HTTP (port 80) traffic to a given IP address, or perhaps all TCP traffic, or perhaps a combination of FTP, SMTP, and HTTP ports.  If you've worked with firewall rules your familiar with the concept, a flow simply allows us a way to apply some action to a specific flow of traffic.
</p>
<p>
Crossbow adds the new command <i>flowadm</i> to define and control network flows.  Here is an example:
</p>
<pre>
root@quadra ~$ flowadm add-flow -l vnic0 -a transport=tcp,local_port=80 httpflow
root@quadra ~$ flowadm add-flow -l vnic0 -a transport=tcp,local_port=443 httpsflow
root@quadra ~$ flowadm show-flow
FLOW        LINK        IP ADDR                        PROTO  PORT    DSFLD
httpflow    vnic0       --                             tcp    80      --
httpsflow   vnic0       --                             tcp    443     --
</pre>
<p>
<i>flowadm</i> relies on <b>attributes</b> that describe a flow, and <b>properties</b> which assign some resource control.  We'll add bandwith control to the flows above by modifying the "maxbw" property:
</p>
<pre>
root@quadra ~$ flowadm show-flowprop
FLOW         PROPERTY        VALUE          DEFAULT        POSSIBLE
httpflow     maxbw              50          --             50M 
httpflow     priority        --             --             
httpsflow    maxbw              80          --             80M 
httpsflow    priority        --             --      
</pre>
<p>
Here the maxbw is specified in Mbps.  Docs show that percentages, Kbps, etc are supported, but they don't seem to work right now.
</p>
<p>
maxbw will rate limit to the specified throughput, priority can be set "low", "normal", "high" or "rt" (real time).  Using these controls carefully you can partition off bandwidth pretty nicely.
</p>
<p>
In addition to all this, extended accounting has been extended to incorporate accounting based on links or flows, but I'll save that for another day.
</p>
<p>
Congrats to everyone on the Crossbow team.  This is a major achievement and an amazing technological advance!</p> ]]></description>
			<guid isPermaLink="false">999@http://www.cuddletech.com/</guid>
			<category>OpenSolaris</category>
			<pubDate>Mon, 29 Dec 2008 09:20:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>Merry Christmas</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=998</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=998#comm</comments>
                        <description><![CDATA[ <p>
To all my fellow administrators, a very merry Christmas to you and yours.  
</p>
<p><i><a href="http://www.youtube.com/watch?v=Uvkp2QY4qEI">'Carol of the Bells' by The Bird and the Bee</i></a></p>
<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/Uvkp2QY4qEI&hl=en&fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Uvkp2QY4qEI&hl=en&fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object>
<br  /><br />
<img src="http://www.thegodmobile.info/images/Mary-Jesus1.jpg" width="300">
<p>
<b>Luke Chapter 2:</b>
</p>
<i>
<p>
In those days a decree went out from Caesar Augustus that all the world should be ﻿registered.  
</p>
<p>
2 This was the first registration when﻿ Quirinius ﻿was governor of Syria.  
</p>
<p>3 And all went to be registered, each to his own town. 
</p>
<p>4 And Joseph also went up from Galilee, from the town of Nazareth, to Judea, to the city of David, 
which is called ﻿Bethlehem, ﻿because he was of the house and lineage of David,  
</p>
<p>
5 to be registered with Mary, his betrothed, who was with child.  
<p>
6 And while they were there, the time came for her to give birth.  
</p>
<p>
7 And she gave birth to her firstborn son and ﻿wrapped him in swaddling cloths and ﻿laid him 
in a manger, because there was no place for them in the inn.
</p>
</i>
<p>
To my fellow brothers and sisters in Christ, let us reflect on our Lord and the blessings He has poured out on us and our families.  Let us also be reminded of His birth and fullness of His coming... God, second person of the trinity, become flesh and blood to reconcile us to Him.  He pooped a diaper, got hungry, swung  a hammer in the hot sun, was tempted, ministered,  died for our sin, and rose again.  I pray that He give you all a restful Christmas season, and then return with joy to the work He has put before us.</p> ]]></description>
			<guid isPermaLink="false">998@http://www.cuddletech.com/</guid>
			<category>cuddletech</category>
			<pubDate>Thu, 25 Dec 2008 11:22:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>OpenSolaris 2008.11 Properly Released</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=997</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=997#comm</comments>
                        <description><![CDATA[ <p>
<a href="http://opensolaris.com">OpenSolaris 2008.11</a> is now fully and properly released.  At <a href="http://opensolaris.com">opensolaris.com</a> you'll find several video interviews, including Sun's all-star cast names like John Fowler, Tim Cramer, David Comey, and Dr. Stephen Hahn.  There is a demo of <b>Time Slider</b>, Sun's time-machine like functionality added to GNOME's file browser which leverages ZFS snapshot navigation in an easy to use graphical way.  There are also presentations with both Intel and AMD on leveraging their technologies both present and forthcoming.
</p>
<p>
If you haven't downloaded the new release yet, make it your weekend project.  It runs as a LiveCD so grab it and play.</p> ]]></description>
			<guid isPermaLink="false">997@http://www.cuddletech.com/</guid>
			<category>OpenSolaris</category>
			<pubDate>Wed, 10 Dec 2008 20:06:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>Ode to Dave</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=996</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=996#comm</comments>
                        <description><![CDATA[ <p>
David Stewart, in a super-snazzy suit no less, at Tokyo Tech Days 2008, photo by Jim Gris.
</p>
<img src="http://farm4.static.flickr.com/3055/3079996814_dfcb01f778.jpg?v=0">
<p>
Is there a reason for this post?  Nope... Intel Dave is just awesome.  Anyone that can make me not hate Intel has <i>got</i> to have some kind of super powers.</p> ]]></description>
			<guid isPermaLink="false">996@http://www.cuddletech.com/</guid>
			<category>OpenSolaris</category>
			<pubDate>Fri, 05 Dec 2008 01:09:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>OpenSolaris 2008.11 Released</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=995</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=995#comm</comments>
                        <description><![CDATA[ <p>
The crew was so busy getting everything ready for OpenSolaris 2008.11 that they forgot to tell anyone they released it...... so, guess what: <a href="http://www.opensolaris.com/get/index.jsp">Download OpenSolaris 2008.11 Now!</a>
</p>
<p>
For a great run through of new features, especially from a desktop perspective, please watch Roman's excellent <a href="http://webcast-west.sun.com/interactive/09B12437/index.html">Whats New in OpenSolaris 2008.11 Screencast</a>.
</p>
<p>
I know the docs team spent a lot of time on the docs kit for OpenSolaris 2008.11, but I can't seem to find it... so, think of it as an easter egg, search and be amazed.  I'm sure the docs are within the ISO itself.
</p>
<p>
If your interested in  graffiti or modern art, or whatever, behold the new OpenSolaris shirt:
</p>
<img src="http://www.opensolaris.org/os/community/advocacy/mktgdownloads/tshirts/tshirt-design.jpg">
<p>
... judge for yourself.</p>
<p>
<b>UPDATE:</b>  I'm told by several sources that a big launch/announcement event is scheduled for next week, complete with blog blitz, etc.  I therefore infere that they wanted to get the release out as close to Nov as possible and thus released early and quiet until the marketing hubbub could occur.</p> ]]></description>
			<guid isPermaLink="false">995@http://www.cuddletech.com/</guid>
			<category>OpenSolaris</category>
			<pubDate>Wed, 03 Dec 2008 18:59:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>SA Pro: Tom Limoncelli Interview</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=994</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=994#comm</comments>
                        <description><![CDATA[ <p>
The 2nd <i>SA Pro</i> podcast has arrived...
</p>
<a href="http://cuddletech.com/sapro/"><img src="http://cuddletech.com/sapro/SApro-Cover001.png" border="0"></a>
<p>
Tom Limoncelli, of <i>Time Management for System Administators</i> and <i>The Practice of System and Network Administration</i> fame, and I talk about his books, experience at Bell Labs, and time management in general in this 60 minute interview.
</p>
<p>
Available in MP3, AAC, and OggVorbis, <a href="http://cuddletech.com/sapro/">Download SA Pro here</a>, or directly:
</p>
<ul>
<li><a href="http://cuddletech.com/sapro/SApro-Episode001.mp3">Download in MP3 Format</a>
<li><a href="http://cuddletech.com/sapro/SApro-Episode001.m4a">Download in AAC Format</a>
<li><a href="http://cuddletech.com/sapro/SApro-Episode001.ogg">Download in OggVorbis Format</a>
</ul>
<p>
<b>Special Bonus:</b> For my faithful readers, here is the <a href="http://cuddletech.com/sapro/SApro-Episode02-Intro.mp3">original cut intro</a>.</p> ]]></description>
			<guid isPermaLink="false">994@http://www.cuddletech.com/</guid>
			<category>cuddletech</category>
			<pubDate>Mon, 01 Dec 2008 05:42:00 -0000</pubDate>
		</item>
		
		
		
		<item>
			<title>Thumpers and SMART: When You Suspect A Failed Disk</title>
			<link>http://www.cuddletech.com/blog/pivot/entry.php?id=993</link>
			<comments>http://www.cuddletech.com/blog/pivot/entry.php?id=993#comm</comments>
                        <description><![CDATA[ <p>
While not an uncommon problem for storage arrays, Thumpers (Solaris/ZFS) in particular are susceptible to "mostly dead" disk issues.  This is a situation in which a disk has not failed but IO performance or log messages give you that gut feeling that a drive needs to be swapped out.  One would think that Solaris FMA (Fault Management Architecture) should detect these and handle them, but until the Fishworks team made a series of putbacks to the Nevada 90's builds it almost never did.  So when our SA gut says "swap it" but Solaris doesn't seem to agree, what do we do?
</p>
<p>
Your drives aren't as stupid as they look, thanks to SMART (<a href="http://en.wikipedia.org/wiki/Self-Monitoring,_Analysis,_and_Reporting_Technology">Self-Monitoring, Analysis, and Reporting Technology</a>).  The state of SMART for SATA drives on Solaris is pretty crappy (improved via Fishworks work, but thats a different entry).  Thankfully the "Sun Fire X4500 Software" CD includes an amazing utility named "hd", provided by the SUNWhd package.  This utility can do a wide variety of things, but most importantly it a) can output a logical to physical drive map (helps you know which disk is which), and b) can queiry SMART data of the drives.
</p>
<p>
If you have a Thumper and have not installed SUNWhd, here is the example that will make you <a href="http://www.sun.com/servers/x64/x4500/downloads.jsp">download it now</a>:
</p>
<pre>
---------------------SunFireX4500------Rear----------------------------

36:   37:   38:   39:   40:   41:   42:   43:   44:   45:   46:   47:
c5t3  c5t7  c4t3  c4t7  c7t3  c7t7  c6t3  c6t7  c1t3  c1t7  c0t3  c0t7
^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++
24:   25:   26:   27:   28:   29:   30:   31:   32:   33:   34:   35:
c5t2  c5t6  c4t2  c4t6  c7t2  c7t6  c6t2  c6t6  c1t2  c1t6  c0t2  c0t6
^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++
12:   13:   14:   15:   16:   17:   18:   19:   20:   21:   22:   23:
c5t1  c5t5  c4t1  c4t5  c7t1  c7t5  c6t1  c6t5  c1t1  c1t5  c0t1  c0t5
^++   ^++   ^++   ^++   ^++   ^++   ^--   ^--   ^++   ^--   ^++   ^++
 0:    1:    2:    3:    4:    5:    6:    7:    8:    9:   10:   11:
c5t0  c5t4  c4t0  c4t4  c7t0  c7t4  c6t0  c6t4  c1t0  c1t4  c0t0  c0t4
^b+   ^b+   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++   ^++
-------*-----------*-SunFireX4500--*---Front-----*-----------*----------
</pre>
<p>
For that alone, its worth it.  But wait... there's more!
</p>
<p>
Using the same <i>hd</i> utility, using the -r or -R flags, you can pull all the SMART data off all the drives.  The -R output gives you a single-line per disk output for easy browsing:
</p>
<pre>
$ /opt/SUNWhd/hd/bin/hd -R
                1  2    3           4 5 7 8  9    10 12         [ temp 194 ] 196...   <--- Key
 0 c5t0         0 500  55877894808 30 0 0 31 20135 0 30 673 673  26  20  35 0 0 0 0
 1 c5t4         0 655  55877304979 30 0 0 33 20134 0 30 65 65  26  21  36 0 0 0 0
 2 c4t0         0 824  55878746782 29 0 0 32 20134 0 29 70 70  25  21  35 0 0 0 0
 3 c4t4         0 662  55877763735 29 1 0 29 20134 0 29 68 68  26  21  36 1 0 0 0
 4 c7t0         0 260  55876977290 30 0 0 32 20135 0 30 71 71  26  21  36 0 0 0 0
 5 c7t4         0 1201  55877436058 30 0 0 32 20135 0 30 71 71  26  21  36 0 1 0 0
 6 c6t0         0 758  55878484644 30 0 0 32 20135 0 30 55 55  27  22  36 0 0 0 0
 7 c6t4         0 950  55877239437 30 23 0 31 20134 0 30 72 72  26  21  36 24 0 0 0
 8 c1t0         0 1442  55876780678 29 5 0 33 20134 0 29 68 68  27  21  36 5 1 0 0
 9 c1t4         0 1616  55877763727 29 27 0 33 20134 0 29 67 67  26  20  36 29 18 4 0
10 c0t0         0 955  55876911756 29 0 0 32 20134 0 29 68 68  27  20  36 0 0 0 0
11 c0t4         0 1428  55877567125 29 6 0 31 20134 0 29 63 63  28  21  37 6 0 0 0
</pre>
<p>
Please note the second "key" line is my addition.  We'll get back to that.
</p>
<p>
To better understand this output, lets look at the more verbose -r output for just a single disk.  Lets first look at a healthy disk:
</p>
<pre>
15 c4t5
======
Revision: 16
Offline status 130
Selftest status 0
Seconds to collect 10419
Time in minutes to run short selftest 1
Time in minutes to run extended selftest 174
Offline capability 91
SMART capability 3
Error logging capability 1
Checksum 0x8b
Identification                     Status Current Worst         Raw data
  1 Raw read error rate            0xb        100   100                0
  2 Throughput performance         0x5        110   110              789
  3 Spin up time                   0x7        104   104      55878484641
  4 Start/Stop count               0x12       100   100               29
  5 Reallocated sector count       0x33       100   100                0
  7 Seek error rate                0xb        100   100                0
  8 Seek time performance          0x5        136   136               31
  9 Power on hours count           0x12        98    98            20134
 10 Spin retry count               0x13       100   100                0
 12 Device power cycle count       0x32       100   100               29
192 Power off retract count        0x32       100   100               71
193 Load cycle count               0x12       100   100               71
194 Temperature                    0x2        189   189  29/ 23/ 38 (degrees C cur/min/max)
196 Reallocation event count       0x32       100   100                0
197 Current pending sector count   0x22       100   100                0
198 Scan uncorrected sector count  0x8        100   100                0
199 Ultra DMA CRC error count      0xa        200   253                0
</pre>
<p>
You can find explanations of these <a href="http://www.siguardian.com/products/siguardian/on_line_help/s_m_a_r_t_attribute_meaning.html">here</a> and <a href="http://smartlinux.sourceforge.net/smart/attributes.php">there</a>, and even the <a href="http://www.t13.org/Documents/UploadedDocuments/docs2005/e05148r0-ACS-SMARTAttributesAnnex.pdf">Official T13 SMART Attributes Annex</a> (PDF)... but here is my short reference for the most important values to watch:
</p>
<ul>
<li>1 Raw read error rate: Count of non-corrected read errors. More errors (i.e. lower attribute value) means worse condition of disk surface. Frequency of errors appearance while reading RAW data from a disk
<li>2 Throughput performance: Overall (general) throughput performance of HDD
<li>5 Reallocated sector count: Quantity of remapped sectors 
<li>192 Power off retract count: Number of the fixed 'turning off drive' cycles (Fujitsu: Emergency Retract Cycle Count)
<li>193 Load cycle count: Number of cycles into Landing Zone position 
<li>196 Reallocation event count: Quantity of remapping operations
<li>197 Current pending sector count: Current quantity of unstable sectors (waiting for remapping)
<li>198 Scan uncorrected sector count: Quantity of uncorrected errors (This is perhaps the single best value to watch.) 
</ul>
<p>
In my experience thus far, #1 and #5 is important to watch and a good indication that things are heading south, but are not to be considered unusual at reasonable levels.  The values to <i>really</i> watch are 196, 197 and 198.  If any of these values are non-zero things are bad.  Chief of all, 198.  If there was any single value that would cause me to "swap to be on the safe side", it would be 198.
</p>
<p>
Here is an example (-r) of a really jacked up drive:
</p>
<pre>
22 c0t1
======
Revision: 16
Offline status 132
Selftest status 0
Seconds to collect 10419
Time in minutes to run short selftest 1
Time in minutes to run extended selftest 174
Offline capability 91
SMART capability 3
Error logging capability 1
Checksum 0xf1
Identification                     Status Current Worst         Raw data
  1 Raw read error rate            0xb         53    53          5133961
  2 Throughput performance         0x5        109   109              829
  3 Spin up time                   0x7        104   104      55878353565
  4 Start/Stop count               0x12       100   100               29
  5 Reallocated sector count       0x33         1     1                8
  7 Seek error rate                0xb        100   100                0
  8 Seek time performance          0x5        136   136               31
  9 Power on hours count           0x12        98    98            20134
 10 Spin retry count               0x13       100   100                0
 12 Device power cycle count       0x32       100   100               29
192 Power off retract count        0x32       100   100               65
193 Load cycle count               0x12       100   100               65
194 Temperature                    0x2        183   183  30/ 22/ 38 (degrees C cur/min/max)
196 Reallocation event count       0x32       100   100                8
197 Current pending sector count   0x22         1     1             1891
198 Scan uncorrected sector count  0x8          1     1            56254
199 Ultra DMA CRC error count      0xa        200   253                0
</pre>
<p>
This drive might as well have been run over by a Mac truck.  56,254 scanned uncorrected sectors?  Eject... immediately.
</p>
<p>
If you're a savvy storage admin, your keen mind is probly telling you to go and review Google's FAST paper: <a href="http://labs.google.com/papers/disk_failures.pdf">Failure Trends in a Large Disk Drive Population</a>.  This paper used Google's massive deployment to examine correlations between disk failures and, in particular, SMART data that might have predicted the failure.  Its important to note that Google, wisely, considers a "failure" as any event in which an admin swaps the drive (errors, dead, whatever).
</p>
<p>
If you haven't read the paper, do it... now.  But here is a couple of choice quotes relating to SMART:
</p>
<ul>
<li><b>Scan Errors</b>: "After the first scan error, drives are 39 times more likely to fail within 60 days"
<li><b>Reallocation Counts</b>: "After the first reallocation, drives are over 14 times more likely to fail within 60 days"
<li><b>Offline Reallocations</b>: "After the first offline reallocation, drives have over 21 times higher changes of failure within 60 days"
<li><b>Probational Counts</b>: "after the first event, drives are 16 times more likely to fail within 60 days"
<li><b>Conclusions</b>: "Despite those strong correlations, we find that failure prediction models based on SMART parameters alone are likely to be severly limited in their prediction accuracy, given that a large fraction of our failed drives have shown no SMART error signals whatsoever."
</ul>
<p>
While many people have read this paper and simply walked away saying "Yup, SMART is useless, yet again" I want to disagree.  When you combine Google's research with your SA instinct, we arrive at a good balance.  To put it another way, I don't think  you should poll SMART every 5 minutes and swap a drive because you get a non-zero value, but when you feel like there is something wrong with disks in your system and just don't have proof, SMART is the answer.</p> ]]></description>
			<guid isPermaLink="false">993@http://www.cuddletech.com/</guid>
			<category>cuddletech</category>
			<pubDate>Fri, 28 Nov 2008 07:27:00 -0000</pubDate>
		</item>
		
		
		
	</channel>
</rss>
