The Cuddletech Veritas Volume Manager Series

             Sun Midrange Servers: An Overview 

                      by: B. Rockwood
                    benr@cuddletech.com




Introduction
------------

I wanted to create a tutorial about the Sun Midrange line of servers, but
found it difficult to write out.  When I did write it out it was no better
than the manuals you can get from Sun.  So much of the tips and tricks are
best told in stories and mistakes, and that makes for really really dull
writing (and reading!).   So I opted to instead use audio.  MP3 to be more
accurate.  I have created 2 one hour audio commentaries in regards to this 
server line, which cover every aspect of the machines I thought I could
get in there.  If you've worked with these machines you'll find it an interesting
review, and if you haven't it will compliment the written docs very well,
while providing a good bent toward the systems storage and IO capabilities.
All in all, I think its something most people will get something from.


Below you will find the outline for the MP3's.  You can use this to help guide you
through the tutorials.
Part 1 [56:40 - 10.2M]: Introduction, FRU's, Parts, etc. Part 2 [52:56 - 9.5M]: Patching, Firmware, DR, AP

The Outline
Audio Commentary for Sun Midrange Servers:
-----------------------------------------

Warnings and Disclaimers
Lame Story
What is the Midrange?
What are the other classes of systems?
	- Workstation/Desktop
	- Workgroup (220R, 250, SunFire 280R, 440R, 450)
			( -R = Rack)
	- Midrange
	- High End
Specs of the Midrange:
	3000/3500 = 5 Slot, 8 Proc, w/Internal FC Disks	
	4000/4500 = 8 Slot, 14 Proc
	5000/5500 = Same as 4000 (Yes! They exsist! http://www.sun.com/servers/midrange/e5500/index.html)
	6000/6500 = 16 Slot, 30 Proc
Price of the Midrange: 
	 [ w/400Mhz CPU ]
	3500: 48k (2 Proc/1 CPU/1 IO)
		166k (6 Proc/3 CPU/2 IO)
	4500: 132k (4 Proc/2 CPU/2 IO) 		Small Config
		314k (12 Proc/6 CPU/2 IO)	Large Config
	6500: 291k (8 Proc/4 CPU/2 IO)
		665k (24 Proc/12 CPU/4 IO)
Comparison: 
	StarFire: 1.151M (20 Proc, 10G Mem)
			1.7 (36 Proc, 18G Mem)
		
Componants:
	Chassis
	Peripheral Power Supply (PPS)
	PCMs
	Clock Board
	CPU Board
	IO Board 	
	Internal Dev (CD/Tape)
	Key

------------------------------------------------------------------
Key Componanats:
	Clock Board:
		Keyboard/Mouse Connectors
		Serial A/B Connectors
		Reset Buttons
		PROM
		Central TOD (Time of Day) and Proc Clock
		Board IS NOT hot plugable
	CPU Board:
		2 Procs/Board
		2G Max/Board
		CPU's are torqued
		Mem has interwieved slots
		Mixing Mem is bad.
		NEW: 2G Mem Modules for Midrange! 16K/SIMM.
	IO Board:
		3 Types of Board:
			- SBus(+) I/O Board 
			- Graphics(+) I/O Board w/FC
			- PCI I/O Board
		Official (Undocumented) Board Names:
			- I/O Graphics Board = Type 1
			- I/O Graphics+ Board = Type 2
			- Dual PCI I/O Board = Type 3
			- Dual SBus I/O Board = Type 4
			- Dual SBus+ I/O Board = Type 5
		Standard Services:
			SBUS and Graphics:
				-MII Connector (external tx)
				-FastEther Connector
				-Fast/Wide SCSI-2 (20MB/s)
				-2x GBIC FC-AL Slots
				- +Sbus slots (and/or Grahpics)
			PCI Board:
				-Fast Ether
				-Fast/Wide SCSI-2 (20MB/s) 
		Blanks and Load Boards 35/45 vs 6500
		Buses/SYSIO ASICs (Appendix B)
			-Slot 0, SCSI, FastEther = SBus1 = SYSIO B
			-Slot 1, 2, FC = SBus0 = SYSIO A 
		Note on FC, GBIC 0 is ALWAYS on the Right
		SCSI Notes: First SCSI Must be termed. Internal
			devs are on first SCSI.
	Disk Boards:
		Cabled to a external SCSI
		
	Power Supplied:
		Monitored via prtdiag:
			PCM's = Supply 0 - X
			PPS = Peripheral Power Supply
			AC Power = Plug	
		PCM powers each Pair of Boards
		PCMs back eachother up.
		PPS powers internal devs (CD/Tape/Key Fan, etc)
		PPS and PCMs share power channel on backplane for redunc.
		PPS and PCMs are hot swappable.
			- Check Manual Matrax to see if your hotswappable.
			 	(Page 7-9)

	Skins:
		Side Skins are removed by just slidding 'em
		From bezel is removed by pushing in on bottom sides, and
			rotating up.
	
	Board LEDs:
		Normal Operation: Left Solid, Center Out, Right Flashing
			Left = DC Power
			Center = Fault
			Right = Running/Status
		Check Manual Matric Page 9-4 for LEDs
	
	Key Switch Position:
		From Left, Counterclockwise:
			Standby: Off
			On: Normal Start
			Diag: Normal Start, but uses kernel specified
				by OBP var "diag-file"
				Corisponds to OBP var "diag-switch"
			(See Sun InfoDoc 21434)
			Secure/Locked: No Flashes, No Breaks
		Key Position CAN change while live.

	
----------------------------------------------------------------
Diags Tools:
	
	prtdiag: (/usr/platform/sun4u/sbin)
		For Help See Man Page or Sun InfoDoc 23479
		Monitors all systems for faults	
	
	OBP Vars: diag-level OBP var
		- "min" little POST output, 5-10 minute boots
		- "max" lots o POST output, 10-20 minute boots

	SunVTS: Preforms stress tests and validation	
		Should be run on ALL new systems, for 48hours for
			breakin
		Most VARs will do this for you.

	SyMon/Sun Management Center (ie:Sun "MC")

	Processor Control: psrinfo -v for info and status.
			   psradm to disable/enable procs.	

	prtconf: Shows system configuration	
	
	luxadm: Show fc-al setup
	
----------------------------------------------------------------
Patching and Firmware:
	
	Patch ID: 103346 (current rev -27)	
	
	Flash-Prom Updates mod the following:
		- OBP (for CPU Boards)
		- POST (for CPU Boards)
		- FCode (for I/O Boards)
		- iPOST (for I/O Boards)
	
	All Midrange systems share the same flash-prom patch.
	When Flashing all NVRAM (OBP Vars) "can" be lost.

	OBP/POST/FCode/iPOST revs can be checked w/"prtdiag -v"	

	Quick checks can be done by checking the last number, in the
		3 dot seperated version number.  The last 2 digits
		corispond to the patch rev.  (ie: OBP 3.2.26, POST 
		3.9.26 means last patch update was 103346-26)

	Process: 	Run "flash-update-xx"	
			Check revs from output
			Let prog install new firmware, watch process:
				Erase, Verify, Program, Verify... next
			POWER CYCLE (!!)
			Check/Re-edit PROM
	
	Diffrent Board PROM/Firmware versions can be brought up-to-date
		from OBP with "update-proms" command.

	On this note: Common OBP error "Clock TOD does not match TOD
		on any IO boards"
		- Fix this problem from OBP via the command:
		 	"copy-clock-tod-to-io-boards"
		(for more info check SunSolve SRDB ID: 14006)
			
-----------------------------------------------------------------
HotSwap/HA:
	
	Dynamic Reconfiguration (DR):
		Doc'ed in: Solaris X Hardware Answerbook
		Keeps getting better each OS....
			Dependant on OS/Kernel NOT Hardware/Patch
		Installed by default (generally)
		Controlled via "cfgadm"
		 - Check manpage
		cfgadm run with no args displays board
			status.
		For removablility check w/: cfgadm -v -l
		 ("non-detachable means it can't be removed)
	 	cfgadm break down:
			Recpetical: Slot.  Inserted or not? (physical)
			Occupant: Configured or Un-? (logical)
		3 Part Proccess: Insertion, Connection and Configuration
		To Add a Board:
			DR Must be enabled
			cfgadm -v -c insert sysctlXX:slotXX
			 Inserts the boarrd
			cfgadm -v -c connect sysctlXX:slotXX
			 Connects board/Powers Up
			cfgadm -v -c configure sysctlXX:slotXX
			 Adds board to system
			drvconfig;devlinks;ports;tapes;disks
			 Adds new devices and manages char/block devs.
	 	To Remove a Board:
			DR Must Be enabled
			cfgadm -v -c unconfigure sysctlXX:slotXX
			 Unconfigures the board, removes from OS
			cfgadm -v -c disconnect sysctlXX:slotXX
			 Powers board down.
			cfgadm -v -c remove sysctlXX:slotXX
			Now you can remove the board.
		Requirement Page found at:
			sunsolve2.sun.com/sunsolve/enterprise-dr/
		Requirements to make DR work:
			- /etc/system file mods: (Solaris 2.6)
				(For IO Boards:)
				set soc:soc_enable_detach_suspend=1 
       				set pln:pln_enable_detach_suspend=1 
       				set socal:socal_enable_suspend=1
			- For 7 and 8: /etc/system:
				(For IO Boards:)
				set soc:soc_enable_detach_suspend=1 
       				set pln:pln_enable_detach_suspend=1	
				(For CPU Boards:)
				set kernel_cage_enable=1
			- For 7 and 8 OBP EPROM var changes:
				memory-interleave=min
		Restrictions:
			-Can't remove first CPU board.
			-Can't remove central IO Board (power to clockboard)
			-Bend a pin and your crash
			-Insert to SLOWLY and panic
			-Inserting a failed board will crash system
		Always be looking for this bootime message:
			"Hot Plug not supported in this system"

AP:
		Allows DR to work with IO boards....
		Check the OS Hardware Referance for AP manuals...
		

------------------------------------------------------------
Other Info:
	
	See Sun InfoDoc 16184: "Enterprise systems troubleshooting guide"
	See Sun InfoDoc 23476: "Hardware Diagnostics for Sun Systems"
	See Sun Feature: "Using Device Path Names to Identify System Devices"