IPMI & SMASH Agentless Server Standards

    January 24, 2006

Many vendors and industry analysts have noted that the biggest expense for IT is in the ongoing network and systems management.

These costs are continuing to scale upward, forcing IT managers and system vendors to look at new approaches. There are an abundance of technologies available at the software and/or operating system level to help network managers maximize uptime of their servers, but these typically come at a higher purchase and/or operational cost. In response, vendors and standards organizations have been working hard to develop common embedded – or agentless – technologies that can help. By offering a consistent command line interface to complementary management features that are typically pre-integrated for free by the system vendor, companies can begin to control operational expenses.

First, Buy Servers that include IPMI to Standardize Hardware Management

When considering server management, typical solutions have tended to focus on software agent technology. An additional area of opportunity to increase the flexibility of this approach is to look at agentless management. One such standard is a low-cost approach to increasing the availability of a server at the hardware level – the Intelligent Platform Management Interface (IPMI). IPMI defines how administrators monitor system hardware and sensors, control system components and retrieve logs of important system events to conduct remote management and recovery. IPMI monitors hardware health conditions like temperature, fan, voltage, hardware errors (memory, network, etc.) and chassis intrusion.

The foundation of IPMI lies in specialized firmware that runs on a dedicated chip/controller – sometimes referred to as a Service Processor or BMC (Baseboard Management Controller) that is typically on the system motherboard, or blade. This approach creates an agentless management subsystem that runs separately within the system – independent of the type or condition of CPUs, the BIOS and the OS. These autonomous’ characteristics remove limitations encountered with all OS-dependent management agents (agent-based approach), such as when the OS is not responding or is not loaded. And because IPMI is almost always pre-integrated, the cost-to-benefit ratio for using IPMI offers a great opportunity for IT shops to control costs.

All IPMI functions are accomplished by sending commands to the BMC, using standardized instructions identified in the specification. Messages all use a request/response protocol and commands are grouped by functionality. The BMC receives and logs event messages in the System Event Log (SEL), and maintains a Sensor Data Record (SDR) that describes the sensors in a system.

There are an abundance of features in IPMI that are well suited to remote manageability. For example, when remote access to the system text console is required, the Serial Over LAN (SOL) feature can be very useful. SOL redirects the local serial interface over an IPMI Session allowing remote access to the Emergency Management Console (EMS) Special Administration Console (SAC) for Windows, or to the LINUX serial console. This is accomplished by the IPMI firmware intercepting the data, then resending this information destined for the serial port over the LAN. This offers a standard way to remotely view the BOOT, OS Loader or Emergency Management consoles, irrespective of vendor, to diagnose and repair server-related issues. It also allows configuration of various components during the BOOT phase.

Administrators can also use IPMI to proactively monitor the health of components so as to ensure preset thresholds, for example server temperatures, are not exceeded. This aids IT in maintaining uptime by avoiding unscheduled outages. Remember that IPMI’s autonomous implementation allows it to function no matter the condition of other devices or components (so long as the NIC is working and power is available within the server). Messages can be sent to dispatch technicians while IPMI is able to monitor and control other system components to minimize overall system impact. IPMI’s predictive failure capabilities aid in IT lifecycle management as well. By examining the System Event Log (SEL), predicting failing components can be more easily determined.

This real-time view of the health of a server compliments existing agent-based monitoring applications. An administrator can view a server’s operational status via a centrally located console or application, typically provided by the server vendor or supplied as a feature within an existing management application. An example is the IPMISH utility that Dell provides with their PowerEdge servers. This IPMI-enabled monitoring application communicates directly with the IPMI firmware so is always available to the remote administrator to view event logs and sensor information.

IPMI’s hardware monitoring features also provides additional levels of security. Chassis intrusions can be detected by configuring IPMI to detect if the server has been opened. And, the use of multi-layer privileges and passwords together with authentication and on-the-wire encryption lets IT managers allow or deny access to specific IPMI features.

Finally, Standardize the Command Line Interface – SMASH

In emergency and/or ad/hoc situations however, system administrators often need to interactively manage various systems using a specific command. The challenge however is that servers from different vendors use different commands to do the same task. Enter SMASH. The Systems Management Architecture for Server Hardware (SMASH) is a recent development that enables administrators to use a consistent command line interface to server monitoring and management tasks irrespective of the vendor. The Distributed Management Task Force (DMTF) released the new SMASH Command Line Protocol (CLP) standard in June 2005 to address the lack of this command line consistency to management and monitoring information in heterogeneous server environments.

It is useful to think of IPMI and SMASH going hand-in-hand. Even though IPMI offers a standardized message interface outside of the server to various applications/consoles, the separate server command line interface can vary from vendor to vendor. Think of SMASH then as the primary command line interface (CLI) available out of the box, accessing the IPMI hardware health management features in the box. This command consistency across different vendor’s servers or blades is important to understand, not just for reducing the learning curve for managing new systems, but especially when considering the investment in scripts that many IT shops have made. Adopting SMASH will aid in ensuring consistent server scripts moving forward. Servers that have support for SMASH offer administrator access by using generic telnet and/or SSH software clients to open interactive sessions. Once logged in (using the server’s native security features) they can use the SMASH “SHOW” command to list the system resources that can be managed by the SMASH “START’ and “STOP” commands. Some examples follow:

stop /system1 – power down server discovered as system1
show /system1 – display all the managed elements for system1
start alarm1 – switch on the front led on the current/default server
show log1/record*- display all the system event log records (from IPMI SEL)
start textredirectsap1 – view the system/OS console (can be provided by IPMI Serial Over LAN, or SOL, function)

Bottom Line: IPMI + SMASH = Lower Server Management Costs

Overall, by using servers with IPMI and SMASH, IT can lower ongoing operational costs by:

Offering a consistent command line that changes little over time – reducing training requirements which reduce mistakes

Using fewer scripts that now perform management tasks across multiple server vendors

Buying fewer management tools – which lowers purchase and training costs

Predicting hardware failures to schedule downtime during non-peak hours

Arriving at the remote site with the right parts by diagnosing the issue before dispatching service personnel

Reducing the Mean Time to Repair (MTTR)

More than 50 percent of servers shipped in 2005 had IPMI pre-integrated – available right out of the box for free. Well known vendors that are market leaders in their fields, such as Avocent, Dell, HP, IBM, and Intel have offered IPMI pre-integrated within their products for some time now. When considering your next server or blade purchase, choose vendors that include IPMI and that have a stated direction for delivering SMASH-based components and technologies. In fact, many of these vendors will be offering SMASH as early as 2006.

The complete review specification for IPMI is available at the Intel IPMI website: http://developer.intel.com/design/servers/ipmi/spec.htm

More information on the DMTF can be found at http://www.dmtf.org

More information on the SMASH command line can be found at http://www.dmtf.org/standards/published_documents/DSP0214.pdf

Steve Rokov is Director of Technical Marketing for Avocents Embedded Software & Solutions (AESS) Group. He can be reached via email at steve.rokov@avocent.com.