Thursday, February 9, 2012

ATA Commands in Python

In software development sometimes you spend time on an implementation which you are unreasonably proud of, but ultimately decide not to use in the product. This is one such story.

I needed to retrieve information from an attached disk, such as its model and serial number. There are commands which can do this, like hdparm, sdparm, and smartctl, but initially I tried to avoid building in a dependency on any such tools by interrogating it from the hard drive directly. In pure Python.

Snicker all you want, it did work. My first implementation used an older Linux API to retrieve this information, the HDIO_GET_IDENTITY ioctl. This ioctl maps more or less directly to an ATA IDENTIFY or SCSI INQUIRY from the drive. The implementation uses the Python struct module to define the data structure sent along with the ioctl.

def GetDriveId(dev):
  """Return information from interrogating the drive.

  This routine issues a HDIO_GET_IDENTITY ioctl to a block device,
  which only root can do.

  Args:
    dev: name of the device, such as 'sda' or '/dev/sda'

  Returns:
    (serial_number, fw_version, model) as strings
  """
  # from /usr/include/linux/hdreg.h, struct hd_driveid
  # 10H = misc stuff, mostly deprecated
  # 20s = serial_no
  # 3H  = misc stuff
  # 8s  = fw_rev
  # 40s = model
  # ... plus a bunch more stuff we don't care about.
  struct_hd_driveid = '@ 10H 20s 3H 8s 40s'
  HDIO_GET_IDENTITY = 0x030d
  if dev[0] != '/':
    dev = '/dev/' + dev
  with open(dev, 'r') as fd:
    buf = fcntl.ioctl(fd, HDIO_GET_IDENTITY, ' ' * 512)
    fields = struct.unpack_from(struct_hd_driveid, buf)
    serial_no = fields[10].strip()
    fw_rev = fields[14].strip()
    model = fields[15].strip()
    return (serial_no, fw_rev, model)

No no wait, stop snickering, it does work! It has to run as root, which is one reason why I eventually abandoned this approach.

$ sudo python hdio.py
('5RY0N6BD', '3.ADA', 'ST3250310AS')

HDIO_GET_IDENTITY is deprecated in Linux 2.6, and logs a message saying that sg_io should be used instead. sg_io is an API to send SCSI commands to a device. sg_io also didn't require my entire Python process to run as root, I'd "only" have to give it CAP_SYS_RAWIO. So of course I changed the implementation... still in Python. Stop snickering.


class AtaCmd(ctypes.Structure):
  """ATA Command Pass-Through
     http://www.t10.org/ftp/t10/document.04/04-262r8.pdf"""

  _fields_ = [
      ('opcode', ctypes.c_ubyte),
      ('protocol', ctypes.c_ubyte),
      ('flags', ctypes.c_ubyte),
      ('features', ctypes.c_ubyte),
      ('sector_count', ctypes.c_ubyte),
      ('lba_low', ctypes.c_ubyte),
      ('lba_mid', ctypes.c_ubyte),
      ('lba_high', ctypes.c_ubyte),
      ('device', ctypes.c_ubyte),
      ('command', ctypes.c_ubyte),
      ('reserved', ctypes.c_ubyte),
      ('control', ctypes.c_ubyte) ]


class SgioHdr(ctypes.Structure):
  """<scsi/sg.h> sg_io_hdr_t."""

  _fields_ = [
      ('interface_id', ctypes.c_int),
      ('dxfer_direction', ctypes.c_int),
      ('cmd_len', ctypes.c_ubyte),
      ('mx_sb_len', ctypes.c_ubyte),
      ('iovec_count', ctypes.c_ushort),
      ('dxfer_len', ctypes.c_uint),
      ('dxferp', ctypes.c_void_p),
      ('cmdp', ctypes.c_void_p),
      ('sbp', ctypes.c_void_p),
      ('timeout', ctypes.c_uint),
      ('flags', ctypes.c_uint),
      ('pack_id', ctypes.c_int),
      ('usr_ptr', ctypes.c_void_p),
      ('status', ctypes.c_ubyte),
      ('masked_status', ctypes.c_ubyte),
      ('msg_status', ctypes.c_ubyte),
      ('sb_len_wr', ctypes.c_ubyte),
      ('host_status', ctypes.c_ushort),
      ('driver_status', ctypes.c_ushort),
      ('resid', ctypes.c_int),
      ('duration', ctypes.c_uint),
      ('info', ctypes.c_uint)]

def SwapString(str):
  """Swap 16 bit words within a string.

  String data from an ATA IDENTIFY appears byteswapped, even on little-endian
  achitectures. I don't know why. Other disk utilities I've looked at also
  byte-swap strings, and contain comments that this needs to be done on all
  platforms not just big-endian ones. So... yeah.
  """
  s = []
  for x in range(0, len(str) - 1, 2):
    s.append(str[x+1])
    s.append(str[x])
  return ''.join(s).strip()

def GetDriveIdSgIo(dev):
  """Return information from interrogating the drive.

  This routine issues a SG_IO ioctl to a block device, which
  requires either root privileges or the CAP_SYS_RAWIO capability.

  Args:
    dev: name of the device, such as 'sda' or '/dev/sda'

  Returns:
    (serial_number, fw_version, model) as strings
  """

  if dev[0] != '/':
    dev = '/dev/' + dev
  ata_cmd = AtaCmd(opcode=0xa1,  # ATA PASS-THROUGH (12)
                   protocol=4<<1,  # PIO Data-In
                   # flags field
                   # OFF_LINE = 0 (0 seconds offline)
                   # CK_COND = 1 (copy sense data in response)
                   # T_DIR = 1 (transfer from the ATA device)
                   # BYT_BLOK = 1 (length is in blocks, not bytes)
                   # T_LENGTH = 2 (transfer length in the SECTOR_COUNT field)
                   flags=0x2e,
                   features=0, sector_count=0,
                   lba_low=0, lba_mid=0, lba_high=0,
                   device=0,
                   command=0xec,  # IDENTIFY
                   reserved=0, control=0)
  ASCII_S = 83
  SG_DXFER_FROM_DEV = -3
  sense = ctypes.c_buffer(64)
  identify = ctypes.c_buffer(512)
  sgio = SgioHdr(interface_id=ASCII_S, dxfer_direction=SG_DXFER_FROM_DEV,
                 cmd_len=ctypes.sizeof(ata_cmd),
                 mx_sb_len=ctypes.sizeof(sense), iovec_count=0,
                 dxfer_len=ctypes.sizeof(identify),
                 dxferp=ctypes.cast(identify, ctypes.c_void_p),
                 cmdp=ctypes.addressof(ata_cmd),
                 sbp=ctypes.cast(sense, ctypes.c_void_p), timeout=3000,
                 flags=0, pack_id=0, usr_ptr=None, status=0, masked_status=0,
                 msg_status=0, sb_len_wr=0, host_status=0, driver_status=0,
                 resid=0, duration=0, info=0)
  SG_IO = 0x2285  # <scsi/sg.h>
  with open(dev, 'r') as fd:
    if fcntl.ioctl(fd, SG_IO, ctypes.addressof(sgio)) != 0:
      print "fcntl failed"
      return None
    if ord(sense[0]) != 0x72 or ord(sense[8]) != 0x9 or ord(sense[9]) != 0xc:
      return None
    # IDENTIFY format as defined on pg 91 of
    # http://t13.org/Documents/UploadedDocuments/docs2006/D1699r3f-ATA8-ACS.pdf
    serial_no = SwapString(identify[20:40])
    fw_rev = SwapString(identify[46:53])
    model = SwapString(identify[54:93])
    return (serial_no, fw_rev, model)

For the unbelievers out there, this one works too.

$ sudo python sgio.py
('5RY0N6BD', '3.ADA', 'ST3250310AS')

So, there you go. HDIO_GET_IDENTITY and SG_IO implemented in pure Python. They do work, but in the process of working on this and reading existing code it became clear that low level ATA handling is fraught with peril. The disk industry has been iterating this interface for decades, and there is a ton of gear out there that made questionable choices in how to interpret the spec. Most of the code in existing utilities is not to implement the base operations but instead to handle the quirks from various manufacturers. I decided that I didn't want to go down that path, and will instead rely on forking sdparm and smartctl as needed.

I'll just leave this post here for search engines to find. I'm sure there is a ton of demand for this information.

Stop snickering.

Thursday, February 2, 2012

ISC DHCP VIVO config

In addition to its role in assigning IPv4 addresses, DHCP has an options mechanism to send other bits of data between client and server. For example there are options to provide the DNS and NTP server addresses along with the client's assigned IP address. In its request, the client lists the options it would like to receive. The server fills in whichever ones it can.

DHCP has always allowed for vendor extensions of the available options, inheriting this support from the earlier BOOTP protocol. The original vendor extension mechanism was very simple: use option 43, and put whatever you like there. A mechanism was provided to encode multiple sub-options within the option 43 data, but made no attempt to coordinate between different vendors use of the space. Vendors immediately began creating conflicts, using the same numeric codes for different means simultaneously. This led to a variety of heroic hacks in which the client would populate its request with magic values which the server would use to figure out the set of vendor options to supply.

DHCP6 defined a more complex encoding, where each vendor includes their unique IANA Enterprise Number as part of its option. Options from different vendors can be accommodated simultaneously. This Vendor-Identifying Vendor Options (VIVO) encoding was also added back to DHCP4 as options 124 and 125. DHCP4 thus has two separate vendor option mechanisms in common use.


ISC DHCPd

The ISC DHCP server can support VIVO options using several different mechanisms. The first, and so far as I can tell most common, is to specify the byte-level payload. The administrator pores over RFCs and vendor documentation to come up with the magic string of bytes to send and types it into dhcpd.conf, where it immediately becomes magic voodoo that everyone is afraid to touch for fear of breaking something.

Avoiding magic byte strings by specifying the format of the options is more difficult to get working, but easier to maintain and understand. We'll consider an example here.

  • Vendor: Frobozzco
  • IANA Enterprise Number (IEN): 12345
  • Code #1: a text string containing the location within the maze.
  • Code #2: an integer describing the percentage likelihood of being eaten by a grue.
    In practice this is always 100%, which many clients simply hard-code.

To implement this in the DHCP config, we define an option space for frobozzco. This just creates a namespace; we bind that namespace to the numeric IEN later. We have to tell DHCP how wide to make the code and length fields. DHCP4 usually uses 1 byte fields, while DHCP6 generally uses 2 bytes. Most of the time vendors don't specify the width they use, and if so you should assume the normal sizes for the protocol. The example below comes from a DHCP6 config, so the code and length are both declared as two bytes. After we've declared all of the option codes, we bind the frobozzco option space to its numeric IANA Enterprise Number. Use of IEN for DHCP is called the Vendor Specific Information Option, so the syntax in the DHCP configuration labels this vsio.{option space name}

option space frobozzco code width 2 length width 2;
option frobozzco.maze-location code 1 = text;
option frobozzco.grue-probability code 2 = uint8;
option vsio.frobozzco code 12345 = encapsulate frobozzco;

Owing to the long and sordid history of numbering conflicts, most vendor extensions define a secret handshake. The client inserts a specific value into a field in its request to trigger the server to respond with the options for that vendor. Frobozzco has decreed that clients should send the string "look north" as a vendor-class option in the request. A DHCP6 vendor-class consists of the vendor's IEN followed by the content. In our case the content consists of another two byte length field, followed by the string. ISC DHCP 4.x doesn't define a type for handling vendor-class in the config but we can construct one using a record, which is a collection of fields defined inside brackets.

option dhcp6.vendor-class code 16 = {integer 32, integer 16, string};

# length=14 bytes, Frobozzco IEN, content=look north
send dhcp6.vendor-class 12345 14 "look north";

Finally, we have to provide a script for dhclient to run to handle the received options. We'll get to the client script a bit later, for now just assume it is in /usr/local/sbin/dhclient-script. Putting it all together, the dhclient6.conf should look like this.

script "/usr/local/sbin/dhclient-script";

option space frobozzco code width 2 length width 2;
option frobozzco.maze-location code 1 = text;
option frobozzco.grue-probability code 2 = uint8;
option vsio.frobozzco code 12345 = encapsulate frobozzco;

option dhcp6.vendor-class code 16 = {integer 32, integer 16, string};

interface "eth0" {
    also request dhcp6.vendor-opts;
    send dhcp6.vendor-class 12345 10 "look north";
}

dhclient-script

On the client we also must provide the script for dhclient to run. The OS vendor will have provided one, often in /sbin or /usr/sbin. We'll copy it, and add handling.

dhclient passes in environment variables for each DHCP option. The name of the variable is "new_<option space name>_<option name>" For the example config above, we'd define a shell script function to write our two options to files in /tmp.

make_frobozzco_files() {                                                       
  mkdir /tmp/frobozzco                                            
  if [ "x${new_frobozzco_maze_location}" != x ] ; then         
    echo ${new_frobozzco_maze_location} > /tmp/frobozzco/maze_location                 
  fi                                                                      
  if [ "x${new_frobozzco_grue_probability}" != x ] ; then
    echo ${new_frobozzco_grue_probability} > /tmp/frobozzco/grue_probability
  fi                                                             
}

The dhclient-script provided with the OS will have handling for DNS nameservers. Adding a call to make_frobozzco_files at the same points in the script which handle /etc/resolv.conf is a reasonable approach to take.

I'm mostly blogging this for my own future use, to be able to find how to do something I remember doing before. There you go, future me.