Coding Relic: CPE WAN Management Protocol: transaction flow

Technical Report 69 from the Broadband Forum is a management protocol called the CPE WAN Management Protocol (CWMP). It was first published in 2004, revised a number of times since, and aimed at the operation of DSL modems placed in customer homes. Over time it has broadened to support more types of devices which an Internet Service Provider might operate outside of its own facilities, in the residences and businesses of its customers.

There are a few key points about CWMP:

It was defined during the peak popularity of the Simple Object Access Protocol (SOAP). CWMP messages are encoded as SOAP XML.
Like SNMP and essentially every other network management protocol, it separates definition of the protocol from definition of the variables it manages. SNMP calls them MIBs, CWMP calls them data models.
It recognizes that firewalls will be present between the customer premises and the ISP, and that the ISP can expect to control its own firewall but not necessarily other firewalls between it and the customer.
It makes a strong distinction between the Customer Premises Equipment (CPE) being managed, and the Auto Configuration Server (ACS) which does the managing. It does not attempt to be a generic protocol which can operate bidirectionally, it exists specifically to allow an ACS to control CPE devices.

A few years ago I helped write an open source tr-69 agent called catawampus. The name was chosen based mainly on its ability to contain the letters C W M P in the proper order. I’d like to write up some of the things learned from working on that project, in one or more blog posts.

Connection Lifecycle

One unusual thing about CWMP is connection management between the ACS and CPE. Connections are initiated by the CPE, but RPC commands are then sent by the ACS. Keeping with the idea that it is not a general purpose bidirectional protocol, all commands are sent by the ACS and responded to by the CPE.

tr-69 runs atop an HTTP (usually HTTPS) connection. The CPE has to know the URL of its ACS. There are mechanisms to tell a CPE device what ACS URL to use, for example via a DHCP option from the DHCP server, but honestly in almost all cases the URL of the ISP’s ACS is simply hard-coded into the firmware of devices supplied by the ISP.

Thus:

The CPE device in the customer premises initiates a TCP connection to the ACS, and starts the SSL/TLS handshake. Once the connection is established, the CPE sends an Inform message to the ACS using an HTTP POST. This is encoded using SOAP XML, and tells the ACS the serial number and other information about the CPE in the <DeviceId> stanza.

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:cwmp="urn:dslforum-org:cwmp-1-2"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:soap-enc="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <soap:Header>
    <cwmp:ID soap:mustUnderstand="1">catawampus.1529004153.967958</cwmp:ID>
  </soap:Header>
  <soap:Body>
    <cwmp:Inform>
      <DeviceId>
        <Manufacturer>CatawampusDotOrg</Manufacturer>
        <OUI>ABCDEF</OUI>
        <ProductClass>FakeCPE</ProductClass>
        <SerialNumber>0123456789abcdef</SerialNumber>
      </DeviceId>
      <Event soap-enc:arrayType="EventStruct[1]">
        <EventStruct>
          <EventCode>0 BOOTSTRAP</EventCode\>
        </EventStruct>
      </Event>
      <CurrentTime>2018-06-14T19:34:47.297063Z</CurrentTime>
      <ParameterList soap-enc:arrayType="cwmp:ParameterValueStruct[1]">
        <ParameterValueStruct>
          <Name>InternetGatewayDevice.ManagementServer.ConnectionRequestURL</Name>
          <Value xsi:type="xsd:string">http://[redacted]:7547/ping/7fd86a7302ec5f</Value>
        </ParameterValueStruct>
      </ParameterList>
    </cwmp:Inform>
  </soap:Body>
</soap:Envelope>

Several fields are highlighted above: the EventCode tells the ACS why the CPE device is connecting. It might have just booted, it might be a periodic connection at a set interval, or it might be because of an exceptional condition. The ParameterList, also highlighted, is a list of parameters the CPE can include to tell the ACS about exceptional conditions.

The ACS sends back an InformResponse in response to the POST.

<soapenv:Envelope
    xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:cwmp="urn:dslforum-org:cwmp-1-2"
    xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soapenv:Header>
    <cwmp:ID soapenv:mustUnderstand="1">catawampus.1529004153.967958</cwmp:ID>
    <cwmp:HoldRequests>0</cwmp:HoldRequests>
  </soapenv:Header>
  <soapenv:Body>
    <cwmp:InformResponse>
      <MaxEnvelopes>1</MaxEnvelopes>
    </cwmp:InformResponse>
  </soapenv:Body>
</soapenv:Envelope>

If the CPE has other conditions to communicate to the ACS, such as successful completion of a software update, it performs additional POSTs containing those messages. When it has run out of things to send, it does a POST with an empty body. At this point the ACS takes over. The CPE continues sending HTTP POST transactions with an empty body, and the ACS sends a series of RPCs to the CPE in the response. There are RPC messages to get/set parameters, schedule a reboot or software update, etc. All transactions are sent by the ACS and the CPE responds.

<soapenv:Envelope
    xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
    xmlns:cwmp="urn:dslforum-org:cwmp-1-2"
    xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soapenv:Header>
    <cwmp:ID soapenv:mustUnderstand="1">TestCwmpId</cwmp:ID>
  </soapenv:Header>
  <soapenv:Body>
    <cwmp:SetParameterValues>
      <ParameterList>
        <ns2:ParameterValueStruct xmlns:ns2="urn:dslforum-org:cwmp-1-2">
          <Name>StringParameter</Name>
          <Value xmlns:xs="http://www.w3.org/2001/XMLSchema" xsi:type="xs:string">param</Value>
        </ns2:ParameterValueStruct>
      </ParameterList>
      <ParameterKey>myParamKey</ParameterKey>
    </cwmp:SetParameterValues>
  </soapenv:Body>
</soapenv:Envelope>

The ACS can send multiple RPCs in one session with the CPE. Only one RPC can be outstanding at a time, the ACS has to wait for a response from the CPE before sending the next.

When the session ends, it is up to the CPE to re-establish it. One of the parameters in a management object is the PeriodicInformInterval, the amount of time the CPE should wait between initiating sessions with the ACS. By default it is supposed to be infinite, meaning the CPE will only check in once at boot and the ACS is expected to set the interval to whatever value it wants during that first session. In practice we found that not to work very well and set the default interval to 15 minutes. It was too easy for something to go wrong and result in a CPE which would be out of contact with the ACS until the next power cycle.

There is also a mechanism by which the ACS can connect to the CPE on port 7547 and do an HTTP GET. The CPE responds with an empty payload, but is supposed to immediately initiate an outgoing session to the ACS. In practice, this mechanism doesn't work very well because intervening firewalls, like the ISP's own residential gateway within the home, will often block the connection. This is an area where the rest of the industry has moved on: we now routinely have a billion mobile devices maintaining a persistent connection back to their notification service. CPE devices could do something similar, perhaps even using the same infrastructure.

Friday, June 15, 2018

CPE WAN Management Protocol: transaction flow

Connection Lifecycle