Calculating ifInOctets

SNMP is a great utility for systems monitoring. Some systems support SNMP better than others. I was working on an gateway product and ran into the limitations of OID (ifInOctets). This also applies to (ifOutOctets), both of which are provided in RFC 1213; ifInOctets is described as:

the total number of octets received on the interface, including framing characters.

To take the counter and turn it into useful information, we have to compare it to the previous value and run some calculations. We'll want to express the value as a measure of bits-per-second, so we need to keep track of the time between our polling periods and because the value is expressed in octets we multiply by eight to convert to bits. This results in the following calculation:

(ifInOctetsCurrent - ifInOctetsPrevious) * 8 / pollingSeconds
//Gives us bits/second (bps), which then would be converted
//for unit to make it more legible. This provides a measure
//of the incoming bandwidth used by an interface.

That's great, and it works, but we have another concern. ifInOctets is expressed as 32-bit unsigned integer. 232 = 4,294,967,296. Remembering zero, this provides a maximum range of 0-4,294,967,295, which means 232 octets can express a maximum value of about 34 gigabits. If we have a very fast network connection with a sustained throughput, this value limitation will impact our polling, which is how I came to be aware of this limitation.

I was trying to track a gateway sustaining 74-92Mbps, but about every 6 minutes I would get an invalid reading. 92Mbps *60seconds *6minutes = about 33 gigabits. What was happening was ifInOctets was rolling over and I was trying to compare two incompatible values!

RFC 2233 provides an excellent description:

As the speed of network media increase, the minimum time in which a 32 bit counter will wrap decreases. For example, a 10Mbs stream of back-to-back, full-size packets causes ifInOctets to wrap in just over 57 minutes; at 100Mbs, the minimum wrap time is 5.7 minutes, and at 1Gbs, the minimum is 34 seconds. Requiring that interfaces be polled frequently enough not to miss a counter wrap is increasingly problematic.

What would be nice is if SNMP-II would have provided an OID to track the number of times ifInOctets rolled over. But instead, it provides a new ifHCInOctets (high capacity) value, which is expressed as a 64bit unsigned integer. The most troubling part about the ifHCInOctets value is that not all systems support it. Disappointingly, the system I'm working on, which has a 300M fiber uplink, does not support the high capacity table. If this system ran at capacity, ifInOctets would roll over about every 1.9 minutes. I'm therefore forced to poll this system every minute for the sake of accuracy in reporting.

As a point of comparison, 264 is a massive number -- 1.8x1019, but 23232 (232 with a counter value for the number of rollovers) is a much more massive 1.797x10308. If SNMP-II had used a simple rollover counter alongside the original 32-bit number, it would've supported not only a massive increase in bandwidth accounting, but this rollover also could've been used to identify system reboots or any other concern which may have caused the counter(s) to reset to 0.

The problem at hand is: how to calculate bits/second when ifInOctets rolls over?

var current    = ifInOctetsCurrent
var previous   = ifInOctetsPrevious
var pollPeriod = time //seconds between polling
var maxValue   = 4294967295 //2^32-1

var bps // bits/second -- our target value.

if(current < previous){
  //counter rolled over; need to adjust value.
  current = maxValue + current

  //caution - don't store the adjusted ${current}
  //value to ${previous} in preparation for
  //the next cycle! Use the original value!

bps = (current - previous) * 8 / pollPeriod

There's one more thing we want to watch out for -- a system reboot, which will cause a counter reset. We should track uptime ( alongside counters so they can be computed correctly when reboots occur..... except that uptime is itself a 32bit counter of timeticks (1/100 of a second). The precision of this counter is unnecessary and causes a 32bit unsigned integer to roll its value every 496 days. Here's a good forum post on the matter. Spoiler: use snmpEngineTime ( instead and hope your device doesn't have a buggy snmp engine.

And just FYI, RFC 1155 says:  Counter

   This application-wide type represents a non-negative integer
   which monotonically increases until it reaches a maximum
   value, when it wraps around and starts increasing again
   from zero.  This memo specifies a maximum value of
   2^32-1 (4294967295 decimal) for counters.