Spotted something interesting today / confusing..
This is on Zenoss 4.2.5 ZenUP SUP 732
We've got a type of device that has 10 interfaces on it. These can be listed via an snmpwalk the iftable in the mib and it successfully lists the 10 interfaces.
However, when I modelled it using the zenoss.snmp.InterfaceMap it only found 5 interfaces.
This was consistently repeatable with the snmpwalk showing 10 and zenoss only modelling 5.
After much dissection of code and experimentation I found that the max Repetitions value sent to the snmp collection client was 5 and this made me suspicious that it was having some bearing on my interfaces count. The maxRepetitions is calculated by dividing the device configuration property zMaxOIDPerRequest by the number of oids required to be gathered, so taking the default value for devices for max OID which is 40 and diving it by the number of table oids to be collected for the iftable, which is 8, we get our value of 5 (40/8). The max repetitions seems to control the amount of data passed back in one request from the getbulk call via twistedsnmp.
This makes sense so far, i.e. for collecting interface information on a device the snmp collection will make multiple requests of 5 repetitions each, with each repetitions collecting 8 oids, (8x5 = 40!) to collect all the interface data.
This works as expected for almost all of our devices, some with many interfaces (e.g. 20+ interfaces) that would require multiple requests. But for our problematic devices we are only getting 5 – this suggests that for our problematic devices, only one request is being made whereas for other devices the collection code seems to know to make multiple requests until all the interface data has been collected.
If I set the zMaxOIDPerRequest value to be larger I can successfully get more interfaces on the problematic device, and the number of interfaces successfully modelled can be predictably controlled by changing the maxOIDPerRequest value to multiples of 8 (i.e. set it to 16, we get 2 interfaces modelled, set it to 80, we get 10 interfaces modelled).
My confusion / question is why would a certain type of device have this problem when for others the collection code will happily make multiple requests to collect all the interfaces data.
If there something in the snmp returned that indicates that there is more data to be retrieved? And for some reason our problematic devices are not returning this info? It's the same collection code running over both so there must be some difference at the device end?
Does anyone have any thought on what this could be / experienced this sort of thing before?
(Obviously a quick fix for this is for the problematic devices to set the zMaxOIDPerRequest value to be high enough to collect all the interfaces but this requires knowing how many interfaces there are for each device before modelling – not ideal!)
Yeah, suspicion has moved to the device being problematic. I've not had the opporunity to check for dropped packets etc.. but running some manual snmpbulkwalk requests I get some strange behaviour so this suggests its the device rather than zenoss.
As upping the maxoids per request is a simple solution for this device type and doesn't cause us any issues, we're not investigating much further in terms of why and just running with the higher maxoids value to get all the interfaces.Thanks for the help.