[snmp] SNMP v3 discovery bug in 5.1

Tim Panton thp at westhawk.co.uk
Fri Sep 14 18:14:00 BST 2007


Ah, that sounds right (it is definitely running out of transmitters).

Thanks!

I'll look at your options, but I think 1 looks better at first glance.

I'll discuss it with Birgit when she is back on Monday.

For future reference, there is function I  use to debug transmitter  
problems,
AbstractContext.getDebugString()
here's a snippet from the javadoc.

  * Returns the thread usage of the AbstractSnmpContext.
* It returns a String in the form of <code>=PO=QR--------------0</code>.
*
* <p>
* The String represents the array of transmitters.
* Each character represents a transmitter slot.
* The transmitters form a thread pool of a maximum size, MAXPDU.
* Each transmitter is used to wait for one PDU response at a given
* moment in time.
* When the response is received the transmitter will stop running, but
* is not destroyed. It will be reused.
* </p>

I'm not sure how much this would have helped you, but it can
save a lot of stepping through in the debugger.

Tim.


On 14 Sep 2007, at 16:41, Josh Bers wrote:

> Tim and Birgit,
>
> I don't think that the latest code will fix it since 5.1 includes  
> version
> 1.17 of UsmDiscoveryBean.java.
>
> However, I believe that I have found the problem. It is in
> SnmpContextv3Basis.java#addPdu(Pdu, boolean).
> Here is the problem (it is one of transmitter thread starvation):
>
> This method first calls super.addPdu(pdu). The parent method  
> reserves a
> Transmitter, however, this is risky since discovery has yet to be  
> performed.
> When discovery is performed, a timeout PduException is thrown,  
> however, it
> is not caught by the stack to clean up the transmitter assignment with
> removePdu(reqId).
>
> So the fix could be:
>
> 1. Move discovery in front of calling super.addPdu that way an  
> exception
> will not result in a tied up transmitter in the context.
>
> 2. Catch the PduException thrown by discovery and call removePdu 
> (reqId)...
>
> My preference is toward 1, where the ordering of creating and  
> sending out a
> req id is consistent: discovery req id is < original PDU req id.  
> The way the
> code currently operates it sends out the discovery req with id >  
> original
> req id which is then sent later... Somewhat confusing.
>
> What do you think?
>
> Josh
>
>> -----Original Message-----
>> From: Tim Panton [mailto:thp at westhawk.co.uk]
>> Sent: Friday, September 14, 2007 5:04 AM
>> To: List for discussion of the Westhawk SNMP stack; Josh Bers
>> Cc: 'Birgit Arkesteijn'; 'Jonathan Tung'; 'Stephane Blais'
>> Subject: Re: [snmp] SNMP v3 discovery bug in 5.1
>>
>>
>>
>> On 13 Sep 2007, at 21:27, Josh Bers wrote:
>>
>>> Birgit,
>>>
>>> Here is output (tail end) from our run of a program that
>> isolates the
>>> problem of SNMP v3 not working correctly when an agent that is
>>> unreachable,
>>> becomes reachable. We started with the network cable unplugged and
>>> tried to
>>> fetch sysUpTime from 128.89.68.66 , every 5 seconds we kick off a
>>> thread to
>>> fetch (authPriv enabled). Around request #35 we plug it back in
>>>
>>> (search for Thread 41 started at: 1189713664742)
>>>
>>> It seems that there is an issue with context being destroyed while
>>> still
>>> sending/receiving PDU's from the remote agent.... not clear. Why
>>> isn't the
>>> DiscoveryBean using the context pool mechanism and creating
>> its own
>>> context
>>> directly?
>>>
>>> ps. this was run using westhawk 5.1 from a windows XP box with java
>>> 1.4.2
>>> talking to a linux box:
>>
>> Hi,
>> I think this bug is fixed in the current CVS -
>> see :
>>
>> http://westhawksnmp.cvs.sourceforge.net/westhawksnmp/westhawks
> nmp/src/
> uk/co/westhawk/snmp/beans/UsmDiscoveryBean.java?r1=1.16&r2=1.17
>
> Tim.
>
>



More information about the snmp mailing list