From birgit at westhawk.co.uk Tue Apr 3 19:06:56 2007 From: birgit at westhawk.co.uk (Birgit Arkesteijn) Date: Tue Apr 3 18:12:13 2007 Subject: [snmp] snmpv3 context receiver thread destoyed In-Reply-To: <004d01c7730b$5aab78d0$010010ac@JBERS2> References: <004d01c7730b$5aab78d0$010010ac@JBERS2> Message-ID: <20070403170655.GF3459@westhawk.co.uk> Hi Josh, One of the last changes I made to the code (and checked it in on SourceForge) was that the DiscoveryPdu (via the UsmDiscoveryBean) uses the retry_intervals from the original pdu. I know it isn't an official release, but you could check out the code from SF and use that one. See http://sourceforge.net/projects/westhawksnmp/ To answer your question: The 5 retries all have the same request Id. The first one that comes back is "accepted" (for better word of it), the next ones are ignored. In the later case you would get a message: Pdu of msgId XXX is already answered. Hope this answers your question. Cheers, Birgit On Fri, Mar 30, 2007 at 04:38:15PM -0400, Josh Bers wrote: > Birgit and Tim, > > I believe that the problem I am seeing is compounded by the long > timeouts enforced by the UsmDiscoveryBean ~20 seconds.. > > say I am polling for data every 5 seconds, but the device is down so > the discovery bean retries 4 times at 2 4 8 seconds > > that means that i'll have a backlog of USmDiscoveryBean requests > outstanding when one finally, goes through..and then another.. > How does the stack respond to multiple USM discovery requests coming > back from the same snmp agent? > > > Josh > _______________________________________________ > snmp mailing list > snmp@snmp.westhawk.co.uk > http://snmp.westhawk.co.uk/mailman/listinfo/snmp -- -- Birgit Arkesteijn, birgit@westhawk.co.uk, -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK -- Company no: 1769350 -- Registered Office: -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. -- tel.: +44 (0)161 237 0660 -- From birgit at westhawk.co.uk Tue Apr 3 19:16:54 2007 From: birgit at westhawk.co.uk (Birgit Arkesteijn) Date: Tue Apr 3 18:38:19 2007 Subject: [snmp] Recieved Engine ID is not correct In-Reply-To: <20070330053824.29604.qmail@webmail26.rediffmail.com> References: <20070330053824.29604.qmail@webmail26.rediffmail.com> Message-ID: <20070403171654.GG3459@westhawk.co.uk> Hi Tausif, The discovery process is done in two steps: - engineId - timeline info (boots & timestamp), only when using authentication. When a message comes in, the engineId is checked first. If the engineId doesn't correspond, the stack throws an DecodingException, just like you have reported. The timeliness information is only checked when engineId is correct. I'm sure your server (or the process that simulates a server) will increment engineBoots and set timeStamp to zero. However, that doesn't matter one way or another in this case. The only thing that matters is that whilst the stack is running and done a discovery of your server, the engineId should not change. There is no such thing as a "re discovery" mechanisme in SNMPv3. Hope this helps, Cheers, Birgit On Fri, Mar 30, 2007 at 05:38:24AM -0000, tausif tausif wrote: > ? > Hi Birgit, > > Thanks for the reply. > I have few more doubts regarding the same. > I think after re-installation of snmp engine it is possible of getting > set the different engineid. and in this case the engine boots value > will get assign to 1 and time must be zero? (please correct me if I'm > wrong) i have not seen the time-stamp value in NET-SNMP's conf files > though. > > in this case my snmp-manager should not treat it as unauthorized > packet/call and should process my request. I checked the > TimeWindow.isEngineIdOK() code which just match the engineid and does > not bother about the engineBoots and timeStamp. Shouldn't it check > these variables as well? I think if we check these variables as well, > our code should work fine? > > Waiting for your response > Thanks & regards > Tausif > > > > On Mon, 26 Mar 2007 Birgit Arkesteijn wrote : > >Hi Tausif, > > > >When using SNMPv3, the stack discovers the engineId by asking the > >authoritative engine for it. If authentication or privacy is used, it > >will then ask for the timeline details as well. > > > >It does this once at the beginning. (It will repeat it only when > >unsuccessful.) The details are stored for the remaining of the stack > >lifetime. > > > >In your case, the stack complains that the discovered engineId and the > >one sent (by the authoritative engine) as part of a subsequent request > >are not the same. > > > >However, you said so yourself: there is a mis-match. > > > >The engineId is supposed to uniquely identify an authoritative engine, > >i.e. it's not supposed to change. > > > >The only option I see is to stop/start the stack. > > > >If that doesn't work: > >Call AsnObject.setDebug(5), the stack will then print the engineId (and > >timeline details if using authentication or privacy) it stores. See > >what you get when doing discovery. > > > >Cheers, Birgit -- -- Birgit Arkesteijn, birgit@westhawk.co.uk, -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK -- Company no: 1769350 -- Registered Office: -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. -- tel.: +44 (0)161 237 0660 -- From jbers at bbn.com Tue Apr 3 16:06:48 2007 From: jbers at bbn.com (Josh Bers) Date: Tue Apr 3 20:08:49 2007 Subject: [snmp] snmpv3 context receiver thread destoyed In-Reply-To: <20070403170655.GF3459@westhawk.co.uk> Message-ID: <000601c77623$3e0de760$010010ac@JBERS2> Birgit, Yeah, I'd prefer an official release but if this fixes the problem we might update to the CVS head... Did you and/or Tim take a look at the debug output? Any ideas about what's going wrong? > To answer your question: > The 5 retries all have the same request Id. The first one > that comes back is "accepted" (for better word of it), the > next ones are ignored. > In the later case you would get a message: Pdu of msgId XXX > is already answered. > Birgit the problem that I'm expecting is that many Discovery PDU's with different msgId's will be outstanding (being retried) say req id 55, 57, and 59. Let's say the original request of 59 comes back because the agent becomes reachable. Now retry # 2 for reqid 57 goes through... Does the stack handle this case of multiple USM discovery PDU's with responses? What happens to the timeline, engine id, etc. Josh > -----Original Message----- > From: snmp-bounces@snmp.westhawk.co.uk > [mailto:snmp-bounces@snmp.westhawk.co.uk] On Behalf Of Birgit > Arkesteijn > Sent: Tuesday, April 03, 2007 1:07 PM > To: List for discussion of the Westhawk SNMP stack > Subject: Re: [snmp] snmpv3 context receiver thread destoyed > > > Hi Josh, > > One of the last changes I made to the code (and checked it in on > SourceForge) was that the DiscoveryPdu (via the > UsmDiscoveryBean) uses the retry_intervals from the original pdu. > > I know it isn't an official release, but you could check out > the code from SF and use that one. See > http://sourceforge.net/projects/westhawksnmp/ > > > To answer your question: > The 5 retries all have the same request Id. The first one > that comes back is "accepted" (for better word of it), the > next ones are ignored. > In the later case you would get a message: Pdu of msgId XXX > is already answered. > > Hope this answers your question. > > Cheers, Birgit > > > On Fri, Mar 30, 2007 at 04:38:15PM -0400, Josh Bers wrote: > > Birgit and Tim, > > > > I believe that the problem I am seeing is compounded by the long > > timeouts enforced by the UsmDiscoveryBean ~20 seconds.. > > > > say I am polling for data every 5 seconds, but the device > is down so > > the discovery bean retries 4 times at 2 4 8 seconds > > > > that means that i'll have a backlog of USmDiscoveryBean requests > > outstanding when one finally, goes through..and then another.. How > > does the stack respond to multiple USM discovery requests > coming back > > from the same snmp agent? > > > > > > Josh > > _______________________________________________ > > snmp mailing list > > snmp@snmp.westhawk.co.uk > > http://snmp.westhawk.co.uk/mailman/listinfo/snmp > > -- > -- Birgit Arkesteijn, birgit@westhawk.co.uk, > -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK > -- Company no: 1769350 > -- Registered Office: > -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. > -- tel.: +44 (0)161 237 0660 > -- > _______________________________________________ > snmp mailing list > snmp@snmp.westhawk.co.uk > http://snmp.westhawk.co.uk/mailman/listinfo/snmp > From birgit at westhawk.co.uk Wed Apr 4 13:18:35 2007 From: birgit at westhawk.co.uk (Birgit Arkesteijn) Date: Wed Apr 4 12:23:51 2007 Subject: [snmp] snmpv3 context receiver thread destroyed In-Reply-To: <000601c77623$3e0de760$010010ac@JBERS2> References: <20070403170655.GF3459@westhawk.co.uk> <000601c77623$3e0de760$010010ac@JBERS2> Message-ID: <20070404111835.GD5651@westhawk.co.uk> Hi Josh, An interesting problem ... hence a very long answer. Incoming discovery responses aren't treated any different from other responses. This is because other (non discovery) responses contain timeline info as well to keep the stack synchronised. BTW, discovery is done blocked. Here is a very short outline of what happens when a response (discovery or not) is received and decoded: **** AsnDecoderv3.processSNMPv3() calls: - TimeWindow.isEngineIdOK(): + if no engineId yet: store it + if engineId: compare returns true/false (debug > 4) When false -> throws DecodingException("Received engine Id xxx is not correct") - TimeWindow.isOutsideTimeWindow(): returns true/false, does not store + if no timeline info yet: return false + if timeline info: check if in window (return true/false) When true -> throws DecodingException("Message is outside time window") - TimeWindow.updateTimeWindow(): stores when: + (no timeline yet) & (bootsA > 0 || timeA > 0) + received timeline data newer then previous one (debug > 4) **** In other words: 1 The engineId is stored the first time. 2 Timeline info is only stored if newer. However, if an older timeline info is received after newer timeline info, the older info might cause a DecodingException exception. I don't think that's your case, since I cannot see a DecodingException in your debug output. **** As to your debug output: I'm trying to highlight the most important lines of your debug output. I removed "uk.co.westhawk." to make the lines shorter. The first number is the line number: 284: Bringing down the loopback interface. 671: pdu.DiscoveryPdu.setResponseException(): Timed out 4284: pdu.DiscoveryPdu.setResponseException(): Timed out 4290: pdu.GetPdu_vec.setResponseException(): Timed out 4297: Bringing up the loopback interface. 4298: stack.SnmpContextv3.addPdu(): msgId=57, Pdu reqId=57 4304: stack.SnmpContextv3.addPdu(): msgId=58, Pdu reqId=58 4374: pdu.DiscoveryPdu.sendme(): Sent Pdu reqId 58, retries 1 4436: stack.SnmpContextv3.processIncomingResponse(): msgId=58, Pdu reqId=58 4462: stack.TimeWindow.setSnmpEngineId(): hostaddr '127.0.0.1', port '161', snmpEngineId '800007E580F3A63E214C07C945'', key '127.0.0.1: 161' 4466: stack.TimeWindow.setTimeLine(): snmpEngineId 800007E580F3A63E214C07C945, node stack.TimeWindowNode[ engineId=800007E580F3A63E214C07C945, engineBoots=8, engineTime=409479, latestReceivedEngineTime=409479] 4469: stack.SnmpContextv3.processIncomingResponse(): rid2=58 4472: beans.UsmDiscoveryBean.startDiscovery(): Done 4474: stack.SnmpContextv3.actualEncodePacket(): msgId=57, Pdu reqId=57 4571: pdu.DiscoveryPdu.sendme(): Sent Pdu reqId 56, retries 5 4633: stack.SnmpContextv3.processIncomingResponse(): msgId=56, Pdu reqId=56 4659: pdu.DiscoveryPdu.setResponseException(): Timed out 4667: stack.SnmpContextv3.processIncomingResponse(): rid2=56 4670: beans.UsmDiscoveryBean.startDiscovery(): Done 4758: pdu.GetPdu_vec.setResponseException(): Timed out 4774: pdu.GetPdu_vec.setResponseException(): Timed out 4784: pdu.GetPdu_vec.setResponseException(): Timed out 4794: pdu.GetPdu_vec.setResponseException(): Timed out 4795: Done unblocked timed out. 4892: WARNING: Method: public abstract void fcncnm.ndm.commonobjects.EventStreamAPI.updateStatus(fcncnm.ndm.commonobjects.Status) throws fcncnm.ndm.commonobjects.CNPException, couldn't add arg: DeviceStatus: MeId: NMS ID = 25/SECRET/HOST/bogus/local ME type = fcncnm.ndm.ams.adaptors.ics.ICSHost ME instance = localhost,knode82' date='Wed Mar 28 12: 05:11 EDT 2007' 4955: SEVERE: Couldn't find ip address bogus-ip cause: "java.net.UnknownHostException: bogus-ip: bogus-ip" 4956: java.net.UnknownHostException: bogus-ip: bogus-ip *** Comments on your debug output: Two of the discovery packets are answered (58, 56). However, all your original GetPdu_vec keep timing out. Message 57 (what I assume is one of your GetPdu_vec), should have stand a chance. As far as I can tell, the GetPdu_vec isn't even sent once (I would expect a "GetPdu_vec.sendme()" in the log file), because it times out. That's due to the waitForSelf(). The situation will improve if you use the notification mechanism, since that starts the clock when the message is actually sent (see Pdu.transmit()), whereas waitForSelf() immediately starts waiting. (BTW, your debug says "one unblocked timed out", but by calling "waitForSelf()" you are sending the GetPdu_vec blocked.) I realise that our debugging isn't sufficient, since not all the message mention the request ID. It makes it very hard to track when a request is added, sent and timed out. It won't help you now, but I'll put that on my list. BTW, what the level of debug are you using? I would have expected more IOExceptions. *** Overall conclusion: The situation will improve dramatically if you use the latest CVS source, since the discovery packets will use the same retry intervals. The discovery requests and your requests won't drift so much. When you send the GetPdu_vec via the notification mechanism, I expect the pdu to be sent after the discovery succeeds, whereas now it has already timed out. Hope the above helps. Cheers, Birgit On Tue, Apr 03, 2007 at 03:06:48PM -0400, Josh Bers wrote: > Birgit, > > Yeah, I'd prefer an official release but if this fixes the problem we > might update to the CVS head... Did you and/or Tim take a look at the > debug output? Any ideas about what's going wrong? > > > To answer your question: > > The 5 retries all have the same request Id. The first one > > that comes back is "accepted" (for better word of it), the > > next ones are ignored. > > In the later case you would get a message: Pdu of msgId XXX > > is already answered. > > > > Birgit the problem that I'm expecting is that many Discovery PDU's with > different msgId's will be outstanding (being retried) say req id 55, 57, > and 59. Let's say the original request of 59 comes back because the > agent becomes reachable. Now retry # 2 for reqid 57 goes through... Does > the stack handle this case of multiple USM discovery PDU's with > responses? What happens to the timeline, engine id, etc. > > Josh -- -- Birgit Arkesteijn, birgit@westhawk.co.uk, -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK -- Company no: 1769350 -- Registered Office: -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. -- tel.: +44 (0)161 237 0660 -- From jbers at bbn.com Wed Apr 4 11:30:46 2007 From: jbers at bbn.com (Josh Bers) Date: Wed Apr 4 15:32:58 2007 Subject: [snmp] snmpv3 context receiver thread destroyed In-Reply-To: <20070404111835.GD5651@westhawk.co.uk> Message-ID: <000d01c776c5$d8694140$010010ac@JBERS2> Birgit, Thank you for the detailed analysis. I am going to try doing a notify instead of waitForSelf() to see if that improves things...however... I still do not understand why the Receiver thread for the localhost context dies which would seem to explain why all subsequent GetPdu_vec's timeout. Based upon your response I can understand that the set of getPdu_vecs that are lined up before the agent comes up are timing out, however, once the timeline is in and no more discoverUSMbeans need to run... There should not be any delay in sending the pdu's... However you are not seeing and sendme() calls. Are the transmitter's getting killed too? BTW. We were using debug level of 15. Josh > -----Original Message----- > From: snmp-bounces@snmp.westhawk.co.uk > [mailto:snmp-bounces@snmp.westhawk.co.uk] On Behalf Of Birgit > Arkesteijn > Sent: Wednesday, April 04, 2007 7:19 AM > To: List for discussion of the Westhawk SNMP stack > Subject: Re: [snmp] snmpv3 context receiver thread destroyed > > > Hi Josh, > > An interesting problem ... hence a very long answer. > > Incoming discovery responses aren't treated any different > from other responses. This is because other (non discovery) > responses contain timeline info as well to keep the stack > synchronised. > > BTW, discovery is done blocked. > > Here is a very short outline of what happens when a response > (discovery or not) is received and decoded: > > **** AsnDecoderv3.processSNMPv3() calls: > - TimeWindow.isEngineIdOK(): > + if no engineId yet: store it > + if engineId: compare > returns true/false > (debug > 4) > When false -> throws DecodingException("Received engine Id > xxx is not > correct") > > - TimeWindow.isOutsideTimeWindow(): > returns true/false, does not store > + if no timeline info yet: return false > + if timeline info: check if in window (return true/false) > When true -> throws DecodingException("Message is outside > time window") > > - TimeWindow.updateTimeWindow(): > stores when: > + (no timeline yet) & (bootsA > 0 || timeA > 0) > + received timeline data newer then previous one > (debug > 4) > > > **** In other words: > 1 The engineId is stored the first time. > 2 Timeline info is only stored if newer. > However, if an older timeline info is received after newer timeline > info, the older info might cause a DecodingException exception. > I don't think that's your case, since I cannot see a > DecodingException > in your debug output. > > > > **** As to your debug output: > I'm trying to highlight the most important lines of your > debug output. I removed "uk.co.westhawk." to make the lines > shorter. The first number is the line number: > > > 284: Bringing down the loopback interface. > > 671: pdu.DiscoveryPdu.setResponseException(): Timed out > 4284: pdu.DiscoveryPdu.setResponseException(): Timed out > 4290: pdu.GetPdu_vec.setResponseException(): Timed out > > 4297: Bringing up the loopback interface. > > 4298: stack.SnmpContextv3.addPdu(): msgId=57, Pdu reqId=57 > 4304: stack.SnmpContextv3.addPdu(): msgId=58, Pdu reqId=58 > 4374: pdu.DiscoveryPdu.sendme(): Sent Pdu reqId 58, retries 1 > > 4436: stack.SnmpContextv3.processIncomingResponse(): > msgId=58, Pdu reqId=58 > > 4462: stack.TimeWindow.setSnmpEngineId(): > hostaddr '127.0.0.1', > port '161', > snmpEngineId '800007E580F3A63E214C07C945'', > key '127.0.0.1: 161' > > 4466: stack.TimeWindow.setTimeLine(): > snmpEngineId 800007E580F3A63E214C07C945, > node stack.TimeWindowNode[ > engineId=800007E580F3A63E214C07C945, > engineBoots=8, > engineTime=409479, > latestReceivedEngineTime=409479] > > 4469: stack.SnmpContextv3.processIncomingResponse(): rid2=58 > 4472: beans.UsmDiscoveryBean.startDiscovery(): Done > 4474: stack.SnmpContextv3.actualEncodePacket(): msgId=57, Pdu reqId=57 > > 4571: pdu.DiscoveryPdu.sendme(): Sent Pdu reqId 56, retries 5 > > 4633: stack.SnmpContextv3.processIncomingResponse(): > msgId=56, Pdu reqId=56 > > 4659: pdu.DiscoveryPdu.setResponseException(): Timed out > > 4667: stack.SnmpContextv3.processIncomingResponse(): rid2=56 > 4670: beans.UsmDiscoveryBean.startDiscovery(): Done > > 4758: pdu.GetPdu_vec.setResponseException(): Timed out > 4774: pdu.GetPdu_vec.setResponseException(): Timed out > 4784: pdu.GetPdu_vec.setResponseException(): Timed out > 4794: pdu.GetPdu_vec.setResponseException(): Timed out > 4795: Done unblocked timed out. > > 4892: WARNING: Method: public abstract void > fcncnm.ndm.commonobjects.EventStreamAPI.updateStatus(fcncnm.nd > m.commonobjects.Status) throws fcncnm.ndm.commonobjects.CNPException, > couldn't add arg: DeviceStatus: > MeId: NMS ID = 25/SECRET/HOST/bogus/local > ME type = fcncnm.ndm.ams.adaptors.ics.ICSHost > ME instance = localhost,knode82' > date='Wed Mar 28 12: 05:11 EDT 2007' > > 4955: SEVERE: Couldn't find ip address bogus-ip cause: > "java.net.UnknownHostException: bogus-ip: bogus-ip" > > 4956: java.net.UnknownHostException: bogus-ip: bogus-ip > > > *** Comments on your debug output: > Two of the discovery packets are answered (58, 56). However, > all your original GetPdu_vec keep timing out. Message 57 > (what I assume is one of your GetPdu_vec), should have stand a chance. > > As far as I can tell, the GetPdu_vec isn't even sent once (I > would expect a "GetPdu_vec.sendme()" in the log file), > because it times out. > > That's due to the waitForSelf(). The situation will improve > if you use the notification mechanism, since that starts the > clock when the message is actually sent (see Pdu.transmit()), > whereas waitForSelf() immediately starts waiting. > > (BTW, your debug says "one unblocked timed out", but by > calling "waitForSelf()" you are sending the GetPdu_vec blocked.) > > I realise that our debugging isn't sufficient, since not all > the message mention the request ID. It makes it very hard to > track when a request is added, sent and timed out. It won't > help you now, but I'll put that on my list. > > BTW, what the level of debug are you using? I would have > expected more IOExceptions. > > > *** Overall conclusion: > The situation will improve dramatically if you use the latest > CVS source, since the discovery packets will use the same > retry intervals. The discovery requests and your requests > won't drift so much. > > When you send the GetPdu_vec via the notification mechanism, > I expect the pdu to be sent after the discovery succeeds, > whereas now it has already timed out. > > Hope the above helps. > > Cheers, Birgit > > > On Tue, Apr 03, 2007 at 03:06:48PM -0400, Josh Bers wrote: > > Birgit, > > > > Yeah, I'd prefer an official release but if this fixes the > problem we > > might update to the CVS head... Did you and/or Tim take a > look at the > > debug output? Any ideas about what's going wrong? > > > > > To answer your question: > > > The 5 retries all have the same request Id. The first one > > > that comes back is "accepted" (for better word of it), the > > > next ones are ignored. > > > In the later case you would get a message: Pdu of msgId XXX > > > is already answered. > > > > > > > Birgit the problem that I'm expecting is that many Discovery PDU's > > with different msgId's will be outstanding (being retried) > say req id > > 55, 57, and 59. Let's say the original request of 59 comes back > > because the agent becomes reachable. Now retry # 2 for > reqid 57 goes > > through... Does the stack handle this case of multiple USM > discovery > > PDU's with responses? What happens to the timeline, engine id, etc. > > > > Josh > > -- > -- Birgit Arkesteijn, birgit@westhawk.co.uk, > -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK > -- Company no: 1769350 > -- Registered Office: > -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. > -- tel.: +44 (0)161 237 0660 > -- > _______________________________________________ > snmp mailing list > snmp@snmp.westhawk.co.uk > http://snmp.westhawk.co.uk/mailman/listinfo/snmp > From birgit at westhawk.co.uk Wed Apr 4 16:55:58 2007 From: birgit at westhawk.co.uk (Birgit Arkesteijn) Date: Wed Apr 4 16:01:17 2007 Subject: [snmp] snmpv3 context receiver thread destroyed In-Reply-To: <000d01c776c5$d8694140$010010ac@JBERS2> References: <20070404111835.GD5651@westhawk.co.uk> <000d01c776c5$d8694140$010010ac@JBERS2> Message-ID: <20070404145558.GE5651@westhawk.co.uk> Hi Josh, I don't have much time now, so just a quick answer: In your debug output: The lines: 4298: snmp.stack.SnmpContextv3.addPdu(): msgId=57, Pdu reqId=57 4304: snmp.stack.SnmpContextv3.addPdu(): msgId=58, Pdu reqId=58 are the last where 'addPdu()' is called. - Pdu.send() calls - Pdu.send(int error_status, int error_index) calls - SnmpContextBasisFace.addPdu(this); Pdu.send() should be on your end. Are you sure you keep sending PDUs? Could you (maybe) add some debug every time you call send()? Cheers, Birgit On Wed, Apr 04, 2007 at 10:30:46AM -0400, Josh Bers wrote: > Birgit, > > Thank you for the detailed analysis. I am going to try doing a notify > instead of waitForSelf() to see if that improves things...however... > > I still do not understand why the Receiver thread for the localhost > context dies which would seem to explain why all subsequent > GetPdu_vec's timeout. > Based upon your response I can understand that the set of getPdu_vecs > that are lined up before the agent comes up are timing out, however, > once the timeline is in and no more discoverUSMbeans need to run... > There should not be any delay in sending the pdu's... However you are > not seeing and sendme() calls. Are the transmitter's getting killed > too? > > BTW. We were using debug level of 15. > > Josh > > > -----Original Message----- > > From: snmp-bounces@snmp.westhawk.co.uk > > [mailto:snmp-bounces@snmp.westhawk.co.uk] On Behalf Of Birgit > > Arkesteijn > > Sent: Wednesday, April 04, 2007 7:19 AM > > To: List for discussion of the Westhawk SNMP stack > > Subject: Re: [snmp] snmpv3 context receiver thread destroyed > > > > > > Hi Josh, > > > > An interesting problem ... hence a very long answer. > > > > Incoming discovery responses aren't treated any different > > from other responses. This is because other (non discovery) > > responses contain timeline info as well to keep the stack > > synchronised. > > > > BTW, discovery is done blocked. > > > > Here is a very short outline of what happens when a response > > (discovery or not) is received and decoded: > > > > **** AsnDecoderv3.processSNMPv3() calls: > > - TimeWindow.isEngineIdOK(): > > + if no engineId yet: store it > > + if engineId: compare > > returns true/false > > (debug > 4) > > When false -> throws DecodingException("Received engine Id > > xxx is not > > correct") > > > > - TimeWindow.isOutsideTimeWindow(): > > returns true/false, does not store > > + if no timeline info yet: return false > > + if timeline info: check if in window (return true/false) > > When true -> throws DecodingException("Message is outside > > time window") > > > > - TimeWindow.updateTimeWindow(): > > stores when: > > + (no timeline yet) & (bootsA > 0 || timeA > 0) > > + received timeline data newer then previous one > > (debug > 4) > > > > > > **** In other words: > > 1 The engineId is stored the first time. > > 2 Timeline info is only stored if newer. > > However, if an older timeline info is received after newer timeline > > info, the older info might cause a DecodingException exception. > > I don't think that's your case, since I cannot see a > > DecodingException > > in your debug output. > > > > > > > > **** As to your debug output: > > I'm trying to highlight the most important lines of your > > debug output. I removed "uk.co.westhawk." to make the lines > > shorter. The first number is the line number: > > > > > > 284: Bringing down the loopback interface. > > > > 671: pdu.DiscoveryPdu.setResponseException(): Timed out > > 4284: pdu.DiscoveryPdu.setResponseException(): Timed out > > 4290: pdu.GetPdu_vec.setResponseException(): Timed out > > > > 4297: Bringing up the loopback interface. > > > > 4298: stack.SnmpContextv3.addPdu(): msgId=57, Pdu reqId=57 > > 4304: stack.SnmpContextv3.addPdu(): msgId=58, Pdu reqId=58 > > 4374: pdu.DiscoveryPdu.sendme(): Sent Pdu reqId 58, retries 1 > > > > 4436: stack.SnmpContextv3.processIncomingResponse(): > > msgId=58, Pdu reqId=58 > > > > 4462: stack.TimeWindow.setSnmpEngineId(): > > hostaddr '127.0.0.1', > > port '161', > > snmpEngineId '800007E580F3A63E214C07C945'', > > key '127.0.0.1: 161' > > > > 4466: stack.TimeWindow.setTimeLine(): > > snmpEngineId 800007E580F3A63E214C07C945, > > node stack.TimeWindowNode[ > > engineId=800007E580F3A63E214C07C945, > > engineBoots=8, > > engineTime=409479, > > latestReceivedEngineTime=409479] > > > > 4469: stack.SnmpContextv3.processIncomingResponse(): rid2=58 > > 4472: beans.UsmDiscoveryBean.startDiscovery(): Done > > 4474: stack.SnmpContextv3.actualEncodePacket(): msgId=57, Pdu reqId=57 > > > > 4571: pdu.DiscoveryPdu.sendme(): Sent Pdu reqId 56, retries 5 > > > > 4633: stack.SnmpContextv3.processIncomingResponse(): > > msgId=56, Pdu reqId=56 > > > > 4659: pdu.DiscoveryPdu.setResponseException(): Timed out > > > > 4667: stack.SnmpContextv3.processIncomingResponse(): rid2=56 > > 4670: beans.UsmDiscoveryBean.startDiscovery(): Done > > > > 4758: pdu.GetPdu_vec.setResponseException(): Timed out > > 4774: pdu.GetPdu_vec.setResponseException(): Timed out > > 4784: pdu.GetPdu_vec.setResponseException(): Timed out > > 4794: pdu.GetPdu_vec.setResponseException(): Timed out > > 4795: Done unblocked timed out. > > > > 4892: WARNING: Method: public abstract void > > fcncnm.ndm.commonobjects.EventStreamAPI.updateStatus(fcncnm.nd > > m.commonobjects.Status) throws fcncnm.ndm.commonobjects.CNPException, > > couldn't add arg: DeviceStatus: > > MeId: NMS ID = 25/SECRET/HOST/bogus/local > > ME type = fcncnm.ndm.ams.adaptors.ics.ICSHost > > ME instance = localhost,knode82' > > date='Wed Mar 28 12: 05:11 EDT 2007' > > > > 4955: SEVERE: Couldn't find ip address bogus-ip cause: > > "java.net.UnknownHostException: bogus-ip: bogus-ip" > > > > 4956: java.net.UnknownHostException: bogus-ip: bogus-ip > > > > > > *** Comments on your debug output: > > Two of the discovery packets are answered (58, 56). However, > > all your original GetPdu_vec keep timing out. Message 57 > > (what I assume is one of your GetPdu_vec), should have stand a chance. > > > > As far as I can tell, the GetPdu_vec isn't even sent once (I > > would expect a "GetPdu_vec.sendme()" in the log file), > > because it times out. > > > > That's due to the waitForSelf(). The situation will improve > > if you use the notification mechanism, since that starts the > > clock when the message is actually sent (see Pdu.transmit()), > > whereas waitForSelf() immediately starts waiting. > > > > (BTW, your debug says "one unblocked timed out", but by > > calling "waitForSelf()" you are sending the GetPdu_vec blocked.) > > > > I realise that our debugging isn't sufficient, since not all > > the message mention the request ID. It makes it very hard to > > track when a request is added, sent and timed out. It won't > > help you now, but I'll put that on my list. > > > > BTW, what the level of debug are you using? I would have > > expected more IOExceptions. > > > > > > *** Overall conclusion: > > The situation will improve dramatically if you use the latest > > CVS source, since the discovery packets will use the same > > retry intervals. The discovery requests and your requests > > won't drift so much. > > > > When you send the GetPdu_vec via the notification mechanism, > > I expect the pdu to be sent after the discovery succeeds, > > whereas now it has already timed out. > > > > Hope the above helps. > > > > Cheers, Birgit > > > > > > On Tue, Apr 03, 2007 at 03:06:48PM -0400, Josh Bers wrote: > > > Birgit, > > > > > > Yeah, I'd prefer an official release but if this fixes the > > problem we > > > might update to the CVS head... Did you and/or Tim take a > > look at the > > > debug output? Any ideas about what's going wrong? > > > > > > > To answer your question: > > > > The 5 retries all have the same request Id. The first one > > > > that comes back is "accepted" (for better word of it), the > > > > next ones are ignored. > > > > In the later case you would get a message: Pdu of msgId XXX > > > > is already answered. > > > > > > > > > > Birgit the problem that I'm expecting is that many Discovery PDU's > > > with different msgId's will be outstanding (being retried) > > say req id > > > 55, 57, and 59. Let's say the original request of 59 comes back > > > because the agent becomes reachable. Now retry # 2 for > > reqid 57 goes > > > through... Does the stack handle this case of multiple USM > > discovery > > > PDU's with responses? What happens to the timeline, engine id, etc. > > > > > > Josh > > > > -- > > -- Birgit Arkesteijn, birgit@westhawk.co.uk, > > -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK > > -- Company no: 1769350 > > -- Registered Office: > > -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. > > -- tel.: +44 (0)161 237 0660 > > -- > > _______________________________________________ > > snmp mailing list > > snmp@snmp.westhawk.co.uk > > http://snmp.westhawk.co.uk/mailman/listinfo/snmp > > > > > > _______________________________________________ > snmp mailing list > snmp@snmp.westhawk.co.uk > http://snmp.westhawk.co.uk/mailman/listinfo/snmp > -- -- Birgit Arkesteijn, birgit@westhawk.co.uk, -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK -- Company no: 1769350 -- Registered Office: -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. -- tel.: +44 (0)161 237 0660 -- From snookala at alcatel-lucent.com Thu Apr 12 10:45:16 2007 From: snookala at alcatel-lucent.com (Nookala, Sridevi (Sridevi)) Date: Thu Apr 12 15:56:41 2007 Subject: [snmp] SNMP proxy - trap forwarding Message-ID: Dear westhawk team, We use westhawk stack for northbound trap forwarding. My server listens for traps from multiple agents. Some forward v1 and some v2 My server acts as a proxy. I simply pass thru v1 / v2 traps to northbound destinations. By pass thru I mean I make a new v1trappdu for incoming v1 and new v2 trap pdu for incoming v2. I copy the varbinds from in the incoming trap. In this process, the Source of the trap is changed to my Server's IP address. Folks /Customers at the northbound don't want to see the proxy ip address and want to see the actual agent address. How can I achieve this with the stack ? westhawk api I hope my question is clear This is urgent and any response will be greatly valued and welcomed. The platform is SUN solaris 9 and stack version is snmp 4.13 -thx, Sri From snookala at alcatel-lucent.com Thu Apr 12 12:23:10 2007 From: snookala at alcatel-lucent.com (Nookala, Sridevi (Sridevi)) Date: Thu Apr 12 17:34:25 2007 Subject: [snmp] RE: SNMP proxy - trap forwarding Message-ID: Dear westhawk, Can somebody also enlighten me if the application acting as a proxy forwarder should tag in additional varbinds other than those varbinds that came in the original trap. Any links/pointers to RFC's would be great -thx, Sri ________________________________ From: Nookala, Sridevi (Sridevi) Sent: Thursday, April 12, 2007 10:45 AM To: 'snmp@snmp.westhawk.co.uk' Subject: SNMP proxy - trap forwarding Dear westhawk team, We use westhawk stack for northbound trap forwarding. My server listens for traps from multiple agents. Some forward v1 and some v2 My server acts as a proxy. I simply pass thru v1 / v2 traps to northbound destinations. By pass thru I mean I make a new v1trappdu for incoming v1 and new v2 trap pdu for incoming v2. I copy the varbinds from in the incoming trap. In this process, the Source of the trap is changed to my Server's IP address. Folks /Customers at the northbound don't want to see the proxy ip address and want to see the actual agent address. How can I achieve this with the stack ? westhawk api I hope my question is clear This is urgent and any response will be greatly valued and welcomed. The platform is SUN solaris 9 and stack version is snmp 4.13 -thx, Sri From snookala at alcatel-lucent.com Thu Apr 12 13:25:40 2007 From: snookala at alcatel-lucent.com (Nookala, Sridevi (Sridevi)) Date: Sat Apr 14 16:08:32 2007 Subject: [snmp] RE: SNMP proxy - trap forwarding Message-ID: Dear westhawk, I was reading google and if an application is acting as a proxy and forwarding traps looks like it needs to tag/append extra varbinds with the actual remote agent information from RFC2576-MIB Extra oid .1.3.6.1.6.3.18.1.3.0 This oid will take the value of the actual remote agent address This is especially when proxying v2 traps. I need confirmation from SNMP experts on if we should be tagging this extra information Thanks and waiting for your response Sri ________________________________ From: Nookala, Sridevi (Sridevi) Sent: Thursday, April 12, 2007 12:23 PM To: Nookala, Sridevi (Sridevi); 'snmp@snmp.westhawk.co.uk' Subject: RE: SNMP proxy - trap forwarding Dear westhawk, Can somebody also enlighten me if the application acting as a proxy forwarder should tag in additional varbinds other than those varbinds that came in the original trap. Any links/pointers to RFC's would be great -thx, Sri ________________________________ From: Nookala, Sridevi (Sridevi) Sent: Thursday, April 12, 2007 10:45 AM To: 'snmp@snmp.westhawk.co.uk' Subject: SNMP proxy - trap forwarding Dear westhawk team, We use westhawk stack for northbound trap forwarding. My server listens for traps from multiple agents. Some forward v1 and some v2 My server acts as a proxy. I simply pass thru v1 / v2 traps to northbound destinations. By pass thru I mean I make a new v1trappdu for incoming v1 and new v2 trap pdu for incoming v2. I copy the varbinds from in the incoming trap. In this process, the Source of the trap is changed to my Server's IP address. Folks /Customers at the northbound don't want to see the proxy ip address and want to see the actual agent address. How can I achieve this with the stack ? westhawk api I hope my question is clear This is urgent and any response will be greatly valued and welcomed. The platform is SUN solaris 9 and stack version is snmp 4.13 -thx, Sri From snookala at alcatel-lucent.com Thu Apr 12 15:31:33 2007 From: snookala at alcatel-lucent.com (Nookala, Sridevi (Sridevi)) Date: Sat Apr 14 16:08:57 2007 Subject: [snmp] How to Explicilty specify the SRC IP in an outgoing v2 trap Message-ID: Dear westhawk, I want to be more precise. I have a sender (S1) sending traps to R1( receiver 1) R1 now has to send R2 ( a different receiver - difft host/port) When I receive a trap from S1 sender, I get the srcIP out of the incoming trap. If it's a v1 trap that S1 sent, I can use OneTrapPduv1 and call setIpAddress to set the IP address. If it's a V2 trap, how do I explicitly or do I have a way to explicitly specify the SRC IP. In my case, when R1 has to forward to R2 from R1 (sun machine), I need to put the ipaddress of the actual sender S1 -thx, Any help is really appreciated Sridevi From birgit at westhawk.co.uk Mon Apr 16 18:09:29 2007 From: birgit at westhawk.co.uk (Birgit Arkesteijn) Date: Mon Apr 16 17:14:54 2007 Subject: [snmp] SNMP proxy - trap forwarding In-Reply-To: References: Message-ID: <20070416160929.GK22441@westhawk.co.uk> Hi Sri, I'm not sure I can answer all of your questions. Trap v1 has property IpAddress, the equivalent in TrapPduv2 would be snmpTrapEnterprise (see http://www.ietf.org/rfc/rfc3418.txt). The javadoc documentation on class uk.co.westhawk.snmp.stack.TrapPduv2 explains the Trap v2 requirements and has pointers to two of the RTFs: http://www.ietf.org/rfc/rfc3416.txt http://www.ietf.org/rfc/rfc3418.txt TrapPduv2 requires: - sysUpTime.0 - snmpTrapOID.0 > In this process, the Source of the trap is changed to my Server's IP > address. Don't you mean your change the 'destination' of the trap to your Server's IP address? In fact, you probably have to make sure the 'source' of the trap is the original source and not the proxy. I don't have experience with 1.3.6.1.6.3.18.1.3.0 (= snmpTrapAddress.0) "The value of the agent-addr field of a Trap PDU which is forwarded by a proxy forwarder application using an SNMP version other than SNMPv1. The value of this object SHOULD contain the value of the agent-addr field from the original Trap PDU as generated by an SNMPv1 agent." but you could try. However, the section in RFC 2576 relates to translating SNMPv1 traps to SNMPv2 traps. This is not what you are doing. You are forwarding (not translating) a SNMPv2 trap via a proxy. I'm not up to speed with how a proxy works exactly. Because you create a new trap PDU, the server will probably think the trap comes from the proxy. The question is, when your server receives a trap (v1 or v2) from your proxy and checks where it came from, will it check - the socket (or the datagram packet) - the PDU varbind list ? You can change the PDU varbind list in the stack, however, you cannot change the datagram's address in the stack. The later is part of the transport layer, and outside the scope of the stack. Hope the above helps. Cheers, Birgit On Thu, Apr 12, 2007 at 09:45:16AM -0500, Nookala, Sridevi (Sridevi) wrote: > Dear westhawk team, > > We use westhawk stack for northbound trap forwarding. > > My server listens for traps from multiple agents. > > Some forward v1 and some v2 > > My server acts as a proxy. I simply pass thru v1 / v2 traps to > northbound destinations. > > By pass thru I mean I make a new v1trappdu for incoming v1 and new v2 > trap pdu for incoming v2. > > I copy the varbinds from in the incoming trap. > > In this process, the Source of the trap is changed to my Server's IP > address. > > Folks /Customers at the northbound don't want to see the proxy ip > address and want to see the actual agent address. > > How can I achieve this with the stack ? westhawk api > > I hope my question is clear > > This is urgent and any response will be greatly valued and welcomed. > > The platform is SUN solaris 9 and stack version is snmp 4.13 > > -thx, > Sri On Thu, Apr 12, 2007 at 11:23:10AM -0500, Nookala, Sridevi (Sridevi) wrote: > Dear westhawk, > > Can somebody also enlighten me if the application acting as a proxy > forwarder should tag in additional varbinds other than those varbinds > that came in the original trap. > > Any links/pointers to RFC's would be great > > -thx, > Sri On Thu, Apr 12, 2007 at 12:25:40PM -0500, Nookala, Sridevi (Sridevi) wrote: > Dear westhawk, > > I was reading google and if an application is acting as a proxy and > forwarding traps looks like it needs to tag/append extra varbinds with > the actual remote agent information from RFC2576-MIB > > Extra oid .1.3.6.1.6.3.18.1.3.0 > > This oid will take the value of the actual remote agent address > > This is especially when proxying v2 traps. > > I need confirmation from SNMP experts on if we should be tagging this > extra information > > Thanks and waiting for your response > > Sri On Thu, Apr 12, 2007 at 02:31:33PM -0500, Nookala, Sridevi (Sridevi) wrote: > Dear westhawk, > > I want to be more precise. > > I have a sender (S1) sending traps to R1( receiver 1) > > R1 now has to send R2 ( a different receiver - difft host/port) > > When I receive a trap from S1 sender, I get the srcIP out of the > incoming trap. > > If it's a v1 trap that S1 sent, I can use OneTrapPduv1 and call > setIpAddress to set the IP address. > > If it's a V2 trap, how do I explicitly or do I have a way to explicitly > specify the SRC IP. > > In my case, when R1 has to forward to R2 from R1 (sun machine), I need > to put the ipaddress of the actual sender S1 > > -thx, > > Any help is really appreciated > Sridevi -- -- Birgit Arkesteijn, birgit@westhawk.co.uk, -- Westhawk Ltd, Albion Wharf, 19 Albion Street, Manchester M1 5LN, UK -- Company no: 1769350 -- Registered Office: -- 15 London Road, Stockton Heath, Warrington WA4 6SJ. UK. -- tel.: +44 (0)161 237 0660 --