= problem = People were not able to get properly connected to a conference room on Thursday 2010-03-04 UTC ~1700.
= analysis = Looks like it was caused by sockets not being released in a timely fashion. Many open RTP connections were still open for calls long since released, even though /etc/asterisk/sip.conf shows they should have been. Over 200 RTP-related connections still open, which could potentially overrun our port allocation, even though normally we've never had enough traffic to cause this.
= enhancement recommendation = Monitor talk.fedoraproject.org to see if greater than some number of connections (50? 100?) in our RTP port range 10000-10500 are active.
This is the first time I've filed a ticket like this so if it's not complete, I apologize -- just find me online and I'll see if I can clarify. Mike McGrath and Jared Smith (jsmith) were in on the conversation so they may be able to help too.
I just found that plugin:
rafaelgomes@mordor:~/Downloads$ ./check_netstat.pl -p ">80" -w ">5" -c ">10" CRITICAL -tcp80_out is 21(more then 10)
The problem is that don't support port range. I am not good enough perl development to hack this, but I am trying understand how it works and suggest some change.
I am thinking about write my own script in Shell, but it will be last shoot.
I forgot plugin link:
http://exchange.nagios.org/directory/Plugins/Network-Connections%2C-Stats-and-Bandwidth/check_netstat/details
I have notice that that service may will be halt. No more effort for that.
According to a report sent to the sysadmin team, this server will be disabled. Because that, I will close this ticket.
Login to comment on this ticket.