IP9100 crashed by Cisco 2900XL
Goto page 1, 2  Next
 
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Yoics Support Forums Forum Index -> Misc Yoics Enabled Devices
View previous topic :: View next topic  
Author Message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Wed Aug 08, 2007 5:17 am    Post subject: IP9100 crashed by Cisco 2900XL Reply with quote

Hi,

Just reporting that my IP9100a (Yoics firmware 2.41d) was recently connected into my home network through a Cisco 2900XL enterprise grade switch instead of the Netgear FS116 I previously used.

I went through a few days of hard-resetting the IP9100A after noticing that it had crashed. The symptoms were interesting :- the device could still be polled, and returned the last video images taken before the crash (in my case a night-time image returned all through the next day) (usr/yoics0.jpg and usr/yoics1.jpg). Legacy web page access was 'kind-of' working; it was very slow and sometimes froze. I could not revive it by a web-page commanded reboot, a power-cycle was necessary.

Previously the IP9100A worked 100%, it was polled by a PC once a second, for 2 ports, 24/7 with no problems for months. Reconnecting it back the old way (ie. using the FS116 again and eliminating the 2900XL) resulted in proper operation again.

I'm not seeking a solution from Yoics [it probably isn't under their control], I'm just letting everyone know. I don't know much about Cisco switches, but I did notice the port-status lights all flashing together once every second or two. Perhaps there's a regular packet set around by the Cisco which the IP9100 can't deal with, and ends up crashing after some hours. I reckon around 8-12 hours was the time until crash.

I have not much info on the switch itself, but am going to look at documentation to see if some auto-network-sniff or discovery modes are on by default. That might be one cause.

So, until we know more about what packets a Cisco switch issues, or how to set up the switch to avoid the problem, I would avoid connecting IP9100As to them!

Cheers all,
Simon.
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Fri Aug 17, 2007 1:06 am    Post subject: Reply with quote

Answering my own message... first sign of madness Laughing

I have tinkered a little more. Fortunately I have two IP9100A's, one with 2.34 firmware, never been reflashed, this is my 'backup', and my 'main' one which I reported the problem with, which has Yoics firmware. I use two video camera inputs.

I connected the 'backup' 9100 to the Cisco switch, and split the two video inputs between them. Each 9100 is thus being 'hit' once per second.
(Edit for clarification: the Yoics 9100 was connected to the Netgear switch that was connected to the Cisco switch. Thus it was 'insulated' in some way).
This setup has now worked 24/7 for about 5 days with no crashes or freezes, however in the longer term, I don't wish to keep my backup unit operating all the time.

My next step is to return to the single 9100, being hit twice a second, and check that the problem returns. I'll also try reducing the poll rate - that might help.


Last edited by emlsnws on Tue Aug 21, 2007 5:36 am; edited 1 time in total
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Tue Aug 21, 2007 5:34 am    Post subject: Reply with quote

I've updated the previous post to correct a subtle point. The working configuration had the 'backup' 9100 (with legagy firmware) connected to the Cisco switch and the Yoics one 'through' the Netgear DS116. Thus it might not have 'seen' any unusual packets that the Cisco was issuing. No crashes have been seen for about 1 week.

The next step is to put both 9100A's back onto the Cisco switch to re-create the problem.

I'll keep the thread posted.
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Tue Sep 18, 2007 4:32 am    Post subject: Reply with quote

Another update, and probably my last post on the subject.

I confirmed that reconnecting the Yoics 9100A to the Cisco resulted in crashes, as described at the top. Then reconnecting it through a Netgear DS116 switch caused the problems to cease.

I've decided to retire the 2900XL and replace it with a fewer port, but Gigabit speed switch to support more home network use. Hopefully I will not see the crashing problem again!

Googling on the subject, I found that 'keepalive' is configurable on these swtiches - I cannot access my switches information so cannot confirm this.

Cisco also seem to have a Configuration Test Protocol, and this might permit multiple Cisco switches to talk to one another. The Yoics firmware might not gracefully handle packets from this protocol and a crash eventually results.

References for CTP:
http://www.cisco.com/univercd/cc/td/doc/product/software/ios102/rpcr/74056.htm See the '9000' entry.
http://www.velocityreviews.com/forums/t299885-keepalive-on-ethernet-interface.html
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Wed Nov 14, 2007 2:28 am    Post subject: Reply with quote

Hello again,

Now my IP9100A has started crashing again. As before, it's being polled by the Linux program 'motion'. I used to use two polls a second but have now changed it to one poll per second (for each of the 2 ports in use). It runs 24/7. (Still using older Yoics firmware, not the latest).

It's connected to a D-Link Gigabit switch (I can post the model number later) and has been so for about 2 months, without any trouble.

Previously I suspected the network switch, but now I'm not so sure....
I will have to reconnect it to the DS116 switch I used to use, and see if the crashing stops. I have had trouble in the past with those little power adapters failing, so I might look there, after eliminating the network switch.

Hmm!
Simon.
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Mon Nov 26, 2007 8:39 am    Post subject: Reply with quote

Another update to this problem. It has been crashing quite a bit while connected through the old DS116 switch.

I have noticed that after today's 9100A crash, the Linux machine polling it had 38 (I counted them) HTTP connections in FIN_WAIT1 state, most with 106 bytes in the Send-Q, and 2 more sockets in the ESTABLISHED state.
Pings to the 'crashed' 9100A resulted in OK responses but HTTP requests timed out.

Have I hit a limit in the number of sockets or something in the 9100A? Or perhaps in Linux but I suspect that's unlikely.
I did notice that after exiting Motion, these FIN_WAIT1 sockets remained.
I will endeavour to get these sockets sorted out/closed down in the program, as it's undesirable, but that will mean I have to get back into the programming side and work out a patch for Motion.

Motion is killed and restarted every midnight. I had guessed that this might be the reason for 'stuck' ports, if it goes down in an unclean fashion, but the most recent one has occurred after much less than 20 (=40 ports / 2 channels used) restarts of Motion. It was actually less than 1 day and about 4 restarts.

Cheers,
Simon.


Last edited by emlsnws on Mon Nov 26, 2007 3:53 pm; edited 1 time in total
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Tue Nov 27, 2007 2:22 am    Post subject: Reply with quote

I'm continuing to document my work on this 'problem' here, hope that's ok. It may well be a problem with my system or software instead of the 9100A and the Yoics code.

I've brought into Motion a patch that I made for HTTP Keep-Alive connections. I am fairly sure it works, although it is beta code.
Unfortunately the patched Motion resulted in a 9100A crash, after around 2.5 hours. The failure was exactly the same way as before: frozen HTTP but still responding to Pings. I did see from my logs this time that around 90 minutes in, some corrupt Jpeg data was noticed by Motion:
[3] Corrupt JPEG data: found marker 0xde instead of RST6
[3] Unsupported marker type 0xde
This was on one of the IP9100A channels only. Images were received for about an hour more before the stoppage occurred.

It was slightly different in that the output of "netstat -a" on the Linux machine showed 2 HTTP Keepalive sockets, with about 13000 bytes of data queued: (excerpt follows)

#netstat -ao
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State Timer
(snip)
tcp 0 13161 192.168.1.8:39288 192.168.1.200:http FIN_WAIT1 unkn-4 (29.17/0/0)
tcp 0 13161 192.168.1.8:39287 192.168.1.200:http FIN_WAIT1 unkn-4 (28.44/0/0)
tcp 0 221 192.168.1.8:34041 192.168.1.200:http FIN_WAIT1 on (0.12/8/0)
tcp 0 221 192.168.1.8:34042 192.168.1.200:http FIN_WAIT1 on (0.26/8/0)
(snip)
tcp 0 0 192.168.1.8:55674 192.168.1.180:http ESTABLISHED keepalive (3535.23/0/0)
(snip)

Since I did not restart the Linux machine (192.168.one.eight), I assume the sockets in FIN_WAIT1 with and without unkn-4 are due to the previous runs of Motion. The IP9100A is 192.168.1.200. I could look into changing the default timer values of the sockets opened by Motion.

Also, before I go too far, I will have to work out when an opportune time to upgrade to latest Yoics firmware, as I'm aware I am still running an older one.

Simon.
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Wed Nov 28, 2007 2:56 am    Post subject: Reply with quote

It looks like the problem is that the receive window on the IP9100A has gone to zero. My knowledge of this field is small, but increasing as I research it...

I should catch this problem if & when it occurs again as I have set up tcpdump to continuously record all traffic to & from the IP9100A.

There's also a monitoring task running on the machine, so if the fps of the IP9100A's camera (as observed by the Motion program) drops to 0, the tcpdump will be killed after a pause (a few minutes). Then I will have a record of what had been going on Smile I'll just have to decipher it.....
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Tue Dec 04, 2007 2:30 pm    Post subject: Reply with quote

Right, I've collected data with tcpdump for 3 crashes under various conditions. It doesn't seem to be the receive window going to 0 after all.

I don't have time tonight to write it all up, but I will do soon. I have tcpdump raw and decoded output at the time of the 'anomaly' and beforehand.

As I've seen enough crashes for now, I have exchanged the little switching power supply for the one belonging to the 'backup' 9100A. I have other similar power units I could use if necessary.

So I will keep the 9100A running with the new(er) power supply for now and watch for more crashes. They will probably continue!!! We will see.
Back to top
View user's profile Send private message
mycal



Joined: 03 Dec 2006
Posts: 601

PostPosted: Wed Dec 05, 2007 9:37 pm    Post subject: Reply with quote

Is this for the 2.45 build?

-M
Back to top
View user's profile Send private message AIM Address
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Fri Dec 07, 2007 6:27 am    Post subject: Reply with quote

Hi mycal,

I did say (buried above) that it's the old 2.41d version. Still plodding on with that!

I changed the wall wart power supply about two days ago, for the one from my backup IP9100A, and no crashes yet. Didn't really want to say yet, as it will spook it. I want to see it run for over a week before I get happy.

Will keep the thread updated, as always. It could do with re-naming, can you do that? Could just call it 'IP9100A crashing occasionally' or something.

Cheers,
Simon.
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Tue Dec 11, 2007 7:08 am    Post subject: Reply with quote

Well, another update.

After 5 days operating with a different 9100's wall cube, it crashed again.

I did notice that if I put some force on the power input socket at the right angle, I could turn off the IP9100A - it seems like I interrupted the spring connection inside the power jack. Since the 9100 lives in a garage where temperatures are cold and fluctuate, I thought that was a possibility for a cause.

So, being an engineer I opened the 9100A and connected the power input by some wires and a 2.5mm power plug. Much more reliable, or so I thought. After less than 90 minutes it had crashed again. However I did use the first (original) power supply for this job.

Then I tried a 5V 4A power supply from something else, with the same 2.5mm power plug. The 9100 also crashed when connected to that! And it didn't take many hours to do so.

So I am thinking that the power supply is actually ok and I should be looking in other places. The 9100A is on a UPS together with the network switch and PC that are connected to it.

Next, I will drag out the logs I took and put them on this thread for inspection.

mycal, should I update to 2.45?
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Thu Jan 03, 2008 3:05 am    Post subject: Reply with quote

I have updated to 2.45a Yoics firmware over the holiday period.

It still crashed, _but_ the crash corresponded to the Motion program re-starting at midnight. It does this every day under cron control, and crashes often have occurred at midnight (but at other times too).
Perhaps when Motion re-starts, it interrupts the protocol and the IP9100 is left 'hanging' and doesn't recover. A patch to Motion to finish what it was doing before re-starting is certainly possible, I have written other patches for it.

I've set up another experiment now, removing the Motion restart every midnight. Still running with the replacement power supply and wired-through power socket for connection reliability.

I have the logs for 2.41 crashing but don't think it's worth posting them. I intend to capture some more - with 2.45a which will be far more use.

Simon.
Back to top
View user's profile Send private message
emlsnws



Joined: 13 Jan 2007
Posts: 29
Location: UK Gloucestershure

PostPosted: Sat Jan 05, 2008 3:09 pm    Post subject: Reply with quote

So, here are some logs taken from my troublesome IP9100A as it crashes. The 9100 is on 192.168.1.200 and is running 2.45a. Set to 640x480 and channels 1,2 'up'.

The polling system is 192.168.1.8 and is an Athlon, Fedora Core 6 system, also running mythtv (although not heavily loaded or even active at the time). The polling program is 'motion' SVN 292 which is the latest development version.

I use tcpdump to obtain the logs, and a shell script to monitor the Motion program's frame rate from the IP9100A channel 1 every 30s, and when it goes to 0, after a delay the tcpdump is halted.

I saw the other thread where mycal talked about internal temperatures causing problems with the 9100a, so my unit now has its top cover removed, and is sitting in a UK garage at about 10oC so overheating shouldn't be a problem. (Unless there is perhaps some kind of aging going on, making the devices more unreliable with time. Hope not.)

I've also eliminated the motion midnight restart, as it has been disabled, and anyway these crashes have occurred well away from midnight.

These are tcpdump logs showing all traffic to/from 192.168.1.200. I have much more data if it will be of interest. Not full packet dumps but can re-do them if needed. I am only posting the area immediately before it goes wrong, as it is very repetitive (continual reading of images).

I use Static IPs throughout.

Here we go:

This log is with ip9100a with channels 1,2 being sampled and both being polled once a second by the Motion program. Ports 45787 & 45788 are used for the 2 channels.
The test ran for ~3 hours before the error occurred.
Quote:

21:00:58.466388 IP 192.168.1.200.http > 192.168.1.8.45787: . 924677:926125(1448) ack 5550 win 5792 <nop,nop,timestamp 1650050 1478264796>
21:00:58.466397 IP 192.168.1.8.45787 > 192.168.1.200.http: . ack 926125 win 1962 <nop,nop,timestamp 1478264797 1650050>
21:00:58.466660 IP 192.168.1.200.http > 192.168.1.8.45787: . 926125:927573(1448) ack 5550 win 5792 <nop,nop,timestamp 1650050 1478264796>
21:00:58.466667 IP 192.168.1.8.45787 > 192.168.1.200.http: . ack 927573 win 1919 <nop,nop,timestamp 1478264797 1650050>
21:00:58.467696 IP 192.168.1.200.http > 192.168.1.8.45787: . 927573:929021(1448) ack 5550 win 5792 <nop,nop,timestamp 1650050 1478264797>
21:00:58.467717 IP 192.168.1.8.45787 > 192.168.1.200.http: . ack 929021 win 1876 <nop,nop,timestamp 1478264798 1650050>
21:00:58.467736 IP 192.168.1.200.http > 192.168.1.8.45787: P 929021:929445(424) ack 5550 win 5792 <nop,nop,timestamp 1650050 1478264797>
21:00:58.467741 IP 192.168.1.8.45787 > 192.168.1.200.http: . ack 929445 win 1862 <nop,nop,timestamp 1478264798 1650050>
21:00:59.195287 IP 192.168.1.8.45788 > 192.168.1.200.http: P 5439:5550(111) ack 1483477 win 2003 <nop,nop,timestamp 1478265526 1650024>
21:00:59.231027 IP 192.168.1.200.http > 192.168.1.8.45788: . ack 5550 win 5792 <nop,nop,timestamp 1650127 1478265526>
21:00:59.480223 IP 192.168.1.8.45787 > 192.168.1.200.http: P 5550:5661(111) ack 929445 win 2003 <nop,nop,timestamp 1478265811 1650050>
21:00:59.510968 IP 192.168.1.200.http > 192.168.1.8.45787: . ack 5661 win 5792 <nop,nop,timestamp 1650156 1478265811>
21:01:04.193725 IP 192.168.1.8.45788 > 192.168.1.200.http: P 5550:5661(111) ack 1483477 win 2003 <nop,nop,timestamp 1478270526 1650127>
21:01:04.194090 IP 192.168.1.200.http > 192.168.1.8.45788: . ack 5661 win 5792 <nop,nop,timestamp 1650627 1478270526>
21:01:04.478634 IP 192.168.1.8.45787 > 192.168.1.200.http: P 5661:5772(111) ack 929445 win 2003 <nop,nop,timestamp 1478270811 1650156>
21:01:04.478954 IP 192.168.1.200.http > 192.168.1.8.45787: . ack 5772 win 5792 <nop,nop,timestamp 1650655 1478270811>
21:02:14.191912 IP 192.168.1.8.45788 > 192.168.1.200.http: P 5661:5772(111) ack 1483477 win 2003 <nop,nop,timestamp 1478340547 1650627>
21:02:14.192255 IP 192.168.1.200.http > 192.168.1.8.45788: . ack 5772 win 5792 <nop,nop,timestamp 1657623 1478340547>
21:02:14.476920 IP 192.168.1.8.45787 > 192.168.1.200.http: P 5772:5883(111) ack 929445 win 2003 <nop,nop,timestamp 1478340832 1650655>
21:02:14.477253 IP 192.168.1.200.http > 192.168.1.8.45787: . ack 5883 win 5792 <nop,nop,timestamp 1657652 1478340832>
21:02:19.188348 arp who-has 192.168.1.8 tell 192.168.1.200
21:02:19.188363 arp reply 192.168.1.8 is-at 00:50:8d:e5:d0:7c (oui Unknown)
21:02:19.190896 IP 192.168.1.8.45788 > 192.168.1.200.http: P 5772:5883(111) ack 1483477 win 2003 <nop,nop,timestamp 1478345548 1657623>
21:02:19.191199 IP 192.168.1.200.http > 192.168.1.8.45788: . ack 5883 win 5792 <nop,nop,timestamp 1658123 1478345548>
21:02:19.475263 IP 192.168.1.8.45787 > 192.168.1.200.http: P 5883:5994(111) ack 929445 win 2003 <nop,nop,timestamp 1478345832 1657652>
21:02:19.475579 IP 192.168.1.200.http > 192.168.1.8.45787: . ack 5994 win 5792 <nop,nop,timestamp 1658151 1478345832>
21:03:29.474275 IP 192.168.1.8.45787 > 192.168.1.200.http: P 5994:6105(111) ack 929445 win 2003 <nop,nop,timestamp 1478415854 1658151>
21:03:29.474621 IP 192.168.1.200.http > 192.168.1.8.45787: . ack 6105 win 5792 <nop,nop,timestamp 1665147 1478415854>
21:03:34.472690 IP 192.168.1.8.45787 > 192.168.1.200.http: P 6105:6216(111) ack 929445 win 2003 <nop,nop,timestamp 1478420854 1665147>
21:03:34.473029 IP 192.168.1.200.http > 192.168.1.8.45787: . ack 6216 win 5792 <nop,nop,timestamp 1665646 1478420854>
21:03:34.473325 arp who-has 192.168.1.8 tell 192.168.1.200
21:03:34.473336 arp reply 192.168.1.8 is-at 00:50:8d:e5:d0:7c (oui Unknown)



This log is with 2 IP9100 channels 'up' and only one being polled. Port 39577 is used to poll channel 1. The test ran for about 6 hours before the error occurred.
Quote:

17:53:23.310663 IP 192.168.1.200.http > 192.168.1.8.39577: . 43643941:43645389(1448) ack 270840 win 5792 <nop,nop,timestamp 2276814 1553434675>
17:53:23.310698 IP 192.168.1.8.39577 > 192.168.1.200.http: . ack 43645389 win 2003 <nop,nop,timestamp 1553434676 2276814>
17:53:23.310948 IP 192.168.1.200.http > 192.168.1.8.39577: . 43645389:43646837(1448) ack 270840 win 5792 <nop,nop,timestamp 2276814 1553434675>
17:53:23.310959 IP 192.168.1.8.39577 > 192.168.1.200.http: . ack 43646837 win 2003 <nop,nop,timestamp 1553434676 2276814>
17:53:23.311899 IP 192.168.1.200.http > 192.168.1.8.39577: . 43646837:43648285(1448) ack 270840 win 5792 <nop,nop,timestamp 2276814 1553434676>
17:53:23.311934 IP 192.168.1.8.39577 > 192.168.1.200.http: . ack 43648285 win 2003 <nop,nop,timestamp 1553434677 2276814>
17:53:23.312163 IP 192.168.1.200.http > 192.168.1.8.39577: . 43648285:43649733(1448) ack 270840 win 5792 <nop,nop,timestamp 2276814 1553434676>
17:53:23.312174 IP 192.168.1.8.39577 > 192.168.1.200.http: . ack 43649733 win 2003 <nop,nop,timestamp 1553434677 2276814>
17:53:23.313332 IP 192.168.1.200.http > 192.168.1.8.39577: P 43649733:43651181(1448) ack 270840 win 5792 <nop,nop,timestamp 2276815 1553434677>
17:53:23.313348 IP 192.168.1.8.39577 > 192.168.1.200.http: . ack 43651181 win 2003 <nop,nop,timestamp 1553434678 2276815>
17:53:24.302923 IP 192.168.1.8.39577 > 192.168.1.200.http: P 270840:270951(111) ack 43651181 win 2003 <nop,nop,timestamp 1553435668 2276815>
17:53:24.342924 IP 192.168.1.200.http > 192.168.1.8.39577: . ack 270951 win 5792 <nop,nop,timestamp 2276919 1553435668>
17:53:29.301377 IP 192.168.1.8.39577 > 192.168.1.200.http: P 270951:271062(111) ack 43651181 win 2003 <nop,nop,timestamp 1553440668 2276919>
17:53:29.301700 IP 192.168.1.200.http > 192.168.1.8.39577: . ack 271062 win 5792 <nop,nop,timestamp 2277417 1553440668>
17:54:39.299557 IP 192.168.1.8.39577 > 192.168.1.200.http: P 271062:271173(111) ack 43651181 win 2003 <nop,nop,timestamp 1553510690 2277417>
17:54:39.299950 IP 192.168.1.200.http > 192.168.1.8.39577: . ack 271173 win 5792 <nop,nop,timestamp 2284415 1553510690>
17:54:44.298019 IP 192.168.1.8.39577 > 192.168.1.200.http: P 271173:271284(111) ack 43651181 win 2003 <nop,nop,timestamp 1553515690 2284415>
17:54:44.298352 IP 192.168.1.200.http > 192.168.1.8.39577: . ack 271284 win 5792 <nop,nop,timestamp 2284914 1553515690>
17:54:44.299881 arp who-has 192.168.1.8 tell 192.168.1.200
17:54:44.299892 arp reply 192.168.1.8 is-at 00:50:8d:e5:d0:7c (oui Unknown)



Things might be a little clearer in the one-channel log without the complication of the second channel, but AFAICT the same things are happening.

From what I can tell, all is going well, both logs show the tail-end of the last image going from .200 -> .8 (ip9100a to server) which are the multiple packets in that direction.
Then suddenly (at 21:01:04 and 17:53:29) there is a 5s delay (look at the time stamps) and the server sends another HTTP request (these are 111 bytes every time) which is acknowledged. 10s after that the server sends another HTTP request, which is ack'ed, and another one 5s later, which is also ack'ed.
Then after another 5s, the 9100a (.200) seems to have lost the server's MAC address and issues an ARP request. This might indicate it has reset?


Conclusions? I have yet to disable all but Channel 1 on the 9100A, or try (say) Channel 1 and Channel 4. I've tried a slower poll rate in the past, 2s and 5s I think, with no improvement.

I have a nagging doubt about the Linux kernel, as I found some kernel-dev messages about TCPIP hangups (but can I go back and find them, I doubt it...)
I may change the Linux version, as this is the third Linux I have used with this IP9100A. I have some mythtv reasons to go to a new version, and will probably try FC8.
I started with NSLU2-Linux, then Grafpup (Puppy) Linux, and am now on FC6. The hardware has also changed each time, and the funny thing is I don't remember as many crashes with the NSLU2 at all, but remember a few with Grafpup. Maybe I just noticed them more. Which makes me wonder, as a hardware engineer, if the IP9100 is aging in some way and becoming more unreliable. As I mentioned at the top of this long thread, I have a backup 9100A which I could substitute in. My main unit was bought second hand so I don't know how old it is (I've used it continuously for about a year).
Or it could be coincidence and I have a Motion program or Linux TCPIP issue. Who knows!!

I would appreciate any opinions / ideas etc.

[/list]
Back to top
View user's profile Send private message
mycal



Joined: 03 Dec 2006
Posts: 601

PostPosted: Sun Jan 06, 2008 11:20 pm    Post subject: Reply with quote

From a quick glance it looks like the crash happens on a new image request on a keepalive connection. correct me if I'm wrong.

Also from the server side the 2K window is also strange, it looks like it is falling slowly, and I think may be the cause of the stall as the 9100 looks like it goes into slow start and never recovers.

Is there a new connection for each request or are you using the keepalive option? I know you guys added the keep-alive option,
I'm wondering if the same results happen without it.

Also what is that 111 bytes? a new request? it looks like it keeps flowing and the 9100 is acking, what is that data/

If the keepalive option is the cause you may want to limit the # of request you do on a single socket to 1K and then reconnect. I know apache does this, but the 9100 webserver is not that smart so you might have to do it on the client end.
Back to top
View user's profile Send private message AIM Address
Display posts from previous:   
This forum is locked: you cannot post, reply to, or edit topics.   This topic is locked: you cannot edit posts or make replies.    Yoics Support Forums Forum Index -> Misc Yoics Enabled Devices All times are GMT - 7 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum