Soldak Home   Drox Operative   Din's Curse   Depths of Peril   Zombasite  

Go Back   Soldak Entertainment Forums > News > News/Blog comments
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Thread Tools Display Modes
Old 07-18-2014, 01:42 PM
Shadow's Avatar
Shadow Shadow is offline
Super Moderator
Join Date: Jun 2007
Location: Dallas, TX
Posts: 9,672
Default Why programming is sometimes hard

I fixed some networking issues in our engine (specifically Drox Operative) a while back and the path I had to take to find the solution was interesting, so I thought I would write up a little of what happened.

The problem was that once a game progressed long enough, people were having a hard time joining the server hosting that particular sector. Easy enough, I thought, but no matter what I did I couldn't reproduce the problem. I even got some of our gamers to send me their save games that they were having trouble with, but still no luck.

Even though I couldn't reproduce the problem, I looked at the save games to see if there was anything interesting going on. What I found was one of the initial networking messages was so large that the networking system needed to create over 10 fragments for it (the system would send 10+ smaller messages). At the time, the system would throw out the entire message if it received an out of order fragment or any fragment got lost. I figured if the packet loss was very high, a message of that size would almost never get through. So I set my handy packet loss tool to drop 50% of the packets and sure enough, I could almost never get that message through.

So I go about making the system smarter about fragments. I made it so it would store all of the fragments and if it missed a fragment or got an out of order fragment, everything was fine and it would just wait for the other side to resend the message again. As long as the next message was the same as the first, it would ignore repeated fragments and just use the fragments it needed. This way sooner or later, even with bad packet loss, it will get all of the fragments.

So now everything should work great. I test it and the new stuff does exactly what it should, but it still doesn't work. Even weirder is I turn the packet loss tool off and it still doesn't work even though it used to when the tool was off. At least now I have a reproducible situation.

I debug the problem and see that it gets each fragment fine until a little after 8192 bytes. It wasn't exactly 8192, but it was near enough to that power of 2 that I was suspicious. I turned the packet loss stuff back on and now I started getting data after 8192 bytes but I noticed that I was getting the same number of fragments through each time. So the networking was only delivering a certain number of bytes before eating everything else. I did a little googling and found out that Windows defaults to a 8192 buffer size for incoming UDP packets.

Ok, so I've found the problem. I found the correct commands and now the networking is told to use a much larger buffer so it can at least hold one large message. I test again and it still doesn't work! I start debugging again. Now I see fragments coming through way past 8192, so that is fixed, but I get to around 32K and then one of the numbers goes negative. Again that sounds like another power of 2. In this case it sounds like a signed 16 bit value. Sure enough I find that fragments are using a signed 16 bit for an offset number. Again easy enough, I change it to a 32 bit value so that should never be a problem again, assuming we never generate messages that are over 2GB in size.

I test again and things finally work as they should! So in the end, my initial packet loss changes had nothing to do with the real problem that we were running into. The fixes to the actual problem were like 2 lines of code compared to probably 100s dealing with the packet loss. However, it still is a nice change because it handles packet loss much better on large messages. I'm still not quite sure why I couldn't reproduce the problem in the first place though.
Steven Peeler
Depths of Peril, Kivi's Underworld, Din's Curse, Drox Operative, Zombasite, Din's Legacy, & Drox Operative 2
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

All times are GMT -4. The time now is 07:49 AM.

Powered by vBulletin® Version 3.6.7
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
Copyright 2007 - 2022 Soldak Entertainment, Inc.