I mentioned last week that I had a small design flaw in my networking to look at this week. Work on that flaw, increasing the capacity of the server and fixing the few bugs that came up along the way took up the whole week.
The design flaw I found was that the networking code would send at most one UDP packet per update. That one packet could contain a lot of individual messages, but with 35 clients the server couldn't send messages fast enough to keep the network message queue from overflowing after a while.
That was easy enough to fix, but I was rather surprised to find that the message queue was still overflowing. The packets that tell the sender that a packet was received so it can be removed from the queue were sometimes showing up very, very late. There seemed to be no good reason for this. I could see the confirmation packets being sent, but sometimes it just wouldn't send one for a really long time. I spent most of a day trying to figure out what was going on before I finally spotted a < rather than a > in the check for when to send those packets. Yes, programming is super-picky.
With that fixed, it turns out it still didn't work. Occasionally the confirmation packets would still show up really late. Investigating some more I discovered that in those situations it was receiving a lot of messages in a single update and it took about 5 seconds to process them all - long enough for the queue to overflow. Previously I had some code that made sure that the networking code didn't run for too long before letting everything else in the game run but I had removed it because I didn't think I needed it anymore. As soon as I saw what was happening I remembered that the same thing had happened before and that was the solution I had came up with. I put that code back in, and suddenly I could get 50 players in-game and have them stay.
Getting 50 players in-game is one thing, but the goal is 1000 which requires some expansion of the basic capabilities of the network messaging. I spent an evening increasing the queue size from just 150 to 2047. (This was more involved than just changing a number. Once the number was changed I found some things that suddenly ran quite slowly when faced with managing 13x as many messages.) That probably still won't be enough, but I'm not sure how large it will need to be so I made some notes to myself on how to tweak that later.
The next problem was that while I was trying to get 50 players in-game reliably I was running out of CPU on the PC running the debug server software. I was surprised to discover that what was actually using the CPU was not any of my server software, but Visual Studio. Miranda uses OutputDebugString to print out debugging information in the Visual Studio debugger. Just transferring the debugging messages to the debugger was using 4x as much CPU as all the server software combined.
And that was 6:00 Friday afternoon.
We were unable to retrieve our session cookie from your web browser. If pressing F5 once to reload this page does not get rid of this message, please read this to learn more.
You will not be able to post until you resolve this problem.