Recording game video is not as easy a problem as you might think. You likely want to record at 1920x1080 resolution, 60FPS so that's 1920x1080x3x60=373,248,000 bytes every second (373MB/s.) My disk drive is pretty quick, it can do around 85MB/s, some newer hard drives can break 100MB/s. My first video recording system wrote every frame as a PNG and I liked to call it the crash-the-game-button. It would record about ten seconds of video before it ran out of memory and crashed, however once I had half a video recording system, I kind of wanted to finish it. Actually, that's not true. The reason for all this work is that I'm not satisfied with the quality of video produced by screen capture programs. The trailer for Miranda needs to look as good as I can make it.
I still wanted to write PNG sequences because Sony Movie Studio reads them so nicely, but trying to write PNG's in real-time is a non-starter. It takes about half a second for one thread on my Core i7 to write one PNG with the PNG compression set to fastest. So I abandoned that and replaced it with a system that uses all cores to do the initial processing of frames, then one thread to write the processed results to a temporary file on disk as fast as it can. After recording is done, that same thread reads back the temporary file and writes the final PNG's out over a couple of minutes. If you prefer H264 output instead, you might want to look at this.
The trick then was getting the data rate low enough that it could write the temporary frames to disk in real time. I played with FRAPS previously, and it works really well - I understand now why the files it writes are so large. At first I read that FRAPS uses YUV420 encoding, some reading led me here for a fast YUV-RGB conversion algorithm. YUV420 uses 1 byte per pixel for Y and 1 byte per 4 pixels for each of U and V - a 50% savings over RGB! One thing that was not at all clear in my YUV research is how you calculate the U and V for the 4 combined pixels. After some experimentation I found the cheapest way to do that was to average the RGB values of the four pixels. The advantage of YUV420 is that it combines all the Y, U and V values for all the pixels into three planes, those planes then are much more compressible. I applied ZLIB to the YUV data and got a total of 85% savings over raw RGB - no problem writing that data in real time. The problem was that the video frames looked pretty awful and I abandoned it immediately. YUV420 is probably fine for live-action video, but the loss of color detail makes text in video games look pretty poor. Around this time I found some source to a FRAPS codec which shows how FRAPS does its magic.
The next optimization was to cut the recording framerate in half, 30FPS is adequate for what I need.
Next I tried YUV444 which just converts the RGB data to YUV without any loss of quality (or savings in data size.) With ZLIB, YUV's compressibility gave me a 74% reduction over the raw RGB, could be handled by a fast hard drive and looks perfect. The only problem was that having all those frames being compressed in parallel by every thread on the i7 is a pretty serious load on the CPU. That said, it did work alright.
As an experiment I also tried running compression on raw RGB data, that only gave me a 61% savings so the relatively inexpensive conversion to YUV is definitely worth it.
Next I gave it a try on my new SSD, it turns out that the SSD can handle raw 1920x1080x3x30FPS video with almost no CPU load. Solved!
Once I had video recording working I discovered my next problem. Playing a game, if a frame is late, the next frame the game takes into account its lateness and moves everything appropriately so that the delay is much less noticeable. When that happens while recording video and then you play it back at a perfect 30FPS, that beautiful, smooth video game pan is unacceptably jerky. I have read that some recording programs actually alter the time code in the video to "fix" this but that wasn't an option for me.
The scene I was recording is one of the most challenging Miranda produces (from a GPU point of view.) That meant that even at 30FPS, my aging laptop occasionally would produce a late frame resulting in an ugly skip in the final video.
Miranda caps its framerate at 30FPS with a check in the main loop. It sees how long each frame takes to produce, then sleeps for just long enough to maintain 30FPS. This seems like it should work, but it wasn't. To try to figure out what was going wrong I fired up my frame profiler. It turns out that the late frames were caused by either the shader compiler or by the OpenGL SwapBuffers call which randomly takes a long time.
I had heard that SwapBuffers is how OpenGL implements VSync, and that turns out to be exactly true. I turned off VSync and instantly SwapBuffers started consistenly taking very little time. The two synchronization systems were no longer fighting with each other.
Next I realized I could make the framerate smoother by scheduling frames at regular intervals rather than just timing the last frame. With scheduled frames, if one frame is late, the next one is early. It turns out that works quite well. I also discovered to my surprise that using Sleep to control the framerate is usually accurate to within 600 microseconds. Nobody is going to notice a 600 microsecond late frame.
Miranda previously capped itself at 60FPS during regular gameplay, it does that no longer. You can turn on VSync to synchronize with your screen's refresh rate, or turn it off to run flat out. When recording video, I turn vsync off and schedule frames to maintain a stable 30FPS.
I mentioned that some late frames were caused by the shader compiler. That thing is slow, I've seen it take 240ms to compile one shader, so at some point I'll need to do some work on my current compile-on-first-use implementation.
To get the second shot for the trailer, I also put in a new sky that looks a lot like this along with a sky-sphere to replace the old flat sky plane. Its a big improvement over the old solid color sky and gets rid of the occasional background bleed-through I would see on the tops of mountains.
Shot 2 is in the can, hopefully the rest of the shots go a little quicker.
We were unable to retrieve our session cookie from your web browser. If pressing F5 once to reload this page does not get rid of this message, please read this to learn more.
You will not be able to post until you resolve this problem.