On The Lack of Video Processing Delay
I recently (as of this writing, though probably not as of when you’re reading this!) went to a Pink concert at Staples Center in L.A., and was impressed by multiple things. First, the sound quality was actually good. Here in San Diego, the Sports Arena where concerts of this type are held has horrible acoustics, so actually hearing a rock concert was a nice treat. Second, the production quality was impressive. The whole show was exquisitely choreographed, with numerous set changes and way more acrobatics than I had expected (check out this video that somebody else took at the show to see what I mean). Third, the processing delay for the visual effects was impressive, and by that I mean virtually imperceptible. That third one really grabbed my attention.
At Cirrascale, we have a few customers who deal with live event-based material, but it’s not usually delay sensitive. These customers do things like record live events for later playback or analysis (so practically any delay is tolerable so long as the data eventually makes it to disk) or convert live events into streams that can be viewed online (presumably by people not simultaneously watching the actual event!). In these scenarios, an acceptable lag between when something happens in real life, and when it is done being processed (encoded, saved, transmitted, or whatever) is usually measured in seconds, if not minutes. This is markedly different from processing material that is included in a live event.
When I was working at Commodore (and to be honest, even before that, since I liked to geek out on this stuff), I often used the NewTek Video Toaster to create product and technology demonstrations, and worked with a lot of people and companies that were using it for live events. While the Video Toaster could do really cool stuff, one of the things that wasn’t so cool was the processing delay it introduced to the video stream. The workflow had a lot of analog elements to it, so there was very little delay introduced by the cameras recording an event or the monitors displaying the output; most of the delay was due to the processing of the live video stream. A standard definition NTSC stream (roughly 720×480 pixels at just under 60 fields per second) passing through the Video Toaster and a Time Base Corrector could be delayed by amounts approaching 100ms, as you can sort of see in this demo. Keep in mind that the effects being performed were also relatively simple, such as a crude scaling of the video, or altering the chroma or luminance to give false color, posterization, or some other all-over effect on the video frames.
Contrast that to the Pink concert. Many times the performance included projecting the the singers face or body onto multiple large screens behind the stage, so the audience would see Pink dancing and singing (or doing acrobatics!) in front of a larger version of herself on multiple, different sized screens around the stage.
It’s a common thing to do at concerts, award shows, and other performances, but makes any delay in the video chain very obvious. I’ve got almost no clue what a setup is that would be used for a theatrical performance like the concert I went to, but conceptually it’s far more complex than what I dealt with on the Video Toaster. Products like Event Presenter and Tricaster show some seemingly simplistic workflows, which seem more like what you’d have at a small event.
Extrapolating from there, I imagine it’s something like capturing the image or images from cameras, applying some effects filters to them, rendering those images to multiple geometries (since there are multiple screens on the stage), and then decimating that image into smaller pieces (I’m guessing each screen on the stage is really a number of smaller screens that are color-matched and stitched together) for display on the final display devices. Even if the images being handled are the equivalent of 480p, that’s a lot of data to process and transport! Assuming that the video is manipulated in an uncompressed format at around 16bpp (it’s likely really 24 or 32bpp, but I like to be cautious), we’re talking tens of gigabytes-per-second of data manipulation going on.
If my quick Googling is to be believed, all of that data movement and processing needs to be done in under 20ms to be viewed as happening in sync with the actual performance. There are obviously products that can handle that, such as the NewTek Tricaster (the spiritual successor to the aforementioned Video Toaster) and the Showkube KShow, but even there a scan of the Tricaster support forum has users complaining of 7-frame and even 3-frame delays…so we’re back to the 100ms range again, albeit with some pretty cool functionality. Whatever was being used at the Pink concert must have been way beefier than those products!
Of course, while trying to satisfy my curiosity about how this low delay production was actually done, I realized that it could be just a viewing effect. Staples Center’s floor is roughly the size of a hockey rink, and my seat was about 2/3rds of a rink-distance away.
That 134 feet means sound is delayed from the stage by just over 110ms. This assumes I’m hearing the sound projected from the stage, which is likely not the case, but let’s pretend it is. The net result is that the processing delay may not have been reduced since the Video Toaster days, but the capabilities certainly have been.
As Moore’s Law pointed out, technological innovations continue to happen at a rapid pace. The growing capabilities of products and technology I get to work with here at Cirrascale continually wows me, and seeing even simple things like “delay-less” video effects during a live performance is just yet another reminder that advancements can manifest themselves in many ways. In my day-to-day work it’s faster processors, more complex GPUs, and I/O devices with lower latency and higher throughput that solve problems related to HPC and storage applications. It’s interesting and refreshing to see advancements in other areas when out and about in my daily life, and knowing that things I work with every day are helping make that possible.