You Deserve Better than Grainy Giraffes
When the park announced that April was pregnant with what would be the zoo’s first ever baby giraffe, the internet took notice. Since February, with April’s fifteen-month-long gestation period drawing to a close and delivery imminent, eager fans have been able to monitor the expectant mother’s progress via the GiraffeCam, a 24/7 live video stream from April’s enclosure. Before long, April had gone viral, complete with a Twitter account, a #giraffewatch hashtag, a spoof video, and even an elaborate April Fool’s Day conspiracy theory.
As I was checking in on April today—never let it be said that anyone involved with NGCodec is not up-to-the-minute on important current events—I found myself thinking how much nicer it would be if the video quality of the live stream were better. Exciting though the stream is, it’s undeniably a bit grainy and it tends to freeze. Considering how long we’ve been waiting for this birth, I think I speak for all of us when I say that we’d like to see it clearly.
The NGCodec team has been doing a lot of work lately on improving the quality of live video through the use of FPGA hardware encoding. You may recall our announcement late last year about porting our RealityCodec NGCodec H.265/HEVC video encoder to the Amazon Web Services (AWS) Elastic Compute Cloud (EC2) F1 Instances. What this milestone signifies, practically speaking, is that we are making strides toward significantly improving live video encoding and the quality of the resulting video. The following excerpt from our recent white paper, “Live Video Encoding Using New AWS F1 Acceleration: The Benefits of Xilinx FPGA for Live Video Encoding,” gives an overview of the current status of live video encoding using software and addresses the many ways in which hardware encoding offers significant advantages over these methods:
In a live video broadcast over the internet, a single video stream is sent from the source to the cloud. It is then transcoded—decoded in the cloud and re-encoded into multiple bit rates for ABR—before being sent on to the end viewer. Today, this is achieved purely through software, typically open source encoding projects such as x264 or x265, using many central processing units (CPUs). The difficulty with this approach for live video is that there is a limit to the amount of parallelism that can be exploited to make the video smaller; this capability is defined by the number of cores within the server in question. Because the frames per second (FPS) must be maintained to avoid jerky playback, the computing requirement must not exceed this FPS at any time. As such, the highest quality settings in the software encoder cannot be used. For our purposes, we will look at the x265 open source software video encoder as an example.
Encoding software like x265 contains a great many presets, allowing the user to customize settings and trade overall computing requirements for the end size of the video. For file-based videos, this technology can produce very high- quality results with the x265 ‘veryslow’ preset: the video can be encoded many times longer than real-time constraints allow, yielding the best compression, but with considerable cost of encoding resources.
For live video, by contrast, software encoding is simply unable to achieve the maximum quality offered by the encoder technology. Fig. 2 compares 1080p50 source video encoded with different x265 presets (for video quality) and the resulting frames that can be encoded per second on the AWS c4.8xlarge instance type. The necessary tradeoffs to satisfy computing requirements mean significant reductions in quality. Instead of running the encoder at a slow setting, which will produce the best end quality, sacrifices are necessary to achieve the target frame rate. The fundamental problem with software-based encoding for live videos is that the best compression—that is, the highest quality video for the lowest bit rate—and the finest end result in video quality are unattainable with the available compute level. By comparison, NGCodec’s encoder can achieve 80 FPS and surpass the quality of even the x265 ‘veryslow’ preset.
[...] For live video, the primary benefit to video encoding with FPGA F1 instances is that we can achieve a higher quality video at the same bit rate, and do it at a desirable 60 frames per second. A second benefit, relevant only in certain cases, is lower latency and reduced lag time between live stream source and end viewing. Third, the cost of encoding is significantly reduced. Finally, we can support up to 32 independent encoded streams of video on a single F1 instance.
In a practical sense, the gain is ultimately that NGCodec can enable customers to achieve a higher-quality video by taking advantage of the greater compute capability of an FPGA. We are able to reduce source video to 0.13 percent of its original value with virtually no perceived loss of quality.
I’m confident that hardware encoding with FPGAs is going to be a game-changer for live video, enabling significant increases in picture quality and improved stream fluidity. My only regret is that we won’t be able to take advantage of it in time to celebrate the arrival of April’s baby in glorious high definition. You deserve better than grainy giraffes. We all do.