Exploring the Technical Challenges of Scaling Live Streaming Services

Increased use of GUI-based online services for content management and creation has shifted focus away from traditional webmasters who also wore the hat of in-house IT technician. Many companies no longer have IT support staff, leading to the phenomenon of offloading technical problems and streaming server administration to a content creator without appropriate expertise on streaming technology processes or the array of technical jargon. This compounds an existing problem in our information overload era, with masses of confusing out-of-date/beginner-level tutorials and web articles. The content creator has much to gain from a clear and definitive resource that will allow them to align best practice with the most cost-effective deployment of new technologies. E.g., should I trust HTML5 video to replace an old embedded Windows Media Player plugin, and if so, what format of video should I use? What are the competing benefits of live streaming vs existing on-demand video content?

Many small to mid-scale companies’ primary concern is selecting the best investment in streaming technology, which will maximize compatibility with partner sites and reach their customers with minimal encode time and a high-quality viewing experience. Accessibility of new codec technologies and the promise of great compression to reduce bandwidth costs has an unfortunate potential downside. The enterprise must ensure that codec formats will be supported on a majority of customer platforms, leading many to continue supporting multiple codec formats to avoid customer firewalls blocking one certain type. This personifies a wider problem in a rapidly changing industry. An enterprise may save on bandwidth costs by encoding exclusively in modern codec X, yet B2B communications, which still account for a great deal of customer support investment (e.g., voice to customer on a troubleshooting issue), may not be possible because the customer using older software/hardware may not support codec X. In an industry famous for vendor feuding to push their own proprietary standards, many still view the recent VP8 and subsequent Google acquisition of VP8’s parent company On2 codecs as an attempt to open-source a solution to the industry, similar to the success of Flash video.

A decade ago, the choice of small/mid-scale companies providing streaming media would be easy; always go with Real Networks. Real’s early entry into streaming technology processes and the astounding viral market penetration of their G2 player gave them a commanding market share lead up until 2002. Today, the streaming media landscape is much different, competitive, and unfortunately for many, a minefield of standards, formats, and vendors resulting in time-consuming work and compromised quality due to constant deployment of new technologies.

Technical Challenges of Scaling Live Streaming Services

We discuss these server-side problems in the context of video delivery using conventional protocols for static data such as HTTP and streaming/video on demand systems using these protocols. We do not address peer-to-peer systems for live streaming; the use of P2P for video streaming imposes different constraints and optimization problems for video delivery.

As more viewers tune into the video, problems arise in video delivery and service availability. These problems specifically arise when video platforms have significantly more viewers than their expected audience at design time. While network bandwidth is certainly a concern with delivering video to lots of viewers, we will see that server-side concerns with video delivery and service availability are even more challenging technical problems.

When services are small and their video platform consists of a single web server, it is simple to make your video live: just start a broadcast from the camera or encode stored video into an incoming data feed to the server. Viewers can then watch the video by sending a request to the server and receiving video data in a series of UDP or HTTP packets.

Scaling live video streaming to large audiences is still a fairly young area of research. At the time of this writing, YouTube reported that 60 hours of new video are uploaded to their service every minute, while live platforms like Twitch can grow dramatically in scale over short periods of time with the popularity of a new game or sporting event.

Bandwidth Requirements

The required bit rate can be calculated using equation 1. This states that the bit rate is the product of the video resolution, the frame rate, the color depth, and the efficiency of the video codec. In general, an increase in any of these variables will increase the perceived quality of the video, but also the complexity and the cost of the codec and the encoding equipment. If it is assumed that the cost of the codec and encoding equipment is fixed, then it is sufficient to say that the increase in perceived video quality is not worthwhile. This can be seen as a trade-off curve between perceived video quality and its cost. For a given bandwidth, the aim will be to position the video towards the left of the trade-off curve in order to leave some room for audio and to ensure that quality is not compromised at times of high network congestion. If higher quality is desired, it is preferable to wait until there is spare bandwidth available at a lower cost.

The bit rate required for live streaming depends on the quality of service desired. At one extreme, a paging service might only require the ability to view the received data and scroll backwards and forwards. At the opposite extreme, a live sports broadcast might require full-screen, full-motion video with near-broadcast quality. The former might be accomplished with a bit rate of 10-30 kbps with a very low-quality video codec, whereas the latter might require a bit rate of 1.5-15 Mbps using a high-quality video codec. This is a very wide range, so it is useful to provide some typical bit rate figures for different services. Peon and Ho, who have analyzed the media characteristics of a number of multimedia applications, recommend bit rates of 40-220 kbps for acceptable quality video and 220-1000 kbps for near-TV quality video. This might be taken as an average of 60-500 kbps for acceptable to high-quality video, which is a useful rule of thumb for calculating the bit rate required for a live streaming session.

Minimizing bandwidth overhead is critical to providing a cost-effective service. This section describes how the bandwidth requirements for live streaming can be calculated and various techniques for reducing the required bit rate.

Content Delivery Network (CDN) Optimization

When trying to use CDNs to optimally serve different versions of objects in the face of dynamic, unpredictable demand for popular manifestations, research is necessary into algorithms and policies for cache consistency. This is an important issue for live events, where creating multiple versions of a specific object to match different network bandwidth, end-system resource, or display resolution capabilities may not be desirable. If viewers change the version of an object that they are receiving during the event, for example from a high-resolution to a low-resolution video stream, it is beneficial to content providers for this change to occur at a CDN proxy near the viewer rather than at the origin server. By that time though, the low-resolution object may have been replaced in the cache by a new object that the viewer requests, resulting in a reload from the origin server to obtain the new version. Making this change of object versions cacheable at the edge can be accomplished with an extension to the HTTP/1.1 protocol by using the Vary header to specify the primary condition under which the cached response can be reused, and as an indication that the object is variant, a time-to-live specified in the Expires header. This new protocol increases the complexity of cache consistency, but provides powerful capabilities for it. A specific policy to match different versions of objects to viewer capabilities can be implemented in edge content markup preprocessors, in which persistence of different objects in the cache can be made by generation of a derived URL from the original URL and information on object variant.

Server Load Balancing

Server load balancing has several different advantages, such as redundancy. In the event of a server failure, the server cluster can still operate. Scalability is another advantage, as more servers can be added to a server cluster when required to increase application performance. Load balancing can also increase application performance because the network requests can be evenly distributed among several servers.

Server load balancing is an architecture in which the incoming network traffic to a server is distributed across a group of backend servers. This is also known as a server cluster. Using a special device, a software component, or a hardware and software combination, it divides user requests according to a scheduling algorithm among the servers in the cluster, allowing each server to share an equal load of the application traffic. In the event that a server fails, the load balancer will redirect traffic to other properly functioning servers. When a new server is added to the server group, the load balancer automatically starts to send requests to the new server.

Server load balancing is a network-based method in which two or more network connections are used to provide a single consistent network. It is configured in such a way that the resulting performance is superior to that of a single connection. Load balancing improves responsiveness and increases the availability of applications. It also provides higher throughput and reliable transmission of data.

Strategies for Overcoming Scaling Challenges

One of the key methods indeed reuses traditional broadcasting techniques and makes use of state-of-the-art video coding. It has been shown that different users have very different QoS when using the Internet. By allowing the client player to adaptively select the most appropriate bit rate for the content, the user’s QoS can be improved. If the network condition is bad and the client selects a very high bit rate, it will be stopped frequently to rebuffer. During this time, the bit rate can be adapted down so that the user can still view the content. The adaptive bit rate selection is based on client-driven rate-distortion optimization. By using Figure 1 as an example, the bit rate is started off high and gradually reduced as the buffering time increases. The quality with the lowest quality index is selected when the PSNR is equal to. Satisfactory rates can be guaranteed to users viewing real-time streams because changes in the quality of service have direct and immediate feedback to the bit rate selection. This real-time rate selection maintains smooth video playback and also provides the highest possible video quality without interruptions or frequent stops to rebuffer. This method can be used to stream live events with multiple bit rate streams of the same content.

Adaptive Bitrate Streaming

In the past, the delivery of video via HTTP has had a very fixed quality per unit time, with statically assigned files being sent and sometimes later retrieved from cache. Video segments, such as this, would relate to entire files or timed sections of separate files, both of which are not ideal for initial playback delays. Traditional techniques include methods such as HTTP pseudo-streaming or even pure progressive download. The latter, in this case, is straightforward enough and ensures playback would start straight away. Unfortunately, the user is limited by the file’s quality versus download speed ratio, since there is only one file to watch. Pseudo streaming can be more effective, since the user is able to seek through a file and playback will start from that point. However, this technique still relates to a single file download and does not actually address the issue of quality versus download speed. The initial result and common user frustration is a stop-start scenario while the video buffer can only temporarily outpace playback.

Continuing to explore video on demand responsibilities, this section will specifically focus on key strides that have allowed the industry to improve quality and reliability. Chief among these is the advent of Adaptive Bitrate Streaming (ABS), which allows for drastic improvements in Quality of Service (QoS) for a better end user experience and is directly scalable in principle.

Caching and Content Preloading

The following simple pegging algorithm can be used to adjust video quality. Let R be the incoming data rate at the video player. If the cache has the video file at quality q (measured by average bits per frame), then we should switch to video quality q’ if the rate Rq'[q/q’] is greater than the rate of the video file, and the file with the higher quality is expected to download before it is needed.

In the context of live video streaming, On-Demand Streaming with partial cache can simulate live streaming by placing a sliding expiration window on files of duration t (e.g., 10 minutes). Although this is not true live video streaming, any stream consisting of a series of onionskin encoded video clips will begin playback of the first clip as soon as it is available, thus simulating live streaming. At expiration time t, the cache considers the file complete and will not need to retrieve it again.

Adaptive Bitrate Streaming: When the number of viewers increases linearly with stream bandwidth, as shown in Figure 1, adaptive bitrate video players stream the highest bitrate possible for the viewer’s network connection. If the link bandwidth is insufficient, the player could stall or rebuffer, a situation common in congested networks and particularly undesirable during live events. This can cause users to become frustrated and leave the site, creating a negative feedback loop where users in search of higher quality video cause stalls and rebuffering, degrading the quality of video for all users. An alternative strategy would dynamically monitor congestion and use the incoming data rate as a signal to trade off video quality for smooth playback, maintaining a high-quality user experience for as many viewers as possible.

Distributed Cloud Infrastructure

Overall, there are many different ways to scale live streaming services, and a one-size-fits-all recommendation cannot be given. The best strategy is likely to make selective use of the methods discussed, with a focus on the specific characteristics of the service to be provided.

More recently, cloud computing infrastructures have revolutionized the way that large-scale systems are built, and Altas et al. explore various different approaches to using the cloud for live streaming systems. A static allocation of resources is compared to an on-demand allocation where increases in load trigger the initiation of more server instances, and cloud resources are constantly being monitored and adapted. This is investigated on both single cloud and across multiple cloud scenarios and aims to find the most cost-effective approach, each yielding the required QoS. Static allocation may be deemed cost-effective but promises poor QoS if server resources cannot handle the load. The resources wasted by constant monitoring and adapting on multiple cloud scenarios may not justify the improved QoS, and a single cloud may not provide the required resources during peak loads. These cloud options are also weighed up with the option of using a CDN.

Traditional live streaming services have mainly used inexpensive DNS round robin scheduling to simply allocate client requests evenly across a few static media servers. But this does not cater well to the vastly dynamic load characteristics of live streaming scenarios. On the other hand, P2P approaches to live streaming often give a good account of scalability, since any peers that join the system to lighten the load on server resources also add capacity to the system. However, in P2P live streaming, each peer acts as a client and a server, consuming resources (upload bandwidth) by contributing them, often requiring incentives for contributions and still providing no guarantees on QoS and stream continuity.

Load Testing and Performance Optimization

Load testing is the process of simulating expected usage load on the system to gauge response and identify bottlenecks. This is a vital practice for live streaming services because they are faced with the difficult task of providing near real-time delivery of video data, and many times there are a great deal of users with shared interest (i.e. everyone watching a World Cup event). Taking a typical application and server level load testing is not enough, as the nature of video data delivery has many more facets and potential pitfalls. Specific media-based tests need to be performed using a combination of network emulation and synthetic video data to determine capacity and server scalability. Current research on this topic has introduced the concept of hybrid load testing, which aims to provide accurate predictions of stream quality at various load levels, and determining the cost effectiveness of implementing QoS mechanisms to ensure continued service quality at higher server loads.

In conclusion, I have found that many systems capable of providing decent-sized live video do so with varying levels of success. Streaming live video on the internet is becoming a big business and has many potential business cases. Advertising and pay-per-view are obvious ways to fund video content. With success, these systems may bring in the future the desirability to store actual broadcasts in a similar manner to TV for later view on demand. Overall, businesses want to keep customers on their websites longer and have something to show in return for bandwidth costs. Large-scale live video can be a very interesting addition to some sites.

Streaming live video is a very new and fast-evolving technology. As the demand for live video increases, so will the complexity of the systems used to deliver this video. Although this paper has covered many areas of large-scale live video, it has by no means covered it all. Many aspects such as security, real-time monitoring, and many types of specific content have been omitted due to size constraints. I hope that this paper can be used as a stepping stone to further understand the complex requirements for providing large-scale live video.

This paper has discussed the various technologies used in modern large-scale web-based live video systems. It aimed to describe the problems encountered by engineers and the solutions used to combat these problems. It also aimed to give a better understanding of the complexity involved in these systems.

Global Statistics