2012-02-24 14:40:48 | Technology
Downsides of HTTP adaptive bit rate streaming
I wrote a number of articles about the shortcomings of the first generation of HTTP Adaptive Bit Rate Streaming (HABS), including Apple HLS, Adobe HDS, Microsoft Smooth Streaming and MPEG DASH. Jet-Stream R&D identified many issues that cripple massive adoption of these technologies. There are some serious design flaws in HTTP Adaptive Bit Rate streaming than apparently these vendors and CDNs have never thought of and never encountered:
1) File size, block size
Since HABS content is segmented into tiny objects, many CDNs and CDN storage vendors struggle with the block sizes of their storage solutions. VOD files can be GBs in size while fragmented content can be as small as a KB. CDN storage vendors can't tune their storage for various content sizes: some used to use 64MB block sizes which are just a waste of expensive storage space for storing segmented content. Bringing down their block sizes makes their storage highly inefficient for large objects. The result is that they either have poor performance and poor efficiency for large VOD objects or for segmented content, or that they propose to split these over multiple storage vaults, adding expensive complexity. (Jet-Stream solves this by being able to dynamically allocating various storage solutions within the CDN).
2) Number of objects
Instead of having to manage millions of VOD objects, CDNs now face the challenge to manage billions of segmented objects. The result is that some CDNs and some CDN vendor technologies run out of database capacity, their viewing systems and APIs can't handle file listings anymore (if they offered this ability in the first place). (Jet-Stream invented the concept of 'logical assets' to prevent this problem).
3) Number of log entries
Each downloaded segment generates a log entry. Instead of processing millions access log entries per time unit, CDNs now face the challenge to manage billions access log entries per time unit. The result is that some CDNs and some CDN vendor technologies run out of log processing capacity, effectively losing the ability to generate statistics for their clients and losing the ability to bill their customers for used traffic. (Jet-Stream invented and worked with various vendors to add true session logging capabilities to caches to solve this problem).
4) Stateless delivery means loss of session reporting
HTTP adaptive streaming is a stateless delivery technology. It means that there is no correlation between the manifest file and the segments. This also means that CDNs lose the ability to report viewing sessions based upon access logs. This is a major stepback for content owners. (Jet-Stream invented and worked with various vendors to add true session logging capabilities to caches to solve this problem).
5) Stateless delivery means loss of concurrent streams reporting
Since caches don't serve out a single object or a single stream per user, but users basically download chunks serially and refresh playlists, caches can only report the number of concurrent downloads. Which is not the same as concurrent viewers. Effectively the CDN loses the ability to offer a reliable number of concurrent and peak viewers and lose the ability to report bandwidth usage per session as well. Note that CDNs cannot rely on the client to obtain session information.
6) Stateless delivery means loss off anti deep linking features; loss of separate manifest distribution
Most CDNs offer basic or advanced anti deep linking features. This feature means that delivery nodes by default deny access to content, unless there is a valid token in the URL. However with HABS, only the access to the manifest file can be limited. There is no session information available between the manifest and the segments. Locking down caches would effectively mean that end users with a valid token can only obtain the manifest file but will not be able to access the media chunks even though they have a valid token. The result is that CDNs have to allow direct access to chunks and can only protect access the manifest files. This opens a leak where anyone could create a manifest file, point for the segments to the CDN and anyone can access the content. CDN caching vendors could try to dynamically add manifest-segments session management to better lock-down access. This however destroys the claim that HABS can be distributed through any basic HTTP cache. This also distroys the possibility to let third parties (such as telcos and enterprises) further cache HTTP streams. It also destroys the assumption that manifest files can be used as an offline referrer file that can be distributed outside the CDN.
7) The Thundering Herd: http live streaming renders caches useless
HTTP adaptive streaming vendors claim that you can use any basic reverse proxy to cache, scale and distribute live streams. However, in reality that is a lie. Caches are not smart. If one user requests an object from a cache, and the cache doesn't have the content, the cache will try to pull in the object from the origin and pass it through to the user. When two users request the same object at the same time, and the cache doesn't have the content, any cache will pass through both requests simultaneously to the origin. When ten thousand users request the same object at the same time, each cache will pass through all requests to the origin server. Imagine what happens to the origin server. Imagine dozens of edges hammering an origin. Imagine that this happens every two seconds. That is what HTTP adaptive bit rate live streaming does. Caches simply don't offload origin servers, they kill the origin instead by forwarding a killing herd of end users request to the origin. That is why this problem is called the Thundering Herd. Caches and origins will fall down like domino tiles. Proxy servers and caches simply don't have the intelligence to queue requests. The effect of edge caches is zero. Caching servers were never designed for live streaming, but to cache static web sites, where this thundering herd problem hardly occurs. The entire assumption that you can use any cache from any vendor to support scalable HTTP live streaming is wrong. Virtually all vendors who claim that their proprietary caches are a great tool for CDNs can't queue cache requests, and in our opinion that gives a clear indication on their actual knowledge and experience with CDNs and HTTP adaptive streaming. Most transparent caching vendors don't queue requests either. (Jet-Stream worked with a number of partners to add queuing to their caching systems).
8) No QoS
In the nineties IP video streaming over the web was pioneering fun. In 2000, when broadband emerged, we entered a new era with higher quality video over the web. No serious business though. But we are entering a new era, where consumers and content producers expect the same quality of service as they get from traditional cable: service level agreements, delivery capacity guarantee, delivery performance guarantee, and end user quality guarantee. QoE technologies do not meet up to these requirements. QoE means that the video doesn't buffer but there is absolutely no guarantee that the image quality stays true HD. QoE is not a replacement of QoS. It is a major step backward.
9) Overloading networks
We are getting more and more feedback from operators that HTTP adaptive streams are counterproductive: these streams generate more traffic load, more traffic overhead load and more signaling problems in their networks compared to RTMP, MMS or RTSP streams. HABS is quite an aggressive technology that fills up pipes whenever it can. Regular streams do not, they are capped. The effect is that less end users can consume the content and that other traffic types are being pushed away by HABS traffic. HABS has some similarities to P2P traffic patterns that also create havoc in access networks.
So why are people so enthusiastic about HTTP adaptive streaming? Let's look at who exactly are so enthusiastic:
Content providers? No.
Telecom Operators? No, they want to offer a premium service over global CDNs.
Streaming technology vendors? No. Microsoft, Wowza and Adobe stated that HTTP adaptive is just another technology that is not intended to replace other streaming technologies.
Global CDNs? Yes. Why? They can't offer QoS anyway because global CDNs are built for the Internet: a best effort delivery infrastructure. They want to believe that QoE can replace QoS because that is in their interest. They have a second reason and that is that their platforms aren't really optimized for QoS delivery: caching is a best effort delivery technology, DNS is a best effort geo load balancing technology and in the core of these companies, they are a website acceleration company, not a premium media delivery company.
CDN hardware vendors? Yes. Why? Their strength is in building high performance boxes, but their knowledge of various streaming protocols and the streaming technologies is limited. They can lower their costs but still sell expensive appliances and sell mediocre, basic best effort caching and DNS based CDN solutions.
In our view all HTTP Adaptive Bit Rate Streaming solutions on the market today are in their 1.0 release state. There are some serious design flaws that need to be addressed by the industry before HABS can really be adopted.