[opendtv] Re: Apple to Provide Live Video Streaming of September 1 Event

  • From: Kilroy Hughes <Kilroy.Hughes@xxxxxxxxxxxxx>
  • To: "opendtv@xxxxxxxxxxxxx" <opendtv@xxxxxxxxxxxxx>
  • Date: Sat, 4 Sep 2010 21:13:10 +0000

Yes, the Pantos draft has been posted at IETF for quite a while and reposted 
as-is when the six month time to live on drafts expires.  It's a convenient way 
to make most of the Apple spec available so people can make content and 
playlists; but not an attempt to kick off standardization activity.  MPEG where 
that party is being held, with Apple, 3GPP, OIPF, DTG, DVB, DECE, and other 
companies and organizations who care about internet video streaming trying to 
come up with a single standard.

FYI - The Pantos format mostly duplicates what Move Networks did for years ... 
chopping up multiplexed transport streams at video access points (I, IDR, 
etc.), and requesting sequential chunks from different bitrate files to adapt 
to actual network throughput.   Apple limits to MPEG-2 TS, and it is typically 
implemented by chopping 4 to 8 full length files into ten second files with 
file names that include their sequence number and quality or bitrate.  Audio 
and other tracks have to be identical to allow concatenating file chunk n from 
the 1 Mbps mux with file chunk n+1 from the 2 Mbps mux (for example) without 
causing audio and subtitle decoders to freak out at binary discontinuities.  
The new info in the Pantos draft is documenting their application of WinAmp MP3 
playlists with custom tags to name the file chunks so a client can make the 
right HTTP URL requests to get the next ten second chunk of the current file, 
or a different file if its buffer is getting too full or too empty. 

That all works pretty well for VOD, where you have a static playlist, but gets 
dicey for live encoding like the Apple event, where every client has to request 
a new version of the playlist about every ten seconds to see what the next 
chunk is, or if there is a next chunk.  If you get enough clients hammering on 
the server for a new playlist and getting busy signals, it starts to look like 
a Denial of Service attack.

For good adaptation to network dynamics, 2 second chunks have proven to be 
adequate and allow smaller gradation changes in bitrate and video quality for 
more visually seamless presentation, faster start and jump time, etc.  For live 
streaming, low latency is considered important (like 1 second) so chunks need 
to be under one second, and the polling problem is an order of magnitude worse. 
 Note:  for a ten second live chunk, it takes ten seconds plus some delay for 
the decoded picture buffer for the encoder to output the elementary stream for 
transport packetization and multiplexing, updating the playlist, etc., then 
maybe 5 or 8 seconds to transmit, then probably double buffering (or more) in 
the client; so decoding runs at least 3 chunk durations behind real time (30 
seconds in this example), and needing a round trip to fetch a new playlist 
after a chunk is (maybe) available on the server before it can be requested by 
the client adds additional latency. 

Another problem is combinatorial complexity.
If you want to provide a few screen resolutions ranging from mobile to HDTV, a 
few audio and video codecs (e.g. H.263 and AMR mandatory in cell phones, 
Dolby/DTS in surround sound systems, AVC Baseline or High Profile Levels 1, 2, 
3), subtitles and captions, different languages, stereo/multichannel audio, 
stereo video, camera angles ... and maybe 8 different encodings of each video 
stream at different bitrates; that typically works out to around 40 - 50 tracks 
to approximate a DVD distributed in Europe, or a "3 screens" Internet video 
spanning various devices, resolutions, codecs, languages, accessibility 
requirements, etc. 

Rough math for a 2 hour movie:
40 individually addressable track files (e.g. ISO MPEG-4 fragmented movie files)
4,000 full length multiplex files (each mux with only the video, audio, and 
subtitle tracks requested for download/stream)
4,000,000 individual Move/Pantos files made by chopping up the full length mux 
files

Cache hit ratio (efficiency on the CDN) is roughly inversely proportional to 
the number of files, i.e. with individually addressable tracks, most clients 
could be requesting the same video track segments from cache while requesting 
different audio and subtitle tracks.  For muxed files, each combination of 
audio, video, and subtitles (all 4 million) is unique, even if they share some 
track(s), resulting in lower cache hit ratio.  As long as everyone is an 
English speaker with good hearing and uses a specific collection of 
devices/codecs ... I guess that isn't a problem.  TV and cell phone guys "think 
different".

Kilroy Hughes

-----Original Message-----
From: opendtv-bounce@xxxxxxxxxxxxx [mailto:opendtv-bounce@xxxxxxxxxxxxx] On 
Behalf Of Hughes Gary-DJWV76
Sent: Friday, September 03, 2010 9:06 AM
To: opendtv@xxxxxxxxxxxxx
Subject: [opendtv] Re: Apple to Provide Live Video Streaming of September 1 
Event

A couple of points..

- it has been submitted to IETF as a draft, not an RFC. May seem like a
nit, but in the standards world this is important. A number of SDOs are
scrambling to figure out their role in this brave new world (MPEG, DVB,
OIPF, 3GPP, ...)

- it is not streaming in the push, isochronous sense. It is a series of
short file downloads initiated by the client (pull model) and relies on
the client to pace the transfers and reassemble a stream from the
fragments. Again this may seem like a nit, but if you build servers it
is the difference between do I build/buy a semi-trailer or a couple of
dozen Hyundai sedans.

- use of RTP/RTCP would require synchronizing multiple elementary
streams and requires the active participation of the origin server. More
importantly it requires the public CDNs to build an overlay network
specific to the protocols in use. Using short HTTP GETs fits in nicely
with the existing caching HTTP based CDNs (this applies to MS
Smoothstream and Adobe Dynamic Streaming as well)

Even so, watching it from home (via Comcast internet) was not a banner
experience. Video quality was so-so, ok for news, not up to
entertainment standards. It kept dropping back to a 'B roll' feed of an
auditorium, unannounced, and I'd have to restart the session to get back
to Steve on stage. Presumably their new server farm was overloaded.

gary

Gary Hughes
Video Architect
Distinguished Member, Technical Staff
Motorola On Demand Video, MA34
80 Central St.
Boxborough, MA  01719
Email: ghughes@xxxxxxxxxxxx
Office: 978 266 7269
Mobile: 978 339 3615
Fax: 978 264 9108
 
 

> -----Original Message-----
> From: opendtv-bounce@xxxxxxxxxxxxx 
> [mailto:opendtv-bounce@xxxxxxxxxxxxx] On Behalf Of Manfredi, Albert E
> Sent: Thursday, September 02, 2010 5:48 PM
> To: opendtv@xxxxxxxxxxxxx
> Subject: [opendtv] Re: Apple to Provide Live Video Streaming 
> of September 1 Event
> 
> Craig Birkmaier wrote:
> 
> > Perhaps Bert would like to comment on the approach Apple has taken
> > with HTTP Live streaming, as it appears to be a viable work around
> > to UDP based streaming, making it possible to deliver 
> multiple levels
> > of quality through the existing HTTP router infrastructure...
> 
> I never paid a lot of attention to this, because it seems to 
> be specifics added to streaming mechanisms that have been 
> around a long time. This is not a fundamentally new protocol. 
> It seems more like nitty gritty specified to make sure 
> everything plays together as Apple intends. Others have done 
> similar things in the past, thir own way.
> 
> Among the specifics are encryption of the streams and 
> encyption of the playlists. Cl;ients need to download the 
> encrypted playlist before they can start a session. The 
> precise mechanisms for this are described.
> 
> There is an Internet Draft, dated 2009, and updated most 
> recently in June 2010, that explains this new protocol.
> 
> http://tools.ietf.org/html/draft-pantos-http-live-streaming-04
> 
> It is based on HTTP, meaning RFC 2616. As such, the video 
> streams are sent over TCP only, which means they cannot use 
> IP multicast. Since IP multicast is, let's say, non-existent 
> among different ISPs, I don't think that limitation amounts 
> to much. But surely, we have all used other variants of HTTP 
> live streaming, including the possibility of viewing at 
> various quality levels, haven't we?
> 
> One of the specifics is that Apple is restricting this to 
> MPEG-2 TS formatting. For example, if they had based the 
> streaming on RTP/RTCP, that MPEG-2 TS would not have been 
> required. But that's a tradeoff, because doing it this way 
> allows leveraging off HTTP, which must use a TCP Transport 
> Layer. So, to keep the packets flowing at a steady rate, 
> while using TCP, might as well go with MPEG-2 TS.
> 
> Bert
>  
>  
> ----------------------------------------------------------------------
> You can UNSUBSCRIBE from the OpenDTV list in two ways:
> 
> - Using the UNSUBSCRIBE command in your user configuration 
> settings at FreeLists.org 
> 
> - By sending a message to: opendtv-request@xxxxxxxxxxxxx with 
> the word unsubscribe in the subject line.
> 
> 
 
 
----------------------------------------------------------------------
You can UNSUBSCRIBE from the OpenDTV list in two ways:

- Using the UNSUBSCRIBE command in your user configuration settings at 
FreeLists.org 

- By sending a message to: opendtv-request@xxxxxxxxxxxxx with the word 
unsubscribe in the subject line.


 
 
----------------------------------------------------------------------
You can UNSUBSCRIBE from the OpenDTV list in two ways:

- Using the UNSUBSCRIBE command in your user configuration settings at 
FreeLists.org 

- By sending a message to: opendtv-request@xxxxxxxxxxxxx with the word 
unsubscribe in the subject line.

Other related posts: