Yes, the Pantos draft has been posted at IETF for quite a while and reposted as-is when the six month time to live on drafts expires. It's a convenient way to make most of the Apple spec available so people can make content and playlists; but not an attempt to kick off standardization activity. MPEG where that party is being held, with Apple, 3GPP, OIPF, DTG, DVB, DECE, and other companies and organizations who care about internet video streaming trying to come up with a single standard. FYI - The Pantos format mostly duplicates what Move Networks did for years ... chopping up multiplexed transport streams at video access points (I, IDR, etc.), and requesting sequential chunks from different bitrate files to adapt to actual network throughput. Apple limits to MPEG-2 TS, and it is typically implemented by chopping 4 to 8 full length files into ten second files with file names that include their sequence number and quality or bitrate. Audio and other tracks have to be identical to allow concatenating file chunk n from the 1 Mbps mux with file chunk n+1 from the 2 Mbps mux (for example) without causing audio and subtitle decoders to freak out at binary discontinuities. The new info in the Pantos draft is documenting their application of WinAmp MP3 playlists with custom tags to name the file chunks so a client can make the right HTTP URL requests to get the next ten second chunk of the current file, or a different file if its buffer is getting too full or too empty. That all works pretty well for VOD, where you have a static playlist, but gets dicey for live encoding like the Apple event, where every client has to request a new version of the playlist about every ten seconds to see what the next chunk is, or if there is a next chunk. If you get enough clients hammering on the server for a new playlist and getting busy signals, it starts to look like a Denial of Service attack. For good adaptation to network dynamics, 2 second chunks have proven to be adequate and allow smaller gradation changes in bitrate and video quality for more visually seamless presentation, faster start and jump time, etc. For live streaming, low latency is considered important (like 1 second) so chunks need to be under one second, and the polling problem is an order of magnitude worse. Note: for a ten second live chunk, it takes ten seconds plus some delay for the decoded picture buffer for the encoder to output the elementary stream for transport packetization and multiplexing, updating the playlist, etc., then maybe 5 or 8 seconds to transmit, then probably double buffering (or more) in the client; so decoding runs at least 3 chunk durations behind real time (30 seconds in this example), and needing a round trip to fetch a new playlist after a chunk is (maybe) available on the server before it can be requested by the client adds additional latency. Another problem is combinatorial complexity. If you want to provide a few screen resolutions ranging from mobile to HDTV, a few audio and video codecs (e.g. H.263 and AMR mandatory in cell phones, Dolby/DTS in surround sound systems, AVC Baseline or High Profile Levels 1, 2, 3), subtitles and captions, different languages, stereo/multichannel audio, stereo video, camera angles ... and maybe 8 different encodings of each video stream at different bitrates; that typically works out to around 40 - 50 tracks to approximate a DVD distributed in Europe, or a "3 screens" Internet video spanning various devices, resolutions, codecs, languages, accessibility requirements, etc. Rough math for a 2 hour movie: 40 individually addressable track files (e.g. ISO MPEG-4 fragmented movie files) 4,000 full length multiplex files (each mux with only the video, audio, and subtitle tracks requested for download/stream) 4,000,000 individual Move/Pantos files made by chopping up the full length mux files Cache hit ratio (efficiency on the CDN) is roughly inversely proportional to the number of files, i.e. with individually addressable tracks, most clients could be requesting the same video track segments from cache while requesting different audio and subtitle tracks. For muxed files, each combination of audio, video, and subtitles (all 4 million) is unique, even if they share some track(s), resulting in lower cache hit ratio. As long as everyone is an English speaker with good hearing and uses a specific collection of devices/codecs ... I guess that isn't a problem. TV and cell phone guys "think different". Kilroy Hughes -----Original Message----- From: opendtv-bounce@xxxxxxxxxxxxx [mailto:opendtv-bounce@xxxxxxxxxxxxx] On Behalf Of Hughes Gary-DJWV76 Sent: Friday, September 03, 2010 9:06 AM To: opendtv@xxxxxxxxxxxxx Subject: [opendtv] Re: Apple to Provide Live Video Streaming of September 1 Event A couple of points.. - it has been submitted to IETF as a draft, not an RFC. May seem like a nit, but in the standards world this is important. A number of SDOs are scrambling to figure out their role in this brave new world (MPEG, DVB, OIPF, 3GPP, ...) - it is not streaming in the push, isochronous sense. It is a series of short file downloads initiated by the client (pull model) and relies on the client to pace the transfers and reassemble a stream from the fragments. Again this may seem like a nit, but if you build servers it is the difference between do I build/buy a semi-trailer or a couple of dozen Hyundai sedans. - use of RTP/RTCP would require synchronizing multiple elementary streams and requires the active participation of the origin server. More importantly it requires the public CDNs to build an overlay network specific to the protocols in use. Using short HTTP GETs fits in nicely with the existing caching HTTP based CDNs (this applies to MS Smoothstream and Adobe Dynamic Streaming as well) Even so, watching it from home (via Comcast internet) was not a banner experience. Video quality was so-so, ok for news, not up to entertainment standards. It kept dropping back to a 'B roll' feed of an auditorium, unannounced, and I'd have to restart the session to get back to Steve on stage. Presumably their new server farm was overloaded. gary Gary Hughes Video Architect Distinguished Member, Technical Staff Motorola On Demand Video, MA34 80 Central St. Boxborough, MA 01719 Email: ghughes@xxxxxxxxxxxx Office: 978 266 7269 Mobile: 978 339 3615 Fax: 978 264 9108 > -----Original Message----- > From: opendtv-bounce@xxxxxxxxxxxxx > [mailto:opendtv-bounce@xxxxxxxxxxxxx] On Behalf Of Manfredi, Albert E > Sent: Thursday, September 02, 2010 5:48 PM > To: opendtv@xxxxxxxxxxxxx > Subject: [opendtv] Re: Apple to Provide Live Video Streaming > of September 1 Event > > Craig Birkmaier wrote: > > > Perhaps Bert would like to comment on the approach Apple has taken > > with HTTP Live streaming, as it appears to be a viable work around > > to UDP based streaming, making it possible to deliver > multiple levels > > of quality through the existing HTTP router infrastructure... > > I never paid a lot of attention to this, because it seems to > be specifics added to streaming mechanisms that have been > around a long time. This is not a fundamentally new protocol. > It seems more like nitty gritty specified to make sure > everything plays together as Apple intends. Others have done > similar things in the past, thir own way. > > Among the specifics are encryption of the streams and > encyption of the playlists. Cl;ients need to download the > encrypted playlist before they can start a session. The > precise mechanisms for this are described. > > There is an Internet Draft, dated 2009, and updated most > recently in June 2010, that explains this new protocol. > > http://tools.ietf.org/html/draft-pantos-http-live-streaming-04 > > It is based on HTTP, meaning RFC 2616. As such, the video > streams are sent over TCP only, which means they cannot use > IP multicast. Since IP multicast is, let's say, non-existent > among different ISPs, I don't think that limitation amounts > to much. But surely, we have all used other variants of HTTP > live streaming, including the possibility of viewing at > various quality levels, haven't we? > > One of the specifics is that Apple is restricting this to > MPEG-2 TS formatting. For example, if they had based the > streaming on RTP/RTCP, that MPEG-2 TS would not have been > required. But that's a tradeoff, because doing it this way > allows leveraging off HTTP, which must use a TCP Transport > Layer. So, to keep the packets flowing at a steady rate, > while using TCP, might as well go with MPEG-2 TS. > > Bert > > > ---------------------------------------------------------------------- > You can UNSUBSCRIBE from the OpenDTV list in two ways: > > - Using the UNSUBSCRIBE command in your user configuration > settings at FreeLists.org > > - By sending a message to: opendtv-request@xxxxxxxxxxxxx with > the word unsubscribe in the subject line. > > ---------------------------------------------------------------------- You can UNSUBSCRIBE from the OpenDTV list in two ways: - Using the UNSUBSCRIBE command in your user configuration settings at FreeLists.org - By sending a message to: opendtv-request@xxxxxxxxxxxxx with the word unsubscribe in the subject line. ---------------------------------------------------------------------- You can UNSUBSCRIBE from the OpenDTV list in two ways: - Using the UNSUBSCRIBE command in your user configuration settings at FreeLists.org - By sending a message to: opendtv-request@xxxxxxxxxxxxx with the word unsubscribe in the subject line.