Network Working Group                                            R. Blom
Internet-Draft                                                  Y. Cheng
Intended status: Informational                               F. Lindholm
Expires: September 10, 2009                                  J. Mattsson
                                                              M. Naslund
                                                              K. Norrman
                                                       Ericsson Research
                                                           March 9, 2009


           SRTP Store-and-Forward Use Cases and Requirements
                draft-mattsson-srtp-store-and-forward-02

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 10, 2009.

Copyright Notice

   Copyright (c) 2009 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents in effect on the date of
   publication of this document (http://trustee.ietf.org/license-info).
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.





Blom, et al.           Expires September 10, 2009               [Page 1]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


Abstract

   The Secure Real-time Transport Protocol (SRTP) was designed to allow
   simple and efficient protection of RTP.  To provide this, encryption
   and authentication of media and control signaling are tightly coupled
   to the RTP session, and the information in the RTP header.  Hence, in
   general, it is not possible to perform store-and-forward of protected
   media.

   This document gives, based on a use case analysis, requirements that
   SRTP and new SRTP transforms need to satisfy in order to allow secure
   store-and-forward operation.  A first outline on how to introduce the
   needed new functionality and transforms in SRTP is also presented.






































Blom, et al.           Expires September 10, 2009               [Page 2]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  5
   3.  Selected SRTP Background Facts . . . . . . . . . . . . . . . .  6
   4.  Use Cases  . . . . . . . . . . . . . . . . . . . . . . . . . .  7
     4.1.  Trust Model and Assumptions  . . . . . . . . . . . . . . .  7
     4.2.  Media Distribution Use Cases . . . . . . . . . . . . . . .  7
       4.2.1.  Streaming Pre-encrypted Media  . . . . . . . . . . . .  7
       4.2.2.  Video on Demand  . . . . . . . . . . . . . . . . . . .  8
       4.2.3.  Caching Protected Media in the Network . . . . . . . .  8
       4.2.4.  Recording Encrypted Media at Home  . . . . . . . . . .  8
     4.3.  Answering Machine Use Cases  . . . . . . . . . . . . . . .  9
       4.3.1.  Storing/Caching Encrypted Media  . . . . . . . . . . .  9
       4.3.2.  Transport Protection . . . . . . . . . . . . . . . . .  9
       4.3.3.  Playback of Media Stream . . . . . . . . . . . . . . . 10
       4.3.4.  Multiple Callers . . . . . . . . . . . . . . . . . . . 10
     4.4.  Centralized Conferencing Use Case  . . . . . . . . . . . . 10
   5.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 11
   6.  Solution Outline . . . . . . . . . . . . . . . . . . . . . . . 12
     6.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . 13
     6.2.  SRTP Store-and-Forward Cryptographic Contexts  . . . . . . 14
     6.3.  Store-and-Forward Packet Format  . . . . . . . . . . . . . 15
     6.4.  Replay Protection  . . . . . . . . . . . . . . . . . . . . 16
   7.  Commented Example Usage  . . . . . . . . . . . . . . . . . . . 16
   8.  Implications on SRTP . . . . . . . . . . . . . . . . . . . . . 17
   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 18
     9.1.  Media protection Transform . . . . . . . . . . . . . . . . 18
     9.2.  Replay Protection  . . . . . . . . . . . . . . . . . . . . 18
   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18
   11. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 18
   12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19
     12.1. Normative References . . . . . . . . . . . . . . . . . . . 19
     12.2. Informative References . . . . . . . . . . . . . . . . . . 19
   Appendix A.  Draft Compound Transform Details  . . . . . . . . . . 19
     A.1.  Processing . . . . . . . . . . . . . . . . . . . . . . . . 20
       A.1.1.  Sender . . . . . . . . . . . . . . . . . . . . . . . . 20
       A.1.2.  Middlebox  . . . . . . . . . . . . . . . . . . . . . . 20
       A.1.3.  Receiver . . . . . . . . . . . . . . . . . . . . . . . 21
   Appendix B.  Key Management  . . . . . . . . . . . . . . . . . . . 21
     B.1.  Key Management Example for Media Distribution  . . . . . . 22
     B.2.  Key Management Example for Answering Machine . . . . . . . 23
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24








Blom, et al.           Expires September 10, 2009               [Page 3]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


1.  Introduction

   The Secure Real-time Transport Protocol (SRTP) [RFC3711] is a profile
   of the Real-time Transport Protocol (RTP) [RFC3550], and it provides
   confidentiality, message authentication, and replay protection to
   both RTP and RTCP (Real-time Transport Control Protocol).

   SRTP was designed to protect real-time point-to-point communications
   and is, as presently defined, not aimed for communication solutions
   that include non-trusted store-and-forward middleboxes, i.e.
   middleboxes that should not have access to cleartext media, but still
   should have access to other data in order to retransmit media
   according to RTP standard procedures.

   Media in need of end-to-end (e2e) protection could e.g. be real-time
   voice and video information/media clips for internal use by personnel
   in enterprises or authorities.  There are also multimedia telephony
   applications utilizing mailboxes and other store-and-forward
   functions that need e2e protection.  Protection e2e could also be
   needed to protect subscribed media like commercial-free radio and
   television that is distributed over the Internet.

   A typical use case is store-and-forward media distributions systems.
   Many of those systems require that media is confidentiality protected
   e2e between the media source and the media rendering device; this to
   prevent illegitimate media intercept or sharing.  At the same time
   the communication should be hop-by-hop (hbh) protected to prevent
   malicious users from performing denial of service attacks by sending
   bogus data to store-and-forward middleboxes.  Methods like the
   Packet-switched Streaming Service (PSS) [3GPP.26.234] exhibit the
   properties needed for secure store-and-forward operation, but they
   are part of larger frameworks tailored for very specific use cases.
   Thus, it would be desirable to be able to offer use of SRTP as a
   general lightweight mechanism to achieve this type of protection.

   Trying to use SRTP with store-and-forward middleboxes reveals two
   main problems:

   The first problem is because the incoming and outgoing RTP streams in
   general are independent; received RTP packets cannot just be stored
   and later retransmitted.  This in particular implies that SRTP with
   currently defined transforms cannot be applied.  For details, see
   Section 3.

   It should be noted that store-and-forward of media in most cases
   requires that side information is available when retransmitting
   received media.  Such side information, e.g.  RTP timestamp
   information, may come from the RTP header, RTCP messages, and session



Blom, et al.           Expires September 10, 2009               [Page 4]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   definition data.

   The second problem is that to provide both e2e and hbh protection,
   two independent security contexts with associated protection
   mechanisms have to coexist; a feature unavailable in SRTP as
   currently specified.  To resolve these problems, SRTP needs
   enhancements that in an efficient and coherent way support store-and-
   forward use cases.

   The objective of this document is to explore use cases for a SRTP
   store-and-forward solution, derive associated requirements, present,
   and discuss an approach for a solution.


2.  Terminology

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

   Definitions of terms and notation will, unless otherwise indicated,
   be as defined in [RFC3711].

   o  The term authentication will be used to denote message
      authentication and message integrity protection.

   o  By RTP transport protection or simply transport protection, we
      mean protection (confidentiality, authentication, etc.) of
      streamed RTP packets.  This is provided by SRTP according to
      [RFC3711].

   o  By media protection, we similarly mean protection of the
      application payloads carried in RTP.  SRTP provides media
      protection, but only during transport (see above).

   o  A store-and-forward e2e session is defined as the set of store-
      and-forward e2e protected data produced under a single e2e
      context.  A store-and-forward e2e session may comprise several so-
      called store-and-forward sources, i.e. several distinct logical
      e2e media streams to be protected by the same e2e context.

   o  A store-and-forward hbh session is defined as the set of store-
      and-forward hbh protected data produced under a single hbh
      context.







Blom, et al.           Expires September 10, 2009               [Page 5]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


3.  Selected SRTP Background Facts

   SRTP as currently specified has the features described below, which
   explain why it cannot be directly used in store-and-forward
   applications.  They also indicate how a SRTP store-and-forward
   solution could be designed.

   o  All current SRTP transforms use the RTP header as input.  AES-CTR
      uses the SSRC and the packet index to calculate the IV
      (Initialization Vector), AES-f8 uses even more header parameters,
      and HMAC-SHA1 authenticates the full RTP header.  The SSRC is
      typically determined by the key management protocol and the packet
      index includes the RTP sequence number, which should be randomly
      chosen according to RTP [RFC3550].  All this means that there are
      no standard compliant ways to receive SRTP protected packets in
      one stream and later just retransmit the packets as they were
      received.

   o  Even if the SRTP relevant RTP parameters like SSRC and the SRTP
      index could be determined beforehand for the retransmission
      stream, it would not allow a client to randomly seek in a stream
      without renegotiating the session, as it would lead to
      misalignment between the packet index used for streaming and the
      packet index used by SRTP at the originator.  If the user jumps to
      a different part of the stream, it is impossible to continue
      increasing the RTP sequence number stepwise while at the same time
      keeping it equal to the sequence number needed for decryption.
      Jumping backward (e.g. media rewind) would cause even more
      problems as the retransmitted packets would be discarded by the
      SRTP replay protection.

   o  The encryption key and the authentication key are both derived
      from the same master key in SRTP, see Figure 1.  This means that a
      client which is able to derive e.g. the authentication key will
      also always have access to the encryption key making it impossible
      to use say the session encr_key for e2e protection and the session
      auth_key for hbh protection.














Blom, et al.           Expires September 10, 2009               [Page 6]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


                       Packet index -------+
                                           |
                                           v
      +------------+                 +------------+ Session encr_key
      |            |   Master key    |            +------------------>
      |  External  +---------------->|     Key    | Session auth_key
      |    Key     |                 | Derivation +------------------>
      | Management +---------------->|            | Session salt_key
      |            |   Master salt   |            +------------------>
      +------------+                 +------------+

                       Figure 1: SRTP key derivation


4.  Use Cases

   The use cases below were chosen to illustrate media streaming
   scenarios where the current SRTP specification [RFC3711] does not
   provide sufficient functionality.  These use cases provide context
   and general rationale for the requirements presented in Section 5.

   Note that the necessary key distribution and media session setup is
   out of scope for this document, and will thus not be discussed in any
   detail in the use cases below.  As key management is an integral part
   of a complete store-and-forward solution, the necessary key
   distribution and media session setup for some of the use cases are
   discussed in Appendix B.

4.1.  Trust Model and Assumptions

   The trust model assumed in this document includes two parties who
   wish to communicate securely via one or more honest but curious
   middleboxes.  This means that the communicating parties trust the
   middlebox to deliver the media as expected, but they do not trust it
   with cleartext data.  In the use cases below, there is no example of
   multiple (sequential) middleboxes, but it is a natural generalization
   and it seems warranted to cover this case as well.

4.2.  Media Distribution Use Cases

4.2.1.  Streaming Pre-encrypted Media

   A content creator wants to distribute high value content to clients.
   The content provider distributes the media via a streaming server
   that should not have access to cleartext media, typically because the
   content creator does not trust it.  In one scenario, the content
   creator streams the media to the streaming server where the media is
   stored in a protected format.  In another scenario, the protected



Blom, et al.           Expires September 10, 2009               [Page 7]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   media may be delivered to the streaming server via e.g. file
   transfer.  These use cases correspond to use of pre-encryption in
   media distribution.  In both cases, protected media is available in
   the streaming server for later transmission to different clients.

   Even in cases when the streaming server could be trusted with
   cleartext data there are reasons why one would like to avoid
   performing encryption in the streaming server itself.  One reason to
   use pre-encryption is to offload the streaming server the task of
   encrypting the media.  If the media is pre-encrypted, the streaming
   server only needs to add integrity protection to the encrypted media
   before streaming it to the clients.  Clients are trusted by the
   content creator and have access to the encryption key.  When a client
   receives a packet, the authenticity is checked using a security
   context shared with the streaming server and the decryption is
   performed using a security context shared with the content creator.

4.2.2.  Video on Demand

   Some protected content is offered as video on demand where users can
   watch selected video clips at any time.  The media is unicasted and
   the clients are offered random seek functionality which allow them to
   quickly jump to any part of the video.  Other features offered may be
   rendering with speed translation as in fast forward and slow motion
   rendering.  These features can be used to skip parts of the video or
   jump backward to see interesting parts again.  The problem here is
   jumping back and forth and performing rendering speed translations in
   an e2e protected media stream.

4.2.3.  Caching Protected Media in the Network

   High value encrypted media (e.g.  Internet Protocol Television
   (IPTV), and radio) is broadcasted in a network.  Only clients trusted
   by the content creator have access to the encryption key.  A network
   node is caching the media, but is not trusted by the content creator
   and has therefore no access to the encryption keys.  A client that
   missed the beginning of a program might stream the media from the
   network cache instead of listening to the broadcast.  Due to the
   trust model where the content creator only trusts the clients, the
   media needs to be e2e protected.  Nevertheless, the media also needs
   to be hbh integrity protected to protect against denial-of-service
   (DoS) attacks.

4.2.4.  Recording Encrypted Media at Home

   High value encrypted media (e.g.  IPTV, and radio) is broadcasted in
   a network.  Only clients trusted by the content creator have access
   to the encryption key.  A user is recording the media on a HDD (Hard



Blom, et al.           Expires September 10, 2009               [Page 8]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   Disk Drive), but does not yet have a license, or have a license that
   does not allow cleartext copying.  The media is therefore stored in
   protected format on the HDD.  There is however, a strong need for the
   HDD to be able to check the integrity of the media before it is
   stored.  Otherwise, a DoS attack may fill the HDD with garbage.

4.3.  Answering Machine Use Cases

4.3.1.  Storing/Caching Encrypted Media

   Operators commonly provide an answering machine service to their
   customers.  In this case, the communicating parties (the caller and
   the callee) may not wish to disclose the media to any other party,
   and hence want to apply encryption between each other.  This requires
   that they are able to establish a shared key.  The answering machine
   acts as a store-and-forward middlebox, which stores encrypted data
   and retransmits it to the callee.  The answering machine may act as a
   streaming server when sending the data to the callee, and will then
   not use the exact same RTP headers on the outgoing SRTP traffic as
   was used on the incoming SRTP traffic.  SRTP as specified in
   [RFC3711] will not work in this case, since parts of the RTP header
   are input to the encryption/authentication transforms.

   An alternative forwarding of the recorded media from the answering
   machine to the callee could be by file transfer, e.g. sending the
   recorded media in the format that was used to store it.  Such
   forwarding would not be according to SRTP, but would still yield end-
   to-end protection of the media.  Note however, that decryption and
   rendering would be similar to part of an enhanced SRTP solution.

4.3.2.  Transport Protection

   To avoid that the answering machine is filled up with bogus data, it
   is necessary for the answering machine to authenticate the sender of
   the traffic, and further, to verify the authenticity of the incoming
   traffic.  This poses a problem for SRTP as of [RFC3711] in that the
   message authentication requires a session key shared with the
   answering machine, but the encryption key shall as discussed above
   not be available to it.  This implies that there is a need for two
   independent security contexts, one end-to-end and one hop-by-hop.

   When the callee retrieves the media from the answering machine,
   message authentication is also beneficial.  There are two
   possibilities.  Since the answering machine is trusted to maintain
   and redistribute the media, it may be sufficient to provide message
   authentication between the answering machine and the callee.  In
   addition, here it would be necessary to have a separation between the
   e2e protection and the hbh protection.  A second option is that



Blom, et al.           Expires September 10, 2009               [Page 9]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   authentication is applied from the caller to the callee.  However, if
   the authentication is applied in that way, the answering machine will
   not be able to verify the integrity of the incoming traffic from the
   caller.  It is of course also possible that message authentication is
   desired for any combination of endpoints, i.e. between the caller and
   the callee, between the caller and the answering machine, and between
   the answering machine and the callee.

4.3.3.  Playback of Media Stream

   When a user listens to the messages stored on the answering machine,
   it is useful to be able to rewind and/or fast forward in the media
   stream.  For SRTP as of [RFC3711], this is not possible.  The reason
   for that is that even if the same payloads can be reinserted in the
   stream by the answering machine, the RTP sequence number is steadily
   increasing on a per packet basis.  Since the synchronization of the
   encryption transforms is based on the RTP sequence number, the
   decryption will fail.  In addition, message authentication will fail
   since the authentication according to [RFC3711] shall cover the
   header of the RTP packet.  This implies that the payload and the
   media have to be protected by a mechanism that is independent of
   parameters used in the transport protocol.

4.3.4.  Multiple Callers

   Several messages may be left on the answering machine, received in
   different sessions and possibly from different callers.  The result
   of this is that different keys were used to encrypt the media.
   Depending on how the callee retrieves the messages from the answering
   machine, different options are possible.  One option is to retrieve
   each message as a separate stream, and in this case, a separate
   session is required per message.  Another option is to somehow switch
   security contexts within an ongoing hbh session.

4.4.  Centralized Conferencing Use Case

   Another use case is a conference bridge that is not to be trusted
   with the cleartext media.  In this case, the conference bridge cannot
   act as a mixer, but in some cases, this may be a reasonable
   assumption.  An example is Push-To-Talk solutions, where only one
   user at a time is allowed to talk.  In this setting, the media may be
   repackaged by the conferencing server into RTP packets with different
   headers compared to the incoming traffic.  As described in Section 3,
   this causes authentication and decryption to fail in SRTP.







Blom, et al.           Expires September 10, 2009              [Page 10]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


5.  Requirements

   The use cases above show that to enable store-and-forward in an
   enhanced SRTP, it has to in an efficient way support the following
   requirements:

   o  Transport independent media protection

      It SHALL be possible to have media protection that is independent
      of RTP parameters.

      To allow retransmission of received protected media, a transform
      for protecting the RTP payload that is independent of RTP
      transport parameters is needed.

      The media protection MUST cover both message authentication and
      confidentiality protection.

      It SHALL be possible to protect several store-and-forward e2e
      streams with a single e2e master key.

      The requirements imply that the media protection format has to
      include a SRTP SaF Source (SSS) field for robust operation.  The
      SSS can be thought of as an e2e SSRC.

   o  Media source authentication

      It SHALL be possible to provide source authentication of the media
      stream.

      In a group setting, source authentication is here meant to ensure
      that the message originated from a member of the group.  This
      requirement is fulfilled if media has authentication protection in
      a transport independent manner.

   o  Support of playback of protected media streams

      A client SHALL be able to do random seek in a protected media
      stream.

      Note that as playback functions like retransmission and random
      seek capability are features in the described use cases, replay
      protection cannot be required for transport independent media
      protection.

   o  Transport protection

      It SHALL be possible to provide transport protection that is



Blom, et al.           Expires September 10, 2009              [Page 11]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


      independent of the media protection.

      The transport protection MUST be able to provide confidentiality,
      authentication, and replay protection for RTP and at least
      authentication and replay protection for RTCP.

      This requirement maps well against SRTP as of [RFC3711].
      Transport protection is also a means to provide replay protection
      of the media on a hop-by-hop basis.

   o  Separation of security contexts

      It MUST be possible to have independent security contexts for the
      transport independent media protection and the transport
      protection.

      This means in particular that there has to be two distinct master
      keys, one for e2e media protection and one for hbh transport
      protection.


   o  Change of transport independent media protection security context

      It MUST be possible to signal to the receiver the current media
      protection security context to use.  It MUST be possible to change
      this security context within an ongoing hbh session.

      This is needed to allow single stream multiplexing of e.g.
      protected media "clips" which were generated using different
      transport independent media protection security contexts

      The requirements imply that the media protection format has to
      include a Crypto Context Indicator (CCI) field for robust
      operation.  The CCI can be thought of as a generalized MKI and may
      be defined to also include all the MKI based functionality defined
      in [RFC3711].


6.  Solution Outline

   In this section, a first outline on how to introduce the needed new
   functionality and transforms in SRTP is presented.  For a more
   complete description, including a packet format specification and a
   detailed transform description, see [Naslund].







Blom, et al.           Expires September 10, 2009              [Page 12]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


6.1.  Overview

   The stated requirements above seem possible to meet by implementing a
   few minor additions to SRTP.  These additions mainly address new SRTP
   transforms, introduction of media and transport protection crypto
   context definitions, together with key handling and key derivation.

   A high-level description of the proposed new SRTP functionality is as
   follows: The first step is to perform a transport independent media
   protection operation.  The coverage of this transform is the RTP
   payload only.  This operation could either be done with an
   Authenticated Encryption (AE) transform, or with separate encryption
   and authentication transforms.  The media protection should rely on
   two explicit values for cryptographic synchronization, the Packet
   Unique Value (PUV) and the SRTP SaF Source (SSS), which are forwarded
   in the payload.

   After the steps making up the transport independent media protection
   have been performed, the protection processing proceeds as currently
   defined by [RFC3711], which results in the addition of the required
   transport protection.

   Keying for transport protection is performed as described in
   [RFC3711] and uses the SRTP internal key derivation function.  The
   key derivation function operates on a master key and a master salt,
   where the master key is denoted hbh key.

   The keying for the media protection is defined in an equivalent way,
   producing keying material for the media transform.  The e2e keying
   material is based on another master key, the e2e key, which is
   independent of the hbh key.  Also for the e2e context, a master salt
   is defined.  The key derivations used to derive the e2e keying
   material could preferable use the key derivation function defined in
   [RFC3711].

   Note that with the approach taken, only the media protection
   endpoints will have to implement the new SRTP functionality with
   combined media and transport transform and handling of two security
   contexts.  In the following, we will denote such a combined transform
   a Compound Transform (CT).  The store-and-forward middlebox can rely
   solely on [RFC3711], using already existing functionality for store-
   and-forward operation, given that the transport transform in the
   compound transform is equivalent to a transform defined for
   [RFC3711].  However, there are some practical reasons why also the
   middlebox needs to have some "knowledge" of the e2e part of the
   protection, see below.

   To summarize: By a compound transform, we mean the combination of



Blom, et al.           Expires September 10, 2009              [Page 13]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   media protection transform (using the e2e key) and one of the defined
   transforms of [RFC3711] for the transport protection (using the hbh
   key).  The compound transform should be defined in this way to allow
   an intermediary to reuse a [RFC3711] compliant implementation of SRTP
   to first receive and then resend the media.

   For RTCP the solution principles described for RTP applies.  However,
   the main application for RTCP is to control the traffic over one hop,
   which means that e2e encryption cannot be applied in general.
   However, note that there are RTCP application messages, which might
   benefit from having e2e integrity protection.

6.2.  SRTP Store-and-Forward Cryptographic Contexts

   SRTP maintains a cryptographic context, containing master key(s),
   cryptographic transforms, etc., for the associated SRTP session.
   Exactly how the parameters in the cryptographic context are agreed
   upon is a session setup issue and out of scope of SRTP.  SRTP assumes
   that a cryptographic context or rather the master key therein, is
   shared only between mutually trusted parties.

                      e2e context (media protection)
             <----------------------------------------------->
           +---+                   +---+                   +---+
           | S |                   | M |                   | R |
           +---+                   +---+                   +---+
             <----------------------> <---------------------->
                  hbh context 1             hbh context 2
              (transport protection)   (transport protection)

          Figure 2: Context sharing (Sender, Middlebox, Receiver)

   The SRTP cryptographic context concept is reusable for the proposed
   solution.  Conceptually, the originator and the intended end-receiver
   share an e2e media security context, while an hbh transport security
   context is shared by an endpoint and an intermediary or by two
   intermediaries, see Figure 2.

   To comply with the trust model of the use cases above, the master
   key(s) in the e2e context MUST be cryptographically independent of,
   and MUST NOT be deducible from the master key of any hbh context.
   The key management protocol(s) used MUST therefore be able to
   negotiate keys satisfying these requirements.

   The identification of the hbh context should be as defined in
   [RFC3711], while the used e2e context is either implicitly identified
   in the session setup or its identification relies on the proposed
   crypto context indicator (CCI).



Blom, et al.           Expires September 10, 2009              [Page 14]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   A sender will use two cryptographic contexts: an e2e context used for
   payload protection to the end-receiver, and an hbh context used to
   secure the SRTP transport to the (first) intermediary.  Similarly,
   the end-receiver will use two contexts.  An intermediary node
   however, will only use one standard SRTP context for each session.
   In other words, an e2e context is used to achieve transport
   independent media protection as required in Section 5, and an hbh
   context is similarly used to achieve transport protection.

   For both e2e and hbh contexts, it is assumed that cryptographic
   context parameters, such as master key and salt (if needed) are
   included.  From these, session keys/salts are derived similarly to
   [RFC3711].

   If several senders' payloads are multiplexed within the same stream
   from a server to a receiver (as discussed in Section 4.3.4) the
   receiver may need to switch between e2e contexts within an ongoing
   hbh session.  This can be implemented using a mechanism similar to
   the SRTP MKI field in the e2e context (what is referred to as CCI
   above).  The hbh context would, however, not need any change but
   could rely on an MKI field according to the current definition in
   [RFC3711].

6.3.  Store-and-Forward Packet Format

   The packet format is composed of an "inner" e2e (sender-receiver)
   part embedded in an "outer" hbh (sender-middlebox or middlebox-
   receiver) part.  Between these parts, the CCI field is placed.

   With fields and processing as defined above, the SRTP store-and-
   forward packet format should look approximately like Figure 3

   +-----------+-------------------+-----+-----+-----+-----+-----+-----+
   |  hbh RTP  +   e2e Encrypted   | e2e | e2e | e2e |     | hbh | hbh |
   |   Header  +      Payload      | PUV | SSS | MAC | CCI | MKI | MAC |
   +-----------+-------------------+-----+-----+-----+-----+-----+-----+

              Figure 3: SRTP store-and-forward packet format

   The additional fields added by the inner e2e security processing are:

   o  e2e SSS: SRTP SaF Source is a value used by the SRTP SaF transform
      as an identifier for the SaF source within a SaF e2e session.
      Thus, SSS MUST be unique for all SaF sources within the SaF e2e
      session.

   o  e2e PUV: Packet Unique Value for the e2e transform.  The e2e PUV
      shall be unique for each e2e encrypted payload being generated by



Blom, et al.           Expires September 10, 2009              [Page 15]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


      a SaF source within a SaF e2e session.

   o  e2e MAC: This field is used to carry payload authentication data
      e2e.

   o  CCI: Crypto Context Identifier is used to signal from sender to
      receiver, which e2e cryptographic context to use.

   The hbh RTP header, hbh MAC, and hbh MKI are in one-to-one
   correspondence with respective fields of [RFC3711] and will not be
   discussed further.

6.4.  Replay Protection

   When the RTP data is hbh transport protected between server and
   receiver, replay protection on the transport level is provided as the
   hbh protection offers the same security features as [RFC3711].  As
   mentioned, it is assumed that the server is trusted not to attempt
   replay of data on media level, unless the user requests it and thus,
   this is in line with the trust model.

   It is possible to implement replay protection on the media level for
   e2e transforms where the e2e PUV is a counter.  This has to be done
   on the application layer for the applications that requires it.


7.  Commented Example Usage

   In this example use case, it is assumed that a single sender S wants
   to send a single e2e protected media stream to a receiver R via an
   intermediary M. For this, S will use SRTP with a compound transform
   (CT) as defined above.

   1.  S defines an e2e crypto context and forwards it to R. S and M
       agrees upon an hbh crypto context.  Each crypto context defines a
       master key, i.e. k_e2e and k_hbh respectively.  Note that for
       store-and-forward operation, the e2e crypto context has to be
       decided unilaterally by the sender.

       The compound transform defines transport authentication and NULL
       transport encryption, which corresponds to a transform defined
       for [RFC3711].  The e2e protection is configured to use both
       integrity and confidentiality protection.

       How these crypto contexts are setup (which key management
       protocol to use etc.) is out of scope.  Still, it can be noted
       that in principle it could be done by having e.g. two MIKEY
       [RFC3830] exchanges, one between S and M and one between S and R.



Blom, et al.           Expires September 10, 2009              [Page 16]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   2.  S sets up an SRTP session with M, to have data forwarded to R. S
       offers the compound transform to M. M, knowing that it will act
       as an intermediary, accepts the offer (even though it doesn't
       have access to the e2e crypto context).  M records that the media
       received is e2e protected.  M also records the identity of the
       compound transform used.

   3.  To receive the media stream, M initiates SRTP as in [RFC3711]
       using a transform equivalent to the hbh transform in the compound
       transform offered by S.

   4.  S starts to transmit SRTP towards M, in effect using k_e2e for
       e2e media protection and k_hbh for hbh transport authentication.

   5.  M receives the SRTP packets and verifies the hbh authenticity of
       each packet.  M stores the (protected) payloads together with
       relevant side information to be used when the media is forwarded.
       Note that M would perform exactly the same operations when
       storing unprotected media for later forwarding.

   6.  Later, R sets up a session with M to render the stored media.  As
       R contacts a middlebox, R offers use of a compound transform,
       preferably having the same e2e transform as was used by S (the
       e2e transform may be part of the e2e crypto context).  If R only
       offers a CT with a different e2e transform, or if R only offers
       use of standard SRTP, M will decline the offer and propose a
       compatible compound transform.  An hbh crypto context, which is
       independent of the first one, is agreed between R and M.

   7.  M, knowing that the stored payloads are e2e protected, initiates
       use of SRTP as in [RFC3711] specifying a transform equal to the
       hbh transform in the CT agreed between R and M. M then transmits
       the authenticated media stream to R.

   8.  When receiving the SRTP packets from M, R first verifies the hbh
       transport authentication and then checks e2e media authentication
       and decrypts the payloads to retrieve the plaintext media.


8.  Implications on SRTP

   As the SRTP specification allows new transforms, the new transforms
   can be added with only minor implications.

   The handling of dual security contexts (in the endpoints) is however
   a new feature, which will have to be introduced in SRTP.

   The Key Derivation Function defined in [RFC3711] can be reused for



Blom, et al.           Expires September 10, 2009              [Page 17]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   both the e2e and the hbh security contexts.


9.  Security Considerations

9.1.  Media protection Transform

   Any fixed keystream output, generated from the same inputs (i.e. key
   and IV) MUST only be used to encrypt once.  Reusing such a key-stream
   (commonly called a "two-time pad") would almost certainly compromise
   security.

   The new e2e transform accomplish packet-uniqueness by inclusion of
   the PUV and stream-uniqueness by inclusion of the SSS in the IV
   formation.  Thus, the SSS MUST be unique among all the RTP streams
   within the same RTP session that share the same e2e master key.
   Master keys MAY be shared between streams belonging to the same RTP
   session, but it is RECOMMENDED that each stream have its own master
   key.

   With the above conditions fulfilled, the security level of the
   compound transform will equal the level offered by [RFC3711].

9.2.  Replay Protection

   Replay protection is only provided on hbh basis.  Note that the
   requirements on random seek in the media stream rules out any general
   replay protection mechanism applied on an e2e basis, and that this
   threat falls outside the assumed trust model.  Still, the e2e PUV
   used offers possibility to implement application specific replay
   protection mechanisms.


10.  Acknowledgements

   The authors would like to thank Daniel Catrein, Frank Hartung, and
   Magnus Westerlund for their support and valuable comments.


11.  IANA Considerations

   To signal that the new transforms are used, each relevant key
   management protocol needs to register the new transforms including
   numbering scheme and syntax with IANA.


12.  References




Blom, et al.           Expires September 10, 2009              [Page 18]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


12.1.  Normative References

   [Naslund]  Naslund, "The Use of the Secure Real-time Transport
              Protocol (SRTP) in Store-and-Forward Applications",
              March 2009.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
              Jacobson, "RTP: A Transport Protocol for Real-Time
              Applications", STD 64, RFC 3550, July 2003.

   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
              RFC 3711, March 2004.

12.2.   Informative References

   [3GPP.26.234]
              3GPP, "Transparent end-to-end Packet-switched Streaming
              Service (PSS); Protocols and codecs", 3GPP TS 26.234
              8.1.0, December 2008.

   [RFC3830]  Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K.
              Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830,
              August 2004.


Appendix A.  Draft Compound Transform Details

   This informative appendix proposes a way to define the compound
   transform such that it fits well in the SRTP framework.  We assume
   the transform is defined to provide

   o  Integrity and confidentiality e2e (the media part)

   o  Integrity hbh (the transport part)

   Clearly, other combinations are also possible in the form of any or
   all of the 15 possible (non-trivial) combinations of the security
   services confidentiality and integrity for the hbh as well as the e2e
   part.  However, we feel that integrity and confidentiality on e2e
   basis combined with hbh integrity will be sufficient in most cases.

   Below, we make the natural (and necessary) assumption that the sender
   is made aware (e.g. by session setup signaling) that the media will
   be delivered/stored in a middlebox.  Similarly, we assume the



Blom, et al.           Expires September 10, 2009              [Page 19]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   middlebox is aware of that it is acting as a middlebox.

A.1.  Processing

   Recall that standard SRTP processing has the following principal
   form.

   1.  The sender determines keys, transforms, and other parameters from
       the cryptographic context.

   2.  The sender encrypts the RTP payload (optional).

   3.  The sender integrity protects the entire RTP packet (optional).

   On the receiver side, the decryption/integrity verification is
   reversed.

   In the following, we describe the processing taking place in sender,
   middlebox, and receiver as triggered by the use of the CT transform
   indicated by the cryptographic contexts involved.

A.1.1.  Sender

    S1  The sender determines keys and other parameters for e2e and hbh
        protection.  The crypto context states that the CT transform
        shall be used.

    S2  The sender applies the e2e part of CT to the payload.
        Conceptually treating the e2e part as an encryption transform,
        this agrees with the normal SRTP processing.

    S3  The sender next applies the hbh part of CT.  Again, this agrees
        with adding standard SRTP integrity protection.

A.1.2.  Middlebox

A.1.2.1.  Message Storage

   MS1  The middlebox determines hbh keys and other parameters in the
        same way as standard SRTP does.  The crypto context states that
        the CT transform shall be used.  Since the middlebox is aware of
        its role as a (receiving) middlebox, the middlebox configures
        itself to verify integrity but not to decrypt the payload.  To
        fit with the normal SRTP processing, the middlebox may therefore
        conceptually configure itself to perform hbh integrity
        verification but use NULL decryption as supported by SRTP.





Blom, et al.           Expires September 10, 2009              [Page 20]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   MS2  The middlebox next applies the hbh part of CT according to
        standard SRTP integrity verification and replay protection.

   MS3  The middlebox extracts the payload (which is the output of the
        e2e transform as generated by the sender) and stores it for
        later retrieval by the receiver.

A.1.2.2.  Message Delivery

   MD1  The middlebox determines hbh keys and other parameters in the
        same way as standard SRTP does.  The crypto context states that
        the CT transform shall be used.  Since the middlebox is aware of
        its role as a (sending) middlebox, the middlebox configures
        itself to not encrypt the payload but only to add integrity
        protection.

   MD2  The middlebox applies NULL encryption to the payload.

   MD3  The middlebox applies hbh integrity.

A.1.3.  Receiver

    R1  The receiver determines keys and other parameters for e2e and
        hbh protection.  The crypto context states that the CT transform
        shall be used.

    R2  The receiver applies the hbh part of CT according to standard
        SRTP procedures.

    R3  The receiver applies the e2e part of CT to the payload.


Appendix B.  Key Management

   This informative appendix discusses possible ways to establish SRTP
   cryptographic contexts for store-and-forward scenarios.  As described
   above there are two cryptographic contexts, i.e., an e2e context and
   an hbh context, and they should be independent of each other.

   An hbh context is identified by the triplet <SSRC, destination IP
   address, destination port number> as defined in [RFC3711].  All
   currently available key management protocols that support SRTP, e.g.
   MIKEY, SDES, and DTLS-SRTP, can be used between sender/receiver and
   middlebox or between two middleboxes for negotiating hbh master keys
   and other security parameters.

   E2e context must also be identified and the identifier can be any
   transport independent value that uniquely determines the



Blom, et al.           Expires September 10, 2009              [Page 21]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


   cryptographic context between a sender and a receiver.  For instance,
   the sender could assign a unique id to the content to be transmitted
   and use such a Content ID (CID) to identify the e2e context.  The CID
   is then sent to the receiver at session setup time, together with the
   e2e master key and other parameters.  Note that the CID discussed
   here is not the same as the proposed CCI.  The CCI may be thought of
   as a mutant, short, in-band alias for the CID and is only used on hbh
   basis.  A middlebox can change the CCI chosen by the sender or by the
   previous middlebox in order to avoid CCI collision.  The mapping
   between CID and CCI is sent out-of-band at each hop.  The receiver
   can thus map the CCI received in SRTP packets to the correct CID and
   retrieve the corresponding e2e cryptographic context.

   Therefore, for the e2e context additional information, i.e.  CID and
   (CID, CCI)-mapping, needs to be transmitted, along with the key
   management protocol messages.  Below we give two examples, addressing
   media distribution and answering machine use cases respectively.  In
   the examples we use MIKEY over SIP, but other key management
   protocols that support SRTP can also be used.

B.1.  Key Management Example for Media Distribution

   An example of session setup sequence for a media distribution use
   case (e.g.  Video on demand) is shown in Figure 4.  An end user (R)
   sends a SIP INVITE to the media service (S) to request the delivery
   of certain content.  S replies with a 200 OK message, which includes
   the CID and a MIKEY message containing e2e master key and other
   parameters.  By this MIKEY exchange, S and R agree on the e2e
   context.

   Assuming the requested content is not available on the streaming
   server (M), S needs to send the content to M first, using the SRTP
   compound transform.  It initiates the session by sending an INVITE to
   M, with a MIKEY message for setting up the hbh context between S and
   M. The mapping between CID and CCI is also sent to M. Similarly, M
   sends an INVITE to R and sets up the hbh context between M and R.















Blom, et al.           Expires September 10, 2009              [Page 22]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


+---+                            +---+                             +---+
| S |                            | M |                             | R |
+---+                            +---+                             +---+

                                INVITE
  <-------------------------------------------------------------------
                        200 OK {CID, MIKEY e2e}
  ------------------------------------------------------------------->
                                  ACK
  <-------------------------------------------------------------------
  INVITE {MIKEY hbh S-M, <CID, CCI>}
  --------------------------------->
                                    INVITE {MIKEY hbh M-R, <CID, CCI>}
                                    --------------------------------->
                                                  200 OK
                                    <---------------------------------
                200 OK
  <---------------------------------
                 ACK
  --------------------------------->
                                                   ACK
                                    --------------------------------->

          Figure 4: Session setup sequence for media distribution

B.2.  Key Management Example for Answering Machine

   Typically, a caller (S) tries to reach the intended callee (R)
   directly.  If R is not online, S is notified and redirected to an
   answering machine (M).  S then knows it should run SRTP with the
   compound transform.  To signal that, S sends an INVITE with two MIKEY
   messages, one for setting up the e2e context between S and R, and the
   other for the hbh context between S and M. M cannot process the first
   MIKEY message but stores it.  By processing the second MIKEY message,
   M agrees the hbh context with S.

   Later when R gets online and tries to retrieve stored data from M, R
   sends an INVITE to M and negotiates the hbh context between them.  In
   the reply, M includes the MIKEY message that was received from S and
   the mapping between CID and CCI.  From this MIKEY message, R gets
   knowledge of the e2e context.  A session setup sequence is shown in
   Figure 5









Blom, et al.           Expires September 10, 2009              [Page 23]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009


+---+                            +---+                             +---+
| S |                            | M |                             | R |
+---+                            +---+                             +---+

  INVITE {MIKEY e2e,
          MIKEY hbh S-M, <CID, CCI>}
  --------------------------------->
                200 OK
  <---------------------------------
                 ACK
  --------------------------------->
                                          INVITE {MIKEY hbh M-R}
                                   <----------------------------------
                                     200 OK {MIKEY e2e, <CID, CCI>}
                                   ---------------------------------->
                                                   ACK
                                   <----------------------------------

          Figure 5: Session setup sequence for answering machine


Authors' Addresses

   Rolf Blom
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 31 707
   Email: rolf.j.blom@ericsson.com


   Yi Cheng
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 17 589
   Email: yi.cheng@ericsson.com












Blom, et al.           Expires September 10, 2009              [Page 24]


Internet-Draft     SRTP SaF Use Cases and Requirements        March 2009



   Fredrik Lindholm
   Ericsson AB
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 31 705
   Email: fredrik.lindholm@ericsson.com


   John Mattsson
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 43 501
   Email: john.mattsson@ericsson.com


   Mats Naslund
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 33 739
   Email: mats.naslund@ericsson.com


   Karl Norrman
   Ericsson Research
   SE-164 80 Stockholm
   Sweden

   Phone: +46 10 71 44 502
   Email: karl.norrman@ericsson.com
















Blom, et al.           Expires September 10, 2009              [Page 25]