RFC 9607 | SCIP RTP Payload Format | July 2024 |
Hanson, et al. | Standards Track | [Page] |
This document describes the RTP payload format of the Secure Communication Interoperability Protocol (SCIP). SCIP is an application-layer protocol that provides end-to-end session establishment, payload encryption, packetization and de-packetization of media, and reliable transport. This document provides a globally available reference that can be used for the development of network equipment and procurement of services that support SCIP traffic. The intended audience is network security policymakers; network administrators, architects, and original equipment manufacturers (OEMs); procurement personnel; and government agency and commercial industry representatives.¶
This IETF specification depends upon a second technical specification that is not available publicly, namely [SCIP210]. The IETF was therefore unable to conduct a security review of that specification, independently or when carried inside Audio/Video Transport (AVT). Implementers need to be aware that the IETF hence cannot verify any of the security claims contained in this document.¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc9607.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document details usage of the "audio/scip" and "video/scip" pseudo-codecs [MediaTypes] as a secure session establishment protocol and media transport protocol over RTP.¶
It discusses how:¶
The United States, along with its NATO Partners, have implemented SCIP in secure voice, video, and data products operating on commercial, private, and tactical IP networks worldwide using the scip media subtype. The SCIP data traversing the network is encrypted, and network equipment in-line with the session cannot interpret the traffic stream in any way. SCIP-based RTP traffic is opaque and can vary significantly in structure and frequency, making traffic profiling not possible. Also, as the SCIP protocol continues to evolve independently of this document, any network device that attempts to filter traffic (e.g., deep packet inspection) may cause unintended consequences in the future when changes to the SCIP traffic may not be recognized by the network device.¶
The SCIP protocol defined in SCIP-210 [SCIP210] includes built-in support for packetization and de-packetization, retransmission, capability exchange, version negotiation, and payload encryption. Since the traffic is encrypted, neither the RTP transport nor middleboxes can usefully parse or modify SCIP payloads; modifications are detected as integrity violations resulting in retransmission, and eventually, communication failure.¶
Because knowledge of the SCIP payload format is not needed to transport SCIP signaling or media through middleboxes, SCIP-210 represents an informative reference. While older versions of the SCIP-210 specification are publicly available, the authors strongly encourage network implementers to treat SCIP payloads as opaque octets. When handled correctly, such treatment does not require referring to SCIP-210, and any assumptions about the format of SCIP messages defined in SCIP-210 are likely to lead to protocol ossification and communication failures as the protocol evolves.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The best current practices for writing an RTP payload format specification, as per [RFC2736] and [RFC8088], were followed.¶
When referring to the Secure Communication Interoperability Protocol, the uppercase acronym "SCIP" is used. When referring to the media subtype scip, lowercase "scip" is used.¶
The following abbreviations are used in this document.¶
The Secure Communication Interoperability Protocol (SCIP) allows the negotiation of several voice, data, and video applications using various cryptographic suites. SCIP also provides several important characteristics that have led to its broad acceptance as a secure communications protocol.¶
SCIP began in the United States as the Future Narrowband Digital Terminal (FNBDT) Protocol in the late 1990s. A combined U.S. Department of Defense and vendor consortium formed a governing organization named the Interoperability Control Working Group (ICWG) to manage the protocol. In time, the group expanded to include NATO, NATO partners, and European vendors under the name International Interoperability Control Working Group (IICWG), which was later renamed the SCIP Working Group.¶
First generation SCIP devices operated on circuit-switched networks. SCIP was then expanded to radio and IP networks. The scip media subtype transports SCIP secure session establishment signaling and secure application traffic. The built-in negotiation and flexibility provided by the SCIP protocols make it a natural choice for many scenarios that require various secure applications and associated encryption suites. SCIP has been adopted by NATO in STANAG 5068. SCIP standards are currently available to participating government and military communities and select OEMs of equipment that support SCIP.¶
However, SCIP must operate over global networks (including private and commercial networks). Without access to necessary information to support SCIP, some networks may not support the SCIP media subtypes. Issues may occur simply because information is not as readily available to OEMs, network administrators, and network architects.¶
This document provides essential information about the "audio/scip" and "video/scip" media subtypes that enable network equipment manufacturers to include settings for "scip" as a known audio and video media subtype in their equipment. This enables network administrators to define and implement a compatible security policy that includes audio and video media subtypes "audio/scip" and "video/scip", respectively, as permitted codecs on the network.¶
All current IP-based SCIP endpoints implement "scip" as a media subtype. Registration of scip as a media subtype provides a common reference for network equipment manufacturers to recognize SCIP in an SDP payload declaration.¶
The "scip" media subtype identifies and indicates support for SCIP traffic that is being transported over RTP. Transcoding, lossy compression, or other data modifications MUST NOT be performed by the network on the SCIP RTP payload. The "audio/scip" and "video/scip" media subtype data streams within the network, including the VoIP network, MUST be a transparent relay and be treated as "clear-channel data", similar to the Clearmode media subtype defined by [RFC4040].¶
[RFC4040] is referenced because Clearmode does not define specific RTP payload content, packet size, or packet intervals, but rather enables Clearmode devices to signal that they support a compatible mode of operation and defines a transparent channel on which devices may communicate. This document takes a similar approach. Network devices that implement support for SCIP need to enable SCIP endpoints to signal that they support SCIP and provide a transparent channel on which SCIP endpoints may communicate.¶
SCIP is an application-layer protocol that is defined in SCIP-210. The SCIP traffic consists of encrypted SCIP control messages and codec data. The payload size and interval will vary considerably depending on the state of the SCIP protocol within the SCIP device.¶
Figure 1 below illustrates the RTP payload format for SCIP.¶
The SCIP codec produces an encrypted bitstream that is transported over RTP. Unlike other codecs, SCIP does not have its own upper layer syntax (e.g., no Network Adaptation Layer (NAL) units), but rather encrypts the output of the audio and video codecs that it uses (e.g., G.729D, H.264 [RFC6184], etc.). SCIP achieves this by encapsulating the encrypted codec output that has been previously formatted according to the relevant RTP payload specification for that codec. SCIP endpoints MAY employ mechanisms, such as inter-media RTP synchronization as described in [RFC8088], Section 3.3.4, to synchronize "audio/scip" and "video/scip" streams.¶
Figure 2 below illustrates notionally how codec packets and SCIP control messages are packetized for transmission over RTP.¶
As described above, the SCIP RTP payload format is variable and cannot be described in specificity in this document. Details can be found in SCIP-210. SCIP will continue to evolve and, as such, the SCIP RTP traffic MUST NOT be filtered by network devices based upon what currently is observed or documented. The focus of this document is for network devices to consider the SCIP RTP payload as opaque and allow it to traverse the network. Network devices MUST NOT modify SCIP RTP packets.¶
The SCIP RTP header fields SHALL conform to [RFC3550].¶
SCIP traffic may be continuous or discontinuous. The Timestamp field MUST increment based on the sampling clock for discontinuous transmission as described in [RFC3550], Section 5.1. The Timestamp field for continuous transmission applications is dependent on the sampling rate of the media as specified in the media subtype's specification (e.g., Mixed Excitation Linear Prediction Enhanced (MELPe)). Note that during a SCIP session, both discontinuous and continuous traffic are highly probable.¶
The Marker bit SHALL be set to zero for discontinuous traffic. The Marker bit for continuous traffic is based on the underlying media subtype specification. The underlying media is opaque within SCIP RTP packets.¶
The bitrate of SCIP may be adjusted depending on the capability of the underlying codec (such as MELPe [RFC8130], G.729D [RFC3551], etc.). The number of encoded audio frames per packet may also be adjusted to control congestion. Discontinuous transmission may also be used if supported by the underlying codec.¶
Since UDP does not provide congestion control, applications that use RTP over UDP SHOULD implement their own congestion control above the UDP layer [RFC8085] and MAY also implement a transport circuit breaker [RFC8083]. Work in the RTP Media Congestion Avoidance Techniques (RMCAT) working group [RMCAT] describes the interactions and conceptual interfaces necessary between the application components that relate to congestion control, including the RTP layer, the higher-level media codec control layer, and the lower-level transport interface, as well as components dedicated to congestion control functions.¶
Use of the packet loss feedback mechanisms in AVPF [RFC4585] and SAVPF [RFC5124] are OPTIONAL because SCIP itself manages retransmissions of some errored or lost packets. Specifically, the payload-specific feedback messages defined in [RFC4585], Section 6.3 are OPTIONAL when transporting video data.¶
The SCIP application-layer protocol uses RTP as a basic transport for the "audio/scip" and "video/scip" payloads. Additional RTPs that do not modify the SCIP payload are considered OPTIONAL in this document and are discretionary for a SCIP device vendor to implement. Some examples include, but are not limited to:¶
The SCIP RTP payload format is identified using the scip media subtype, which is registered in accordance with [RFC4855] and per the media type registration template from [RFC6838]. A clock rate of 8000 Hz SHALL be used for "audio/scip". A clock rate of 90000 Hz SHALL be used for "video/scip".¶
The mapping of the above-defined payload format media subtype and its parameters SHALL be implemented according to Section 3 of [RFC4855].¶
Since SCIP includes its own facilities for capabilities exchange, it is only necessary to negotiate the use of SCIP within SDP Offer/Answer; the specific codecs to be encapsulated within SCIP are then negotiated via the exchange of SCIP control messages.¶
The information carried in the media type specification has a specific mapping to fields in the Session Description Protocol (SDP) [RFC8866], which is commonly used to describe RTP sessions. When SDP is used to specify sessions employing the SCIP codec, the mapping is as follows:¶
An example mapping for "audio/scip" is:¶
m=audio 50000 RTP/AVP 96 a=rtpmap:96 scip/8000¶
An example mapping for "video/scip" is:¶
m=video 50002 RTP/AVP 97 a=rtpmap:97 scip/90000¶
An example mapping for both "audio/scip" and "video/scip" is:¶
m=audio 50000 RTP/AVP 96 a=rtpmap:96 scip/8000 m=video 50002 RTP/AVP 97 a=rtpmap:97 scip/90000¶
In accordance with the SDP Offer/Answer model [RFC3264], the SCIP device SHALL list the SCIP payload type number in order of preference in the "m" media line.¶
For example, an SDP Offer with scip as the preferred audio media subtype:¶
m=audio 50000 RTP/AVP 96 0 8 a=rtpmap:96 scip/8000 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000¶
RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550], and in any applicable RTP profile such as RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/SAVPF [RFC5124]. However, as "Securing the RTP Framework: Why RTP Does Not Mandate a Single Media Security Solution" [RFC7202] discusses, it is not an RTP payload format's responsibility to discuss or mandate what solutions are used to meet the basic security goals like confidentiality, integrity, and source authenticity for RTP in general. This responsibility lies on anyone using RTP in an application. They can find guidance on available security mechanisms and important considerations in "Options for Securing RTP Sessions" [RFC7201]. Applications SHOULD use one or more appropriate strong security mechanisms. The rest of this Security Considerations section discusses the security impacting properties of the payload format itself.¶
This RTP payload format and its media decoder do not exhibit any significant non-uniformity in the receiver-side computational complexity for packet processing, and thus do not inherently pose a denial-of-service threat due to the receipt of pathological data, nor does the RTP payload format contain any active content.¶
SCIP only encrypts the contents transported in the RTP payload; it does not protect the RTP header or RTCP packets. Applications requiring additional RTP headers and/or RTCP security might consider mechanisms such as SRTP [RFC3711], however these additional mechanisms are considered OPTIONAL in this document.¶
The "audio/scip" and "video/scip" media subtypes have previously been registered in the "Media Types" registry [MediaTypes]. IANA has updated these registrations to reference this document.¶
The SCIP protocol is maintained by the SCIP Working Group. The current SCIP-210 specification [SCIP210] may be requested from the email address below.¶
An older public version of the SCIP-210 specification can be downloaded from https://www.iad.gov/SecurePhone/index.cfm. A U.S. Department of Defense Root Certificate should be installed to access this website.¶