BACKGROUNDERS | Updated: Tue, Feb 21, 2006 12:27 pm PDT


Last revised by Ali Farshchian, Feb 21, 06 at 11:27 am PDT | Edit | Add New Term

Session Initiation Protocol (SIP) is a protocol developed by the IETF MMUSIC Working Group and proposed standard for initiating, modifying, and terminating an interactive user session that involves multimedia elements such as video, voice, instant messaging, online games, and virtual reality. In November 2000, SIP was accepted as a 3GPP signaling protocol and permanent element of the IMS architecture. It is one of the leading signalling protocols for Voice over IP, along with H.323.

A goal for SIP was to provide a superset of the call processing functions and features present in the public switched telephone network (PSTN). As such, features that permit familiar telephone-like operations are present: dialing a number, causing a phone to ring, hearing ringback tones or a busy signal. Implementation and terminology are different.

SIP also implements many of the more advanced call processing features present in Signalling System 7 (SS7), though the two protocols themselves could hardly be more different. SS7 is a highly centralized protocol, characterized by highly complex central network architecture and dumb endpoints (traditional telephone handsets). SIP is a peer-to-peer protocol. As such it requires only a very simple (and thus highly scalable) core network with intelligence distributed to the network edge, embedded in endpoints (terminating devices built in either hardware or software). Many SIP features are implemented in the communicating endpoints as opposed to traditional SS7 features, which are implemented in the network.

Although many other VoIP signaling protocols exist, SIP is characterized by its proponents as having roots in the IP community rather than the telecom industry. SIP has been standardized and governed primarily by the IETF while the H.323 VoIP protocol has been traditionally more associated with the ITU. However, the two organizations have endorsed both protocols in some fashion.

SIP works in concert with several other protocols and is only involved in the signaling portion of a communication session. SIP acts as a carrier for the Session Description Protocol (SDP), which describes the media content of the session, e.g. what IP ports to use, the codec being used etc. In typical use, SIP “sessions” are simply packet streams of the Real-time Transport Protocol (RTP). RTP is the carrier for the actual voice or video content itself.

The first proposed standard version (SIP 2.0) was defined in RFC 2543. The protocol was further clarified in RFC 3261, although many implementations are still using interim draft versions. Note that the version number remains 2.0.

SIP is similar to HTTP and shares some of its design principles: It is human readable and request-response structured. SIP proponents also claim it to be simpler than H.323. However, some would counter that while SIP originally had a goal of simplicity, in its current state it has become as complex as H.323. SIP shares many HTTP status codes, such as the familiar ‘404 not found’. SIP and H.323 are not limited to voice communication but can mediate any kind of communication session from voice to video or future, unrealized applications.

This text is licensed under the GNU Documentation License. It uses material from the Wikipedia article “SIP".

Revision History