Why TCP/IP Is not Sufficient for VoIP
The connection-oriented/connectionless dichotomy
Traditional voice networks are classified as connection-oriented networks, in which a path from the source to destination is established, prior to any information transfer. When the end user takes the telephone off-hook, they notify the network that service is requested. The network then returns dial tone, and the end user dials the destination number. When the destination party answers, the end-to-end connection is confirmed through the various switching offices along the path. When the conversation is complete, the two parties hang up, and their network resources can be re-allocated for someone else's conversation.
One of the disadvantages of this process is the consumption of resources spent setting up the call (a process called signaling, which we will consider in a future tutorial). One of the advantages, however, is that once that call has been established, and a path through the network defined, the characteristics of that path, such as propagation delay, information sequencing, etc. should remain constant for the duration of the call. Since these constants add to the reliability of the system, the term reliable network is often used to describe a connection-oriented environment. The Transmission Control Protocol (TCP) is an example of a connection-oriented protocol.
In contrast, traditional data networks are classified as connectionless networks, in which the full source and destination address is attached to a packet of information, and then that packet is dropped into the network for delivery to the ultimate destination. An analogy to connectionless networks is the postal system, in which we drop a letter into the mailbox, and if all works according to plan, the letter is transported to the destination. We do not know the path that the packet (or letter) will take, and depending upon the route, the delay could vary greatly. It is also possible that our packet may get lost or be mis-delivered within the network, and therefore not reach the destination at all. For these reasons, the terms best efforts and unreliable are often used to describe a connectionless environment. The Internet Protocol (IP) and the User Datagram Protocol (UDP) are examples of connectionless protocols.
Recall from your Internet History 101 class, that the Internet protocols, including TCP, IP, and UDP were developed in the 1970s and 1980s to support three key applications: file transfers (using the File Transfer Protocol, or FTP), electronic mail (using the Simple Mail Transfer Protocol, or SMTP), and remote host computer access (using the TELNET protocol). All of these applications were data- (not voice-) oriented, and were therefore based upon IP's connectionless network design. Layering TCP on top of IP gave the entire system enhanced reliability (albeit with additional protocol overhead), but the rigors of a true connection-oriented, switched infrastructure (like the telephone network) was not necessary to support these applications.
Teaching an old dog new tricks
Fast forward a few decades to the new millennium where visions of voice, fax, and video over IP dominate. These applications are sensitive to sequencing and delay issues, and the idea of a "best efforts" serviceespecially if the voice conversation must go through, such as a call to the police or fire departmentwill not gather many supporters.
Which brings us to the challenging question: How do we support connection-oriented applications (such as voice and video) over a connectionless environment (such as IP), without completely redesigning the network infrastructure? The solution is to enhance IP with additional protocols that fill in some of its data-centric gaps. These include:
- Multicast Internet Protocol (Multicast IP), defined in RFCs 1112 and 2236.
Multicast allows information from a single source to be sent to multiple destinations (as may be required for conferencing).
- Real-time Transport Protocol (RTP), defined in RFC 3350.
RTP provides functions such as payload identification, sequence numbering, and timestamps on the information.
- RTP Control Protocol (RTCP), also defined in RFC 3350.
RTCP monitors the quality of the RTP connection.
- Resource Reservation Protocol (RSVP), defined in RFC 2205.
RSVP requests the allocation of network resources, to assure adequate bandwidth between sender and receiver.
- Real-Time Streaming Protocol (RTSP), defined in RFC 2326.
RTSP supports the delivery of real-time data, including retrieval of information from a media server or support for conferencing.
- Session Description Protocol (SDP), defined in RFC 2327.
SDP conveys information about the media streams for a particular session, including session name, time the session will be active, what media (voice, video, etc.) is to be used, the bandwidth required, and so on.
- Session Announcement Protocol (SAP), defined in RFC 2974.
SAP packets are periodically transmitted to identify open sessions that may be of interest to the end user community.
Copyright (C) 2005 DigiNet (R) Corporation
So is TCP/IP adequate for VoIP? Strictly speaking no, but with the addition of new protocols to support time sensitive applications such as voice and video, the existing IP infrastructure can therefore be all things to all peoplesupporting both connection-oriented and connectionless applications. In the next several tutorials we will examine some of these new protocols in more detail.
Mark A. Miller, P.E. is President of DigiNet (R) Corporation, a Denver-based consulting engineering firm. He is the author of many books on networking technologies, including Voice over IP Technologies, and Internet Technologies Handbook, both published by John Wiley & Sons.