Tải bản đầy đủ
1 Transport services and end- to- end communication between hosts

1 Transport services and end- to- end communication between hosts

Tải bản đầy đủ


Transport services and protocols

Figure 7.1

Multiplexing function of TCP and UDP transport protocols.

The two main transport protocols used in conjunction with IP are TCP (transmission
control protocol) and UDP (user datagram protocol). They provide respectively for a
connection-oriented transport service (COTS) and a connectionless transport service (CLTS).
By connection-oriented, we mean that a connection must be established before transport of
user data can commence. When setting up the connection, no data may be sent until there
is confirmation that the destination device exists, is switched on, and is ready to receive it.
Subsequently, during the data-transfer phase of a transport connection, the blocks of user
data sent (at layer 4, called segments) must be acknowledged — thereby guaranteeing reliable
delivery. In contrast, a connectionless transport protocol carries each message (segment of data)
independently and without a connection set-up phase. A connectionless transport protocol is
akin to posting a letter — there is no guarantee the message will arrive: you may have spelt
the address wrongly; it might get lost or corrupted. But while connectionless transport is
unreliable, it has the advantage of being simpler, less demanding of network capacity and
quicker when conveying short (one segment) messages. If all you want to say is ‘Help!’
it is quicker to just say ‘Help’ rather than first have to set-up a connection — something
like: ‘Are you listening?’. . .‘Yes’. . .‘May I start?’. . .‘Yes, I’m ready’. . .‘Help!!’. Just because
connectionless transport is unreliable doesn’t mean it is a bad thing!!

Protocol layers and their configuration parameters: critical
for communication between hosts
A pair of hosts wishing to communicate data with one another across a data network must be
equipped with suitable networking and transport protocols, and have the parameters of these
protocols configured in a compatible manner. The same transport protocol must be used at
each end of the communication, since the two hosts are peer partners at the transport layer.
But the network interfaces and protocols used at the two ends may be different, as Figure 7.2
illustrates. The diagram will help us to understand the benefits of layered protocols and to
consider the prerequisites for successful end-to-end communication.
Figure 7.2 illustrates the same example network as Figure 7.1, except that it also shows the
lower layer communications protocols. In Figure 7.2 it is apparent that the path from one host

Transport services and end-to-end communication between hosts

Figure 7.2


Detailed network topology and communications protocol functions involved in the example
of Figure 7.1.

to the other traverses two routers along the way. We shall assume that each host is connected
to its corresponding router by means of a LAN interface, and that the connection between
the two interfaces is by means of a point-to-point leaseline. As a result, the routers perform
not only IP layer forwarding but also physical interface and datalink layer adaptation. The
datalink layer protocol used in the LAN between the hosts and their corresponding routers
will most likely be IEEE 802.2 (LLC — logical link control protocol), while the datalink layer
protocol used on the point-to-point link between the two routers will probably be either PPP
(point-to-point protocol) or HDLC (higher level data link control). The two physical LAN
interfaces will most likely conform to ethernet (e.g., IEEE 802.3, 10baseT, 100baseT, etc.),
while the point-to-point connection will have a leaseline interface format, e.g., X.21, V.35 etc.
as we discussed in Chapter 3.
The communication of one of the application protocols in the host on the left-hand side
of Figure 7.2 with the corresponding peer application protocol in the host on the right-hand
side (e.g., email-to-email) has to pass through 18 protocol processing steps along its complete
path!! First, the transport layer (TCP/UDP) creates segments of data containing the email
message. Each segment of data is then packed into a packet (at layer 3 — the network layer)
and then into a frame (at layer 2 — the data link layer) before being coded into the appropriate
physical format for transmission. But after each hop of the way, each frame is unpacked for
the IP packet forwarding process, before being re-packed into another frame and recoded
for the next hop. At the final destination, the frames and packets are all unpacked and the
individual segments are reassembled in the correct order for presentation of the message to
the destination email application.
Each of the 18 protocol functions of Figure 7.2 has to be configured correctly, or the
communication will fail!
The configuration of the parameters of each of the various different protocol functions
(18 in all) of Figure 7.2, as in any real network, has to be undertaken carefully. Typically
the different functions will comprise individual software and/or hardware modules, each of
which will have been added to the network at a different time — as a new router or host


Transport services and protocols

has been connected to the network, or as a new interface or link has been added. Most
of the modules are likely to pre-configured in a default configuration, so that much of the
network can be built in a plug-and-play fashion, but there are still configuration duties for the
human administrator to undertake and check (if the plug-and-play approach does not bring
immediate success):
• physical layer interface: cable lengths and types, connector pin-outs and DTE/DCE functional assignments must all be correct;
• datalink layer: the MAC address is usually a factory-configured globally unique identifier
in the case of ethernet or other LANs, but HDLC or PPP will require address assignments.
On a PC, configuration changes are usually to be found in a window labelled something
like ‘control panel/system/ equipment manager/network interface adapters’;
• IP layer: TTL (time-to-live) and other parameters must align, IP addresses must be assigned
manually or a protocol such as DHCP or BOOTP must be organised to undertake dynamic
allocation. Routing protocols must be set up and configured in the routers. IP parameters in host devices (e.g., PCs or workstations) are typically to be found in a ‘control panel/networking’ configuration window; and
• TCP or UDP transport layer capabilities and parameters must align with the peer host.
Things may seem to be getting frightfully complex. There are lots of different protocols to
configure in all the different devices — and much scope for error — but mainly because of the
large number of devices and software functions involved in a data network. Anyone who has
tried to configure a network, or simply tried to configure the network settings in his own PC to
enable him to connect to the Internet will know about the potential for mistakes. Nonetheless,
all experienced data network engineers and software developers will tell you that the use of
layered protocols makes things much easier to administer than they might otherwise have
been — at least the configuration parameters are not embedded in the application software! It
would be a nightmare to have to keep recompiling application software each time a network
parameter or topology change was undertaken!!
But maybe you are wondering how a software application can initiate data communication
with a distant partner? The answer is by invoking a transport service. This is done at an
internal software interface called a port or a socket. (This is the service access point (SAP)
of the transport layer.) An application protocol (such as one of those listed in the right hand
column of Table 7.1) uses the port or socket.

Transport layer multiplexing — ports and sockets
Table 7.1 lists the most commonly used port (or socket) numbers made available by TCP
(transmission control protocol) and UDP (user datagram protocol). The use of different port
numbers allows a single point-to-point communications path between two end-devices to be
used simultaneously for a number of different applications listed in the table.

Transport of communication flows
Today, TCP (transmission control protocol) and UDP (user datagram protocol) are the most
widely used data transport protocols. Despite their age (the current versions date back to 1981
and 1980 respectively) they have proved to be very durable for all manner of data communications applications. But to meet the needs of modern multimedia communications, there has

Transport services and end-to-end communication between hosts


Table 7.1 TCP (transmission control protocol) and UDP (user datagram protocol) port numbers

TCP or
UDP as

Application protocol or service











FTP (file transfer protocol)
SSH (secure shell) remote login
SMTP (simple mail transfer protocol)
DNS (domain names service)
TACACS (terminal access controller access control system) database
BOOTP (bootstrap protocol) / DHCP (dynamic host configuration
protocol) server
BOOTP (bootstrap protocol) / DHCP (dynamic host configuration
protocol) client
TFTP (trivial file transfer protocol)
World wide web HTTP (hypertext transfer protocol)
Sun remote procedure call (RPC)
NNTP (network news transfer protocol)
NTP (network time protocol)
NetBIOS name service
NetBIOS datagram service
NetBIOS session service
SNMP (simple network management protocol)
SNMP trap
BGP (border gateway protocol)
IRC (Internet relay chat)
Novell IPX (internetwork packet exchange)
HTTPS (secure hypertext transfer protocol)
rsh (BSD — Berkeley software distribution) remote shell
RLOGIN (remote login)
cmd (UNIX R commands)
RIP (routing information protocol)
UUCP (UNIX-to-UNIX copy program)
LDP (label distribution protocol): LDP hello uses UDP, LDP sessions
use TCP
SOCKS (OSI session layer security)
RADIUS (remote authentication dial-in user service) authentication server
RADIUS accounting server (Radacct)
L2F (layer 2 forwarding)
NFS (network file system)
DLSw (data link switching) read port
DLSw (data link switching) write port
SIP (session initiation protocol)
X-windows system display
SAP (session announcement protocol)

Note: For a complete and up-to-date listing see also www.iana.org/assignments/port-numbers

been recent intensive effort to further develop both the transport and network layer protocols
further — in order to support the much more stringent demands made by the transport of realtime voice and video signals. Communication of real-time signals requires the carriage of a
continuous signal (a communication flow or stream), with predictable and very high quality
of service — with low and almost constant propagation delay.


Transport services and protocols

Communication flows or streams require very high grade connections and are carried in
IP-based networks by means of a new type of connection-oriented network layer protocol
based on label switching (formally called tag switching). We encountered the basic principles
of label switching in Chapter 3. Label switching takes place by assigning a flow label to IP
packets making up a given communications flow. By doing so, the normal IP-header processing
for determining the routing of packets at each router can be dispensed with, and replaced by
a much faster label switching function. Such a label switching function is defined by MPLS
(multiprotocol label switching) and is capable of direct incorporation into the flow label field of
the IPv6 header (as we saw in Chapter 5). In effect, MPLS provides an optional connectionoriented extension to the IP network layer protocol by means of an additional shim layer
between the network layer (layer 3) and the data link layer (layer 2). We shall discuss it in
greater detail later in the chapter.
As a connection-oriented network service (CONS), MPLS is used extensively to provide for
VPN (virtual private network) services — in which a public Internet service providers (ISPs)
offer ‘leaseline-like’ point-to-point links as router-to-router connections for private IP router
networks. In addition, MPLS is often combined with special transport layer protocols used
to reserve bandwidth reservation and quality of service (QOS) assurance along the entire path
of the label-switched connection. The most important of these transport layer protocols is
the resource reservation protocol (RSVP). RSVP is a control channel protocol, used before
or during the set-up phase of a connection to reserve bandwidth and other network resources
for the handling and forwarding of the packets making up the communication flow or stream.
Unlike the TCP and UDP transport protocols, RSVP is not used directly for the end-to-end
carriage of user-information, and it may often be used in addition to TCP. We return to discuss
RSVP later in the chapter.

7.2 User datagram protocol (UDP)
The user datagram protocol (UDP) is the IP-suite transport protocol intended for connectionless data transport. It is defined in RFC 0768. The segment format of UDP is illustrated in
Figure 7.3: there are only four header fields: source and destination port fields, a length field
and a checksum.
The source and destination port (also called socket) fields are coded according to Table 7.1.
These provide for a multiplexing function to be undertaken by UDP — allowing segments
intended for different application software protocols to share the same network path but still
be kept apart from one another.
The length field indicates the length of the UDP segment (as shown in Figure 7.3), including
the four header fields as well as the UDP data field.

Figure 7.3

Segment format of the user datagram protocol (UDP).

Transmission control protocol (TCP)


The UDP checksum (which is optional) is applied not only across the UDP segment but also
includes the pseudo-UDP-header fields taken from the IP header (see Figure 7.9). The pseudoheader reflects the manner in which some application software sets up a communications path
using UDP and simultaneously issues commands to the IP forwarding software. The application
software packs the data and includes a ‘label’ (the UDP header and parts of the IP header)
to indicate the coding of the data (IP protocol and UDP port number) as well as the address
(IP-address) to which it is to be delivered.1
The user datagram protocol (UDP) is little more than ‘IP with a port number added’.
For some purposes, a similar multiplexing effect to the UDP port number can be obtained
using only the Internet protocol (IP) and relying upon the protocol field for multiplexing.
This is the approach used, for example, for the carriage of OSPF (open shortest path first)
routing protocol.

7.3 Transmission control protocol (TCP)
Transmission control protocol (TCP) is the IP-suite transport protocol which provides for reliable connection-oriented transport of data with guaranteed delivery. It is defined in RFC 0793.
The protocol has come to be the most commonly used of the two available IP-suite transport
protocols (TCP and UDP) and for many people come to be synonymous with IP — hence the
frequently used terminology: TCP/IP. TCP only need be supported by the two hosts at either
end of the connection, and the requirements of these hosts are detailed in RFC 1122 (which
is essentially an implementation guide for TCP/IP).

Basic operation of the transmission control protocol (TCP)
As shown in Figure 7.4, the transmission control protocol (TCP) provides for a three-stage
process of data transport. In the first phase, a connection is set up between the two hosts which
are to communicate. This stage is called synchronisation. Following successful synchronisation
of the two hosts (shown as the ‘A-end’ and the ‘B-end’ in Figure 7.4), there is a phase of
data transfer. In order that the data transfer is reliable (i.e., that the delivery of segments can
be guaranteed), TCP undertakes acknowledgement of segments, as well as flow control and
congestion control during the data transfer phase. After the completion of data transfer, either
of the two ends may close the connection. If at any time there is a serious and unresolvable
network or connection error, the connection is reset (i.e., closed immediately).
The acknowledgement of segments confirms to the sender of the data, that the receiver has
received them successfully — without errors being detected by the checksum. Any segments
lost during transmission are retransmitted (either automatically by the sender having not
received an acknowledgement or on request of the receiver).
The transmitting end takes the prime responsibility in ensuring that data is reliably
delivered. . .using a technique known as retransmission. The transmitter sets a retransmission
timer for each TCP segment sent. Should the timer exceed a pre-determined period of time,
known as the retransmission timeout (RTO), without having received an acknowledgement
(ACK) for the segment, then the segment is automatically retransmitted (i.e., re-sent). A
typical value for the retransmission timeout (RTO) is 3 seconds. Retransmission of the same
segment continues until either an acknowledgement (ACK) is received or until the connection

Because the IP address fields are included in the UDP checksum, the protocol has to be adapted when used in
conjunction with IP version 6 (IPv6) because of the longer length of IPv6 addresses (128 bits) — IPv4 addresses
are only 32 bits long.


Transport services and protocols

Figure 7.4

Basic operational phases of a TCP connection.

timeout expires — in which case the connection is reset. The reset may be necessary to recover
from a remote host or network failure.
To prevent inappropriate retransmission of packets on long delay paths (which might otherwise only lead to network congestion without improving the reliability of delivery), the
retransmission timeout (RTO) is corrected according to the round trip time (RTT). In theory,
the round trip time (RTT) is the minimum time taken for a message to be sent in one direction and immediately acknowledged in the other direction. In practice, it is determined by
the transmitter by means of the Jacobsen algorithm (described in RFC 1323 and RFC 2988),
using a sample of measured RTTs of previous segments. The actual calculation continually
adapts two parameters called the smoothed RTT (SRTT) and the RTT variance (RTTVAR). The
RTO is then accordingly set (or adjusted) to correspond to a slightly longer time period than
the RTT. In other words, RTO = RTT + small margin (Figure 7.5).
The aim of constantly adjusting the RTO is to avoid the network performance degrading effects of excessive retransmission, without compromising the reliability of guaranteed
segment delivery.
The use of the TCP timestamp option (TSOPT — described in RFC 1323) is helpful in
calculating the RTT more exactly. The timestamp (i.e., the time the original data segment was
sent) is returned in the acknowledgement (ACK) of the data segment. This avoids a possible
problem (following retransmission of a given segment) of not knowing for which segment an
acknowledgement has been received (the first transmission or the retransmission). When not
using a timestamp, no new calculation of RTT is undertaken for retransmitted segments. In
other words: the RTT of a retransmitted segment is not a valid sample for recalculation of the
RTT. (This is called Karn’s algorithm.)
It is important to note that not every TCP segment is individually acknowledged (ACK).
Instead, cumulative acknowledgement is undertaken. The acknowledgement does not indicate
directly which segments have been received, but instead which segment is the next expected!
By inference, all previous segments have been received. The use of such a cumulative acknowledgement procedure has a number of advantages. First, not so many acknowledgement (ACK)
messages need be sent, thereby reducing the possibility of network congestion. Second, the
return path need not be kept permanently free for sending acknowledgements. In this way
it is possible to use the connection in a duplex mode — for sending data in both directions

Transmission control protocol (TCP)

Figure 7.5


TCP acknowledgement of data receipt: the round trip time (RTT) and retransmission
timeout (RTO).

simultaneously. A data segment sent along the ‘return’ path can simultaneously be used to
acknowledge all previous received data in the ‘forward’ path. Finally, since the ACK message
indicates the ‘next expected segment’ it provides a useful ‘trick’ for re-ordering of segments
(i.e., requesting their immediate retransmission). The re-ordering is signalled by duplicating
ACKs — four identical ACKS (i.e., with the same value) in a row are interpreted by the
transmitter as a request for a fast retransmit.
Rather than acknowledging each received segment, the cumulative acknowledgement is
undertaken regularly according to a timer (an acknowledgement must be sent at least every
0.5 seconds). In addition, at least every second segment of maximum segment size (MSS) must
be acknowledged. Having too many outstanding (unacknowledged) packets is not a good
thing, since if a retransmission is necessary, the repeated transmission of large packets adds
significantly to the network load — and thus to the chance of congestion.
At the transmitting end of the connection, segments are sent either when the maximum
segment size (MSS) is reached, when the software application forces the transmission (for
example, with a push, PSH, command) or after a given timeout expires. Push commands
are used in terminal-to-host communications protocols (e.g., Telnet). They allow terminal
commands to a mainframe computer (something typed and entered with a carriage return
key) to be immediately sent to the host. If a push command is not used, TCP segments are
forwarded regularly during the connection, according to a timer. If necessary (when there is
no ‘real data’ to be sent), keepAlive segments may be sent to advise the remote application
that the connection is still live.
For data flow control and congestion control, the transmission control protocol (TCP) maintains a transmit window. The size of the window determines how many unacknowledged octets
of data the transmitter is allowed to send before it must cease transmission and wait for an
acknowledgement. By limiting the window, TCP ensures that the receiver input buffer is not
overwhelmed with data — so preventing a receive buffer overflow. Should overflow occur,
segments would not be acknowledged, and would be retransmitted, so the reliability of data


Transport services and protocols

delivery is not at stake — but network congestion is. Avoiding retransmission of data (when
possible) is a prudent manner of preventing network congestion.
The size of the window can be adjusted by the receiver during the course of the TCP
connection, as will become clearer later — when we review the TCP segment format and
discuss TCP flow and congestion control.

TCP connection set-up: the three-way handshake
The setting-up of a TCP connection takes place by means of a three-way handshake, as
illustrated in Figure 7.6. The process uses a TCP synchronisation message. Unlike a telephone
call, the connection does not comprise a dedicated physical circuit, and unlike X.25 (which we
discussed in Chapter 3), there is not really a virtual circuit either. Instead the connection set-up
establishes simply that a transmission path exists between the ‘caller’ and the ‘destination’
and that both are ready to ‘talk’ to one another. The three messages which form the three way
handshake effectively have the following functions:
• first synchronisation message (‘can we talk on port x? If so, I will number messages in
order, starting with message number y’);
• synchronisation acknowledgement message (‘yes, I understand the protocol used on this
port number and am ready to talk now. My messages to you will be numbered too, starting
with message number z’); and
• handshake acknowledgement (‘I confirm that I have understood your messages’). The
third and final message of the handshake confirms to both parties that both directions
of transmission are working correctly — that duplex communication is possible. Without
duplex communication, the protocol will not work.

Figure 7.6

TCP connection set-up (synchronisation) — by means of a 3-way handshake.

Transmission control protocol (TCP)

Figure 7.7


Transmission control protocol (TCP) segment format.

At the time of connection set-up, the initial connection parameter values are set: the retransmission timer (RTO) typically set to 3 secs; the round-trip time (RTT) is set to 0 secs; and
the initial window (IW) size is set to a value in octets equivalent to one segment. In addition,
the initial sequence numbers (ISNs) are set. The sequence numbers (sequence and acknowledgement numbers) are the segment message numbers. Segments are numbered consecutively
according to how many octets they contain. The ISNs are random numbers chosen by the two

Transmission control protocol (TCP) segment format
Figure 7.7 illustrates the TCP segment format. The standard segment2 header is 20 octets long.

Source and destination port fields
The source port field (possible values 0–65 535) identifies the application software in the
transmitting host. The destination port field (possible values 0–65 535) identifies the intended
destination application software in the receiving host. Both fields are normally coded with the
same value. The most commonly used port values (also called socket numbers) are listed in
Table 7.1.

Sequence number and acknowledgement number fields
The sequence number commences at the initial sequence number (ISN) number, which is
chosen randomly by the transmitter, and each subsequent data message (i.e., segment) has a
sequence number accordingly greater than the previous segment. The increment in the sequence
number depends upon the number of octets in the previous segment. The sequence number
At protocol layer 4 (transport layer) a block of data is called a segment, whereas at protocol layer 3 (network layer) we refer to packets and at protocol layer 2 (datalink layer) we refer to frames — the data starts
life as a segment, gets packed into a packet which in turn gets packed into a frame — as we saw in chapter 3 — Figure 3.28.


Transport services and protocols

thus counts octets, but its value also uniquely identifies a particular segment, thereby enabling
the receiver to:
• check that it has received all octets (and therefore all segments) of the user data;
• unambiguously acknowledge the receipt of all segments back to the transmitter;
• order the re-transmission of any missing segments, and
• re-sequence segments into the correct order, should they be received out-of-order.
The acknowledgement number field is used to return the sequence number of the next awaited
octet (by inference indicating the cumulative octets successfully received so far).
Both the sequence number and the acknowledgement number can be incremented up to
the value 232 − 1 (which equals around 4 billion octets — around 4 Gbytes), before the value
is wrapped (i.e., reset to zero) and counting starts from ‘0’ again. Sometime or other, the
total number of octets sent may exceed the 4 Gbytes, whereupon a previously used sequence
number will be re-used (see Table 7.2).
The re-use of sequence numbers can cause a problem, if there is any chance that two
segments might be in existence at the same time with the same segment number. Bear in
mind here, that the TCP specification defines a maximum segment lifetime (MSL) of 120
seconds — thus limiting the data transfer rate to 4 Gbytes in 120 seconds. But working at its
maximum speed of data transfer, a Gigabit ethernet interface will be able to transfer 4 Gbytes of
data in around 34 seconds. In other words, high speed interfaces present a real risk of multiple
different segments bearing the same sequence number. A solution to the problem is defined
in RFCs 1185 and 1323. The solution is called PAWS (protect against wrapped sequences).
PAWS (protect against wrapped sequences) is intended to be used on very high-speed
TCP connections. Instead of simply increasing the size of the TCP sequence number (and
thereby reducing the frequency of sequence number wrapping), the PAWS solution to wrapped
sequences is to include a timestamp with each transmitted segment. The timestamp creates
clear distinction between segments with duplicate wrapped sequence numbers without having
to increase the sequence number field length.3
Table 7.2

Time elapsed per TCP sequence number wrap cycle at full speed of transmission

Network or interface type
Digital leaseline
10baseT ethernet
Fast ethernet (100baseT)
STM-1 (OC-3) line
Gigabit ethernet (1000baseX)

Bit rate


Time to sequence number wrapping
7.1 days
6.2 days
6.2 hours
4.7 hours
57.3 mins
16.8 mins
12.7 mins
5.7 mins
3.7 mins
34 secs

Increasing the length of the sequence number field would have meant that devices with this ‘new version of
TCP’ would be incompatible with existing devices using ‘previous version TCP’. By instead using a protocol
option field — the timestamp field — protocol compatibility can be maintained. Of course there is a chance that
older devices may not correctly interpret and use the option — but there again, these are unlikely to be devices
with high speed interfaces!