Tải bản đầy đủ
11 Reducing errors ¡ª regeneration, error detection and correction

11 Reducing errors ¡ª regeneration, error detection and correction

Tải bản đầy đủ

Reducing errors — regeneration, error detection and correction

49

Figure 2.15 The principle of regeneration.

Note: As an aside, the example of Figure2.16 is some what unusual,since it would not be normal to
send two simple parity bits, since the second parity bit does not provide much more information
about errors beyond that already provided by the first.

Figure 2.16 The principle of error detection.

In addition to regeneration, which is generally used on all digital line systems, it is common
in data communication also to apply error detection and/or error correction. Error detection is
carried out by splitting the data to be sent up into a number of blocks. For each block of data
sent, a number of error check bits are appended to the back end of the block (Figure 2.16).
The error check bits (also sometimes called the FCS or frame check sequence) are used to
verify a successful and error-free transmission of the data block to which they relate. In the
example in Figure 2.16, the data block is 4 bytes long (32 bits) and there are two error check
bits, including an even parity bit and an odd parity bit. The even parity bit is set so that the
total number of bits of binary value ‘1’ within the data block and the parity bit (33 bits in
total) is even. Since there are 17 bits of value ‘1’ in the data block, then the even parity bit is
set to value ‘1’ to make the total number of ‘1’ values 18 and thus even. Similarly, the odd
parity bit is set to make the total number of ‘1’ values odd. If on receipt of the new extended
block, the total number of bits set at binary value ‘1’ does not correspond with the indication
of the parity bits, then there is an error in the message.
Sophisticated error detection codes even allow for the correction of errors. Forward error
correction (FEC) codes allow an error (or a number of errors) in the data block to be corrected
by the receiver without further communication with the transmitter. Alternatively, without
using forward error correction, the detection of an error could simply lead to the re-transmission
of the affected data block. Different data protocols either use parity checks, forward error
correction (FEC), data retransmission or no error correction as their means of eliminating
bit errors.
Error detection and correction have nowadays become a sophisticated business. Standard
means of error detection and correction are provided by cyclic redundancy checks (CRCs),
Hamming codes, Reed-Solomon and Viterbi codes. These methods use extra bits and complex

50

Fundamentals of data communication and packet switching

mathematical algorithms to dramatically reduce the probability of errors surviving detection
and correction. They therefore greatly increase the dependability of transmission systems. The
most common detection and correction technique used by data protocols intended for terrestrial
networks are cyclic redundancy check codes, and we discuss these next.

Cyclic redundancy check (CRC) codes
Error detection and correction codes can be divided into cyclic and non-cyclic codes. Cyclic
codes are a special type of error correcting block code, in which each of the valid codewords 3
are a simple lateral shift of one another.
To illustrate a cyclic code, let us consider a (10, 3) code (the codewords are 10 bits long in
total, comprising 7 bits of user data and 3 error check bits). If we assume that the following
is a valid codeword of our code:
c=(1010100100)
then c = (0 1 0 1 0 0 1 0 0 1) must also be a valid codeword, if the code is cyclic. (All bits
are shifted one position to the left, with the extreme left-hand bit of the original codeword
reinserted on the right-hand end. The other 8 codewords of this CRC are:
c
c
c
c

=
=
=
=

(
(
(
(

1
0
1
0

0
1
0
0

1
0
0
1

0
0
1
0

0
1
0
0

1
0
0
1

0
0
1
0

0
1
0
1

1
0
1
0

0)
1)
0)
1)

c
c
c
c

=
=
=
=

(
(
(
(

0
1
0
0

1
0
0
1

0
0
1
0

0
1
0
1

1
0
1
0

0
1
0
1

1
0
1
0

0
1
0
0

1
0
0
1

0)
0)
1)
0)

Cyclic codes are capable of correcting larger numbers of errors within the data block than are
non-cyclic codes. A CRC-n is an n-bit cyclic redundancy check code. The value of the code
is set by first multiplying the data block by a given multiplier, then dividing (in binary, or
modulo 2 ) the result of the multiplication by a generator polynomial. The remainder is used
to set the value of the CRC field. A number of common CRC codes are shown in Table 2.5.
Table 2.5
CRC-n type
CRC-1
CRC-3
CRC-4
CRC-5
CRC-6
CRC-7
CRC-10
CRC-16

3

Common cyclic redundancy check (CRC) codes
Field multiplier

x3 (in other words: 1000 Binary)
x4 (10 000 B)
x5 (100 000 B)
x6 (1 000 000 B)
x7 (10 000 000 B)
x10 (10 000 000 000 B)
x16 + x15 + x14 + x13 + x12 + x11 + x10 + x9 +
x8 + x7 + x6 + x5 + x4 + x3 + x2 + x +
1 (11 111 111 111 111 111 B)

Generator polynomial
(simple parity check)
x3 + x + 1(1011 B)
x4 + x + 1 (10 011 B)
x5 + x4 + x2 + 1(110 101 B)
x6 + x + 1 (1 000 011 B)
x7 + x3 + 1 (10 001 001 B)
x10 + x9 + x5 + x4 + x +
1 (11 000 110 011 B)
x16 + x12 + x5 + 1
(10 001 000 000 100 001 B)

The codeword is a fancy name for the resulting pattern of bits which is transmitted after adding the error
check bits. The codeword thus comprises both the original user data block (the first 32 bits of Figure 2.16) and
the error check bits (the last 2 bits). One talks of (x, y) codewords, where x is the total number of bits in the
codeword (in the case of Figure 2.16, x = 34) and y is the number of error check bits (y = 2 in Figure 2.16).
Thus, the example of Figure 2.16 is a (34,2) codeword.

Synchronisation

51

Other well-known cyclic redundancy check codes include BCH (Bose-Chaudhuri-Hocquenhem) and Reed-Solomon codes.

2.12 Synchronisation
The successful transmission of data depends not only on the accurate coding of the transmitted
signal (e.g., using the ASCII code, as we discussed earlier), but also on the ability of the
receiving device to decode the signal correctly. This calls for accurate synchronisation of the
receiver with the transmitter, so that the beginning and end of all received bits occur at regular
and predictable intervals. For the purpose of synchronisation a highly accurate clock must be
used in both transmitter and receiver.
It is usual for the receiver to sample the communication line at a rate much faster than
that of the incoming data, thus ensuring a rapid detection of any change in line signal state,
as Figure 2.17c shows. Theoretically it is only necessary to sample the incoming data at a
rate equal to the nominal bit rate of the signal, but this runs a risk of data corruption. If we
chose to sample near the beginning or end of each bit (Figures 2.18a and 2.18b) we might
lose or duplicate data as the result of a slight fluctuation in the time duration of individual
bits. Much faster sampling ensures rapid detection of the start of each ‘0’ to ‘1’ or ‘1’ to ‘0’
transition. In this case, the signal transitions are interpreted as bits, and the exact clock rate
of the transmitter can be determined.
Variations in the clock rate arising from different time durations of individual received bits
come about because signals are liable to encounter time shifts during transmission, which may
or may not be the same for all bits within the message. These variations are usually random
and they combine to create an effect known as jitter. As we showed in Figure 2.13d, jitter can
lead to bit errors.
The purpose of synchronisation is to remove all short-, medium- and long-term time effects
of timing or clocking differences between the transmitter and the receiver. Short-term variations
are called jitter. Long term variations are instead called wander.
In the short term, synchronisation between transmitter and receiver takes place at a bit
level, by bit synchronisation. This keeps the transmitting and receiving clocks in step, so
that bits start and giving pulses of constant duration. (Recall the errors which occurred as
a result of incorrect pulse durations in Figures 2.14c and 2.14d.) Medium-term character
or word synchronisation prevents confusion between the last few bits of one character and
the first few bits of the next. If we interpret the bits wrongly, we end up with the wrong
characters, as we found out in our earlier example, when the message ‘Greetings’ was turned
’. Finally there is frame synchronisation, which ensures
into the gibberish ‘
data reliability and integrity over longer time periods.

Figure 2.17 Effect of sampling rate.

52

Fundamentals of data communication and packet switching

Figure 2.18

Commonly used line codes for digital line systems V = Violation.

Bit synchronisation using a line code
The sequence of ones and zeros (marks and spaces) making up a digital signal is not usually
sent directly to line, but is first arranged according to a line code. The line code serves for
the purpose of bit synchronisation.
One of the biggest problems to be overcome by bit synchronisation is that if either a
long string of 0’s or 1’s were sent to line consecutively, then the line would appear to be
either permanently ‘on’ or permanently ‘off’ — effectively a direct current (DC) condition
is transmitted to line. This is not advisable for two reasons. First, the power requirement is
increased and the attenuation is greater for direct current (DC) as opposed to alternating current
(AC). Second, it may be unclear how many bits of value ‘0’ have been sent consecutively. It
may even be unclear to the receiver whether the line is actually still ‘alive’. The problem gets
worse as the number of consecutive 0’s or 1’s increases. Line codes therefore seek to ensure
that a minimum frequency of line state changes is maintained.
Figure 2.18 illustrates some of the most commonly used line codes. Generally they all seek
to eliminate long sequences of 1’s or 0’s, and try to be balanced codes, i.e., producing a net
zero direct current voltage. Thus, for example, the three-state codes AMI and HDB3 try to
negate positive pulses with negative ones. This reduces the problems of transmitting power
across the line.
The simplest line code illustrated in Figure 2.18 is a non-return to zero (NRZ) code in
which ‘1’ = ‘on’ and ‘0’ = ‘off’. This is perhaps the easiest to understand. All our previous
examples of bit patterns were, in effect, shown in a non-return to zero (NRZ) line code format.
In NRZI (non-return-to-zero inverted) it is the presence or absence of a transition (a transition is a change of line state, either from ‘1’ to ‘0’ or from ‘0’ to ‘1’) which represents a ‘0’
or a ‘1’. Such a code is technically quite simple (even if confusing to work out) and may be

Synchronisation

53

advantageous where the line spends much of its time in an ‘idle’ mode in which a string of
‘0s’ or ‘1s’ would otherwise be sent. Such is the case, for example, between an asynchronous
terminal and a mainframe computer. NRZI is used widely by the IBM company for such
connections.
A return-to-zero (RZ) code works in a similar manner to NRZ, except that marks return to
zero midway through the bit period, and not at the end of the bit. Such coding has the advantage
of lower required power and constant mark pulse length in comparison with basic NRZ. The
length of the pulse relative to the total bit period is known as the duty cycle. Synchronisation
and timing adjustment can thus be achieved without affecting the mark pulse duration.
A variation of the NRZ and RZ codes is the CMI (coded mark inversion) code recommended
by ITU-T. In CMI, a ‘0’ is represented by the two signal amplitudes A1, A2 which are
transmitted consecutively, each for half the bit duration. ‘1s’ are sent as full bit duration
pulses of one of the two line signal amplitudes, the amplitude alternating between A1 and A2
between consecutive marks.
In the Manchester code, a higher pulse density helps to maintain synchronisation between
the two communicating devices. Here the transition from high-to-low represents a ‘1’ and the
reverse transition (from low-to-high) a ‘0’. The Manchester code is used in ethernet LANs.
In the differential Manchester code a voltage transition at the bit start point is generated
whenever a binary ‘0’ is transmitted but remains the same for binary ‘1’. The IEEE 802.5
specification of the token ring LAN demands differential Manchester coding.
In the Miller code, a transition either low-to-high or high-to-low represents a ‘1’. No
transition means a ‘0’.
The AMI (alternate mark inversion) and HDB3 (high density bipolar) codes defined by
ITU-T (recommendation G.703) are both three-state, rather than simple two-state (on/off)
codes. In these codes, the two extreme states (if you like, ‘+’ and ‘−’) are used to represent marks (value ‘1’) and the mid-state (if you like, value ‘0’) is used to represent spaces
(value ‘0’). The three states are often realised as ‘positive’ and ‘negative values’, with a midvalue of ‘0’. Or in the case of optical fibres, where light is used, the three states could be ‘off’,
‘low intensity’ and ‘high intensity’. In both AMI and HDB3 line codes, alternative marks are
sent as positive and negative pulses. Alternating the polarity of the pulses helps to prevent
direct current being transmitted to line. (In a two-state code, a string of marks would have the
effect of sending a steady ‘on’ value to line.)
The HDB3 code (used widely in Europe and on international transmission systems) is an
extended form of AMI in which the number of consecutive zeros that may be sent to line
is limited to 3. Limiting the number of consecutive zeros bring two benefits: first, a null
signal is avoided, and second, a minimum mark density can be maintained (even during idle
conditions such as pauses in speech). A high mark density aids the regenerator timing and
synchronisation.
In HDB3, the fourth zero in a string of four is marked (i.e., forcibly set to 1) but this
is done in such a way that the ‘zero’ value of the original signal may be recovered at the
receiving end. The recovery is achieved by marking fourth zeros in violation, that is to say
in the same polarity as the previous ‘mark’, rather than in opposite polarity mark (opposite
polarity of consecutive marks being the normal procedure).
Other line codes used on WAN lines (particularly in North America) include B8ZS (bipolar
8 zero substitution) and ZBTSI (zero byte time slot interchange).

Character synchronisation — synchronous and asynchronous data transfer
Character synchronisation ensures that the receiver knows which is the first bit of each character code pattern. Misplacing the first bit can change the interpretation of the character. (In
our earlier example, the message ‘Greetings’ became ‘

’).

54

Fundamentals of data communication and packet switching

Let us consider the following sequence of 9 received bits:
(last bit received)

001100110

(first bit received)

As a ‘raw stream’ of received bits, it is difficult to determine which 8 bits (when grouped
together) represent an ASCII character. If we assume that the first 8 bits represent a character,
then the value is ‘01100110’ and the character decoded is ‘f’ (see Table 2.3) On the other
hand, if the first bit shown is the last bit of the previous character, then the code is ‘00110011’
and the decoded character is ‘3’. So how do we determine whether ‘f’ or ‘3’ is meant? The
answer is by means of character synchronisation.
Character synchronisation (or byte synchronisation) can be achieved using either asynchronous mode transmission or synchronous mode transmission.

Asynchronous transmission
In asynchronous data transfer each data character (represented, say, by an 8-bit ‘byte’) is
preceded by a few additional bits, which are sent to mark (or delineate) the start of the 8bit string to the receiver. This assures that character synchronisation of the transmitting and
receiving devices is maintained.
When a character (consisting of 8-bits) is ready to be sent, the transmitter precedes the
8-bit pattern with an extra start bit (value ‘0’), then it sends the 8-bits, and finally it suffixes
the pattern with two ‘stop bits’, both set to ‘1’.4 The total pattern appears as in Figure 2.19,
where the user’s eight bit pattern 00110011 is being sent.
In asynchronous transmission, the line is not usually in constant use and the spacing
of characters need not be regular. The idle period between character patterns (the quiescent
period ) is filled by a string of 1’s which serve to ‘exercise’ the line. The receiver can recognise
the start of a new character by the presence of the start bit transition (from state ‘1’ to state ‘0’).
The following 8-bits then represent the character pattern and are followed by the two stop bits
(Figure 2.19).
The advantage of asynchronous transmission lies in its simplicity. The start and stop bits
sent between characters help to maintain synchronisation without requiring very accurate
clock hardware in either the transmitter or the receiver. As a result, asynchronous devices

Figure 2.19 Asynchronous data transfer.
4
Usually nowadays, only one stop bit is used. This reduces the overall number of bits which need to be sent
to line to convey the same information by 9%.

Packet switching, protocols and statistical multiplexing

55

Figure 2.20 Synchronous data transfer.

can be made quite simply and cheaply. Asynchronous transmission is widely used between
computer terminals and the computers themselves because of the simplicity and cheapness of
terminal design. Given that human operators type at indeterminate speeds and sometimes leave
long pauses between characters, asynchronous transmission is ideally suited to this use. The
disadvantage of asynchronous transmission lies in its relatively inefficient use of the available
bit speed. As we can see from Figure 2.19, out of 11 bits sent along the line, only 8 (i.e.,
73%) represent useful information.

Synchronous transmission
In synchronous mode transmission, data characters (usually a fixed number of bits, or one or
more bytes) are transmitted at a regular periodic rate.
Synchronous data transfer is the most commonly used mode in modern data communications. In synchronous data transfer, the data transmitted and received must be clocked at a
steady rate. A highly accurate clock is used at both ends, and a separate circuit may be used to
transmit the timing between the two. Provided all the data bit patterns are of an equal length,
the start of each is known to follow immediately the previous character. The advantage of
synchronous transmission is that much greater line efficiency is achieved (since no start and
stop bits need be sent for each character). The disadvantage is that the complexity of the
clocking hardware increases the cost as compared with asynchronous transmission equipment.
Byte synchronisation is established at the very beginning of the transmission or after a
disturbance or line break using a special synchronisation (SYN) pattern, and only minor adjustments are needed thereafter. Usually an entire block of user information is sent between the
synchronisation (SYN) patterns, as Figure 2.20 shows. The SYN byte shown in Figure 2.20
is a particular bit pattern, used to distinguish it from other user data.

2.13 Packet switching, protocols and statistical multiplexing
The need for packet switching
Until the 1970s, wide area networks (i.e., nationwide or international networks spanning long
distances) were predominantly circuit-switched networks. Circuit switching is the technology
of telephone communication in which a circuit of a fixed bandwidth is ‘permanently’ allocated to the communicants for the entire duration of a conversation or other communication.
(In telephone networks the ‘permanently’ allocated circuit bandwidth is 3.1 kHz. In modern
digital telephone networks (called ISDN — integrated services digital network ) the bandwidth
is 64 kbit/s). But while such networks can be used for the carriage of data, they are not ideally
suited to data communication.

56

Fundamentals of data communication and packet switching

The main limitation of circuit-switched networks when used for data transport is their
inability to provide variable bandwidth connections. When only a narrow bandwidth (or low
bit-rate) is required compared with that of the standard circuit bandwidth, then the circuit is
used inefficiently (under-utilised). Conversely, when short bursts of much higher bandwidth are
required (for example, when a large computer file is to be sent from one computer to another
or downloaded from the Internet), there may be considerable data transmission delays, since
the circuit is unable to carry all the data quickly. A more efficient means of data conveyance,
packet switching, emerged in the 1970s. Packet switching has become the basis of most modern
data communications, including the Internet protocol (IP), X.25 ‘packet-switched’ networks,
frame relay and local area networks (LANs).

Packets and packet formats
Packet switching is so-called because the user’s overall message is broken up into a number
of smaller packets, each of which is sent separately. Packets are carried from node to node
across the network in much the same way in which packages make their way one post office to
the next in a postal delivery network (recall Figure 1.1 of Chapter 1). To ensure that the data
packets reach the correct destinations, each packet of data must be labelled with the address
of its intended destination. In addition, a number of fields of ‘control information’ are added
(like the stickers on a parcel ‘for internal post office use’). For example, each packet of data
can be protected against errors by adding a frame check sequence (FCS) of error check bits.
A SYN byte in the packet header also serves for the purpose of synchronisation. Figure 2.21
illustrates the typical format of a data packet, showing not only the FCS and SYN fields, but
also some of the other control fields which are added to the user data (or payload ).
The flag delimits the packet from the previous packet and provides for synchronisation. In
conjunction with the packet length field, it prepares the receiver for the receipt of the packet,
enabling the receiver to determine when the frame check sequence (FCS) for detecting errors
in the packet will start.
The destination address tells the network to which destination port the packet must be
delivered. The source address identifies the originator of the packet. This information is
important in order that the sender can be informed if the packet cannot be delivered. It

Figure 2.21 Typical data packet format.