Title of Invention

"METHOD OF PREVENTING DISRUPTION OF MEDIA STREAM TRANSMITTED ACROSS LOSSY NETWORK"

Abstract Devices and techniques for overcoming lost packets and avoiding congestion when conducting media conferences over packet switched networks are described herein. To avoid the problem of lost packets, redundant information is inserted into the media stream that permits the receiver to reconstruct any lost packets from the redundant information. Congestion avoidance techniques include adjusting the bitrate of the media stream to find the highest bitrate that can be supported without packet loss due to congestion. When increasing the bitrate to a higher rate, the additional bits can come from the redundant information used for lost packet recovery so that any lost packets caused by network congestion will not adversely affect the bitstream.
Full Text SYSTEM AND METHOD FOR LOST PACKET RECOVERY
WITH CONGESTION AVOIDANCE
BACKGROUND
[0001] Increasingly, videoconferencing systems have begun to use packet-
switched networks, such as the Internet, rather than circuit-switched networks,
such as PSTN and ISDN. One problem with the use of packet-switched networks
for videoconferencing is that all packet-switched networks will experience some
degree of packet loss. Video quality is signficantly impaired with even a modest
packet loss.
[0002] Existing solutions to the problem of packet loss include the use of
various error concealment techniques and picture refresh mechanisms. Though
helpful, these techniques are often inadequate. They also are frequently integrated
with specific video codecs, requiring the techniques to be re-invented for each
advance in codec technology.
[0003] Packet-switched networks typically have at least two fundamental
types of packet loss. Physical loss can occur when layers 1 and 2 of networks lose
information that is not recovered. This type of loss is more common, for example,
when radio links (e.g., WiFi networks such as 802.11a, 802.11b, 802.11g, and
802.11n) are employed. However, physical layer loss can occur on wired and
fiber links as well. A second type of packet loss can result from network
congestion.
[0004] To overcome this second type of packet loss, some
videoconferencing devices employ various techniques of congestion avoidance.
One such technique is described in U.S. Patent Application 10/305,485, entitled
"System and Method for Dynamic Bandwidth Allocation for Videoconferencing
in Lossy Packet Switched Systems," filed November 26, 2002, which is
incorporated by reference in its entirety. Such technqiues may be referred to as
"Dynamic Bandwidth Allocation" or "DBA." These types of DBA algorithms
inherently risk introducing packet loss whenever they attempt to increase the
bandwidth, which can result in impaired media. Other types of congestion
avoidance techniques include VCON's PacketAssist and other standards relating

to either lost packet recovery or congestion control. For instance, RFC 2733
provides a method for lost packet recovery which employs XOR (parity) packets.
RFC 3448 provides a method a unicast congestion control, as does RFC 4340.
[0005] Additionally, lost data recovery techniques such as forward erasure
correction and forward error correction can be employed on layers 1 and 2 of
networks. These techniques have also been used with stored media such as RAID
disks and CDROMs/DVDs. 3GPP has recently standardized forward erasure
correction. Forward error correction can also provided for video streams
transmitted via H.320.
[0006] Heretofore, there has been no implementation integrating lost
packet recovery (e.g., using forward erasure correction) with congestion
avoidance. Described herein is an algorithm that includes such integration.
SUMMARY
[0007] In one aspect, the present invention relates to a method of
preventing disruption of a media stream transmitted across a lossy network. The
method can include applying a lost packet recovery algorithm to the media stream.
The lost packet recovery mechanism can operate by inserting redundant
information into a transmitted data stream that includes the media stream. The
redundant information can be generated using forward erasure correction and/or
Reed-Solomon encoding. The method can further include applying a congestion
avoidance algorithm to the transmitted data stream. The congestion avoidance
algorithm can include temporarily increasing a data rate of the transmitted data
stream to determine whether the network can support a higher data rate. The data
rate can be temporarily increased by increasing the amount of redundant
information inserted into the transmitted data stream.
[0008] The present invention can also relate to a method of recovering
media data received over a lossy network, wherein the media stream was
transmitted as part of a data stream comprising the media stream and redundant
information. The recovery method can include receiving a plurality of packets
containing the media stream and the redundant information and reconstructing lost

[0009] The present inverition can also related to a videoconferencing device
adapted to perform data recovery and congestion avoidance as described above.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] Figure 1 illustrates a lost packet recovery technique that may be
used to protect against packet loss.
[0011] Figure 2 illustrates a lost packet recovery technique in which
certain packets are lost and the data is reconstructed.
[0012] Figure 3 illustrates a lost packet recovery technique in which
certain packets are lost and the data is reconstructed.
[0013] Figure 4 illustrates a lost packet recovery encoder.
[0014] Figures 5-9 illustrate packet headers that can be used in
conjunction with the lost packet recovery techniques and encoders of Figs. 1-4.
[0015] Figure 10 illustrates a lost packet recovery decoder.
[0016] Figure 11 illustrates a state machine for a congestion avoidance
algorithm that can be used in conjunction with the lost packet recovery
techniques, encoders, and decoders of Figs. 1-10.
[0017] Figure 12 illustrates a data rate for an encoder employing lost
packet recovery and congestion avoidance using techniques, encoders, decoders
and algorithms according to Figs. 1-10.
DETAILED DESCRIPTION
[0018] Disclosed herein are techniques for overcoming the effect of lost
packets in network transmission that combine the use of forward erasure
correction and congestion avoidance. These techniques may be referred to as
"Lost Packet Recovery" or "LPR."
[0019] LPR is a scalable method for recovery of lost RTP media packets.
It is not media specific, although the examples described herein are intended for
video codecs. LPR techniques described herein can also be used to protect other
types of RTP media streams. LPR uses Reed-Solomon encoding to provide
forward erasure correction of lost media packets, which allows for recovery of the

[0020] When LPR is employed the RTP payload is re-packetized to create
data packets of approximately equal size. The re-packetized stream uses a distinct
RTP dynamic payload type (which differs from the assigned media payload
number), and is fully RTP compliant. A payload-specific header includes enough
information to reconstruct the original RTP packets. LPR assumes that the SSRC
and CSRC information does not change in the middle of a recovery set (as defined
below). Timestamps, sequence numbers, the original payload type, and the
marker bit are fully recovered, which ensures that the recovered RTP stream can
be decrypted.
[0021] Recovery packets are then added to this RTP stream. The recovery
packets are also RTP compliant. Each recovery packet also includes a payload
header that describes the protected data packets. Each packet's RTP payload
includes the recovery information. The number of recovery packets added
depends on the channel loss rate as estimated from RTCP receiver reports.
Channel loss rates of up to about 15% can be accommodated, albeit with a bit rate
overhead of about 50%.
[0022] It should be noted that the LPR protection is not considered to be
part of the overall media bit rate. Only the actual compressed payload is counted
as such. However, the aggregated bit rate can be reduced as necessary to avoid
network congestion. It should be understood that network congestion can have
undesirable affects other than or in addition to packet loss. For example, network
congestion can cause increased media latency and dropped connections (including
signaling connections). Therefore, the LPR protocol can have built-in messages
for congestion avoidance. One suitable congestion avoidance algorithm is
described in greater detail below, although a variety of congestion avoidance
algorithms may be used.
[0023] An example of LPR applied to a media stream will now be
described with reference to Figs. 1-3. The details of the example, including bit
rates, packet loss rates, packet sizes, ratio of data packets to recovery packets, etc.
are all mere examples and can vary depending on implementation details, channel
conditions, etc.

[0024] Consider a 300 kbps video channel that has a packet loss rate of
2%. Three packets 101, 102, and 103 are to be transmitted. An LPR mode of
(6+2) can be used, meaning that two recovery packets 104, 105 will be inserted
into each group of six data packets 106-111, creating recovery sets of eight
packets each. The recovery packets 104, 105 are about equal in size to the largest
data packet, but include some additional overhead. The recovery packets allow
regeneration of any two of the six data packets 106-111 as described in greater
detail below.
[0025] To stay within the 300 kbps channel limit, the media rate can be
reduced to approximately 220 kbps, leaving 80 kbps for the recovery packets.
The target size for each re-packetized data packet can be 500 bytes. Each
recovery set therefore carries about 32000 bits of information, representing a time
period of 107 ms at the channel rate of 300 kbps. The 107 ms period is called the
protection period.
[0026] Each of the packets (i.e., the re-packetized data packets 106-111
and the recovery packets 104, 105) contains the normal 40 bytes of IP+UDP+RTP
header overhead. For simplicity, this overhead is not shown. However, the
overhead associated with LPR itself is shown. For example, "3+420" indicates
that three overhead bytes plus 420 bytes of the original data are in the packet. The
recovery packets are 100% overhead. The details of the overhead are described in
greater detail below.
[0027] Figure 2 illustrates the recovery process. Assume that data packets
3 and 5 (108, 110) are lost. The LPR decoder creates regenerated packets 201 and
202 from the six packets that were received, i.e., data packets 1, 2, 4, and 6 and
recovery packets 1 and 2. Note that regeneration of the lost packets cannot occur
until the sixth packet is received. After regeneration, the original RTP packets
203-205 are restored from the data packets. Regeneration of the lost packets is
described in greater detail below.
[0028] Figure 3 illustrates a failed packet recovery where more than two
packets are lost. In this example Recovery Packet 1 is lost as well as Data Packets
3 and 5. Since only 5 packets of the group were received, packet regeneration is

packet can not be recovered in its entirety, then the partial packet information is
discarded. Therefore, data packets 1 and 2, 301, 302 are both shown as lost.
[0029] In the event that all of the original data packets cannot be
recovered, as illustrated in Fig. 3, it is typical for the receiving decoder to request
a fast update of the entire image. Alternatively, a walk-around refresh such as that
described in U.S. Patent 7,020,203, entitled "Dynamic Intra-coded Macroblock
Refresh Interval for Video Error Concealment," issued March 28, 2006, which is
hereby incorporated by reference in its entirety. In either case, the time required
to repair the image is not a function of the number of packets lost, but rather the
mean time between failures caused by losing more than the predetermined number
of packets in the recovery set (two, in this example).
[0030] For the given example of a channel at 300 kbps and 2% packet
loss, with each protection set of six data packets and two recovery data packets
covering 107 ms of channel data, the mean time between failures can be computed
to be approximately 258 seconds. Assuming that a fast update requires
approximately two seconds, the video transmitted on a channel protected by the
LPR example given will be error-free more than 99% of the time. This protection
comes at an additional overhead of 80 kbps, which is approximately 30% of the
channel rate.
[0031] Additional aspects of the LPR example given above may be better
understood with reference to Fig. 4, which illustrates an implementation of an
LPR encoder. Variations of this implementation are also possible. Video encoder
401 can be any of a variety of video encoders operating in a more or less
conventional matter. For example, the video encoder can operate according to the
H.261, H.263, H.264, MPEG-2, or MPEG-4 video compression standards.
Alternatively, the video encoder can operate according to any of a variety of
proprietary video coding techniques.
[0032] The output of video encoder 401 can be supplied to optional
encryption module 402, which can also be of any of a variety of known types.
The encrypted video data (or unencrypted video data, if encryption is not used)
can be supplied to RTP sender 403, which can also be conventional. The RTP

algorithem by LPR packetizer 406 as described in greater detail below. LPR
recovery packet generator 407 can then generate the recovery packets based on
information received from LPR packetizer 406 and video encoder 401, as
described in greater detail below. The LPR recovery packet generator can then
transmit the LPR data packets and the LPR recovery packets to the receiver (not
shown).
[0033] RTCP module 404 can receive packet loss and other channel
statistics and supply these to LPR mode decision module 405, which controls LPR
packetizer 406. LPR mode decision module 405 can decide whether to engage or
disengage LPR protection, as well as determine the LPR protection parameters to
be used. If LPR protection is disengaged, LPR packetizer 406 and LPR recovery
packet generator 407 can pass through the RTP packets generated by RTP sender
403 without alteration.
[0034] LPR mode decision module 405 can operate as follows. First the
protection period (as described above) can be determined. Longer protection
periods offer more efficient packet regeneration (e.g., fewer recovery kbps), but
create more latency. For most typical videoconferencing bit rates, 100 ms can be
a suitable protection period. At bit rates below 128 Kbps longer protection periods
might be necessary. For example, 150 ms can be suitable for a 64 Kbps rate. At
bit rates above 1 Mbps shorter protection periods might be advantageous. For
example, 50 ms can be suitable for rates of 2 Mbps or more. Other parameters can
also be considered in determining the protection period.
[0035] Next, the number of total packets per protection period can be
determined. In general, small numbers of packets per protection period may tend
to decrease the efficiency of the protection. Conversely, larger numbers of
packets per protection period may tend to increase the amount of computation
required to encode and decode the protection packets. Therefore, for most
channels, about thirteen or more packets per protection period is believed to be
suitable. The decision on the number of packets per protection period should also
take into account the resulting packet size for the desired data rate. The output
packet size (including IP overhead) should be less than 1260 bytes to traverse an

the number of recovery packets per protection period can be chosen. The number
of recovery packets can be chosen (using conventional probabilistic analysis
techniques) so that the mean time between protection breakdowns meets or
exceeds a predetermined specified value. Additionally, the aggregation of all
transmitted video channels (including the LPR protection packets and RTP
messages) can be selected so as not to exceed a specified congestion ceiling.
[0037] The tables below give example protection modes that can be used
for various bitrates under various loss conditions. Table 1 gives LPR parameters
that may be suitable for 4% packet loss conditions that can be used in conjunction
with congestion avoidance algorithms (as described below) that reduce the data
rate when packet loss exceeds 3%. Table 2 gives LPR parameters for 2% packet
loss conditions that may be used when congestion avoidance algorithms are not in
use.



[0038] LPR packetizer 406 can operate as follows. LPR packetizer 406
sub-divides the original media packets generated by RTP sender 403 so as to
improve the efficiency of packet regeneration coding. An LPR data packet
contains information from only one source packet. If the media channel is
encrypted, LPR re-packetization can be performed on the encrypted stream. In
general, the LPR stream itself is not re-encrypted, i.e., the LPR encapsulation
fields (headers) added to the encrypted packets and newly generated LPR
recovery packets are sent in the clear.
[0039] If an original source packet is sub-divided (e.g., Packet 1 101 in
Figs. 1-3), the first LPR data packet for this source packet is called the initial data
packet (e.g., Data 1 106 in Figs. 1-3). Subsequent data packets (if any) are called
follow-on data packets (e.g., Data 2 107 and Data 3 108 in Figs. 1-3). The
number of follow-on data packets is signaled in the initial data packet and
therefore cannot be changed once the initial data packet is sent. The follow-on
data packets and the initial data packet may be in different LPR protection groups.
[0040] All fields in the original RTP header except for the sequence
number, the payload type, and the P field can be carried unchanged in each LPR
data packet. This includes the time stamp, the marker bit, and the SSRC/CSRC
information. The payload number can be replaced with the negotiated LPR
payload type. Sequence numbers can be assigned sequentially to the LPR media
stream in the usual fashion. The P field of the LPR packet can be set to 0. LPR
packetizer 406 can pass through the original RTP sequence numbers to LPR
recovery packet generator 407.
[0041] Each initial data packet can include a seven-byte payload header
that is specific to LPR, an example of which is illustrated in Fig. 5. Each follow-
on data packet can include a three-byte payload header, an example of which is

header and precede the video (or other media) payload. The header sizes and
individual field sizes and their arrangement are exemplary only, and it will be
understood that other arrangements are possible.
[0042] The recovery packet field 501, 601 can be used to indicate the
number of recovery packets that will subsequently be sent in the recovery set. All
data packets in a recovery set should signal the same number of recovery packets.
The number of recovery packets signaled can be 0, indicating that no recovery
packets will be sent. This field can be a six-bit field, which would allow for up to
sixty-four recovery packets. The type field 502, 602 can be a two-bit field set to
either 00 or 01 to indicate an initial data packet or a follow-on data packet,
respectively. This allows the header type for LPR to be determined by examining
the first payload byte.
[0043] Data index field 503, 603 can be an 8-bit field. The first data
packet in a recovery set can be assigned a data index of 1. The index can be
incremented for subsequent data packets in the set. The follow-on packets field
504 can be an 8-bit field used to specify the number of follow-on packets
associated with the initial packet. The data packet field 604 can be an eight-bit
field used to signal the number of data packets included in the recovery set. All
data packets should signal the same number of data packets.
[0044] Original sequence field 505 can be a sixteen-bit field that indicates
the original RTP packet sequence number, which will be used to recreate the
original RTP packets. Original p-bit field 506 can be a one-bit field used to carry
the padding bit carried by the original RTP packet. Original payload type field
507 can be a seven-bit field used to carry the payload type as carried by the
original RTP packet.
[0045] LPR recovery packet generator 407 can operate as follows. LPR
recovery packet generator 407 can replace the RTP sequence numbers of the
original media with the final LPR sequence numbers in the LPR data packets.
LPR recovery packet generator 407 can also inserts the LPR recovery packets at
the end of each recovery set.
[0046] Each recovery packet can also include a nine-byte header such as

RTP header and can Precede the video (or other media) payload. As with the
header described above, the field types and sizes are exemplary and other
arrangements may also be used.
[0047] Recovery index field 701 can be a six-bit field used to indicate the
sequence of the recovery packet in the recovery set. The first recovery packet can
be assigned a recovery index of 1, which can be incremented for subsequent
recovery packets in the recovery set. Type field 702 can be a two-bit field set to
10 to indicate the packet is a recovery packet. The recovery packets field can be
an eight-bit field that includes the number of data packets in the recovery set. All
recovery packets should signal the same number of recovery packets signalled by
the corresponding data packets. Data packets field 704 can be an eight-bit field
that indicates the number of data packets sent in the recovery set. The recovery
packets should indicate the same number of data packets indicated in the
corresponding data packets.
[0048] Protected timestamps field 705 can be a 32 bit field that holds
Reed-Solomon encoded RTP timestamps for each data packet in the recovery set.
This can combined using the same Reed-Solomon encoding as used in the
recovery payload itself. This field can allow the RTP timestamp of a missing data
packet to be regenerated. The 4 bytes are in network order.
[0049] Protected marker, flag, and size field 706 (futher illustrated in Fig.
8) can hold the Reed-Solomon encoded RTP marker bit 802, a flag value 801, and
data packet size 803. The flag value can be 1 if the LPR data packet is an initial
packet, and can be set to 0 if the LPR data packet is a follow-on packet. The size
can include the IP, UDP, and RTP header as well as the LPR header. The data
packet size can be in network order.
[0050] The protected field is combined using the same Reed-Solomon
encoding as is used in the recovery payload itself. The field allows the LPR
header type, marker and data packet size to be regenerated for any missing data
packet that can be recovered.
[0051] The payload of the recovery packet can be encoded using Reed-
Solomon 256 (RS256) encoding. The Original Sequence 505, Original P bit 506,

packet can be protected as if they were payload bytes. This can allow thi
information to be recovered if the initial data packet is lost. Additionally, the
recovery packet payload header can include encodings of three Phantom Data
fields: the length of the data packet itself, the marker bit, and the timestamp.
Again, this can allow this information to be recovered if all data packets for a sub-
divided RTP packet are lost.
[0052] Each payload byte in each data packet can contribute to the
corresponding byte in each recovery packet, according to the generating function
for RS256:
where: Di is the ith data packet (1≤i≤d); Rj is theyth recovery packet (1≤J≤r); Bi is
the ith base coefficient (1≤i≤d) having values as specified in table 3; and
B1f-1, are computed using Galois Field (28) arithmetic operations as
described below. It should be noted that the given base coefficient table restricts
the maximum number of data packets in a recovery set to 128, although other
numbers of data packets can be selected, and a suitable table can be computed
accordingly.


[0053]Galois field (28) arothmetic operations can be set up bto use two
helper tables: a Galois Log function (glog) table, and a Galois Exponentiation
function (gexp) table. These tables are:





These tables can then be used to compute various arithmetic opeations as
described below.
[0054] Addition and subtraction in Galois Field (28) are the same
operation, and are performed as follows:
ab =ab
α-b=ab
where  is the logical exclusive or (XOR) operator. As in ordinary arithmetic,
a0 = 0a =a.
[0055] Multiplication (a  b) in Galois Field (28) is performed as
follows:
a  b = gexp[glog[a] + glog[b]]
where + is the ordinary addition operator. As in ordinary arithmetic, a  0 =
0  a = 0. Also, a  1 = 1  a = a.
[0056] Division (a  b) in Galois Field (28) is performed as follows:
a  b = gexp[glog[a] - glog[b] + 255]

where + is the ordinary addition operator and is the ordinary subtraction
operator. As in ordinary arithmetic, a +1 = a. As is also usual, b must be non-
zero.
[0057] The power function (ab) in Galois Field (28) is performed as
follows:
ab = gexp[(glog[a] * b) % OxFF]
where + is the ordinary addition operator, - is the ordinary subtraction operator, *
is the ordinary multiplication operator, and % is the ordinary mod function. As in
ordinary arithmetic, a0 = 1 for non-zero a. In the RS256 reference code, 0 is never
raised to a power.
[0058] As described above, the LPR group size (in packets) can be set to
achieve a desired protection period (in milliseconds),. If the video codec does not
use the entire channel bit rate, then the LPR group will result in a longer latency
period. For example, a video channel that normally runs at 1.5 Mbps may drop to
750 Kbps in periods of light motion. This could result in a protection period twice
as long as desired. Such a situation can result in the selected LPR mode
continuing to meet the predetermined MTBF, which will, in this example, be
exceeded by a factor of 2.
[0059] In some cases, this extended protection period can be allowed.
However, this can result in a longer latency when packets are lost. Therefore, in
some cases, it may be desirable for the transmitter to adjust its LPR mode as the
actual video rate varies. Three such adjustments are described below, although
other adjustments are also possible.
[0060] One adjustment that can be made is to the protection group size. In
some embodiments, this setting can be changed at every protection group
boundary. Therefore, a transmitter can monitor the LPR output packet rate and
dynamically adapt the number of packets per group. When this adjustment is
employed, LPR protection overhead will tend to rise as the data rate falls.
However, because the data rate in such cases will typically be below the normal
limit the altered LPR mode will still generally remain below a predetermined
congestion ceiling

[0061] Another adjustment that can be made is for the transmitter to
change its output packet size upon transmission of an initial data packet. In
general the transmitter should not change the output packet size when transmitting
follow-on packets, as the number of follow-on packets will typically be signaled
in the initial data packet. Therefore, a transmitter can monitor the LPR output
packet rate and the output data rate and dynamically adapt the packet size to
maintain the number of packets per protection period.
[0062] Another adjustment that can be made is for the transmitter to send
empty data packets to complete a partial recovery set. An empty data packet can
include an initial data packet header as described above. Other information that
would normally be in the packet, such as sequence number payload type, number
of follow-on packets, etc., can all be set to zero. Because the empty data packets
are part of the protection group, the recovery packet encoding described above can
include these packets, which can also be recovered by the decoding process. Also,
empty data packets can be discarded by the LPR decoder (discussed below) to
prevent empty packets in the output media packet stream.
[0063] Fill packets can also be used to maintain a predetermined data rate.
These fill packets can be discarded by the receiver. An example fill packet is
illustrated in Fig. 9. The fill packet can include an extended type field 901 of six
. bits having a value of 1 and a type field 902 of two bits having a value of 11. The
fill packet can be an RTP packet using the LPR payload type for the media stream.
The timestamp value can be the same as the value in the previously sent RTP
packet. The sequence number can be incremented from the value in the
previously sent packet. The payload can end with the Fill header shown above.
The payload length of the field can be any length within a maximum specified
packet size.
[0064] An example of an LPR decoder is illustrated in Fig. 10. As with
the LPR encoder discussed above, variations on this embodiment are possible.
Operation of the LPR decoder can be basically an inverse of the LPR encoder
operation described above. Specifically, LPR packets can be recovered by LPR
recovery module 1001. These recovered LPR packets can be passed to the RTP

ino their received order. LPR regenerator 1003 can then recreate any lost
packets, relying on information received from the recovered LPR packets (from
LPR recovery block 1001) and RTCP information received from RTCP module
1006. These regenerated blocks can then be decrypted by decryption block 1004
(if encryption was used at the source), and finally the decrypted blocks can be
processed by video decoder 1005 (or other type of media decoder, as appropriate).
[0065] LPR recovery module 1001 and LPR renenerator 1003 can be the
only non-standard elements as compared to conventional systems, and these two
modules can operate as pass-throughs when LPR is not being used.
[0066] LPR recovery module 1001 can process all LPR RTP packets. In
the given example, all received and regenerated packets (data and recovery) can
be directly passed through to the RTP reordering buffer. The overhead data in
each data packet can allow these packets to be processed by LPR recovery module
1001 in any order. LPR recovery module 1001 can also provide LPR RTP
receiver report information to RTCP module 1006. This information can be based
on what is received (e.g., any packets that are not recovered can be identified as
lost in the receiver report).
[0067] If there are missing data packets in the recovery set that can be
recovered, the LPR recovery module can recover these packets as soon as
sufficient recovery packets have been received. As discussed above, if k data
packets are missing, k recovery packets must be received to recover the data. This
recovery process can involve first recovering the marker bit, the data packet
length, the initial packet flag, and the original timestamp. The data packet length
can be used to recover the original data packet payload. "Fixed" components of
the LPR RTP header (e.g., SSRC, CSRC, LPR payload type, etc.) can be taken
from one of the recovery packets. Variable portions of the LPR RTP header (e.g.,
marker bit, timestamp, and sequence number) can be set to the values recovered
from the recovery header.
[0068] The sequence number for LPR data packet i can be computed
according to the following equation:
Si=Sr-(d + r-i)

wher: i is the index recovered data packet, d is the number of data packets in the
recovery set, r is a received recovery packet index, and St is the sequence number
for packet i.
[0069] While receiving data packets, partial recovery packets can be built
up by recovery module 1001, using the RS256 generation function described
above. As corresponding recovery packets for the set are received, these packets
can be exclusive-or'ed (i.e., XOR'd) with the partial recovery packet. The
resulting residual packets will hold the contribution of any lost data packets to
each recovery packet. In the exemplary system, each residual packet will be a
linear combination of each of the missing data packets. Thus, there will be k
simultaneous equations (one for each of the k residual packets), each with k
unknowns (one for each of the missing k data packets).
[0070] As would be understood by one skilled in the art, these equations
can be represented as a k x k matrix, which can then be solved numerically, e.g.,
by Gaussian elimination. Because such operations are well-known, the details are
not reproduced herein. Multiplying the resulting inverted matrix by the residual
data can then recover the missing data packets.
[0071] To illustrate the above recovery process, the recovery scenario in
Figure 2 can be performed. For clarity ordinary arithmetic operations are used
rather than the actual Galois Field (28) operations. The data in the recovery
packets can generated using the following formula (derived from the RS256
generation function and tables described above:
R1, = Dl + D2 + D3 + D4 + D5 + D6
R2 = 2D1 + 4D2 + 6D3 + 9D4 +13D5 + 14D6
[0072] The partial recovery packets can be computed by the decoder on
the packets that are received. As noted in the example above, packets 3 and 4
(108, 110, Fig. 3) are not received. This results in:
Pl=D1+D2+D5+D6
P2 = 2D1 + 4D2 +13D5 + 14D6
Residuals can be computed by subtracting the partial recovery packets from the
received recovery packets:

r1=R2-P2 =D3+D1
r2=R2-P2 =6D3+9D1
The generating matrix for the residuals is therefore:
1 1
6 9
Inverting this matrix via Gaussian elimination results in the following:
3 -1/3
-2 1/3
Therefore, the information in the missing data packets (i.e., data packets 3 and 4)
can be recovered using the formulas:
D3 =3r1-r2/3
D4 =2r1-r2/3
[0073] When LPR is in use, RTP reordering buffer 1002 can be integrated
with the LPR decoder. This can permit a minimal delay implementation, because
in some situations out-of-order packets can be regenerated by LPR before they
actually arrive. Additionally, in some embodiments it may be desirable to acquire
the first d packets of the recovery set, running the Reed-Solomon decoder only if
some data packets were lost. This design may not produce minimal delay, but can
be more computationally efficient.
[0074] LPR regenerator module 1003 can recreate the original RTP stream
from the data packets as was described above. If some data packets for an RTP
packet are missing, then that RTP packet is omitted from the output stream.
Because the LPR stream has been re-ordered, the initial packet from each packet
set should be received first. If a follow-on packet is encountered first, then the
initial packet has been lost, and the follow-on packet can be discarded.
[0075] The output RTP packet uses the RTP header information of the
initial data packet. The payload type, sequence number, and timestamp are
replaced with the values in the initial data header. The payload is aggregated from
the initial packet and any follow-on data packets with the LPR headers removed.
If any follow-on packets are missing the output RTP packet can be suppressed. It
should be noted that the number of follow-on packets can be known because it can
be signaled in the initial data packet header.

[0076] Lost packet recovery techniques, like those described above, can be
combined with congestion avoidance algorithms to further enhance operation. An
example congestion avoidance algorithm is referenced above and is referred to as
Dynamic Bandwidth Allocation ("DBA"). DBA can attempt to avoid congestion
by downspeeding the bitrate based on the percent packet loss reported by the
receiver's RTP stack (1006, Fig. 10). The receiver RTP stack generates packet
loss statistics for the receiver channel during regular time periods (e.g., every 200
ms). This can allow fast adaptation to a dynamically changing pipe while
allowing for minimal LPR configurations with low MTBF requirements (which,
in turn can allow for lower overhead).
[0077] As illustrated in Fig. 12, the control loop for DBA can be
implemented as a state machine with the following states: max rate state 1101,
downspeeding state 1102, upspeed wait state 1103, upspeeding state 1104, and
constant loss state 1105. The various state transitions can be understood as
follows. Suppose the system is in the max rate state 1101, meaning that the
maximum data rate is being used. A packet loss above a predetermined threshold
can cause state transition 1106 to downspeeding state 1102.
[0078] While in the downspeeding state, the system can decrease the
speed and wait to see whether packet loss above the predetermined threshold
continues to be experienced. If there is no more packet loss above the threshold,
the system can follow transition 1108 to the upspeed wait state. If there continues
to be packet loss above the threshold, and the maximum number of downspeed
attempts have not been used (block 1107), the system can continue in the
downspeeding state 1102 and can thus continue decreasing the bitrate. If the
maximum number of downspeeding attempts has been used, the system can
transition to constant loss state 1105.
[0079] While in the upspeed wait state, additional packet loss above the
predetermined threshold can cause the system to follow transition 1109 to the
downspeeding state. Alternatively, if, after a predetermined period of time, there
is no packet loss above the threshold, the system can follow transition 1110 to the
upspeeding state 1104.

[0080] In the upspeeding state 1104, the system can increase the bitrate
and wait for additional packet loss above the predetermined threshold. If there is
no packet loss, the system can determine whether the bitrate is less than the
maximum bitrate. If so, the system can follow state transition 1111 to the
maximum rate state 1101. If not, the system can continue in the upspeed state
1104. If packet loss above the predetermined threshold is experienced while in
upspeed state 1104, the system can determine whether a predetermined maximum
number of upspeed attempts have been made (block 1112). If not, the system can
follow transition 1113 to the upspeed wait state. If the maximum number of
upspeed attempts has been used, the system can follow transition 1114 to the max
rate state 1101.
[0081] In each of the states described above, different packet loss
thresholds may be employed. Additionally, some or all state transitions may have
a delay associated with them. Upspeeds can be done in 15% step increases of the
bit rate. If the pipe is fully utilized and no packet loss is reported, a delay can be
incurred which can be equal to X timer cycles (200 ms * X) before the next
upspeed will occur. By default the value of X can be set to 2, resulting in a 400
ms delay. Any packet loss that is not a sequential packet loss report and not an
echo report can cause a state transition.
[0082] Figure 12 shows dynamically changing bandwidth availability and
the DBA adaptation. Figure 12 also shows the state transitions, echo delays, along
with three generalized loss scenarios: Burst Loss, Constrained Pipe, and Constant
Loss. LPR protection use is designated in Figure 12 by the shaded area during the
downspeeding state, the upspeed wait state, and the upspeeding state. The
upspeed attempts can use the LPR protection overhead to test for the congestion
ceiling, i.e., only "redundant" data may be used to probe whether network
congestion conditions would permit a higher bitrate. If this test of the congestion
ceiling fails, LPR protects any loss caused by the packet loss incurred because
only "redundant" data has been used. This allows probing of the congestion
ceiling with substantially reduced risk of video disruption.
[0083] The measured congestion ceiling can be applied to the data flow in

congestion ceiling can apply to the aggregated bit rate. Forward erasure
protection can be applied independently to each video stream. However, the same
level of protection can be applied to all streams. The system can also be adapted
to jointly protect the video streams with common forward erasure protection
packets or to apply different levels of protection to each video stream.
[0084] Forward erasure correction can also be applied to just a part of the
video stream. For instance, in cases where layered video encoding is employed,
the forward erasure protection could be applied to the base layer but not the
enhancement layers. It could also be adapted to protect the various layers to
different degrees (so called "unequal protection"). In these configurations, the
congestion avoidance applies to the entire data flow.
[0085] Example LPR probes for congestion ceiling management are given
in Table 6. These probes can be used to test the possibility of raising the data rate
by a predetermined value (e.g.", 10%) and can be activated for short periods of
time (e.g., 800 ms).

[0086] The techniques described herein can be receiver-driven, in that the
receiver feeds back the congestion ceiling and erasure protection level to the
sender. Alternatively, they can also be used in a transmitter-driven framework, or
even in a hybrid framework where the control loop is distributed between sender
and receiver. With some adaptation, it is also suitable for use with multicast

streams. The techniques described also can be combined with other techniques
such as packet retransmission, periodic picture refresh, and video error
concealment.
[0087] As described above, the combination of forward erasure correction
with congestion avoidance offers unique benefits that neither mechanism can
achieve on its own. One such benefit is that congestion ceiling can be periodically
measured by experimentally increasing the amount of forward erasure protection
packets in the bitstream and observing the effect on the delivered data flow.
These protection packets increase the overall data rate to a desired "probe" level
while simultaneously protecting the media from packet loss (in case the level
exceeds the congestion ceiling). A second benefit is that this method of
dynamically measuring the congestion ceiling can result in a much faster
determination of the allowable bandwidth than that provided by the passive
methods normally employed in congestion avoidance algorithms. A third benefit
is that the protected media flow can be delivered with less delay and lower
protection overhead when the entire flow (media and protection) is constrained to
stay below the dynamically measured congestion ceiling.
[0088] An additional benefit of the techniques described herein arises in
that as the video data rate increases, the system can become more and more
efficient at protecting against packet loss. Therefore, as bitrates increase, such as
with the use of high definition video, the techniques described herein become both
more useful and more efficient.
[0089] The integration of congestion avoidance with forward erasure
correction offers several unique advantages. For example, congestion avoidance
brings several unique advantages to forward erasure correction. As a data link
becomes congested, the transmission delay for media along that link increases.
Using forward erasure correction alone may result in an acceptable packet loss
rate, but will not reduce queueing delays. Integrating congestion avoidance into
the control loop results can achieve a desired quality of service (QoS) with much
lower latency. Additionally, the efficiency of forward erasure protection is less
for low-latency applications than it is for high-latency applications. If the source

of the data loss is simple congestion, reducing the bit rate can be more efficient
than adding additional forward erasure protection packets to the data flow.
[0090] Conversely, forward erasure correction can also add considerable
value to congestion avoidance alone. Congestion avoidance techniques typically
reduce the rate of the data flow more aggressively then they raise it. For example,
the congestion mechanism for TCP reduces the data rate by a multiplicative
factor, but raises it much more slowly by applying an additive factor. The reason
for this is that increasing the data flow rate always risks creating congestion, and
therefore creating packet loss. When forward erasure protection is integrated with
the congestion avoidance, this risk is greatly reduced. Thus, as noted above, the
data rate can be increased by simply increasing the density of forward erasure
protection packets.
[0091] Additionally, congestion avoidance techniques also are designed to
estimate the link speed. Cross-congestion from other data flows can frequently
result in packet loss. By the time the cross-congestion is observed, packet loss has
usually already occurred. Including forward erasure protection allows packets lost
due to such cross-congestion to be recovered.
[0092] The techniques disclosed herein can be employed in a wide variety
of systems, including any systems for the transmission of multimedia data, such as
video data, audio data, still image data, and other types of data. Such techniques
can be employed in typical general purpose computer systems, such as desktop
computers, notebook computers, handheld computers, servers, and the like.
Alternatively, these techniques can be employed in various appliance-type
systems such as conference room videoconferencing systems, desktop
videoconferencing systems, and the like. The techniques described can also be
employed in infrastructure type videoconferencing devices such as multi-point
control units (MCUs), bridges, media servers, etc.
[0093] The methods described herein can be coded into one or more
computer programs and stored on a computer-readable media, such as a compact
disk, a tape, stored in a volatile or non-volatile memory, etc. Accordingly,
instructions stored on a program storage device can be used to cause a

techniques disclosed herein. Although the disclosed communcation system has
been described as providing bi-directional communication between a near end and
a far end, it will be appreciated that the teachings of the present disclosure are
applicable to systems that provide one-way transmission.
[0094] The foregoing description of preferred and other embodiments is
not intended to limit or restrict the scope or applicability of the inventive concepts
conceived of by the Applicant. In exchange for disclosing the inventive concepts
contained herein, the Applicant desires all patent rights afforded by the appended
claims. Therefore, it is intended that the appended claims include all
modifications and alterations to the full extent that they come within the scope of
the following claims or the equivalents thereof.

WE CLAIM:
1. A method of preventing disruption of a media stream transmitted across a
lossy network, the method comprising:
applying a lost packet recovery algorithm to the media stream, wherein
the lost packet recovery algorithm inserts redundant information
into a transmitted data stream including the media stream to permit
reconstruction of one or more lost packets; and
applying a congestion avoidance algorithm to the transmitted data
stream, wherein the congestion avoidance algorithm includes
temporarily increasing a data rate of the transmitted data stream by
increasing the amount of redundant information inserted into the
transmitted data stream to determine whether the network can
support a higher data rate.
2. The method of claim 1 wherein the redundant information comprises one
or more recovery packets.
3. The method of claim 2 wherein the number of recovery packets depends
on the channel loss rate determined from RTCP receiver reports.
4. The method of claim 1 further comprising:
reducing the aggregate bit rate as necessary to avoid network
congestion.
5. The method of claim 1 wherein the lost packet recovery algorithm includes
forward erasure correction coding.
6. The method of claim 1 wherein the lost packet recovery algorithm includes
Reed-Solomon encoding.
7. The method of claim 1 wherein the media stream comprises video data.

8. A media encoding apparatus for transmitting media data over a network,
the media encoding apparatus comprising:
a media encoder;
an RTP sender connected to receive output from the media encoder and
package the output of the media encoder according to a real-time
transmission protocol as a plurality of source packets;
an RTCP module connected to receive channel information from the
network;
an LPR mode decision module, wherein the LPR mode decision module
is:
communicatively coupled to receive the channel information
from the RTCP module,
adapted to determine LPR protection parameters to be used from
the channel information, and
communicatively coupled to provide the LPR protection
parameters to the LPR packetizer;
an LPR packetizer connected to receive output from the RTP sender
and repacketize the output of the RTP sender as a plurality of LPR
data packets using parameters received from the LPR mode
decision module; and
an LPR recovery packet generator, wherein the LPR recovery packet
generator is:
coupled to the LPR packetizer and the media encoder,
adapted to generate LPR recovery packets based on information
received from the LPR packetizer and the video encoder,
and
adapted to transmit the LPR data packets and the LPR recovery
packets to a receiver.
9. The media encoding apparatus of claim 8 wherein the media encoder is a
video encoder.

10. The media encoding apparatus of claim 8 wherein the LPR mode decision
module determines LPR protection parameters by:
determining a protection period;
determining a total number of packets per protection period;
determining a number of recovery packets per protection period.
11. The media encoding apparatus of claim 8 wherein each LPR data packet
contains information from only one source packet.
12. The media encoding apparatus of claim 8 wherein a payload of the
recovery packet is encoded using Reed-Solomon encoding.
13. A method of determining LPR protection parameters in a media
conference conducted over a packet-based network, the method
comprising:
determining a protection period;
determining a total number of packets per protection period; and
determining a number of recovery packets per protection period.
14. The method of claim 13 wherein the protection period is determined
according to a bitrate of the media conference.
15. The method of claim 3 further comprising adjusting LPR protection
parameters as a function of the media rate.
16. The method of claim 15 wherein adjusting LPR protection parameters as a
function of the media rate comprises dynamically adapting the number of
packets per group.
17. The method of claim 15 wherein adjusting LPR protection parameters as a
function of the media rate comprises dynamically adapting the packet size.
18. The method of claim 15 wherein adjusting LPR protection parameters as a
function of the media rate comprises sending empty data packets.

19. A media decoding apparatus for receiving media data from a transmitter
over a network, the media decoding apparatus comprising:
an LPR recovery module coupled to receive LPR packets, including
LPR data packets and LPR recovery packets, from the network;
an LPR regenerator adapted to receive the LPR data packets and
regenerate any missing data packets ;
an RTCP module communicatively coupled to the LPR regenerator to
receive a report of lost packets from the LPR regenerator and
transmit lost packet information to the transmitter; and
a media decoder coupled to the LPR regenerator and adapted to receive
the LPR data packets and the regenerated data packets and decode
media information contained therein.
20. The media decoding apparatus of claim 19 wherein the media decoder is a
video decoder.
21. The media decoding apparatus of claim 19 wherein the LPR recovery
module and the LPR regenerator operate as pass throughs when LPR
algorithms are not in use.

Devices and techniques for overcoming lost packets and avoiding
congestion when conducting media conferences over packet switched networks
are described herein. To avoid the problem of lost packets, redundant information
is inserted into the media stream that permits the receiver to reconstruct any lost
packets from the redundant information. Congestion avoidance techniques
include adjusting the bitrate of the media stream to find the highest bitrate that can
be supported without packet loss due to congestion. When increasing the bitrate
to a higher rate, the additional bits can come from the redundant information used
for lost packet recovery so that any lost packets caused by network congestion
will not adversely affect the bitstream.

Documents:

01243-kol-2008-abstract.pdf

01243-kol-2008-claims.pdf

01243-kol-2008-correspondence others.pdf

01243-kol-2008-drawings.pdf

01243-kol-2008-form 1.pdf

01243-kol-2008-form 2.pdf

01243-kol-2008-form 3.pdf

01243-kol-2008-form 5.pdf

01243-kol-2008-gpa.pdf

1243-KOL-2008-(04-05-2012)-AMANDED CLAIMS.pdf

1243-KOL-2008-(04-05-2012)-CORRESPONDENCE.pdf

1243-KOL-2008-(04-05-2012)-FORM-13.pdf

1243-KOL-2008-(04-05-2012)-OTHERS.pdf

1243-KOL-2008-(06-05-2014)-ABSTRACT.pdf

1243-KOL-2008-(06-05-2014)-ANNEXURE TO FORM 3.pdf

1243-KOL-2008-(06-05-2014)-ASSIGNMENT.pdf

1243-KOL-2008-(06-05-2014)-CLAIMS.pdf

1243-KOL-2008-(06-05-2014)-CORRESPONDENCE-1.pdf

1243-KOL-2008-(06-05-2014)-CORRESPONDENCE.pdf

1243-KOL-2008-(06-05-2014)-FORM-1.pdf

1243-KOL-2008-(06-05-2014)-FORM-13.pdf

1243-KOL-2008-(06-05-2014)-FORM-2.pdf

1243-KOL-2008-(06-05-2014)-FORM-3.pdf

1243-KOL-2008-(06-05-2014)-FORM-5.pdf

1243-KOL-2008-(06-05-2014)-GPA.pdf

1243-KOL-2008-(06-05-2014)-OTHERS.pdf

1243-KOL-2008-(06-05-2014)-PETITION UNDER RULE 137.pdf

1243-KOL-2008-(08-07-2014)-CORRESPONDENCE.pdf

1243-KOL-2008-(15-09-2014)-CORRESPONDENCE.pdf

1243-KOL-2008-(18-11-2014)-CORRESPONDENCE.pdf

1243-kol-2008-form 18.pdf

abstract-01243-kol-2008.jpg


Patent Number 264212
Indian Patent Application Number 1243/KOL/2008
PG Journal Number 51/2014
Publication Date 19-Dec-2014
Grant Date 15-Dec-2014
Date of Filing 22-Jul-2008
Name of Patentee POLYCOM, INC.
Applicant Address 6001 AMERICA CENTER DRIVE,SAN KOSE,CALIFORNIA 95002 UNITED STATES OF AMERICA
Inventors:
# Inventor's Name Inventor's Address
1 BOTZKO, STEPHEN 150 WOBURN STREET, READING, MA 01867
2 HEIN, DAVID 3200 FLEECE FLOWER COVE, AUSTIN, TX 78735
3 FLOTT, RICHARD E, JR. 1112 AMESITE TRAIL, AUSTIN, TX 78726
4 SPEARMAN, JOHN PAUL 3007 AMMUNITION DR., AUSTIN, TX 78748
PCT International Classification Number G06F15/16
PCT International Application Number N/A
PCT International Filing date
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 60/951399 2007-07-23 U.S.A.