Title of Invention

VIRTUAL SOURCE LOCATION INFORMATION BASED CHANNEL LEVEL DIFFERENCE QUANTIZATION AND DEQUANTIZATION METHOD

Abstract Methods for Spatial Audio Coding (SAC) of a multi-channel audio signal and decoding of an audio bitstream generated by the SAC are provided. More particularly, methods of efficient quantization and dequantization of Channel Level Difference (CLD) used as a spatial parameter when SAC -based encoding of a multi-channel audio signal is performed are provided. A method of CLD quantization includes extracting sub-band-specific CLDs from an N-channel audio signal (N>1), and quantizing the CLDs by reference to a Virtual Source Location Information (VSLI)-based CLD quantization table designed using CLD quantization values derived from VSLI quantization values of the N-channel audio signal.
Full Text WO 2007/011157 PCT/KR2006/002824
1
[DESCRIPTION]
[Invention Title]
VIRTUAL SOURCE LOCATION INFORMATION BASED CHANNEL
LEVEL DIFFERENCE QUANTIZATION AND DEQUANTIZATION METHOD
[Technical Field]
The present invention relates to Spatial Audio Coding (SAC) of a multi-
channel audio signal and decoding of an audio bitstream generated by the SAC, and
more particularly, to efficient quantization and dequantization of Channel Level
Difference (CLD) used as a spatial parameter when SAC-based encoding of a multi-
channel audio signal is performed.
[Background Art]
Spatial Audio Coding (SAC) is technology for efficiently compressing a
multi-channel audio signal while maintaining compatibility with an existing stereo
audio system. In the Moving Picture Experts Group (MPEG), SAC technology has
been standardized and named "MPEG Surround" since 2002, and is described in
detail in the ISO/IEC working document, ISO/DEC CD 14996-x (published on
February 18, 2005 and hereinafter referred to as "SAC standard document").
Specifically, the SAC approach is an encoding approach for improving
transmission efficiency by encoding N number of multi-channel audio signals (N>2)
using both a down-mix signal, which is mixed into mono or stereo, and a set of
ancillary spatial parameters, which represent a human perceptual characteristic of the

WO 2007/011157 PCT/KR2006/002824
2
multi-channel audio signal. The spatial parameters can include Channel Level
Difference (CLD) representing a level difference between two channels according to
time-frequency, Inter-channel Correlation/Coherence (ICC) representing con-elation
or coherence between two channels according to time-frequency, Channel Prediction
Coefficient (CPC) for making it possible to reproduce a third channel from two
channels by prediction, and so on.
The CLD is a core element in restoring a power gain of each channel, and is
extracted in various ways in the process of SAC encoding. As illustrated in FIG. 1 A,
on the basis of one reference channel, the CLD is expressed by a power ratio of the
reference channel to each of the other channels. For example, if there are six channel
signals L, R, C, LFE, Ls and Rs, five power ratios can be obtained based on one
reference channel, and CLD1 through CLD5 correspond to levels obtained by
applying a base-10 logarithm to each of the five power ratios.
Meanwhile, as illustrated in FIG. 1B, a multi-channel is divided into a
plurality of channel pairs, and each of the channel pairs is analyzed on the basis of
stereo, and, in each analysis step, one CLD value is extracted. This is carried out by
step-by-step use of a plurality of One-To-Two (OTT) modules, which take two input
channels to one output channel. In each OTT, any one of the input stereo signals is
recognized as a reference channel, and a base-10 logarithmic value of a power ratio
of the reference channel to the other channel is output as a CLD value.
The CLD value has a dynamic range between -∞ and +∞. Hence, to express
the CLD value with a finite number of bits, efficient quantization is required.

WO 2007/011157 PCT/KR2006/002824
3
Typically, CLD quantization is performed by using a normalized quantization table.
An example of such a quantization table is given in the SAC standard document (see
page 41, Table 57). In this manner, because all CLD values cannot be expressed
with only a finite number of bits, the dynamic range of the CLD value is limited to a
predetermined level or less. Thereby, quantization error is introduced, and thus
spectrum information is distorted. For example, when 5 bits are used for the CLD
quantization, the dynamic range of the CLD value will be limited to the range
between -25 dB and +25 dB.
[Disclosure]
[Technical Problem]
The present invention is directed to Channel Level Difference (CLD)
quantization and dequantization methods capable of minimizing sound deterioration
in the process of Spatial Audio Coding (SAC)-based encoding of a multi-channel
audio signal.
The present invention is also directed to CLD quantization and
dequantization methods capable of minimizing sound deterioration using advantages
of quantization of Virtual Source Location Information (VSLI), which is replaceable
with CLD, in the process of SAC-based encoding of a multi-channel audio signal.
In addition, the present invention is directed to improving quality of sound
without additional complexity by providing a VSLI-based CLD quantization table,

WO 2007/011157 PCT/KR2006/002824
4
which can be replaced by a CLD quantization table used for CLD quantization and
dequantization in a Moving Picture Experts Group (MPEG)-4 SAC system.
[Technical Solution]
A first aspect of the present invention provides a method for quantizing a
Channel Level Difference (CLD) parameter used as a spatial parameter when Spatial
Audio coding (SAC)-based encoding of an N-channel audio signal (N>1) is
performed. The CLD quantization method comprises the steps of extracting CLDs
for each band from the N-channel audio signal, and quantizing the CLDs by
reference to a Virtual Source Location Information (VSLI)-based CLD quantization
table designed using CLD quantization values derived from VSLI quantization
values of the N-channel audio signal.
A second aspect of the present invention provides a computer-readable
recording medium on which is recorded a computer program for performing the CLD
quantization method.
A third aspect of the present invention provides a method for encoding an N-
channel audio signal (N>1) based on Spatial Audio Coding (SAC). The method
comprises the steps of down-mixing and encoding the N-channel audio signal,
extracting spatial parameters including Channel Level Difference (CLD), Inter-
channel Correlation/Coherence (ICC), and Channel Prediction Coefficient (CPC), for
each band, from the N-channel audio signal and quantizing the extracted spatial
parameters. In the step of quantizing the extracted spatial parameters, the CLD is
quantized by reference to a VSLI-based CLD quantization table designed using CLD

WO 2007/011157 PCT/KR2006/002824
5
quantization values derived from VSLI quantization values of the N-channel audio
signal.
A fourth aspect of the present invention provides an apparatus for encoding
an N-channel audio signal (N>1) based on Spatial Audio Coding (SAC). The
apparatus comprises an SAC encoding means down-mixing the N-channel audio
signal to generate a down-mix signal and extracting spatial parameters including
Channel Level Difference (CLD), Inter-channel Correlation/Coherence (ICC), and
Channel Prediction Coefficient (CPC), for each band, from the N-channel audio
signal, an audio encoding means generating a compressed audio bitstream from the
down-mix signal generated by the SAC encoding means, a spatial parameter
quantizing means quantizing the spatial parameters extracted by the SAC encoding
means, and a spatial parameter encoding means encoding the quantized spatial
parameter levels. The spatial parameter quantizing means quantizes the CLD by
reference to a Virtual Source Location Information (VSLI)-based CLD quantization
table designed using CLD quantization values derived from VSLI quantization
values of the N-channel audio signal.
A fifth aspect of the present invention provides a method for dequantizing an
encoded Channel Level Difference (CLD) quantization value when an encoded N-
channel audio bitstream (N>1) is decoded based on Spatial Audio coding (SAC).
The CLD dequantization method comprises the steps of performing Huffman
decoding on the encoded CLD quantization value, and dequantizing the decoded
CLD quantization value by using a Virtual Source Location Information (VSLI)-

WO 2007/011157 PCT/KR2006/002824
6
based CLD quantization table designed using CLD quantization values derived from
VSLI quantization values of the N-channel audio signal.
A sixth aspect of the present invention provides a computer-readable
recording medium on which is recorded a computer program for performing the CLD
dequantization method.
A seventh aspect of the present invention provides a method for decoding an
encoded N-channel audio bitstream (N>1) based on Spatial Audio Coding (SAC).
The method comprises the steps of decoding the encoded N-channel audio bitstream,
dequantizing a quantization value of at least one spatial parameter received together
with the encoded N-channel audio bitstream,' and synthesizing the decoded N-
channel audio bitstream based on the dequantized spatial parameter to restore an N-
channel audio signal. In the step of dequantizing a quantization value of at least one
spatial parameter, a Channel Level Difference (CLD) included in the spatial
parameter is dequantized by reference to a Virtual Source Location Information
(VSLI)-based CLD quantization table designed using CLD quantization values
derived from VSLI quantization values of the N-channel audio signal.
An eighth aspect of the present invention provides an apparatus for decoding
an encoded N-channel audio bitstream (N>1) based on Spatial Audio Coding (SAC),
The apparatus comprises, means for decoding the encoded N-channel audio bitstream,
means for decoding quantization values of at least one spatial parameter received
together with the encoded N-channel audio bitstream, means for dequantizing the
quantization values of the spatial parameter, and means for synthesizing the decoded

WO 2007/011157 PCT/KR2006/002824
7
N-channel audio bitstream based on the dequantized spatial parameter to restore an
N-channel audio signal. The means for dequantizing the quantization value of the
spatial parameter dequantizes a Channel Level Difference (CLD) included in the
spatial parameter by reference to a Virtual Source Location Information (VSLI)-
based CLD quantization table designed using CLD quantization values derived from
VSLI quantization values of the N-channel audio signal.
[Advantageous Effects]
The VSLI-based CLD quantization table created according to the present
invention can replace the CLD quantization table used in an existing SAC system.
By using the VSLI-based CLD quantization table according to the present invention,
sound deterioration can be prevented as much as possible. In addition, by using a
Huffman codebook in compressing CLD indexes, which is proposed in the present
invention, it is possible to reduce a bit rate required to transmit the CLD.
[Description of Drawings]
FIGS. 1A and 1B conceptually illustrate a process of extracting Channel
Level Difference (CLD) values from multi-channel signals;
FIG. 2 schematically illustrates a configuration of a spatial audio coding
(SAC) system to which the present invention is to be applied;
FIGS. 3A and 3B are views for explaining a concept of VSLI serving as a
reference of CLD quantization in accordance with the present invention; and

WO 2007/011157 PCT/KR2006/002824
8
FIG. 4 is a graph showing CLD quantization values converted from VSLI
quantization values in accordance with the present invention.
[Mode for Invention]
Hereinafter, exemplary embodiments of the present invention will be
described in detail. However, the present invention is not limited to the exemplary
embodiments disclosed below, but can be implemented in various forms. Therefore,
these exemplary embodiments are provided for complete disclosure of the present
invention and to fully convey the scope of the present invention to those.of ordinary
skill in the art.
FIG. 2 schematically illustrates a configuration of a spatial audio coding
(SAC) system to which the present invention is to be applied. As illustrated, the
SAC system can be divided into an encoding part of generating, encoding and
transmitting a down-mix signal and spatial parameters from an N-channel audio
signal and a decoding part of restoring the N-channel audio signal from the down-
mix signal and spatial parameters transmitted from the encoding part. The encoding
part includes an SAC encoder 210, an audio encoder 220, a spatial parameter
quantizer 230, and a spatial parameter encoder 240. The decoding part includes an
audio decoder 250, a spatial parameter decoder 260, a spatial parameter dequantizer
270, and an SAC decoder 280.
The SAC encoder 210 generates a down-mix signal from the input N-
channel audio signal and analyzes spatial characteristics of the N-channel audio
signal, thereby extracting spatial parameters such as Channel Level Difference

WO 2007/011157 PCT/KR2006/002824
9
(CLD), Inter-channel Correlation/Coherence (ICC), and Channel Prediction
Coefficient (CPC).
Specifically, N (N > 1) multi-channel signal input into the SAC encoder 210
is decomposed into frequency bands by means of an analysis filter bank. In order to
. split a signal into sub-bands of a frequency domain with low complexity, a
quadrature mirror filter (QMF) is used. Spatial characteristics related to spatial
perception are analyzed from sub-band signals, and spatial parameters such as CLD,
ICC, and CPC are selectively extracted according to an encoding operation mode.
Further, the sub-band signals are down-mixed and converted into a down-mix signal
of a time domain by means of a QMF synthesis bank.
Alternatively, the down-mix signal may be replaced by a down-mix signal
which is pre-produced by an acoustic engineer (or an artistic/hand-mixed down-mix
signal). At this time, the SAC encoder 210 adjusts and transmits the spatial
parameters on the basis of the pre-produced down-mix signal, thereby optimizing
multi-channel restoration at the decoder.
The audio encoder 220 compresses the down-mix signal generated by the
SAC encoder 210 or the artistic down-mix signal by using an existing audio
compression technique (e.g. Moving Picture Experts Group (MPEG)-4, Advanced
Audio Coding (AAC), MPEG-4 High Efficiency Advanced Audio Coding (HE-
AAC), MPEG-4 Bit Sliced Arithmetic Coding (BSAC) etc.), thereby generating a
compressed audio bitstream.

WO 2007/011157 PCT/KR2006/002824
10
Meanwhile, the spatial parameters generated by the SAC encoder 210 are
transmitted after being quantized and encoded by the spatial parameter quantizer 230
and the spatial parameter encoder 240. The spatial parameter quantizer 230 is
provided with a quantization table, which is to be used to quantize each of the CLD,
ICC and CPC. As described below, in order to minimize sound deterioration caused
by quantizing the CLD using an existing normalized CLD quantization table, a
Virtual Source Location Information (VSLI)-based CLD quantization table can be
used in the spatial parameter quantizer 230.
The spatial parameter encoder 240 performs entropy encoding in order to
compress the spatial parameters quantized by the spatial parameter quantizer 230,
and preferably performs Huffman encoding on quantization indexes of the spatial
parameters using a Huffman codebook. As described below, the present invention
proposes a new Huffman codebook in order to maximize transmission efficiency of
CLD quantization indexes.
The audio decoder 250 decodes the audio bitstream compressed through the
existing audio compression technique (e.g. MPEG-4, AAC, MPEG-4 HE-AAC,
MPEG-4 BSAC, etc.).
The spatial parameter decoder 260 and the spatial parameter dequantizer 270
are modules for performing the inverse of the quantization and encoding performed
by the spatial parameter quantizer 230 and the spatial parameter encoder 240. The
spatial parameter decoder 260 decodes the encoded quantization indexes of the
spatial parameters on the basis of the Huffman codebook, and the spatial parameter

WO 2007/011157 PCT/KR2006/002824
11
dequantizer 270 obtains the spatial parameters corresponding to the quantization
indexes from the quantization table. In analogy to the quantization and encoding of
the spatial parameters, the VSLI-based CLD quantization table and the Huffman
codebook proposed in the present invention are used for the processes of decoding
and dequantization of the spatial parameters.
The SAC decoder 280 restores the N multi-channel audio signals by
synthesis of the audio bitstream decoded by the audio decoder 250 and the spatial
parameters obtained by the spatial parameter dequantizer 270. Alternatively, when
decoding of the multi-channel audio signals is impossible, only the down-mix signal
can be decoded by using an existing audio decoder, so that independent service is
possible. Therefore, the SAC system can provide compatibility with an existing
mono or stereo audio coding system.
The present invention is concerned with providing both the CLD
quantization capable of minimizing sound deterioration resulting from quantization
by utilizing advantages of the quantization of the VSLI representing a spatial audio
image of the multi-channel audio signal. The present invention is based on the fact
that, in expressing an azimuth angle of the spatial audio image, human ears have
difficulty in recognizing an error of 3° or less. The VSLI expressed with the azimuth
angle has a limited dynamic range of 90°, so that quantization error caused by
limitation of the dynamic range upon quantization can be avoided. When the CLD
quantization table is designed on the basis of the advantages of the quantization of
the VSLI, sound deterioration resulting from the quantization can be minimized.

WO 2007/011157 PCT/KR2006/002824
12
FIGS. 3A and 3B are views for explaining a concept of VSLI serving as a
reference of CLD quantization in accordance with the present invention. FIG. 3A
illustrates a stereo speaker environment in which two speakers are located at an angle
of 60°, and FIG. 3B is a view in which a stereo audio signal in the stereo speaker
environment of FIG. 3A is represented by power of a down-mixed signal and by
VSLI. As illustrated, the stereo or multi-channel audio signal can be represented by
the magnitude vector of a down-mix audio signal and the VSLI that can be obtained
by analyzing the each channel power of the multi-channel audio signals. The multi-
channel audio signal represented in this way can be restored by projecting the
magnitude vector according to the location vector of a sound source.
As illustrated in FIGS. 3A and 3B, assuming that power of a signal of the
left speaker is PL, power of a signal of the right speaker is PR, and angles of the left
and right speakers are AL and AR respectively, the VSLI of the sound source can be
found by Equations 1 and 2.
Equation 1

The VSLI calculated in this way has a value between AL and AR. PL and PR
can be restored from the VSLI as follows: First, the VSLI is mapped to a value,

WO 2007/011157 PCT/KR2006/002824
13
VSLI', between 0° and 90° using a Constant Power Panning (CPP) rule, as in
Equation 3.
Equation 3

By using the VSLF mapped in this way and power PD of the down-mixed
signal, PL and PR are calculated using Equations 4 and 5.
Equation 4

As previously described, the subject matter of the present invention concerns
applying the advantages of quantization of the VSLI to quantization of the spatial
parameter, the CLD. In the stereo speaker environment of FIG. 3A, the CLD can be
expressed as in Equation 6.
Equation 6

The CLD can be derived from the VSLI according to Equation 7.

WO 2007/011157 PCT/KR2006/002824
14
Equation 7

Further, as defined in Equation 8 below, the CLD can be obtained by talcing
the natural logarithm, instead of the base-10 logarithm, of the VSLI.
Equation 8

The CLD values obtained by Equations 7 and 8 can be directly used as
spatial parameters of a general SAC system.
As previously described, because the CLD has a dynamic range between -∞
and +∞, problems occur in performing quantization using a finite number of bits.
The main problem is quantization error caused by limitation of the dynamic range.
Because all dynamic ranges of the CLD cannot be expressed with only a finite
number of bits, the dynamic range of the CLD is limited to a predetermined level or
less. As a result, quantization error is introduced, and the spectrum information is
distorted. If 5 bits are used for the CLD quantization, the dynamic range of the CLD
is limited to between -25 dB and +25 dB.

WO 2007/011157 PCT/KR2006/002824
15
In contrast, because the VSLI has a finite dynamic range of 90°, such
quantization error caused by limitation of the dynamic range upon quantization can
be avoided.
In one embodiment, upon quantization of the VSLI, if 5 bits are used for the
CLD quantization and a linear quantizer is applied, the number of quantization levels
is 31 and a quantization interval is 3°. The validity of the VSLI quantization
approach can be verified from the fact that people fail to recognize a difference of 3°
or less when recognizing the spatial image of an audio signal.
The advantages of this VSLI quantization are applied to the CLD
quantization of the stereo coding method, the CLD quantization table used in the
existing SAC system can be replaced by a VSLI-based quantization table.
In one embodiment, quantization values of the VSLI on which 5-bit linear
quantization is performed at a quantization interval of 3° and CLD conversion levels
corresponding to the VSLI quantization values are given in Table 1.
Table 1. VSLI Quantization values and CLD values

Index VSLI
Quantization
value CLD
value Index VSLI
Quantization
value CLD
value
-15 0 -324.2604 1 48 0.9113
-14 3 -25.6121 2 51 1.8326
-13 6 -19.5676 3 54 2.7748
-12 9 -16.0057 4 57 3.7497
-11 12 -13.4505 5 60 4.7712
-10 15 -11.4390 6 63 5.8567
-9 18 -9.7645 7 66 7.0283

WO 2007/011157 PCT/KR2006/002824
16

-8 21 -8.3165 8 69 8.3165
-7 24 -7.0283 9 72 9.7645
-6 27 -5.8567 10 75 11.4390
-5 30 -4.7712 11 78 13.4505
-4 33 -3.7497 12 81 16.0057
-3 36 -2.7748 13 84 19.5676
-2 39 -1.8326 14 87 25.6121
-1 42 -0.9113 15 90 324.2604
0 45 0.0000
Further, a VSLI decision level for the VSLI quantization is decided by a
middle value between neighboring quantization values. The middle value is
converted into the CLD and used as a decision level of the CLD quantization. The
VSLI-based CLD quantization decision level has a value other than the middle value
between neighboring quantization values as seen in Table 2, unlike ordinary CLD
quantization in which the decision level has the middle value between neighboring
quantization values.
FIG. 4 is a graph showing CLD quantization values converted from VSLI
quantization values in accordance with the present invention. As illustrated, when
quantizing the VSLI at a uniform angle on the basis of 45°, the decision level
between the quantized angles is the middle value between two angles. However,
when this VSLI decision level is converted into a CLD value, it can be found that the
VSLI decision level has a value other than the middle value between two
neighboring CLD values. Table 2 below lists the decision levels of the VSLI
quantization and the corresponding CLD values.


WO 2007/011157 PCT/KR2006/002824
17
Table 2
Tables 3 through 7 below are VSLI-based CLD quantization tables created
by using Tables 1 and 2, wherein Table 3 gives the CLD quantization values down
to the fourth decimal place, Table 4 down to the third decimal place, Table 5 down to
the second decimal place, Table 6 down to the first decimal place, and Table 7 to the
integer.

WO 2007/011157 PCT/KR2006/002824
18
The CLD quantization value using the VSLI can be calculated by taking a
base-10 logarithm or natural logarithm. When taking the natural logarithm, e rather
than 10 is used as the base when spectrum information is restored by using the CLD
value.
Table 3. VSLI-based CLD Quantization Table (Fourth Decimal Place)



WO 2007/011157 PCT/KR2006/002824
19
Table 4. VSLI-based CLD Quantization Table (Third Decimal Place)

Index CLD Index CLD

Base-10
logarithm Natural
Logarithm
Base-10
logarithm Natural
Logarithm
-15 -65.140 -150.000 1 0.911 2.098
-14 -25.612 -58.974 2 1.832 4.219
-13 -19.567 -45.056 3 2.774 6.389
-12 -16.005 -36.854 4 3.749 8.633
-11 -13.450 -30.970 5 4.771 10.986
-10 -11.439 -26.339 6 5.856 13.485
-9 -9.764 -22.483 7 7.028 16.183
-8 -8.316 -19.149 8 8.316 19.149
-7 -7.028 -16.183 9 9.764 22.483
-6 -5.856 -13.485 10 11.439 26.339
-5 -4.771 -10.986 11 13.450 30.970
-4 -3.749 -8.633 12 16.005 36.854
-3 -2.774 -6.389 13 19.567 45.056
-2 -1.832 -4.219 14 25.612 58.974
-1 -0.911 -2.098 15 65.140 150.000
0 0.000 0.000

WO 2007/011157 PCT/KR2006/002824
20
Table 5. VSLI-based CLD Quantization Table (Second Decimal Place)

Index CLD Index CLD

Base-10
logarithm Natural
Logarithm
Base-10
logarithm Natural
Logarithm
-15 -65.14 -150.00 1 0.91 2.09
-14 -25.61 -58.97 2 1.83 4.21
-13 -19.56 -45.05 3 2.77 6.38
-12 -16.00 -36.85 4 3.74 8.63
-11 -13.45 -30.97 5 4.77 10.98
-10 -11.43 -26.33 6 5.85 13.48
-9 -9.76 -22.48 7 7.02 16.18
-8 -8.31 -19.14 8 8.31 19.14
-7 -7.02 -16.18 9 9.76 22.48
-6 -5.85 -13.48 10 11.43 26.33
-5 -4.77 -10.98 11 13.45 30.97
-4 -3.74 -8.63 12 16.00 36.85
-3 -2.77 -6.38 13 19.56 45.05
-2 -1.83 -4.21 14 25.61 58.97
-1 -0.91 -2.09 15 65.14 150.00
0 0.00 0.00


WO 2007/011157 PCT/KR2006/002824
21
Table 6. VSLI-based CLD Quantization Table (First Decimal Place)

Index CLD Index CLD

Base-10
logarithm Natural
Logarithm
Base-10
logarithm Natural
Logarithm
-15 -65.1 -150.0 1 0.9 2.0
-14 -25.6 -58.9 2 1.8 4.2
-13 -19.5 -45.0 3 2.7 6.3
-12 -16.0 -36.8 4 3.7 8.6
-11 -13.4 -30.9 5 4.7 10.9
-10 -11.4 -26.3 6 5.8 13.4
-9 -9.7 -22.4 7 7.0 16.1
-8 -8.3 -19.1 8 8.3 19.1
-7 -7.0 -16.1 9. 9.7 22.4
-6 -5.8 -13.4 10 11.4 26.3
-5 -4.7 -10.9 11 13.4 30.9
-4 -3.7 -8.6 12 16.0 36.8
-3 -2.7 -6.3 13 19.5 45.0
-2 -1.8 -4.2 14 25.6 58.9
-1 -0.9 -2.0 15 65.1 150.0
0 0.0 0.0

WO 2007/011157 PCT/KR2006/002824
22
Table 7. VSLI-based CLD Quantization Table (Integer)

Index CLD Index CLD

Base-10
logarithm Natural
Logarithm
Base-10
logarithm Natural
Logarithm
-15 -65 -150 1 0 2
-14 -25 -58 2 1 4
-13 -19 -45 3 2 6
-12 -16 -36 4 3 8
-11 -13 -30 5 4 10
-10 -11 -26 6 5 13
-9 -9 -22 7 7 16
-8 -8 -19 .8 8 19
-7 -7 -16 9 9 22
-6 -5 -13 10 11 26
-5 -4 -10 11 13 30
-4 -3 -8 12 16 36
-3 -2 -6 13 19 45
-2 -1 -4 14 25 58
-1 -0 -2 15 65 150
0 0 0
Next, the decision levels on the VSLI-based CLD quantization tables
classified by decimal place are given in Tables 8, 9,10,11 and 12.

WO 2007/011157 PCT/KR2006/002824
23
Table 8
VSLI-based CLD Quantization Decision Levels (Fourth Decimal Place)


WO 2007/011157 PCT/KR2006/002824
24
Table 9
VSLI-based CLD Quantization Decision Levels (Third Decimal Place)


WO 2007/011157 PCT/KR2006/002824
25
Table 10
VSLI-based CLD Quantization Decision Levels (Second Decimal Place)


WO 2007/011157 PCT/KR2006/002824
26
Table 11
VSLI-based CLD Quantization Decision Levels (First Decimal Place)


WO 2007/011157 PCT/KR2006/002824
27
Table 12
VSLI-based CLD Quantization Decision Levels (Integer)

As shown in Tables 7 and 12, when the CLD quantization values and the
CLD quantization decision levels are expressed as integers by taking the base-10
logarithm, it can be seen that there is a problem that some of the CLD quantization
values are identical to some of the CLD quantization decision levels. Hence, the

WO 2007/011157 PCT/KR2006/002824
28
CLD quantization values and decision levels using the natural logarithm are
preferably used for actual quantization. In other words, when intending to use the
VSLI-based CLD quantization table and the VSLI-based CLD quantization decision
levels, both of which are expressed to the integer, the CLD quantization values are
derived by taking the natural logarithm rather than the base-10 logarithm of the VSLI.
The VSLI-based CLD quantization table created in this way is employed in
the spatial parameter quantizer 230 and the spatial parameter dequantizer 270 of the
SAC system illustrated in FIG. 2, so that sound deterioration resulting from the CLD
quantization error can be minimized.
Further, the present invention proposes a Huffman codebook capable of
optimizing Huffman encoding of the CLD quantization indexes derived on the basis
of the above-described VSLI-based CLD quantization table.
In the SAC system, the multi-channel audio signal is processed after being
split into sub-bands of a frequency domain by means of a filter bank. When the
multi-channel audio signal is split into 20 sub-bands, a differential coding method is
applied to a quantization index of each sub-band, thereby classifying the quantization
indexes into the quantization index of the fist sub-band and the other 19 differential
indexes between neighboring sub-bands. Alternatively, they may be divided into
differential indexes between neighboring frames. A probability distribution is
calculated with respect to each of the three types of indexes classified in this way,
and then the Huffman coding method is applied to each of the three types of indexes.
Thereby, Huffman codebooks described in Tables 13 and 14 below can be obtained.

WO 2007/011157 PCT/KR2006/002824
29
Table 13 is the Huffman codebook for the index of the first sub-band, and Table 14 is
the Huffman code book for the other indexes between neighboring sub-bands.
Table 13


WO 2007/011157 PCT/KR2006/002824
30
Table 14


WO 2007/011157 PCT/KR2006/002824
31
In this manner, the Huffman codebooks proposed in the present invention are
employed to the spatial parameter encoder 240 and the spatial parameter decoder 260
of the SAC system illustrated in FIG. 2, so that a bit rate required to transmit the
CLD quantization indexes can be reduced.
Alternatively, when the number of bits used for Huffman encoding of the 20
sub-bands exceeds 100, 5-bit Pulse Code Modulation (PCM) coding can be
performed on each sub-band.
[Industrial Applicability]
The present invention can be provided as a computer program stored on at
least one computer-readable medium in the form of at least one product such as a
floppy disk, hard disk, CD ROM, flash memory card, PROM, RAM, ROM, or
magnetic tape. In general, the computer program can be written in any programming
language such as C, C++, or JAVA.
While the invention has been shown and described with reference to certain
exemplary embodiments thereof, it will be understood by those skilled in the art that
various changes in form and details may be made therein without departing from the
spirit and scope of the invention as defined by the appended claims.

WO 2007/011157 PCT/KR2006/002824
32
[CLAIMS]
[Claim 1]
A Channel Level Difference (CLD) quantization method for quantizing a
CLD parameter used as a spatial parameter when Spatial Audio coding (SAC)-based
encoding of an. N-channel audio signal (N>1) is performed, the CLD quantization
method comprising the steps of:
extracting CLDs for each sub-band from the N-channel audio signal; and
quantizing the CLDs by reference to a Virtual Source Location Information
(VSLI)-based CLD quantization table designed using CLD quantization values
derived from VSLI quantization values of the N-channel audio signal.
[Claim 2]
The CLD quantization method according to claim 1, wherein the VSLI
quantization value is quantized at a predetermined quantization interval within a
range between 0° and 90°.
[Claim 3]
The CLD quantization method according to claim 2, wherein the
predetermined quantization interval is 3°.
[Claim 4]

WO 2007/011157 PCT/KR2006/002824
33
The CLD quantization method according to claim 1, wherein the CLD
quantization values are derived from the VSLI quantization values according to the
following equation;

[Claim 5]
The CLD quantization method according to claim 1, wherein the CLD
quantization values are derived from the VSLI quantization values according to the
following equation:

[Claim 6]
The CLD quantization method according to claim 1, wherein a decision level
for the CLD quantization is derived from a VSLI decision level for VSLI
quantization.
[Claim 7]
The CLD quantization method according to claim 1, wherein the VSLI-based
CLD quantization table is as follows:


WO 2007/011157 PCT/KR2006/002824
34

[Claim 8]
The CLD quantization method according to claim 7, wherein the VSLI-based
CLD quantization table is related to the CLD quantization decision levels as follows:

WO 2007/011157 PCT/KR2006/002824
35

[Claim 9]
The CLD quantization method according to claim 1, further comprising the
step of performing Huffman encoding on quantization indexes of the CLD.
[Claim 10]

WO 2007/011157 PCT/KR2006/002824
36
The CLD quantization method according to claim 9, wherein the Huffman
encoding is performed on a quantization index of a first sub-band by reference to a
Huffman codebook as follows:

Index Number
of Bits Codeword
(hexadecimal) Index Number
of Bits Codeword
(hexadecimal)
0 5 0x17 16 5 0x1d
1 8 0x64 17 5 0x19
2 8 0x65 18 5 0x1c
3 8 0xf0 19 5 0x16
4 8 0xf1 20 5 0x18
5 7 0x33 21 5 0x14
6 7 0x79 22 5 0x13
7 6 0x18 23 5 0x15
8 6 0x22 24 5 0x1b
9 6 0x23 25 5 0x10
10 6 0x3d 26 5 0x0e
11 5 0x0b 27 5 0x0f
12 5 0x12 28 5 0x0d
13 5 0x1a 29 5 0x0a
14 4 0x04 30 2 0x00
15 5 0x1f
[Claim 11]
The CLD quantization method according to claim 10, wherein the Huffman
encoding is performed on quantization indexes of the remaining sub-bands other than
the first sub-band by reference to a Huffman codebook as follows:

WO 2007/011157 PCT/KR2006/002824
37

Index Number
of Bits Codeword
(hexadecimal) Index Number
of Bits Codeword
(hexadecimal)
0 2 0x00003 16 10 0x00206
1 2 0x00001 17 10 0x00006
2 3 0x00005 18 11 0x0040e
3 3 0x00001 19 11 0x0000e
4 4 0x00009 20 12 0x0081f
5 4 0x00001 21 12 0x0001f
6 5 0x00011 22 13 0x0103c
7 5 0x00001 23 13 0x0003d
8 6 0x00021 24 14 0x0207a
9 6 0x00001 25 14 0x00079
10 7 0x00041 26 14 0x00078
11 7 0x00001 27 15 0x040f6
12 8 0x00080 28 16 0x081ef
13 8 0x00000 29 17 0xl03dd
14 9 0x00102 30 17 0x103dc
15 9 0x00002
[Claim 12]
A computer-readable recording medium on which is recorded a computer
program for performing the CLD quantization method according to any one of
claims 1 through 11.
[Claim 13]

WO 2007/011157 PCT/KR2006/002824
38
A method for encoding an N-channel audio signal (N>1) based on Spatial
Audio Coding (SAC), the method comprising the steps of:
down-mixing and encoding the N-channel audio signal;
extracting spatial parameters including Channel Level Difference (CLD),
Inter-channel Correlation/Coherence (ICC), and Channel Prediction Coefficient
(CPC), for each sub-band, from the N-channel audio signal; and
quantizing the extracted spatial parameters,
wherein, in the step of quantizing the extracted spatial parameters, the CLD is
quantized by reference to a Virtual Source Location Information (VSLI)-based CLD
quantization table designed using CLD quantization values derived from VSLI
quantization values of the N-channel audio signal.
[Claim 14]
An apparatus for encoding an N-channel audio signal (N>1) based on Spatial
Audio Coding (SAC), the apparatus comprising:
an SAC encoding means for down-mixing the N-channel audio signal to
generate a down-mix signal, and extracting spatial parameters including Channel
Level Difference (CLD), Inter-channel Correlation/Coherence (ICC), and Channel
Prediction Coefficient (CPC), for each sub-band, from the N-channel audio signal;
an audio encoding means for generating a compressed audio bitstream from
the down-mix signal generated by the SAC encoding means;
a spatial parameter quantizing means for quantizing the spatial parameters
extracted by the SAC encoding means; and

WO 2007/011157 PCT/KR2006/002824
39
a spatial parameter encoding means for encoding the quantized spatial
parameters,
wherein the spatial parameter quantizing means quantizes the CLD by
reference to a Virtual Source Location Information (VSLI)-based CLD quantization
table designed using CLD quantization values derived from VSLI quantization
values of the N-channel audio signal.
[Claim 15]
The apparatus according to claim 14, wherein the VSLI-based CLD
quantization table is as follows:

Index CLD Index CLD

Base-10
logarithm Natural
Logarithm
Base-10 .
logarithm Natural
Logarithm
-15 -65.1 -150.0 1 0.9 2.0
-14 -25.6 -58.9 2 1.8 4.2
-13 -19.5 -45.0 3 2.7 6.3
-12 -16.0 -36.8 4 3.7 8.6
-11 -13.4 -30.9 5 4.7 10.9
-10 -11.4 -26.3 6 5.8 13.4
-9 -9.7 -22.4 7 7.0 16.1
-8 -8.3 -19.1 8 8.3 19.1
-7 -7.0 -16.1 9 9.7 22.4
-6 -5.8 -13.4 10 11.4 26.3
-5 -4.7 -10.9 11 13.4 30.9
-4 -3.7 -8.6 12 16.0 36.8
-3 -2.7 -6.3 13 19.5 45.0
-2 -1.8 -4.2 14 25.6 58.9
-1 -0.9 -2.0 15 65.1 150.0
0 0.0 0.0

40
[Claim 16]
The apparatus according to claim 15, wherein the VSLI-based CLD
quantization table is related to CLD quantization decision levels as follows:

[Claim 171

WO 2007/011157 PCT/KR2006/002824
41
A method for dequantizing an encoded Channel Level Difference (CLD)
quantization value when an encoded N-channel audio bitstream (N>1) is decoded
based on Spatial Audio coding (SAC), the method comprising the steps of:
performing Huffman decoding on the encoded CLD quantization value; and
dequantizing the decoded CLD quantization value by using a Virtual Source
Location Information (VSLI)-based CLD quantization table designed using CLD
quantization values derived from VSLI quantization values of the N-channel audio
signal.
[Claim 18]
The method according to claim 18, wherein the VSLI-based CLD
quantization table is as follows:


WO 2007/011157 PCT/KR2006/002824
42

-2 -1.8 -4.2 14 25.6 58.9
-1 -0.9 -2.0 15 65.1 150.0
0 0.0 0.0
[Claim 19]
The method according to claim 18, wherein the VSLI-based CLD
quantization table is related to CLD quantization decision levels as follows:


WO 2007/011157 PCT/KR2006/002824
43
[Claim 20]
The method according to claim 17, wherein in the step of performing
Huffman decoding on the encoded CLD quantization value, the CLD quantization
value of a first sub-band is decoded by reference to a Huffman codebook as follows;

Index Number
of Bits Codeword
(hexadecimal) Index Number
of Bits Codeword
(hexadecimal)
0 5 0x17 16 5 0x1d
1 8 0x64 17 5 0x19
2 8 0x65 18 5 0x1c
3 8 0xf0 19 5 0x16
4 8 0xf1 20 5 0x18
5 7 0x33 21 5 0x14
6 7 0x79 22 5 0x13
7 6 0x18 23 5 0x15
8 6 0x22 24 5 0x1b
9 6 0x23 25 5 0x10
10 6 0x3d 26 5 0x0e
11 5 0x0b 27 5 0x0f
12 5 0x12 28 5 0x0d
13 5 0x1a 29 5 0x0a
14 4 0x04 30 2 0x00
15 5 0x1f
[Claim 21]

WO 2007/011157 PCT/KR2006/002824
44
The method according to claim 20, wherein the Huffman encoding is
performed on quantization indexes of the remaining sub-bands other than the first
sub-band by reference to a Huffman codebook as follows:

Index Number
of Bits Codeword
(hexadecimal) Index Number
of Bits Codeword
(hexadecimal)
0 2 0x00003 16 10 0x00206
1 2 0x00001 17 10 0x00006
2 3 0x00005 18 11 0x0040e
3 3 0x00001 19 11 0x0000e
4 4 0x00009 20 12 0x0081f
5 4 0x00001 21 12 0x0001f
6 5 0x00011 22 13 0x0103c
7 5 0x00001 23 13 0x0003d
8 6 0x00021 24 14 0x0207a
9 6 0x00001 25 14 0x00079
10 7 0x00041 26 14 0x00078
11 7 0x00001 27 15 0x040f6
12 8 0x00080 28 16 0x081ef
13 8 0x00000 29 17 0x103dd
14 9 0x00102 30 17 0x103dc
15 9 0x00002
[Claim 22]
A computer-readable recording medium on which is recorded a computer
program for performing the CLD dequantization method according to any one of
claims 17 through 21.
[Claim 23]

WO 2007/011157 PCT/KR2006/002824
45
A method for decoding an encoded N-channel audio bitstream (N>1) based
on Spatial Audio Coding (SAC), the method comprising the steps of:
decoding the encoded N-channel audio bitstream;
dequantizing quantization values of at least one spatial parameter received
together with the encoded N-channel audio bitstream; and
synthesizing the decoded N-channel audio bitstream based on the dequantized
spatial parameter to restore an N-channel audio signal,
wherein, in the step of dequantizing quantization values of at least one spatial
parameter, a CLD included in the spatial parameter is dequantized by reference to a
Virtual Source Location Information (VSLI)-based CLD quantization table designed
using CLD quantization values derived from VSLI quantization values of the N-
channel audio signal.
[Claim 24]
An apparatus for decoding an encoded N-channel audio bitstream (N>1)
based on Spatial Audio Coding (SAC), the apparatus comprising:
means for decoding the encoded N-channel audio bitstream;
means for decoding quantization values of at least one spatial parameter
received together with the encoded N-channel audio bitstream;
means for dequantizing the quantization values of the spatial parameter; and
synthesizing the decoded N-channel audio bitstream based on the dequantized
spatial parameter to restore an N-channel audio signal,

WO 2007/011157 PCT/KR2006/002824
46
wherein the means for dequantizing the quantization value of the spatial
parameter dequantizes a CLD included in the spatial parameter by reference to a
Virtual Source Location Information (VSLI)-based CLD quantization table designed
using CLD quantization values derived from VSLI quantization values of the N-
channel audio signal.
[Claim 25]
The apparatus according to claim 24, wherein the VSLI-based CLD
quantization table is as follows:

Index CLD Index CLD

Base-10
logarithm Natural
Logarithm
Base-10
logarithm Natural
Logarithm
-15 -65.1 -150.0 1 0.9 2.0
-14 -25.6 -58.9 2 1.8 4.2
-13 -19.5 -45.0 3 2.7 6.3
-12 -16.0 -36.8 4 3.7 8.6
-11 -13.4 -30.9 5 4.7 10.9
-10 -11.4 -26.3 6 5.8 13.4
-9 -9.7 -22.4 7 7.0 16.1
-8 -8.3 -19.1 8 8.3 19.1
-7 -7.0 -16.1 9 9.7 22.4
-6 -5.8 -13.4 10 11.4 26.3
-5 -4.7 -10.9 11 13.4 30.9
-4 -3.7 -8.6 12 16.0 36.8
-3 -2.7 -6.3 13 19.5 45.0
-2 -1.8 -4.2 14 25.6 58.9
-1 -0.9 -2.0 15 65.1 150.0
0 0.0 0.0

WO 2007/011157 PCT/KR2006/002824
47
[Claim 26]
The apparatus according to claim 25, wherein the VSLI-based CLD
quantization table is related to CLD quantization decision levels as follows:

Methods for Spatial Audio Coding (SAC) of a multi-channel audio signal and decoding of an audio bitstream generated
by the SAC are provided. More particularly, methods of efficient quantization and dequantization of Channel Level Difference
(CLD) used as a spatial parameter when SAC -based encoding of a multi-channel audio signal is performed are provided. A method
of CLD quantization includes extracting sub-band-specific CLDs from an N-channel audio signal (N>1), and quantizing the CLDs
by reference to a Virtual Source Location Information (VSLI)-based CLD quantization table designed using CLD quantization values
derived from VSLI quantization values of the N-channel audio signal.

Documents:

http://ipindiaonline.gov.in/patentsearch/GrantedSearch/viewdoc.aspx?id=6BatlaDxJTERcSIj7Fp9sw==&loc=wDBSZCsAt7zoiVrqcFJsRw==


Patent Number 279072
Indian Patent Application Number 4847/KOLNP/2007
PG Journal Number 02/2017
Publication Date 13-Jan-2017
Grant Date 10-Jan-2017
Date of Filing 12-Dec-2007
Name of Patentee ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
Applicant Address 161 GAJEONG-DONG, YUSEONG-GU DAEJEON
Inventors:
# Inventor's Name Inventor's Address
1 KANG KYEONG OK SANSUNG PURUN APT. 101-605, JEONMIN-DONG, YUSEONG-GU, DAEJEON 305-727
2 CHON SANG BAE 302, 701-8, BONGCHEON 1-DONG, GWANAK- GU, SEOUL 151-051
3 SUNG KEONG MO GEUMHO PARK VILLAGE 203, 602-63, NAMHYEON-DONG, GWANAK-GU, SEOUL 151-801
4 SEO JEONG IL SEJONG APT. 107-801, JEONMIN-DONG, YUSEONG-GU, DAEJEON 305-728
5 HONG JIN WOO HANBIT APT. 130-702, EOEUN-DONG, YUSEONG-GU, DAEJEON 305-755
6 KIM KWANG KI DORMITORY OF INFORMATION AND COMMUNICATIONS UNIVERSITY, HWAAM-DONG, YUSEONG-GU, DAEJEON 305-348
7 BEACK SEUNG KWON DORMITORY OF INFORMATION AND COMMUNICATIONS UNIVERSITY, HWAAM-DONG, YUSEONG-GU, DAEJEON 305-348
8 HAHN MIN SOO EXPO APT 303-1103, JEONMIN-DONG, YUSEONG-GU, DAEJEON 305-761
PCT International Classification Number G11B 20/10
PCT International Application Number PCT/KR2006/002824
PCT International Filing date 2006-07-19
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 102005-0096256 2005-10-12 Republic of Korea
2 102006-0066822 2006-07-18 Republic of Korea
3 102005-0065515 2005-07-19 Republic of Korea