Title of Invention

AN ENCODER, A DECODER AND METHOD FOR EMCODING/DECODING AN AUDIO SIGNAL

Abstract The present invention proposes a new method and a new apparatus for enhancement of audio source coding systems utilizing high frequency reconstruction (III-R). It utilizes a detection mechanism (703a) on the encoder side to assess what parts of the spectrum will not be correctly reproduced by the III-R method in the decoder. Information on this is efficiently coded (703b) and sent to the decoder, where it is combined with the output of the III-R unit.
Full Text METHODS FOR IMPROVING HIGH FREQUENCY RECONSTRUCTION
TECHNICAL FIELD
The present invention relates to source coding systems utilising high frequency reconstruction
(HFR) such as Spectral Band Replication, SBR [WO 98/57436] or related methods. It improves
performance of both high quality methods (SBR), as well as low quality copy-up methods [U.S.
Pat. 5,127,054]. It is applicable to both speech coding and natural audio coding systems.
BACKGROUND OF THE INVENTION
High frequency reconstruction (HFR) is a relatively new technology to enhance the quality of
audio and speech coding algorithms. To date it has been introduced for use in speech codecs,
such as the wideband AMR coder for 3rd generation cellular systems, and audio coders such as
mp3 or AAC, where the traditional waveform codecs are supplemented with the high frequency
reconstruction algorithm SBR (resulting in mp3PRO or AAC+SBR).
High frequency reconstruction is a very efficient method to code high frequencies of audio and
speech signals. As it cannot perform coding on its own, it is always used in combination with a
normal waveform based audio coder (e.g. AAC, mp3) or a speech coder. These are responsible
for coding the lower frequencies of the spectrum. The basic idea of high frequency reconstruction
is that the higher frequencies are not coded and transmitted, but reconstructed in the decoder
based on the lower spectrum with help of some additional parameters (mainly data describing the
high frequency spectral envelope of the audio signal) which are transmitted in a low bit rate bit
stream, which can be transmitted separately or as ancillary data of the base coder. The additional
parameters could also be omitted, but as of today the quality reachable by such an approach will
be worse compared to a system using additional parameters.

Especially for Audio Coding, HFR significantly improves the coding efficiency especially in the
quality range "sounds good, but is not transparent". This has two main reasons:
• Traditional waveform codecs such as mp3 need to reduce the audio bandwidth for very low
bitrates since otherwise the artefact level in the spectrum is getting too high. HFR regenerates
those high frequencies at very low cost and with good quality. Since HFR allows a low-cost
way to create high frequency components, the audio bandwidth coded by the audio coder can
be further reduced, resulting in less artefacts and better worst case behaviour of the total
system.
• HFR can be used in combination with downsampling in the encoder / upsampling in the
decoder. In this frequently used scenario the HFR encoder analyses the full bandwidth audio
signal, but the signal fed into the audio coder is sampled down to a lower sampling rate. A
typical example is HFR rate at 44.1 kHz, and audio coder rate at 22.05 kHz. Running the
audio encoder at a low sampling rate is an advantage, because it is usually more efficient at
the lower sampling rate. At the decoding side, the decoded low sample rate audio signal is
upsampled and the HFR part is added - thus frequencies up to the original Nyquist frequency
can be generated although the audio coder runs at e.g. half the sampling rate.
A basic parameter for a system using HFR is the so-called cross over frequency (COF), i.e. the
frequency where normal waveform coding stops and the HFR frequency range begins. The
simplest arrangement is to have the COF at a constant frequency. A more advanced solution that
has been introduced already is to dynamically adjust the COF to the characteristics of the signal
to be coded.
A main problem with HFR is that an audio signal may contain components in higher frequencies
which are difficult to reconstruct with the current HFR method, but could more easily be
reproduced by other means, e.g. a waveform coding methods or by synthetic signal generation.
A simple example is coding of a signal only consisting of a sine wave above the COF, Fig. 1.
Here the COF is 5.5kHz. As there is no useful signal available in the low frequencies, the HFR
method, based on extrapolating the lowband to obtain a highband, will not generate any signal.

Accordingly, the sine wave signal cannot be reconstructed. Other means are needed to code this
signal in a useful way. In this simple case, HFR systems providing flexible adjustment of COF
can already solve the problem to some extent. If the COF is set above the frequency of the sine
wave, the signal can be coded very efficiently using the core coder. This assumes, however, that
it is possible to do so, which might not always be the case. As mentioned earlier, one of the main
advantages of combining HFR with audio coding is the fact that the core coder can run at half the
sampling rate (giving higher compression efficiency). In a realistic scenario, such as a 44.1 kHz
system with the core running at 22.05kHz, such a core coder can only code signals up to around
10.5 kHz. However, apart from that, the problem gets significantly more complicated even for
parts of the spectrum within the reach of the core coder when considering more complex signals.
Real world signals may e.g. contain audible sine wave-like components at high frequencies
within a complex spectrum (e.g. little bells), Fig. 2. Adjusting the COF is not a solution in this
case, as most of the gain achieved by the HFR method would diminish by using the core coder
for a much larger part of the spectrum.
SUMMARY OF THE INVENTION
A solution to the problems outlined above, and subject of this invention, is therefore the idea of a
highly flexible HFR system that does not only allow to change the COF, but allows a much more
flexible composition of the decoded/reconstructed spectrum by a frequency selective
composition of different methods.
Basis for the invention is a mechanism in the HFR system enabling a frequency dependent
selection of different coding or reconstruction methods. This could be done for example with the
64 band filter bank analysis/synthesis system as used in SBR. A complex filter bank providing
alias free equalisation functions can be especially useful.
The main inventive step is that the filter bank is now used not only to serve as a filter for the
COF and the following envelope adjustment. It is also used in a highly flexible way to select the
input for each of the filter bank channels out of the following sources:

waveform coding (using the core coder);
transposition (with following envelope adjustment);
waveform coding (using additional coding beyond Nyquist);
parametric coding;
any other coding/reconstruction method applicable in certain parts of the spectrum;
or any combination thereof.
Thus, waveform coding, other coding methods and HFR reconstruction can now be used in any
arbitrary spectral arrangement to achieve the highest possible quality and coding gain. It should
be evident however, that the invention is not limited to the use of a subband filterbank, but it can
of course be used with arbitrary frequency selective filtering.
The present invention comprises the following features:
a HFR method utilising the available lowband in said decoder to extrapolate a highband;
on the encoder side, using the HFR method to assess, within different frequency regions,
where the HFR method does not, based on the frequency range below COF, correctly
generate a spectral line or spectral lines similar to the spectral line or spectral lines of the
original signal;
coding the spectral line or spectral lines, for the different frequency regions;
transmitting the coded spectral line or spectral lines for the different frequency regions from
the encoder to the decoder;

decoding the spectral line or spectral lines;
adding the decoded spectral line or spectral lines to the different frequency regions of
the output from the HFR method in the decoder;
the coding is a parametric coding of said spectral line or spectral lines;
the coding is a waveform coding of said spectral line or spectral lines;
the spectral line or spectral lines, parametrically coded, are synthesised using a subband
filterbank;
the waveform coding of the spectral line or spectral lines is done by the underlying core
coder of the source coding system;
the waveform coding of the spectral line or spectral lines is done by an arbitrary waveform
coder.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
The present invention will now be described by way of illustrative examples, not limiting the
scope or spirit of the invention, with reference to the accompanying drawings, in which:
Fig. 1 illustrates spectrum of original signal with only one sine above a 5.5kHz COF;
Fig. 2 illustrates spectrum of original signal containing bells in pop-music;
Fig. 3 illustrates detection of missing harmonics using prediction gain;

Fig. 4 illustrates the spectrum of an original signal
Fig. 5 illustrates the spectrum without the present invention;
Fig. 6 illustrates the output spectrum with the present invention;
Fig. 7 illustrates a possible encoder implementation of the present invention;
Fig. 8 illustrates a possible decoder implementation of the present invention.
Fig. 9 illustrates a schematic diagram of an inventive encoder;
Fig. 10 illustrates a schematic diagram of an inventive decoder;
Fig. 11 is a diagram showing the organisation of the spectral range into scale factor bands and
channels in relation to the cross-over frequency and the sampling frequency; and
Fig. 12 is the schematic diagram for the inventive decoder in connection with an HFR
transposition method based on a filter bank approach.
DESCRIPTION OF PREFERRED EMBODIMENTS
The below-described embodiments are merely illustrative for the principles of the present
invention for improvement of high frequency reconstruction systems. It is understood that
modifications and variations of the arrangements and the details described herein will be apparent
to others skilled in the art. It is the intent, therefore, to be limited only by the scope of the
impending patent claims and not by the specific details presented by way of description and
explanation of the embodiments herein.
Fig. 9 illustrates an inventive encoder. The encoder includes a core coder 702. It is to be noted
here that the inventive method can also be used as a so-called add-on module for an existing core

coder. In this case, the inventive encoder includes an input for receiving an encoded input signal
output by a separate standing core coder 702.
The inventive encoder in Fig. 9 additionally includes a high frequency regeneration block 703c, a
difference detector 703a, a difference describer block 703b as well as a combiner 705.
In the following, the functional interdependence of the above-referenced means will be described.
In particular the inventive encoder is for encoding an audio signal input at an audio signal input
900 to obtain an encoded signal. The encoded signal is intended for decoding using a high
frequency regenerating technique which is suited for generating frequency components above a
predetermined frequency which is also called the cross-over frequency, based on the frequency
components below the predetermined frequency.
It is to be noted here that as a high frequency regeneration technique, a broad variety of such
techniques that became known recently can be used. In this regard, the term "frequency
component'" is to be understood in a broad sense. This term at least includes spectral coefficients
obtained by means of a time domain/frequency domain transform such as a FFT, a MDCT or
something else. Additionally, the term "frequency component" also includes band pass signals,
i.e.. signals obtained at the output of frequency-selective filters such as a low pass filter, a band
pass filter or a high pass filter.
Irrespective of the fact, whether the core coder 702 is part of the inventive encoder, or whether
the inventive encoder is used as an add-on module for an existing core coder, the encoder
includes means for providing an encoded input signal, which is a coded representation of an input
signal, and which is coded using a coding algorithm. In this regard, it is to be remarked that the
input signal represents a frequency content of the audio signal below a predetermined frequency,
i.e.. below the so-called cross-over frequency. To illustrate the fact that the frequency-content of
the input signal only includes a low-band part of the audio signal, a low pass filter 902 is shown
in Fig. 9. The inventive encoder indeed can have such a low pass filler. Alternatively, such a low

pass filter can be included in the core coder 702. Alternatively, a core coder can perform the
function of discarding a frequency band of the audio signal by any other known means.
At the output of the core coder 702, an encoded input signal is present which, with regard to its
frequency content, is similar to the input signal but is different from the audio signal in that the
encoded input signal does not include any frequency components above the predetermined
frequency.
The high frequency regeneration block 703c is for performing the high frequency regeneration
technique on the input signal, i.e., the signal input into the core coder 702, or on a coded and
again decoded version thereof. In case this alternative is selected, the inventive encoder also
includes a core decoder 903 that receives the encoded input signal from the core coder and
decodes this signals so that exactly the same situation is obtained that is present at the
decoder/receiver side, on which a high frequency regeneration technique is to be performed for
enhancing the audio bandwidth for encoded signals that have been transmitted using a low bit
rate.
The HFR block 702 outputs a regenerated signal that has frequency components above the
predetermined frequency.
As it is shown in Fig. 9, the regenerated signal output by the HFR block 703c is input into a
difference detector means 703a. On the other hand, the difference detector means also receives
the original audio signal input at the audio signal input 900. The means for detecting differences
between the regenerated signal from the HFR block 703c and the audio signal from the input 900
is arranged for detecting a difference between those signals, which are above a predetermined
significance threshold. Several examples for preferred thresholds functioning as significance
thresholds are described below.
The difference detector output is connected to an input of a difference describer block 703b. The
difference describer block 703b is for describing detected differences in a certain way to obtain
additional information on the detected differences. These additional information is suitable for

being input into a combiner means 705 that combines the encoded input signal, the additional
information and several other signals that may be produced to obtain an encoded signal to be
transmitted to a receiver or to be stored on a storage medium. A prominent example for an
additional information is a spectral envelope information produced by a spectral envelope
estimator 704. The spectral envelope estimator 704 is arranged for providing a spectral envelope
information of the audio signal above the predetermined frequency, i.e., above the cross-over
frequency. This spectral envelope information is used in a HFR module on the decoder side to
synthesize spectral components of a decoded audio signal above the predetermined frequency.
In a preferred embodiment of the present invention, the spectral envelope estimator 704 is
arranged for providing only a coarse representation of the spectral envelope. In particular, it is
preferred to provide only one spectral envelope value for each scale factor band. The use of scale
factor bands is known for those skilled in the art. In connection with transform coders such as
MP3 or MPEG-AAC, a scale factor band includes several MDCT lines. The detailed organisation
of which spectral lines belong to which scale factor band is standardized, but may vary.
Generally, a scale factor band includes several spectral lines (for example MDCT lines, wherein
MDCT stands for modified discrete cosine transform), or bandpass signals, the number of which
varies from scale factor band to scale factor band. Generally, one scale factor band includes at
least more than two and normally more than ten or twenty spectral lines or band pass signals.
In accordance with a preferred embodiment of the present invention, the inventive encoder
additionally includes a variable cross-over frequency. The control of the cross-over frequency is
performed by the inventive difference detector 703a. The control is arranged such that, when the
difference detector comes to the conclusion that a higher cross-over frequency would highly
contribute to reducing artefacts that would be produced by a pure HFR, the difference detector
can instruct the low pass filter 902 and the spectral envelope estimator 704 as well as the core
coder 702 to put the cross-over frequency to higher frequencies for extending the bandwidth of
the encoded input signal.
On the other hand, the difference detector can also be arranged for reducing the cross-over
frequency in case it finds out that a certain bandwidth below the cross-over frequency is

acoustically not important and can, therefore, easily be produced by an HFR synthesis in the
decoder rather than having to be directly coded by the core coder.
Bits that are saved by decreasing the cross-over frequency can, on the other hand, be used for the
case, in which the cross-over frequency has to be increased so that a kind of bit-saving-option can
be obtained which is known for a psychoacoustic coating method. In these methods, mainly tonal
components that are hard to encode, i.e., that need many bits to be coded without artefacts can
consume more bits, when, on the other hand, white noisy signal portions that are easy to code,
i.e., that need only a low number of bits for being coded without artefacts are also present in the
signal and are recognized by a certain bit-saving control.
To summarize, the cross-over frequency control is arranged for increasing or decreasing the
predetermined frequency, i.e., the cross-over frequency in response to findings made by the
difference detector which, in general assesses the effectiveness and performance of the HFR
block 703c to simulate the actual situation in a decoder.
Preferably, the difference detector 703a is arranged for detecting spectral lines in the audio signal
that are not included in the regenerated signal. To do this, the difference detector preferably
includes a predictor for performing prediction operations on the regenerated signal and the audio
signal, and means for determining a difference in obtained prediction gains for the regenerated
signal and the audio signal. In particular, frequency-related portions in the regenerated signal or
in the audio signal are determined, in which a difference in predictor gains is larger than the gain
threshold which is the significance threshold in this preferred embodiment.
It is to be noted here that the difference detector 703a preferably works as a frequency-selective
element in that it assesses corresponding frequency bands in the regenerated signal on the one
hand and the audio signal on the other hand. To this end, the difference detector can include time-
frequency conversion elements for converting the audio signal and the regenerated signal. In case
the regenerated signal produced by the HFR block 703c is already present as a frequency-related
representation, which is the case in the preferred high frequency regeneration method applied for
the present invention, no such time domain/frequency domain conversion means are necessary.

In case one has to use a time domain-frequency domain conversion element such as for
converting the audio signal, which is normally a time-domain signal, a filter bank approach is
preferred. An analysis filter bank includes a bank of suitably dimensioned adjacent band pass
filter, where each band pass filter outputs a band pass signal having a bandwidth defined by the
bandwidth of the respective band pass filter. The band pass filter signal can be interpreted as a
time-domain signal having a restricted bandwidth compared to the signal from which it has been
derived. The centre frequency of a band pass signal is defined by the location of the respective
band pass filter in the analysis filter bank as it is known in the art.
As it will be described later, the preferred method for determining differences above a
significance threshold is a determination based on tonality measures and, in particular, on a tonal
to noise ratio, since such methods are suited to find out spectral lines in signals or to find out
noise-like portions in signals in a robust and efficient manner.
Detection of spectral lines to be coded
In order to be able to code the spectral lines that will be missing in the decoded output after HFR,
it essential to detect these in the encoder. In order to accomplish this, a suitable synthesis of the
subsequent decoder HFR needs to be performed in the encoder. This does not imply that the
output of this synthesis needs to be a time domain output signal similar to that of the decoder. It
is sufficient to observe and synthesise an absolute spectral representation of the HFR in the
decoder. This can be accomplished by using prediction in a QMF filterbank with subsequent
peak-picking of the difference in prediction gain between the original and a HFR counterpart.
Instead of peak-picking of the difference in prediction gain, differences of the absolute spectrum
can also be used. For both methods the frequency dependent prediction gain or the absolute
spectrum of the HFR are synthesised by simply re-arranging the frequency distribution of the
components similar to what the HFR will do in the decoder.

Once the two representations are obtained, the original signal and the synthesised HFR signal,
the detection can be done in several ways.
In a QMF filterbank linear prediction of low order can be performed, e.g. LPC-order 2, for the
different channels. Given the energy of the predicted signal and the total energy of the signal, the
tonal to noise ratio can be defined according to

is the energy of the signal block, and E is the energy of the prediction error block, for a given
filterbank channel. This can be calculated for the original signal, and given that a representation
of how the tonal to noise ratio for different frequency bands in the HFR output in the decoder can
be obtained. The difference between the two on an arbitrary frequency selective base (larger than
the frequency resolution of the QMF), can thus be calculated. This difference vector representing
the difference of tonal to noise ratios, between the original and the expected output from the
HFR in the decoder, is subsequently used to determine where an additional coding method is
required, in order to compensate for the short-comings of the given HFR technique, Fig. 3. Here
the tonal to noise ratio corresponding to the frequency range between subband filterbank band 15
- 41 is displayed for the original and a synthesised HFR output. The grid displays the scalefactor
bands of the frequency range grouped in a bark-scale manner. For every scalefactor band the
difference between the largest components of the original and the HFR output is calculated, and
displayed in the third plot.
The above detection can also be performed using an arbitrary spectral representation of the
original, and a synthesised HFR output, for instance peak-picking in an absolute spectrum
["Extraction of spectral peak parameters using a short-time Fourier transform modeling [sic]

and no sidelobe windows" Ph Depalle, T Helie, IRCAM], or similar methods, and then compare
the tonal components detected in the original and the components detected in the synthesised
HFR output.
When a spectral line has been deemed missing from the HFR output, it needs to be coded
efficiently, transmitted to the decoder and added to the HFR output. Several approaches can be
used; interleaved waveform coding, or e.g. parametric coding of the spectral line.
QMF/hybrid filterbank, interleaved wave form coding.
If the spectral line to be coded is situated below FS/2 of the core coder, it can be coded by the
same. This means that the core coder codes the entire frequency range up to COF and also a
defined frequency range surrounding the tonal component, that will not be reproduced by the
HFR in the decoder. Alternatively, the tonal component can be coded by an arbitrary wave form
coder, with this approach the system is not limited by the FS/2 of the core coder, but can operate
on the entire frequency range of the original signal.
To this end, the core coder control unit 910 is provided in the inventive encoder. In case the
difference detector 703a determines a significant peak above the predetermined frequency but
below half the value of the sampling frequency (FS/2), it addresses the core coder 702 to core-
encode a band pass signal derived from the audio signal, wherein the frequency band of the band
pass signal includes the frequency, where the spectral line has been detected, and, depending on
the actual implementation, also a specific frequency band, which embeds the detected spectral
line. To this end, the core coder 702 itself or a controllable band pass filter within the core coder
filters the relevant portion out of the audio signal, which is directly forwarded to the core coder as
it is shown by a dashed line 912.
In this case, the core coder 702 works as the difference describer 703b in that it codes the spectral
line above the cross-over frequency that has been detected by the difference detector. The
additional information obtained by the difference describer 703b, therefore, corresponds to the

encoded signal output by the core coder 702 that relates to the certain band of the audio signal
above the predetermined frequency but below half the value of the sampling frequency (FS/2).
To better illustrate the frequency scheduling mentioned before, reference is made to Fig. 11. Fig.
11 shows the frequency scale starting from a 0 frequency and extending to the right in Fig. 11. At
a certain frequency value, one can see the predetermined frequency 1100, which is also called the
cross-over frequency. Below this frequency, the core coder 702 from Fig. 9 is active to produce
the encoded input signal. Above the predetermined frequency, only the spectral envelope
estimator 704 is active to obtain for example one spectral envelope value for each scale factor
band. From Fig. 11, it becomes clear that a scale factor band includes several channels which in
case of known transform coders correspond to frequency coefficients or band pass signals. Fig.
11 is also useful for showing the synthesis filter bank channels from the synthesis filter bank of
Fig. 12 that will be described later. Additionally, reference is made to half the value of the
sampling frequency FS/2, which is, in the case of Fig. 11, above the predetermined frequency.
In case a detected spectral line is above FS/2, the core coder 702 cannot work as the difference
describer 703b. In this case, as it is outlined above, completely different coding algorithms have
to be applied in the difference describer for the coding/obtaining additional information on
spectral lines in the audio signal that will not be reproduced by an ordinary HFR technique.
In the following, reference is made to Fig. 10 to illustrate an inventive decoder for decoding an
encoded signal. The encoded signal is input at an input 1000 into a data stream demultiplexer
801. In particular, the encoded signal includes an encoded input signal (output from the core
coder 702 in Fig. 9), which represents a frequency content of an original audio signal (input into
the input 900 from Fig. 9) below a predetermined frequency. The encoding of the original signal
was performed in the core coder 702 using a certain known coding algorithm. The encoded signal
at the input 1000 includes additional information describing detected differences between a
regenerated signal and the original audio signal, the regenerated signal being generated by high
frequency regeneration technique (implemented in the HFR block 703c in Fig. 9) from the input
signal or a coded and decoded version thereof (embodiment with the core decoder 903 in Fig. 9).

In particular, the inventive decoder includes means for obtaining a decoded input signal, which is
produced by decoding the encoded input signal in accordance with the coding algorithm. To this
end. the inventive decoder can include a core decoder 803 as shown in Fig. 10. Alternatively, the
inventive decoder can also be used as an add-on module to an existing core decoder so that the
means for obtaining a decoded input signal would be implemented by using a certain input of a
subsequently positioned HFR block 804 as it is shown in Fig. 10. The inventive decoder also
includes a reconstructor for reconstructing detected differences based on the additional
information that have been produced by the difference describer 703b which is shown in Fig. 9.
As a key component, the inventive decoder additionally includes a high frequency regeneration
means for performing a high frequency regeneration technique similar to the high frequency
regeneration technique that has been implemented by the HFR block 703c as shown in Fig. 9.
The high frequency regeneration block outputs a regenerated signal which, in a normal HFR
decoder, would be used for synthesizing the spectral portion of the audio signal that has been
discarded in the encoder.
In accordance with the present invention, a producer that includes the functionalities of block 806
and 807 from Fig. 8 is provided so that the audio signal output by the producer not only includes
a high frequency reconstructed portion but also includes any detected differences, preferably
spectral lines, that cannot be synthesized by the HFR block 804 but that were present in the
original audio signal.
As will be outlined later, the producer 806, 807 can use the regenerated signal output by the HFR
block 804 and simply combine it with the low band decoded signal output by the core decoder
803 and than insert spectral lines based on the additional information. Alternatively, and
preferably, the producer also does some manipulation of the HFR-generated spectral lines as will
be outlined with respect to Fig. 12. Generally, the producer not only simply inserts a spectral line
into the HFR spectrum at a certain frequency position but also accounts for the energy of the
inserted spectral line in attenuating HFR-regenerated spectral lines in the neighbourhood of the
inserted spectral line.

The above proceeding is based on a spectral envelope parameter estimation performed in the
encoder. In a spectral band above the predetermined frequency, i.e., the cross-over frequency, in
which a spectral line is positioned, the spectral envelope estimator estimates the energy in this
band. Such a band is for example a scale factor band. Since the spectral envelope estimator
accumulates the energy in this band irrespective of the fact whether the energy stems from noisy
spectral lines or certain remarkable peaks, i.e., tonal spectral lines, the spectral envelope estimate
for the given scale factor band includes the energy of the spectral line as well as the energy of the
"noisy" spectral lines in the given scale factor band.
To use the spectral energy estimate information transmitted in connection with the encoded
signal as accurate as possible, the inventive decoder accounts for the energy accumulation
method in the encoder by adjusting the inserted spectral line as well as the neighbouring "noisy"
spectral lines in the given scale factor band so that the total energy, i.e., the energy of all lines in
this band corresponds to the energy dictated by the transmitted spectral envelope estimate for this
scale factor band.
Figure 12 shows a schematic diagram for the preferred HFR reconstruction based on an analysis
filter bank 1200 and a synthesis filter bank 1202. The analysis filter bank as well as the synthesis
filter bank consist of several filter bank channels, which are also illustrated in Fig. 11 with respect
to a scale factor band and the predetermined frequency. Filter bank channels above the
predetermined frequency, which is indicated by 1204 in Fig. 12 have to be reconstructed by
means of filter bank signals, i.e. filter bank channels below the predetermined frequency as it is
indicated in Fig. 12 by lines 1206. It is to be noted here that in each filter bank channel, a band
pass signal having complex band pass signal samples is present. The high frequency
reconstruction block 804 in Fig. 10 and also the HFR block 703c in Fig. 9 include a
transposition/envelope adjustment module 1208, which is arranged for doing HFR with respect to
certain HFR algorithms. It is to be noted that the block on the encoder side does not necessarily
have to include an envelope adjustment module. It is preferred to estimate a tonality measure as a
function of frequency. Then, when the tonality differs too much the difference in absolute
spectral envelope is irrelevant.

The HFR algorithm can be a pure harmonic or an approximate harmonic HFR algorithm or can
be a low-complexity HFR algorithm, which includes the transposition of several consecutive
analysis filter bank channels below the predetermined frequency to certain consecutive synthesis
filter bank channels above the predetermined frequency. Additionally, the block 1208 preferably
includes an envelope adjustment function so that the magnitudes of the transposed spectral lines
are adjusted such that the accumulated energy of the adjusted spectral lines in one scale factor
band for example corresponds to the spectral envelope value for the scale factor band.
From Fig. 12 it becomes clear that one scale factor band includes several filter bank channels. An
exemplary scale factor band extends from a filter bank channel Ilow until a filter bank channel Iup.
With respect to the subsequent adaption/sine insertion method, it is to be noted here that this
adaption or "manipulation" is done by the producer 806, 807 in Fig. 10, which includes a
manipulator 1210 for manipulating HFR produced band pass signals. As an input, this
manipulator 1210 receives, from the reconstructor 805 in Fig. 10, at least the position of the line,
i.e. preferably the number Is, in which the to be synthesized sine is to be positioned. Additionally,
the manipulator 1210 preferably receives a suitable level for this spectral line (sine wave) and,
preferably, also information on a total energy of the given scale factor band sfb 1212.
It is to be noted here that a certain channel Is, into which the synthetic sine signal is to be inserted
is treated different from the other channels in the given scale factor band 1212 as will be outlined
below. This "treatment" of the HFR-regenerated channel signals as output by the block 1208 is,
as has been outlined above, done by the manipulator 1210 which is part of the producer 806, 807
from Fig. 10
Parametric coding of spectral lines
An example of a filterbank based system using parametric coding of missing spectral lines is
outlined below.

When using an HFR method where the system uses adaptive noise floor addition according to
[PCT/SE00/00159], only the frequency location of the missing spectral line needs to be coded,
since the level of the spectral line is implicitly given by the envelope data and the noise-floor
data. The total energy of a given scalefactor band is given by the energy data, and the tonal/noise
energy ration is given by the noise floor level data. Furthermore, in the high-frequency domain
the exact location of the spectral line is of less importance, since the frequency resolution of the
human auditory system is rather coarse at higher frequencies. This implies that the spectral lines
can be coded very efficiently, essentially with a vector indicating for each scalefactor band
whether a sine should be added in that particular band in the decoder.
The spectral lines can be generated in the decoder in several ways. One approach utilises the
QMF filterbank already used for envelope adjustment of the HFR signal. This is very efficient
since it is simple to generate sinewaves in a subband filterbank, provided that they are placed in
the middle of a filter channel in order to not generate aliasing in adjacent channels. This is not a
severe restriction since the frequency location of the spectral line is usually rather coarsely
quantised.
If the spectral envelope data sent from the encoder to the decoder is represented by grouped
subband filterbank energies, in time and frequency, the spectral envelope vector may at a given
time be represented by:

and the noise-floor level vector may be described according to:

Here the energies and noise floor data are averaged over the QMF filterbank bands described by
a vector

containing the QMF-band entries form the lowest QMF-band used (Isb ) to the highest (usb),
whose length is M +1, and where the limits of each scalefactor band (in QMF bands) are given
by:


where is the lower limit and is the upper limit of scalefactor band n. In the above the noise-
floor level data vector has been mapped to the same frequency resolution as that of the energy
data
If a synthetic sine is generated in one filterbank channel, this needs to be considered for all the
subband filter bank channels included in that particular scalefactorband. Since this is the highest
frequency resolution of the spectral envelope in that frequency range. If this frequency resolution
is also used for signalling the frequency location of the spectral lines that are missing from the
HFR and needs to be added to the output, the generation and compensation for these synthetic
sines can be done according to below.
Firstly, all the subband channels within the current scalefactor band need to be adjusted so the
average energy for the band is retained, according to:

where and are the limits for the scalefactor band where a synthetic sine will be added, xre and
xm are the real and imaginary subband samples, l is the channel index, and

is the required gain adjustment factor, where n is the current scalefactor band. It is to be
mentioned here that the above equation is not valid for the spectral line / band pass signal of the
filter bank channel, in which the sine will be placed.

It is to be noted here that the above equation is only valid for the channels in the given scale
factor band extending from Ilow to Iup except the band pass signal in the channel having the
number ls This signal is treated by means of the following equation group.
The manipulator 1210 performs the following equation for the channel having the channel
number ls, i.e. modulating the band pass signal in the channel ls by means of the complex
modulation signal representing a synthetic sine wave. Additionally, the manipulator 1210
performs weighting of the spectral line output from the HFR block 1208 as well as determining
the level of the synthetic sine by means of the synthetic sine adjustment factor gsine. Therefore the
following equation is valid only for a filterbank channel ls into which a sine will be placed.
Accordingly, the sine is placed in QMF channel lv where lt
where, k is the modulation vector index (0 every other channel. This is required since every other channel in the QMF filterbank is
frequency inverted. The modulation vector for placing a sine in the middle of a complex subband
filterbank band is:

and the level of the synthetic sine is given by:

The above is displayed in Fig. 4-6 where a spectrum of the original is displayed in Fig. 4, and the
spectra of the output with and without the above are displayed in Fig. 5-6. In Fig. 5, the tone in

the 8kHz range is replaced by broadband noise. In Fig. 6 a sine is inserted in the middle of the
scalefactor band in the 8kHz range, and the energy for the entire scalefactor band is adjusted so it
retains the correct average energy for that scalefactor band.
Practical implementations
The present invention can be implemented in both hardware chips and DSPs, for various kinds of
systems, for storage or transmission of signals, analogue or digital, using arbitrary codecs.
In Fig. 7 a possible encoder implementation of the present invention is displayed. The analogue
input signal is converted to a digital counterpart 701 and fed to the core encoder 702 as well as to
the parameter extraction module for the HFR 704. An analysis is performed 703 to determine
which spectral lines will be missing after high-frequency reconstruction in the decoder. These
spectral lines are coded in a suitable manner and multiplexed into the bitstream along with the
rest of the encoded data 705. Fig. 8 displays a possible decoder implementation of the present
invention. The bitstream is de-multiplexed 801, and the lowband is decoded by the core decoder
803, the highband is reconstructed using a suitable HFR-unit 804 and the additional information
on the spectral lines missing after the HFR is decoded 805 and used to regenerate the missing
components 806. The spectral envelope of the highband is decoded 802 and used to adjust the
spectral envelope of the reconstructed highband 807. The lowband is delayed 808, in order to
ensure correct time synchronisation with the reconstructed highband, and the two are added
together. The digital wideband signal is converted to an analogue wideband signal 809.
Depending on implementation details, the inventive methods of encoding or decoding can be
implemented in hardware or in software. The implementation can take place on a digital storage
medium, in particular, a disc, a CD with electronically readable control signals, which can
cooperate with a programmable computer system so that the corresponding method is performed.
Generally, the present invention also relates to a computer program product with a program code
stored on a machine readable carrier for performing the inventive methods, when the computer
program product runs on a computer. In other words, the present invention therefore is a

computer program with a program code for performing the inventive method of encoding or
decoding, when the computer program runs on a computer.
It is to be noted that the above description relates to a complex system. The inventive decoder
implementation, however, also works in a real-valued system. In this case the equations
performed by the manipulator 1210 only include the quations for the real part.

WE CLAIM:
1. Encoder for encoding an audio signal to obtain an encoded signal, the encoded
signal being intended for decoding using a high frequency regeneration technique,
which is suited for generating frequency components above a predetermined
frequency based on frequency components below the predetermined frequency,
the encoder comprising:
means (702) for providing an encoded input signal, which is a coded
representation of an input signal, the input signal being coded using a coding
algorithm, and representing a frequency content of the audio signal below the
predetermined frequency;
a high frequency regenerator (703 c) for performing the high frequency
regeneration technique on the input signal or a coded and decoded version thereof
to obtain a regenerated signal having frequency components above the
predetermined frequency;
a detector for detecting (703a) differences between the regenerated signal and the
audio signal, which are above a significance threshold;
a describer for describing (703b) detected differences to obtain additional
information; and
a combiner (705) for combining the encoded input signal and the additional
information to produce the encoded signal.
2. Encoder in accordance with claim 1, in which the detected differences are spectral
lines in the audio signal that are not there in the regenerated signal.
3. Encoder in accordance with claim 1 or 2, in which the predetermined frequency is
a cross-over frequency, which determines a frequency up to which the input
signal is coded by the coding algorithm.

4. Encoder in accordance with one of the preceding claims, in which the detector
(703a) is arranged for using a plurality of frequency bands for the regenerated
signal and the audio signal, wherein the differences are detected based on
frequency bands of the regenerated signal and the same frequency bands of the
audio signal.
5. Encoder in accordance with one of the preceding claims, in which the detector
(703 a) and/or the high frequency regenerator comprises a time domain to
frequency domain converter.
6. Encoder in a accordance with claim 5, in which the time domain to frequency
domain converter is a transform or a filter bank.
7. Encoder in accordance with one of the preceding claims, in which the detector
(703) comprises:
a predictor for performing predictions on the regenerated signal and the audio
signal; and
a detector for detecting a difference in prediction gains obtained by the predictor,
which is larger than a gain threshold forming the significance threshold.
8. Encoder in accordance with one of the preceding claims, in which the detector
(703 a) is arranged for detecting a difference in the absolute spectra of the audio
signal and the regenerated signal, which is above predetermined difference
threshold forming the significance threshold.
9. Encoder in accordance with one of the preceding claims, in which the detector
703a) for detecting is arranged for determining a frequency dependent tonality
measure for the audio signal and the regenerated signal, wherein a frequency band
is detected, in which the tonality measures differ more than a threshold difference
forming the significance threshold.

10. Encoder in accordance with claim 9, in which the tonality measure is a tonal-to-
noise ratio.
11. Encoder in accordance with one of the preceding claims,
in which the audio signal is a discrete audio signal sampled using a sampling
frequency;
in which the predetermined frequency is less than half the value of the sampling
frequency;
in which the detector (703a) is arranged for determining a difference for a specific
frequency band above the predetermined frequency band, a center frequency of
the specific frequency band being less than half the value of the sampling
frequency, the encoder further comprising;
a controller (910) for controlling an encoder producing the encoded input signal to
additionally encode the audio signal with respect to the specific frequency band
according to the encoding algorithm in order to describe the determined
difference, wherein an output of the coder (702) for the specific frequency band
serves as the additional information.
12. Encoder in accordance with one of claims 1 to10, in which the describer (703b)
comprises a band pass filter for band pass filtering the audio signal, the band pass
filter being set to a specific frequency band, which comprises detected
difference, and
wherein the describer (703b) comprises an encoder for encoding an output of the
band pass filter to obtain the additional signal, the encoder using a coding
algorithm different from the coding algorithm by means of which the encoded
input signal is coded.

13. Encoder in accordance with one of claims 1 to 11, in which the detector for
detecting differences is arranged for detecting spectral lines, and
in which the describer is arranged for producing information on the frequency
location of the detected spectral line.
14. Encoder in accordance with claim 13, in which the information on the frequency
location comprises a vector indicating, for a scale factor band, whether a spectral
line has to be added in the specific scale factor band when decoding the encoded
signal.
15. Encoder in accordance with one of the preceding claims, in which the audio signal
is processed frame wise, and
in which the determined frequency is variable from frame to frame.
16. Encoder in accordance with claim l5, in which the difference detector (703a)
further comprises a cross-over frequency controller for varying the
predetermining frequency based on a detected difference.
17. Encoder in accordance with one of the preceding claims, in which the HFR
technique is arranged to produce spectral values above the predetermined
frequency from spectral values below the predetermined frequency.
18. Encoder in accordance with one of the preceding claims, in which the HFR
technique is arranged to transpose a group of spectral values or band pass signals
that relate to consecutive frequencies to a group of spectral values or band pass
signals above the predetermined frequency that correspond to consecutive
frequency.
19. Encoder in accordance with claim 17 or 18, further comprising a spectral
envelope estimator (704) for determining a spectral envelope of the audio signal,
the spectral envelope relating to a spectral part of the audio signal above the
predetermined frequency.

20. Encoder in accordance with claim 19, in which the spectral envelope data
comprises number of envelope data points that is smaller than a number of
spectral values, wherein one data point is provided for a scale factor band.
21. Encoder in accordance with one of the preceding claims, in which the spectral
components are complex transform coefficients or complex band pass signals.
22. Decoder for decoding an encoded signal, the encoded signal comprising an
encoded input signal representing a frequency content of an original audio signal
below a predetermined frequency, the encoding being performed using a coding
algorithm, and additional information describing detected differences between a
regenerated signal and the original audio signal, the regenerated signal being
regenerated by a high frequency regenerating technique from the input signal or a
coded and decoded version thereof, the decoder comprising:
means (803) for obtaining a decoded input signal, which is produced by decoding
the encoded input signal in accordance with the coding algorithm;
a reconstructor (805) for reconstruction detected differences based on the
additional information;
a high frequency regenerator (804) for performing a high frequency regeneration
technique similar to the high frequency regeneration technique for obtaining the
detected differences to obtain the regenerated signal;
a producer (806,807) for producing a high frequency regenerated audio signal
based on the decoded input signal, the reconstructed differences and the
regenerated signal.

23. Decoder in accordance with claim 22,in which a detected difference comprises
spectral lines in a specified frequency region and the additional information relate
to the specific frequency region.
wherein the reconstructor (805) is arranged for generating a spectral in the
specified region in response to the additional information.
24. Decoder in accordance with claim 22 or 23,
in which the additional information specifies a scale factor band, in which a
spectral line is to be reconstructed;
in which the encoded signal further comprises spectral envelope data for
describing a spectral portion of the audio signal above the predetermined
frequency,
in which the producer (806,807) is arranged for generating a spectral line in the
scale factor band, and
in which the producer (806,807) is further arranged for adjusting spectral lines in
the scale factor band so that a given energy for the scale factor band comprising
the generated spectral line is maintained.
25. Decoder in accordance with one of claims 22 to 24,
in which the high frequency regenerator (804) comprises a synthesis filter bank
(1203) having synthesis filter bank channels, wherein a scale factor band includes
more than one filter bank channels,
in which the encoded signal further comprises a spectral envelope vector and a
noise-floor level vector, and

wherein the reconstructor (805) is arranged for calculating a level of the
reconstructed spectral line bas on the spectral envelope vector.
26. Decoder in accordance with claim 25, wherein the producer (806,807) is
arranged for determining band pass signals for filter bank channels, into which
no sine is to be inserted, in a scale factor band in accordance with the following
equation

wherein 1 is a filter bank channel number, wherein his the lowest filter bank channel
number for the scale factor band, wherein lu is the highest filter bank channel for the
scale factor band, wherein xre is the real part of a band pass signal sample output by
the HFR block (804), wherein xum is an imaginary part of the band pass signal sample
output by the HFR block (804), wherein yre and yim are the real part and the imaginary
part of an adjusted band pass signal for a filter bank channel, and wherein ghfr is a gain
adjustment factor derived from the noise-floor level vector.
27. Decoder in accordance with claim 25 or 26, wherein the reconstructor (805) is
arranged for determining a certain scale factor band ls into which a synthetic
sine is to be inserted, and
wherein a level of a synthetic sine to be inserted is defined as follows:

wherein n is a number of the given scale factor band, and e is the spectral envelope
vector, and

wherein the producer is arranged for determining a band pass signal for the channel in
which the synthetic sine is to be placed in accordance with the following equation:

wherein ls is a filter bank, channel number, into which a sine is to be inserted, wherein
li is the lowest filter bank channel number for the scale factor band, wherein lu is the
highest filter bank channel for the scale factor band, wherein xre is the real part of a
band pass signal sample output by the HFR block (804), and wherein yre and yjm are
the real part and the imaginary part of an adjusted band signal for a filter bank
channel, and wherein ghfr is a gain adjustment factor derived from the noise-floor level
vector,
wherein φre and φim form a complex modulation vector for placing a sine into a band
pass signal and wherein k is a modulation vector index ranging between 0 and 4.
28. Method for encoding an audio signal to obtain an encoded signal, the encoded
signal being intended for decoding using a high frequency regeneration
technique, which is suited for generating frequency components above a
predetermined frequency based on frequency components below the
predetermined frequency, the method comprising the following steps:
providing an encoded input signal, which is a coded representation of an input signal,
the input signal being coded using a coding algorithm , and representing a frequency
content of the audio signal below the predetermined frequency;
performing the high frequency regeneration technique on the input signal or a coded
and decoded version thereof to obtain a regenerated signal having frequency
components above the predetermined frequency;

detecting(703a) differences between the regenerated signal and the audio signal,
which are above a significance threshold;
describing (703b) detected differences to obtain additional information; and
combining the encoded input signal and the additional information to produce the
encoded signal.
29. Method for decoding an encoded signal, the encoded signal comprising an
encoded input signal representing a frequency content of an original audio signal
below a predetermined frequency, the encoding being performed using a coding
algorithm, and additional information describing detected differences between a
regenerated signal and the original audio signal, the regenerated signal being
generated by a high frequency regenerating technique from the input signal or a
coded and decoded version thereof, the method comprising the following steps:
obtaining a decoded input signal, which is produced by decoding the encoded input
signal in accordance with the coding algorithm;
reconstruction detected differences based on the additional information;
performing a high frequency regeneration technique similar to the high frequency
regeneration technique for obtaining the detected differences to obtain the regenerated
signal;
producing a high frequency regenerated audio signal based on the decoded input
signal, the reconstructed differences and the regenerated signal.

The present invention proposes a new method and a new apparatus for enhancement
of audio source coding systems utilizing high frequency reconstruction (III-R). It
utilizes a detection mechanism (703a) on the encoder side to assess what parts of the
spectrum will not be correctly reproduced by the III-R method in the decoder.
Information on this is efficiently coded (703b) and sent to the decoder, where it is
combined with the output of the III-R unit.

Documents:

721-KOLNP-2004-(11-01-2012)-FORM-27.pdf

721-KOLNP-2004-(30-03-2012)-CERTIFIED COPIES(OTHER COUNTRIES).pdf

721-KOLNP-2004-(30-03-2012)-CORRESPONDENCE.pdf

721-KOLNP-2004-(30-03-2012)-FORM-13-1.pdf

721-KOLNP-2004-(30-03-2012)-FORM-13.pdf

721-KOLNP-2004-(30-03-2012)-PA-CERTIFIED COPIES.pdf

721-KOLNP-2004-FORM-27.pdf

721-kolnp-2004-granted-abstract.pdf

721-kolnp-2004-granted-claims.pdf

721-kolnp-2004-granted-correspondence.pdf

721-kolnp-2004-granted-description (complete).pdf

721-kolnp-2004-granted-drawings.pdf

721-kolnp-2004-granted-examination report.pdf

721-kolnp-2004-granted-form 1.pdf

721-kolnp-2004-granted-form 18.pdf

721-kolnp-2004-granted-form 2.pdf

721-kolnp-2004-granted-form 26.pdf

721-kolnp-2004-granted-form 3.pdf

721-kolnp-2004-granted-form 5.pdf

721-kolnp-2004-granted-reply to examination report.pdf

721-kolnp-2004-granted-specification.pdf


Patent Number 230121
Indian Patent Application Number 721/KOLNP/2004
PG Journal Number 09/2009
Publication Date 27-Feb-2009
Grant Date 25-Feb-2009
Date of Filing 28-May-2008
Name of Patentee CODING TECHNOLOGIES AB
Applicant Address DOBELNSGATAN 64, S-113 52 STOCKHOLM
Inventors:
# Inventor's Name Inventor's Address
1 EKSTRAND, PER SODERMANNAGATAN 45, S 116 50 STOCKHOLM
2 HORICH, HOLGER WIELANDSTRASSE 9K, D-90419 NURNBERG
3 KJORLING, KRISTOFER LOSTIGEN 10, S-170 75 SOLNA
PCT International Classification Number G10L 21/02
PCT International Application Number PCT/EP2002/013462
PCT International Filing date 2002-11-28
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 0104004-7 2001-11-29 Sweden