Title of Invention

METHOD AND APPARATUS FOR PERFORMING HARMONIC NOISE WEIGHTING IN DIGITAL SPEECH CODERS

Abstract To address the need for choosing values of harmonic noise weighting (HNW) coefficient (εp) so that the amount of harmonic noise weighting cam be optimized, a method and apparatus for performing harmonic noise weighting in digital speech coders is provided herein. During operation, received speech is analyzed (503) to determine a pitch period. HNW coefficients are then chosen (505) based on the pitch period, and a perceptual noise weighting filter (C(z)) is determined (507) based on the harmonic-noise weighting (HNW) coefficients.
Full Text HARMONIC NOISE WEIGHTING IN
DIGITAL SPEECH CODERS
Cross-reference to Related Application
This application claims priority from provisional application serial no.
60/515,581, entitled "METHOD AND APPARATUS FOR PERFORMING
HARMONIC NOISE WEIGHTING IN DIGITAL SPEECH CODERS," filed
October 30, 2003, which is commonly owned and incorporated herein by
reference in its entirety.
Field of the Invention
The present invention relates, in general, to signal compression systems
and, more particularly, to Code Excited Linear Prediction (CELP)-type speech
coding systems.
Background of the Invention
Compression of digital speech and audio signals is well known.
Compression is generally required to efficiently transmit signals over a
communications channel, or to store compressed signals on a digital media
device, such as a solid-state memory device or computer hard disk. Although
there exist many compression (or "coding") techniques, one method that has
remained very popular for digital speech coding is known as Code Excited
Linear Prediction (CELP), which is one of a family of "analysis-by-synthesis"
coding algorithms. Analysis-by-synthesis generally refers to a coding process
by which parameters of a digital model are used to synthesize a set of candidate
signals that are compared to an input signal and analyzed for distortion. The set
of parameters that yield the lowest distortion, or error component, is then either
transmitted or stored. The set of parameters are eventually used to reconstruct
an estimate of the original input signal. CELP is a particular analysis-by-
synthesis method that uses one or more excitation codebooks that essentially
comprise sets of code-vectors that are retrieved from the codebook in response
to a codebook index. These code-vectors are used as stimuli to the speech
synthesizer in a "trial and error" process in which an error criterion is evaluated
for each of the candidate code-vectors, and the candidates resulting in the
lowest error are selected.
For example, FIG. 1 is a block diagram of prior-art CELP encoder 100.
In CELP encoder 100, an input signal comprising speech sample n (s(n)) is
applied to a Linear Predictive Coding (LPC) analysis block 101, where linear
predictive coding is used to estimate a short-term spectral envelope. The
resulting spectral parameters (or LP parameters) are denoted by the transfer
function A(z). The spectral parameters are applied to LPC Quantization block
102 that quantizes the spectral parameters to produce quantized spectral
parameters Aq that are suitable for use in a multiplexer 108. The quantized
spectral parameters Aq are then conveyed to multiplexer 108, and the
multiplexer produces a coded bit stream based on the quantized spectral
parameters and a set of parameters, t, ß, k, and ?, that are determined by a
squared error minimization/parameter quantization block 107. As one of
ordinary skill in the art will recognize, t, ß, k, and ? are defined as the closed
loop pitch delay, adaptive codebook gain, fixed codebook vector index, and
fixed codebook gain, respectively.
The quantized spectral, or LP, parameters are also conveyed locally to
LPC synthesis filter 105 that has a corresponding transfer function 1/Aq(z). LPC
synthesis filter 105 also receives combined excitation signal u(n) from first
combiner 110 and produces an estimate of the input signal s(n) based on the
quantized spectral parameters Aq and the combined excitation signal u(n).
Combined excitation signal u(n) is produced as follows. An adaptive codebook
code-vector cr is selected from adaptive codebook (ACB) 103 based on the
index parameter r. The adaptive codebook code-vector Cr is then weighted based
on the gain parameter ß and the weighted adaptive codebook code-vector is
conveyed to first combiner 110. A fixed codebook code-vector ck. is selected
from fixed codebook (FCB) 104 based on the index parameter k. The fixed
codebook code-vector cr is then weighted based on the gain parameter y and is
also conveyed to first combiner 110. First combiner 110 then produces
combined excitation signal u(n) by combining the weighted version of adaptive
codebook code-vector cr with the weighted version of fixed codebook code-
vector ck. (For the convenience of the reader, the variables are also given in
terms of their z-transforms. The z-transform of a variable is represented by a
corresponding capital letter, for example z-transform of e(n) is represented as
E(z)).
LPC synthesis filter 105 conveys the input signal estimate s(n) to
second combiner 112. Second combiner 112 also receives input signal s(n) and
subtracts the estimate of the input signal s(n) from the input signal s(n). The
difference between input signal s(n) and input signal estimate s(n) is applied to
a perceptual error weighting filter 106, which produces a perceptually weighted
error signal e(n) based on the difference between s(n) and s(n) and a weighting
function w(n), such that

Perceptually weighted error signal e(n) is then conveyed to squared error
minimization/parameter quantization block 107. Squared error
minimization/parameter quantization block 107 uses the error signal e(n) to
determine an optimal set of parameters t, ß, k, and y that produce the best
estimate s(n)of the input signal s(n).
FIG. 2 is a block diagram of prior-art decoder 200 that receives
transmissions from encoder 100. As one of ordinary skilled in the art realizes,
the coded bit stream produced by encoder 100 is used by a de-multiplexer in
decoder 200 to decode the optimal set of parameters, that is, t, ß, k, and y, in a
process that is identical to the synthesis process performed by encoder 100.
Thus, if the coded bit stream produced by encoder 100 is received by decoder
200 without errors, the speech s(n) output by decoder 200 can be reconstructed
as an exact duplicate of the input speech estimate s(n) produced by encoder
100.
Returning to FIG. 1, weighting filter W(z) utilizes the frequency
masking property of the human ear, such that simultaneously occurring noise is
masked by the stronger signal provided the frequencies of the signal and the
noise are close. As described in Salami R., Laflamme C, Adoul J-P, Massaloux
D., "A toll quality 8 Kb/s speech coder for personal communications system,"
IEEE Trans. On Vehicular Technology, pp. 808-816, Aug 1994 W{z) is derived
from the LPC coefficients ai, and is given by

and p is the order of the LPC. Since the weighting filter is derived from LPC
spectrum, it is also referred to as "spectral weighting".
The above-described procedure does not take into account the fact that
the signal periodicity also contributes to the spectral peaks at the fundamental
frequencies and at the multiples of the fundamental frequencies. Various
techniques have been proposed to utilize noise masking of these fundamental
frequency harmonics. For example, in "Digital speech coder and method
utilizing harmonic noise weighting" Patent No. 5,528,723: Gerson and Jasiuk,
and in Gerson I. A., Jasiuk M.A., "Techniques for improving the performance
of CELP type speech coders," Proc. IEEE ICASSP, pp. 205-208, 1993, a
method was proposed which includes harmonic noise masking in the weighting
filter. As the above-references show, harmonic noise weighting is incorporated
by modifying the spectral weighting filter by a harmonic noise weighting filter
C(z) and is given by:

where D corresponds to the pitch period or the pitch lag or delay, b, are the filter
coefficients and 0 = ep weighting filter incorporating harmonic noise weighting is given by:

The amount of harmonic noise weighting is typically dependent on the
product epbi. Since bi is dependent on the delay, the amount of harmonic noise
weighting is a function of the delay. Prior-art references noted above have
suggested that different values of harmonic noise weighting coefficient (ep) can
be used at different predetermined times: i.e., ep may be a time varying
parameter (for example be allowed to change from sub-frame to sub-frame),
however, the prior art does not provide a method for choosing ep. Therefore, a
need exists for a method and apparatus for performing harmonic noise
weighting in digital speech coders that optimally and dynamically determines
appropriate values of ep so that the amount of harmonic noise weighting can be
optimized. While prior-art references noted above have suggested that different
values of the harmonic noise weighting coefficient (ep) can be used at different
times (e.g., ep may vary from sub-frame to sub-frame), the prior art does not
provide a method for varying ep or suggest when or how such a method may be
beneficial. Therefore, a need exists for a method and apparatus for performing
harmonic noise weighting in digital speech coders that optimally and
dynamically determines appropriate values of ep so that the overall perceptual
weighting can be improved.
Brief Description of the Drawings
FIG. 1 is a block diagram of a prior-art Code, Excited Linear Prediction
(CELP) encoder.
FIG. 2 is a block diagram of a prior-art CELP decoder of the prior art.
FIG. 3 is a block diagram of a CELP decoder in accordance with the
preferred embodiment of the present invention.
FIG. 4 is a graphical representation of ep versus pitch lag (D).
FIG. 5 is a flow chart showing steps executed by a CELP encoder to
include the Harmonic Noise Weighting method of the current invention.
FIG. 6 is a block diagram of a CELP encoder in accordance with an
alternate embodiment of the present invention.
Description of the Invention
To address the need for choosing values of harmonic noise weighting
(HNW) coefficient (ep) so that the amount of harmonic noise weighting can be
optimized, a method and apparatus for performing harmonic noise weighting in
digital speech coders is provided herein. During operation, received speech is
analyzed to determine a pitch period. HNW coefficients are then chosen based
on the pitch period, and a perceptual noise weighting filter (C(z)) is determined
based on the harmonic-noise weighting (HNW) coefficients (ep). For large pitch
periods (D), the peaks of the fundamental frequency harmonics are very close
and hence the valleys between the adjacent harmonics may lie in the masking
region of the adjoining peaks. Thus, there may be no need to have a strong
harmonic noise weighting coefficient for larger values of D.
Because HNW coefficients are a function of pitch period, a better noise
weighting can be performed and hence the speech distortions are less noticeable
to the listeners.
The present invention encompasses a method for performing harmonic
noise weighting in a digital speech coder. The method comprises the steps of
receiving a speech input s(n) determining a pitch period (D) from the speech
input, and determining a harmonic noise weighting coefficient ep based on the
pitch period. A perceptual noise weighting function WH (z) is then determined
based on the harmonic noise weighting coefficient.
The present invention additionally encompasses a method for
performing harmonic noise weighting in a digital speech coder. The method
comprises the steps of receiving a speech input.s(n), determining a closed-loop
pitch delay (r) from the speech input, and determining a harmonic noise
weighting coefficient ep based on the closed-loop pitch delay. A perceptual
noise weighting function Wn(z) is then determined based on the harmonic noise
weighting coefficient
The present invention additionally encompasses an apparatus
comprising pitch analysis circuitry having speech (s(n)) as an input and
outputting a pitch period (D) based on the speech, a harmonic noise coefficient
generator having D as an input and outputting a harmonic noise weighting
coefficient ( ep ) based on D, and a perceptual error weighting filter having ep
as an input and utilizing ep to generate a weighted error signal e(n), wherein
e(n) is based on a difference between s(n) and an estimate of s(n).
The present invention finally encompasses an apparatus comprising a
harmonic noise coefficient generator having a closed-loop pitch delay (t) as an
input and outputting a harmonic noise weighting coefficient (ep ) based on t, a
perceptual error weighting filter having ep as an input and utilizing ep to
generate a weighted error signal e(n), wherein e(n) is based on a difference
between.s(n) and an estimate of s(n).
Turning now to the drawings, wherein like numerals designate like
components, FIG. 3 is a block diagram of CELP coder 300 in accordance with
the preferred embodiment of the present invention. As shown, CELP decoder
300 is similar to those shown in the prior art, except for the addition of pitch
analysis circuitry 311 and HNW coefficient generator 309. Additionally
Perceptual Error weighting Filter 306 is adapted to receive HNW coefficients
from HNW Coefficient generator 309. Operation of coder 300 occurs as
follows:
Input speech s(n) is directed towards pitch analysis circuitry 311, where
s(n) is analyzed to determine a pitch period (D). As one of ordinary skill in the
art will recognize, pitch period (additionally referred to as pitch lag, delay, or
pitch delay) is typically the time lag at which the past input speech has the
maximum correlation with current input speech.
Once the pitch period (D) is determined, D is directed towards HNW
coefficient generator 309 where a HNW coefficient (ep) for the particular
speech is determined. As discussed above, the harmonic noise weighting
coefficient is allowed to dynamically vary as a function of the pitch period D
The harmonic noise-weighting filter is given by:

As mentioned above, it is desirable to have less harmonic noise
weighting (C(z)) for larger value of D. Choosing eP as a decreasing function of
D (see Eq. 7) ensures a lower amount of harmonic noise weighting for larger
values of pitch delay. Although many functions of ep(D) exist, in the preferred
embodiment of the present invention ep(D) is given by equation (7) and shown
graphically in FIG. 4.

where,
emax is the maximum allowable value of the harmonic noise weighting
coefficient;
emin is the minimum allowable value of the harmonic noise weighting
coefficient;
Dmax is the maximum pitch period above which the harmonic noise weighting
coefficient is set to emin;
A is the slope for the harmonic noise weighting coefficient.
Once ep (D) is determined by generator 309, ep(D) is supplied to filter
306 to generate the weighting filter WH(z). As described above, WH(z) is the
product of W(z) and C(z). The error s(n) ~ s(n) is supplied to weighting filter
306 to generate the weighted error signal e(n). As in prior-art encoders, error
weighting filter 306 produces the weighted error signal e(n) based on a
difference between the input signal and the estimated input signal, that is:

Weighting filter WH (z) utilizes the frequency masking property of the
human ear, such that simultaneously occurring noise is masked by the stronger
signal provided the frequencies of the signal and the noise are close. Based on
the value of e(n), squared Error Minimization/Parameter Quantization circuitry
307 produces values of t, k, ?, ß which are transmitted on the channel, or stored
on a digital media device.
As discussed above, because HNW coefficients are a function of pitch
period, a better noise weighting can be performed and hence the speech
distortions are less noticeable to the listener.
FIG. 5 is a flow chart showing operation of encoder 300. The logic flow
begins at step 501 where a speech input (s(n)) is received by pitch analysis
circuitry 311. At step 503, pitch analysis circuitry 311 determines a pitch period
(D) and outputs D to HNW coefficient generator 309. HNW coefficient
generator 309 utilizes D to determine a harmonic noise weighting coefficient
(ep) based on D and outputs ep to perceptual error weighting filter 306 (step
505). Finally, at step 507 filter 306 utilizes ep to produce a perceptual noise
weighting function WH(z).
While the invention has been particularly shown and described with
reference to a particular embodiment, it will be understood by those skilled in
the art that various changes in form and details may be made therein without
departing from the spirit and scope of the invention. For example, although a
specific formula was given for the production of WH (z) from ep, it is intended
that other means for producing WH (z) from eP may be utilized. For example,
the summation term in the definition of C(z) in equation (6) can be further
modified before multiplying with ep. Additionally, in an alternate embodiment
ep can be based on t with r (see FIG. 6) replacing D in equation (7). As
discussed above t is defined as the closed loop pitch delay, with ep being a
decreasing function of t. Thus, equation (7) becomes:

where,
emax is the maximum allowable value of the harmonic noise weighting
coefficient;
emin is the minimum allowable value of the harmonic noise weighting
coefficient;
tmax is the maximum closed-loop pitch delay above which harmonic noise
weighting coefficient is set to emin;
A is the slope for the harmonic noise weighting coefficient.
Claims
1. A method for performing harmonic noise weighting in a digital speech coder,
the method comprising the steps of:
5 receiving a speech input s(n);
determining a pitch period (D) from the speech input;
determining a harmonic noise weighting coefficient e pbased on the
pitch period; and
determining a perceptual noise weighting function WH (z) based on the
10 harmonic noise weighting coefficient.
2. The method of claim 1 wherein ep is a decreasing function of D.
3. The method of claim 2 wherein:
15

emax is a maximum allowable value of the harmonic noise weighting coefficient;
emin is a minimum allowable value of the harmonic noise weighting coefficient;
20 Dmax is a maximum pitch period above which harmonic noise weighting
coefficient is set to emin and
A is the slope for the harmonic noise weighting coefficient.
4. A method for performing harmonic noise weighting in a digital speech coder,
25 the method comprising the steps of:
receiving a speech input s(n);
determining a closed-loop pitch delay (t) from the speech input;
determining a harmonic noise weighting coefficient ep based on the
closed-loop pitch delay; and
30 determining a perceptual noise weighting function WH(z) based on the
harmonic noise weighting coefficient.
5. The method of claim 4 wherein e p is a decreasing function of t.
6. The method of claim 5 wherein:
5
where,
emax is a maximum allowable value of the harmonic noise weighting coefficient;
emin is a minimum allowable value of the harmonic noise weighting coefficient;
10 tmax is a maximum closed-loop pitch delay above which harmonic noise
weighting coefficient is set to emin; and
A is the slope for the harmonic noise weighting coefficient.
7. An apparatus comprising:
15 pitch analysis circuitry having speech (s(n)) as an input and outputting a
pitch period (D) based on the speech;
a harmonic noise coefficient generator having D as an input and
outputting a harmonic noise weighting coefficient ( ep ) based on D; and
a perceptual error weighting filter having ep as an input and utilizing
20 ep to generate a weighted error signal e(n), wherein e(n) is based on a
difference between s(n) and an estimate of s(n).
8. An apparatus comprising:
a harmonic noise coefficient generator having a closed-loop pitch delay
25 (t) as an input and outputting a harmonic noise weighting coefficient (ep)
based on t, and
a perceptual error weighting filter having ep as an input and utilizing
ep to generate a weighted error signal e(n), wherein e(n) is based on a
difference between s(n) and an estimate of s(n).
30

To address the need for choosing values of harmonic noise weighting
(HNW) coefficient (εp) so that the amount of harmonic noise weighting cam be
optimized, a method and apparatus for performing harmonic noise weighting in
digital speech coders is provided herein. During operation, received speech is
analyzed (503) to determine a pitch period. HNW coefficients are then chosen
(505) based on the pitch period, and a perceptual noise weighting filter (C(z)) is
determined (507) based on the harmonic-noise weighting (HNW) coefficients.

Documents:

233834-CORRESPONDENCE 1.1.pdf

233834-PA.pdf

753-KOLNP-2006-(29-03-2012)-ASSIGNMENT.pdf

753-KOLNP-2006-(29-03-2012)-CERTIFIED COPIES(OTHER COUNTRIES).pdf

753-KOLNP-2006-(29-03-2012)-CORRESPONDENCE.pdf

753-KOLNP-2006-(29-03-2012)-FORM-16.pdf

753-KOLNP-2006-(29-03-2012)-PA-CERTIFIED COPIES.pdf

753-KOLNP-2006-FORM-27.pdf

753-kolnp-2006-granted-abstract.pdf

753-kolnp-2006-granted-assignment.pdf

753-kolnp-2006-granted-claims.pdf

753-kolnp-2006-granted-correspondence.pdf

753-kolnp-2006-granted-description (complete).pdf

753-kolnp-2006-granted-drawings.pdf

753-kolnp-2006-granted-examination report.pdf

753-kolnp-2006-granted-form 1.pdf

753-kolnp-2006-granted-form 18.pdf

753-kolnp-2006-granted-form 3.pdf

753-kolnp-2006-granted-form 5.pdf

753-kolnp-2006-granted-pa.pdf

753-kolnp-2006-granted-reply to examination report.pdf

753-kolnp-2006-granted-specification.pdf

753-KOLNP-2006-OTHER PATENT DOCUMENT.pdf


Patent Number 233834
Indian Patent Application Number 753/KOLNP/2006
PG Journal Number 16/2009
Publication Date 17-Apr-2009
Grant Date 16-Apr-2009
Date of Filing 29-Mar-2006
Name of Patentee MOTOROLA, INC.
Applicant Address 1303 EAST ALGONQUIN ROAD, SCHAUMBURG, ILLINOIS 60196
Inventors:
# Inventor's Name Inventor's Address
1 MITTAL, UDAR 964 C ATLANTIC AVENUE, HOFFMAN ESTATES, IL 60194
2 ASHLEY, JAMES, P. 1816 ARABIAN AVENUE, NAPERVILLE, IL 60565
PCT International Classification Number G10L 19/00, 19/02
PCT International Application Number PCT/US2004/035757
PCT International Filing date 2004-10-26
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 60/515,581 2003-10-30 U.S.A.
2 10/965,462 2004-10-14 U.S.A.