Title of Invention	"EFFICIENT CODING OF DIGITAL MEDIA SPECTRAL DATA USING WIDE-SENSE PERCEPTUAL SIMILARITY"
Abstract	Traditional audio encoders may conserve coding bit-rate by encoding fewer than all spectral coefficients, which can produce a blurry low-pass sound in the reconstruction. An audio encoder using wide-sense perceptual similarity improves the quality by encoding a perceptually similar version of the omitted spectral coefficients, represented as a scaled version of already coded spectrum. The omitted spectral coefficients are divided into a number of sub-bands. The sub-bands are encoded as two parameters: a scale factor, which may represent the energy in the band; and a shape parameter, which may represent a shape of the band. The shape parameter may be in the form of a motion vector pointing to a portion of the already coded spectrum, an index to a spectral shape in a fixed code-book, or a random noise vector. The encoding thus efficiently represents a scaled version of a similarly shaped portion of spectrum to be copied at decoding.

Title of Invention

"EFFICIENT CODING OF DIGITAL MEDIA SPECTRAL DATA USING WIDE-SENSE PERCEPTUAL SIMILARITY"

Abstract

Traditional audio encoders may conserve coding bit-rate by encoding fewer than all spectral coefficients, which can produce a blurry low-pass sound in the reconstruction. An audio encoder using wide-sense perceptual similarity improves the quality by encoding a perceptually similar version of the omitted spectral coefficients, represented as a scaled version of already coded spectrum. The omitted spectral coefficients are divided into a number of sub-bands. The sub-bands are encoded as two parameters: a scale factor, which may represent the energy in the band; and a shape parameter, which may represent a shape of the band. The shape parameter may be in the form of a motion vector pointing to a portion of the already coded spectrum, an index to a spectral shape in a fixed code-book, or a random noise vector. The encoding thus efficiently represents a scaled version of a similarly shaped portion of spectrum to be copied at decoding.

Full Text	DESCRIPTION (COMPLETE) OCR NOT PREPARE DUE TO PRINT PROBLEM We claim: 1. An audio encoding method, comprising: transforming an input audio signal block into a set of spectral coefficients; dividing the spectral coefficients into plural sub-bands; coding values of the spectral coefficients of at least one of the sub-bands in an output bit-stream; and for at least one of the other sub-bands, coding said other sub-band in the output bit-stream as a scaled version of a shape of a portion of the at least one of the sub-bands coded as spectral coefficient values. 2. The audio encoding method of claim 1, wherein said coding said other sub-band comprises coding said other sub-band using a scale parameter and a shape parameter, wherein the shape parameter indicates the portion and the scale parameter is a scaling factor to scale the portion. 3. The audio encoding method of claim 2, wherein said scaling factor represents total energy of said other sub-band. 4. The audio encoding method of claim 3, wherein said scaling factor is a root-mean-square value of co-efficients within said other sub-band. 5. The audio encoding method of claim 2, wherein said shape parameter is a motion vector. 6. The audio encoding method of claim 1, further comprising, for each of plural other sub-bands: performing a search to determine which of a plurality of portions of the at least one sub-bands coded as spectral coefficients is more similar in shape to the respective other sub-band; determining whether the determined portion is sufficiently similar in shape to the respective other sub-band; if so, coding the respective other sub-band as a scaled version of the shape of the determined portion; and otherwise, coding the respective other sub-band as a scaled version of a shape in a fixed codebook or of a random noise vector. 7. The audio encoding method of claim 6, wherein said performing the search comprises performing a least-means-square comparison to a normalized version of each of the plurality of portions. 8. The audio encoding method of claim 6, wherein said otherwise coding the respective other sub-band comprises: performing a search among shapes represented in a fixed codebook for a shape that is more similar in shape to the respective other sub-band; if such similar shape is found in the fixed codebook, coding the respective other sub-band as a scaled version of such similar shape in the fixed codebook; and otherwise, coding the respective other sub-band as a scaled version of a random noise vector. 9. An audio encoder, comprising: a transform for transforming an input audio signal block into a set of spectral coefficients; a base coder for coding values of the spectral coefficients of a baseband portion of the spectral coefficients of the set in an output bit-stream; and a wide-sense perceptual similarity coder for coding at least one other sub-band of other spectral coefficients of the set as a scaled shape of a sub-portion of the baseband portion. 10. The audio encoder of claim 9, wherein the wide-sense perceptual similarity coder produces an encoding of the other sub-band that represents the scaled shape of the sub-portion using a scaling factor parameter and a motion vector parameter. 11. The audio encoder of claim 10, wherein said scaling factor parameter represents total energy of said other sub-band. 12. The audio encoder of claim 11, wherein said scaling factor is a root-mean-square value of co-efficients within said other sub-band. 13. The audio encoder of claim 10, wherein the wide-sense perceptual similarity coder further comprises: means for performing a search, for each of plural other sub-bands, to determine which of a plurality of portions of the at least one sub-bands coded as spectral coefficients is more similar in shape to the respective other sub-band; means for determining whether the determined portion is sufficiently similar in shape to the respective other sub-band; and means for coding the respective other sub-band as a scaled version of the shape of the determined portion, if determined to be sufficiently similar in shape. 14. The audio encoder of claim 10, wherein the wide-sense perceptual similarity coder further comprises: means for performing a search, for each of plural other sub-bands, among shapes represented in a fixed codebook for a shape that is sufficiently similar in shape to the respective other sub-band; means for coding those sub-bands determined to be sufficiently similar in shape to a shape in the fixed codebook as a scaling factor parameter and a motion vector indicating the shape in the fixed codebook. 15. An audio decoder for the encoder of claim 9, comprising: a base decoder for decoding the encoded values of the spectral coefficient of the baseband portion; and a wide-sense perceptual similarity decoder for decoding the encoded other sub-band by copying and scaling the sub-portion of the baseband portion to reproduce a semblance of the spectral coefficients of the other sub-band; and an inverse transform for transforming the decoded spectral coefficients into a reproduction of the input audio signal block. 16. A digital media encoding method, comprising: transforming an input signal block into a set of spectral coefficients; dividing the spectral coefficients into plural disjoint or overlapping sub-bands; coding each sub-band via a selected coding process that best represents the sub-band in a wide-sense perceptual sense given a set of bit-rate, buffer size, and encoder complexity constraints, where the coding process is selected from the following coding processes: coding the sub-band using a baseband codec; representing the sub-band as an appropriately scaled version of a portion of already coded spectrum; representing the sub-band as an appropriately scaled version of a vector from a fixed codebook; and representing the sub-band as an appropriately scaled version of random noise. 17. A method for decoding a coded digital media stream encoded by the method of claim 16, the method for decoding comprising: decoding those of sub-bands encoded using the baseband codec; for each sub-band not encoded using the baseband codec, decoding a scale factor parameter and motion vector, where the motion vector represents a spectral shape of the portion of already coded spectrum, the vector from a fixed codebook, or random noise; and scaling the spectral shape indicated by the motion vector according to the scale factor to reconstruct an approximation of the respective sub-band.

Full Text

DESCRIPTION (COMPLETE) OCR NOT PREPARE
DUE TO PRINT PROBLEM

We claim:
1. An audio encoding method, comprising:
transforming an input audio signal block into a set of spectral coefficients;
dividing the spectral coefficients into plural sub-bands;
coding values of the spectral coefficients of at least one of the sub-bands in an output bit-stream; and
for at least one of the other sub-bands, coding said other sub-band in the output bit-stream as a scaled version of a shape of a portion of the at least one of the sub-bands coded as spectral coefficient values.
2. The audio encoding method of claim 1, wherein said coding said other sub-band comprises coding said other sub-band using a scale parameter and a shape parameter, wherein the shape parameter indicates the portion and the scale parameter is a scaling factor to scale the portion.
3. The audio encoding method of claim 2, wherein said scaling factor represents total energy of said other sub-band.
4. The audio encoding method of claim 3, wherein said scaling factor is a root-mean-square value of co-efficients within said other sub-band.
5. The audio encoding method of claim 2, wherein said shape parameter is a motion vector.
6. The audio encoding method of claim 1, further comprising, for each of plural other sub-bands:
performing a search to determine which of a plurality of portions of the at least one sub-bands coded as spectral coefficients is more similar in shape to the respective other sub-band;
determining whether the determined portion is sufficiently similar in shape to the respective other sub-band;
if so, coding the respective other sub-band as a scaled version of the shape of the determined portion; and
otherwise, coding the respective other sub-band as a scaled version of a shape in a fixed codebook or of a random noise vector.
7. The audio encoding method of claim 6, wherein said performing the search comprises performing a least-means-square comparison to a normalized version of each of the plurality of portions.
8. The audio encoding method of claim 6, wherein said otherwise coding the respective other sub-band comprises:
performing a search among shapes represented in a fixed codebook for a shape that is more similar in shape to the respective other sub-band;
if such similar shape is found in the fixed codebook, coding the respective other sub-band as a scaled version of such similar shape in the fixed codebook; and
otherwise, coding the respective other sub-band as a scaled version of a random noise vector.
9. An audio encoder, comprising:
a transform for transforming an input audio signal block into a set of spectral coefficients;
a base coder for coding values of the spectral coefficients of a baseband portion of the spectral coefficients of the set in an output bit-stream; and
a wide-sense perceptual similarity coder for coding at least one other sub-band of other spectral coefficients of the set as a scaled shape of a sub-portion of the baseband portion.
10. The audio encoder of claim 9, wherein the wide-sense perceptual
similarity coder produces an encoding of the other sub-band that represents the
scaled shape of the sub-portion using a scaling factor parameter and a motion vector
parameter.
11. The audio encoder of claim 10, wherein said scaling factor parameter represents total energy of said other sub-band.
12. The audio encoder of claim 11, wherein said scaling factor is a root-mean-square value of co-efficients within said other sub-band.
13. The audio encoder of claim 10, wherein the wide-sense perceptual similarity coder further comprises:
means for performing a search, for each of plural other sub-bands, to determine which of a plurality of portions of the at least one sub-bands coded as spectral coefficients is more similar in shape to the respective other sub-band;
means for determining whether the determined portion is sufficiently similar in shape to the respective other sub-band; and
means for coding the respective other sub-band as a scaled version of the shape of the determined portion, if determined to be sufficiently similar in shape.
14. The audio encoder of claim 10, wherein the wide-sense perceptual
similarity coder further comprises:
means for performing a search, for each of plural other sub-bands, among shapes represented in a fixed codebook for a shape that is sufficiently similar in shape to the respective other sub-band;
means for coding those sub-bands determined to be sufficiently similar in shape to a shape in the fixed codebook as a scaling factor parameter and a motion vector indicating the shape in the fixed codebook.
15. An audio decoder for the encoder of claim 9, comprising:
a base decoder for decoding the encoded values of the spectral coefficient of the baseband portion; and
a wide-sense perceptual similarity decoder for decoding the encoded other sub-band by copying and scaling the sub-portion of the baseband portion to reproduce a semblance of the spectral coefficients of the other sub-band; and
an inverse transform for transforming the decoded spectral coefficients into a reproduction of the input audio signal block.
16. A digital media encoding method, comprising: transforming an input signal block into a set of spectral coefficients; dividing the spectral coefficients into plural disjoint or overlapping sub-bands;
coding each sub-band via a selected coding process that best represents the sub-band in a wide-sense perceptual sense given a set of bit-rate, buffer size, and encoder complexity constraints, where the coding process is selected from the following coding processes:
coding the sub-band using a baseband codec; representing the sub-band as an appropriately scaled version of a portion of already coded spectrum;
representing the sub-band as an appropriately scaled version of a vector from a fixed codebook; and
representing the sub-band as an appropriately scaled version of random noise.
17. A method for decoding a coded digital media stream encoded by the method of claim 16, the method for decoding comprising:
decoding those of sub-bands encoded using the baseband codec; for each sub-band not encoded using the baseband codec,
decoding a scale factor parameter and motion vector, where the motion vector represents a spectral shape of the portion of already coded spectrum, the vector from a fixed codebook, or random noise; and
scaling the spectral shape indicated by the motion vector according to the scale factor to reconstruct an approximation of the respective sub-band.

Documents:

2740-delnp-2005-Abstract-(14-02-2014).pdf

2740-delnp-2005-abstract.pdf

2740-delnp-2005-Assignment-(14-02-2014).pdf

2740-delnp-2005-Claims-(14-02-2014).pdf

2740-delnp-2005-Claims-(26-09-2014).pdf

2740-delnp-2005-claims.pdf

2740-delnp-2005-Correspondence Others-(07-03-2013).pdf

2740-delnp-2005-Correspondence Others-(14-02-2014).pdf

2740-DELNP-2005-Correspondence-101114.pdf

2740-DELNP-2005-Correspondence-Others-(03-06-2010).pdf

2740-DELNP-2005-Correspondence-Others-(15-12-2010).pdf

2740-delnp-2005-Correspondence-Others-(26-09-2014).pdf

2740-delnp-2005-correspondence-others.pdf

2740-delnp-2005-description (complete).pdf

2740-DELNP-2005-Drawing-101114.pdf

2740-delnp-2005-drawings.pdf

2740-DELNP-2005-Form 1-101114.pdf

2740-DELNP-2005-Form 5-101114.pdf

2740-delnp-2005-Form--5(26-09-2014).pdf

2740-DELNP-2005-Form-1-(15-12-2010).pdf

2740-delnp-2005-Form-1-(26-09-2014).pdf

2740-delnp-2005-form-1.pdf

2740-DELNP-2005-Form-13-(03-06-2010).pdf

2740-delnp-2005-form-18.pdf

2740-delnp-2005-Form-2-(26-09-2014).pdf

2740-delnp-2005-form-2.pdf

2740-delnp-2005-Form-3-(07-03-2013).pdf

2740-delnp-2005-Form-3-(26-09-2014).pdf

2740-delnp-2005-form-3.pdf

2740-delnp-2005-form-5.pdf

2740-DELNP-2005-GPA-(03-06-2010).pdf

2740-delnp-2005-GPA-(26-09-2014).pdf

2740-delnp-2005-gpa.pdf

2740-DELNP-2005-OTHERS-101114.pdf

2740-delnp-2005-pct-101.pdf

2740-delnp-2005-pct-105.pdf

2740-delnp-2005-pct-210.pdf

2740-delnp-2005-pct-220.pdf

2740-delnp-2005-pct-237.pdf

2740-delnp-2005-pct-301.pdf

2740-delnp-2005-pct-304.pdf

2740-delnp-2005-Petition-137-(07-03-2013).pdf

2740-delnp-2005-Petition-137-(26-09-2014).pdf

« Previous Patent

Next Patent »

Patent Number

264244

Indian Patent Application Number

2740/DELNP/2005

PG Journal Number

51/2014

Publication Date

19-Dec-2014

Grant Date

16-Dec-2014

Date of Filing

21-Jun-2005

Name of Patentee

MICROSOFT CORPORATION

Applicant Address

ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052, U.S.A.

Inventors:

#	Inventor's Name	Inventor's Address
1	SANJEEV MEHROTRA	10201 127th avenue, ne kirkland, washington 98033 united states of america
2	WEI-GE CHEN	24635 se 37th street, issaquah, washington 98029 united states of america

PCT International Classification Number

G06F

PCT International Application Number

PCT/US2004/24935

PCT International Filing date

2004-07-29

PCT Conventions:

#	PCT Application Number	Date of Convention	Priority Country
1	10/882,801	2004-06-29	U.S.A.
2	60/539,046	2004-01-23	U.S.A.