Title of Invention | "EFFICIENT CODING OF DIGITAL MEDIA SPECTRAL DATA USING WIDE-SENSE PERCEPTUAL SIMILARITY" |
---|---|
Abstract | Traditional audio encoders may conserve coding bit-rate by encoding fewer than all spectral coefficients, which can produce a blurry low-pass sound in the reconstruction. An audio encoder using wide-sense perceptual similarity improves the quality by encoding a perceptually similar version of the omitted spectral coefficients, represented as a scaled version of already coded spectrum. The omitted spectral coefficients are divided into a number of sub-bands. The sub-bands are encoded as two parameters: a scale factor, which may represent the energy in the band; and a shape parameter, which may represent a shape of the band. The shape parameter may be in the form of a motion vector pointing to a portion of the already coded spectrum, an index to a spectral shape in a fixed code-book, or a random noise vector. The encoding thus efficiently represents a scaled version of a similarly shaped portion of spectrum to be copied at decoding. |
Full Text | DESCRIPTION (COMPLETE) OCR NOT PREPARE DUE TO PRINT PROBLEM We claim: 1. An audio encoding method, comprising: transforming an input audio signal block into a set of spectral coefficients; dividing the spectral coefficients into plural sub-bands; coding values of the spectral coefficients of at least one of the sub-bands in an output bit-stream; and for at least one of the other sub-bands, coding said other sub-band in the output bit-stream as a scaled version of a shape of a portion of the at least one of the sub-bands coded as spectral coefficient values. 2. The audio encoding method of claim 1, wherein said coding said other sub-band comprises coding said other sub-band using a scale parameter and a shape parameter, wherein the shape parameter indicates the portion and the scale parameter is a scaling factor to scale the portion. 3. The audio encoding method of claim 2, wherein said scaling factor represents total energy of said other sub-band. 4. The audio encoding method of claim 3, wherein said scaling factor is a root-mean-square value of co-efficients within said other sub-band. 5. The audio encoding method of claim 2, wherein said shape parameter is a motion vector. 6. The audio encoding method of claim 1, further comprising, for each of plural other sub-bands: performing a search to determine which of a plurality of portions of the at least one sub-bands coded as spectral coefficients is more similar in shape to the respective other sub-band; determining whether the determined portion is sufficiently similar in shape to the respective other sub-band; if so, coding the respective other sub-band as a scaled version of the shape of the determined portion; and otherwise, coding the respective other sub-band as a scaled version of a shape in a fixed codebook or of a random noise vector. 7. The audio encoding method of claim 6, wherein said performing the search comprises performing a least-means-square comparison to a normalized version of each of the plurality of portions. 8. The audio encoding method of claim 6, wherein said otherwise coding the respective other sub-band comprises: performing a search among shapes represented in a fixed codebook for a shape that is more similar in shape to the respective other sub-band; if such similar shape is found in the fixed codebook, coding the respective other sub-band as a scaled version of such similar shape in the fixed codebook; and otherwise, coding the respective other sub-band as a scaled version of a random noise vector. 9. An audio encoder, comprising: a transform for transforming an input audio signal block into a set of spectral coefficients; a base coder for coding values of the spectral coefficients of a baseband portion of the spectral coefficients of the set in an output bit-stream; and a wide-sense perceptual similarity coder for coding at least one other sub-band of other spectral coefficients of the set as a scaled shape of a sub-portion of the baseband portion. 10. The audio encoder of claim 9, wherein the wide-sense perceptual similarity coder produces an encoding of the other sub-band that represents the scaled shape of the sub-portion using a scaling factor parameter and a motion vector parameter. 11. The audio encoder of claim 10, wherein said scaling factor parameter represents total energy of said other sub-band. 12. The audio encoder of claim 11, wherein said scaling factor is a root-mean-square value of co-efficients within said other sub-band. 13. The audio encoder of claim 10, wherein the wide-sense perceptual similarity coder further comprises: means for performing a search, for each of plural other sub-bands, to determine which of a plurality of portions of the at least one sub-bands coded as spectral coefficients is more similar in shape to the respective other sub-band; means for determining whether the determined portion is sufficiently similar in shape to the respective other sub-band; and means for coding the respective other sub-band as a scaled version of the shape of the determined portion, if determined to be sufficiently similar in shape. 14. The audio encoder of claim 10, wherein the wide-sense perceptual similarity coder further comprises: means for performing a search, for each of plural other sub-bands, among shapes represented in a fixed codebook for a shape that is sufficiently similar in shape to the respective other sub-band; means for coding those sub-bands determined to be sufficiently similar in shape to a shape in the fixed codebook as a scaling factor parameter and a motion vector indicating the shape in the fixed codebook. 15. An audio decoder for the encoder of claim 9, comprising: a base decoder for decoding the encoded values of the spectral coefficient of the baseband portion; and a wide-sense perceptual similarity decoder for decoding the encoded other sub-band by copying and scaling the sub-portion of the baseband portion to reproduce a semblance of the spectral coefficients of the other sub-band; and an inverse transform for transforming the decoded spectral coefficients into a reproduction of the input audio signal block. 16. A digital media encoding method, comprising: transforming an input signal block into a set of spectral coefficients; dividing the spectral coefficients into plural disjoint or overlapping sub-bands; coding each sub-band via a selected coding process that best represents the sub-band in a wide-sense perceptual sense given a set of bit-rate, buffer size, and encoder complexity constraints, where the coding process is selected from the following coding processes: coding the sub-band using a baseband codec; representing the sub-band as an appropriately scaled version of a portion of already coded spectrum; representing the sub-band as an appropriately scaled version of a vector from a fixed codebook; and representing the sub-band as an appropriately scaled version of random noise. 17. A method for decoding a coded digital media stream encoded by the method of claim 16, the method for decoding comprising: decoding those of sub-bands encoded using the baseband codec; for each sub-band not encoded using the baseband codec, decoding a scale factor parameter and motion vector, where the motion vector represents a spectral shape of the portion of already coded spectrum, the vector from a fixed codebook, or random noise; and scaling the spectral shape indicated by the motion vector according to the scale factor to reconstruct an approximation of the respective sub-band. |
---|
2740-delnp-2005-Abstract-(14-02-2014).pdf
2740-delnp-2005-Assignment-(14-02-2014).pdf
2740-delnp-2005-Claims-(14-02-2014).pdf
2740-delnp-2005-Claims-(26-09-2014).pdf
2740-delnp-2005-Correspondence Others-(07-03-2013).pdf
2740-delnp-2005-Correspondence Others-(14-02-2014).pdf
2740-DELNP-2005-Correspondence-101114.pdf
2740-DELNP-2005-Correspondence-Others-(03-06-2010).pdf
2740-DELNP-2005-Correspondence-Others-(15-12-2010).pdf
2740-delnp-2005-Correspondence-Others-(26-09-2014).pdf
2740-delnp-2005-correspondence-others.pdf
2740-delnp-2005-description (complete).pdf
2740-DELNP-2005-Drawing-101114.pdf
2740-DELNP-2005-Form 1-101114.pdf
2740-DELNP-2005-Form 5-101114.pdf
2740-delnp-2005-Form--5(26-09-2014).pdf
2740-DELNP-2005-Form-1-(15-12-2010).pdf
2740-delnp-2005-Form-1-(26-09-2014).pdf
2740-DELNP-2005-Form-13-(03-06-2010).pdf
2740-delnp-2005-Form-2-(26-09-2014).pdf
2740-delnp-2005-Form-3-(07-03-2013).pdf
2740-delnp-2005-Form-3-(26-09-2014).pdf
2740-DELNP-2005-GPA-(03-06-2010).pdf
2740-delnp-2005-GPA-(26-09-2014).pdf
2740-DELNP-2005-OTHERS-101114.pdf
2740-delnp-2005-Petition-137-(07-03-2013).pdf
2740-delnp-2005-Petition-137-(26-09-2014).pdf
Patent Number | 264244 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Indian Patent Application Number | 2740/DELNP/2005 | ||||||||||||
PG Journal Number | 51/2014 | ||||||||||||
Publication Date | 19-Dec-2014 | ||||||||||||
Grant Date | 16-Dec-2014 | ||||||||||||
Date of Filing | 21-Jun-2005 | ||||||||||||
Name of Patentee | MICROSOFT CORPORATION | ||||||||||||
Applicant Address | ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052, U.S.A. | ||||||||||||
Inventors:
|
|||||||||||||
PCT International Classification Number | G06F | ||||||||||||
PCT International Application Number | PCT/US2004/24935 | ||||||||||||
PCT International Filing date | 2004-07-29 | ||||||||||||
PCT Conventions:
|