Title of Invention | "METHODS AND APPARATUSES FOR CODING OF CHANNELS IN PARAMETRIC MULTI-CHANNEL CODING SYSTEM |
---|---|
Abstract | The invention relates to a method in parametric multi-channel Coding system for encoding a multi-channel audio signal having a plurality of audio input channels, characterized by comprising : applying a parametric audio encoding technique to generate parametric audio codes for a first subset of the audio input channels for a first frequency region; and applying the parametric audio encoding technique to generate parametric audio for a second subset of the audio input channels for a second frequency region, wherein: the second frequency region is different from the first frequency region; and the second subset is different from the subset. |
Full Text | FIELD OF THE INVENTION The present invention relates to the encoding of audio signals and the subsequent synthesis of auditory scenes from the encoded audio data. BACKGROUND OF THE INVENTION Multi-channel surround audio systems have been standard in movie theaters for years. As technology has advanced, it has become affordable to produce multi- channel surround systems for home use. Today, such systems are mostly sold as "home theater systems." Conforming to an ITU-R recommendation, the vast majority of these systems provide five regular audio channels and one low- frequency sub-woofer channel (denoted the low-frequency effects or LFE channel). Such multi-channel system is denoted a 5.1 surround system. There are other surround systems, such as 7.1 (seven regular channels and one LFE channel) and 10.2 (ten regular channels and two LFE channels). C. Faller and F. Baumgarte, "Efficient representation of spatial audio coding using perceptual parametrization," IEEE Workshop on Application of Sig. Proc. to Audio and Acoust., October 2001, and C. Faller and F. Baumgarte, "Binaural Cue Coding Applied to Stereo and Multi-Channel Audio Compression, "Preprint 112th Conv. Aud. Eng. Soc, May 2002, (collectively, "the BCC papers") the teaching of both of which are incorporated herein by reference, describe a parametric multi- channel audio coding technique (referred to as BCC coding). Figure 1 shows a block diagram of an audio processing system 100 that performs binaural cue coding (BCC) according to the BCC papers. BCC system 100 has a BCC encoder 102 that receives C. invention have (1) reduced processing loads at both the encoder and decoder and (2) smaller BCC code bitstreams than corresponding BCC-based systems that process all six channels at all frequencies. More generally, the present invention involves the application of parametric audio coding techniques, such as BCC coding, but not necessarily limited to BCC coding, where two or more different subsets of input channels are processed for two or more different frequency ranges. As used in this specification, the term "subset" may refer to the set containing all of the input channels as well as to those proper subsets that include fewer than all of the input channels. The application of the present invention to BCC coding of 5.1 and other surround signals is just one particular example of the present invention. Accordingly, there is provided A method in parametric multi-channel Coding system for encoding a multi-channel audio signal having a plurality of audio input channels, characterized by comprising applying a parametric audio encoding technique to generate parametric audio codes for a first subset of the audio input channels for a first frequency region; and applying the parametric audio encoding technique to generate parametric audio for a second subset of the audio input channels for a second frequency region, wherein the second frequency region is different from the first frequency region; and the second subset is different from the subset. BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS Other aspects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which : Figure 1 shows a block diagram of an audio processing system that performs binaural cue coding (BCC); and Figure 2 shows a block diagram of an audio processing system that performs BCC coding according to one embodiment of the present invention. DETAILED DESCRIPTION Figure 2 shows a block diagram of an audio processing system 200 that performs binaural cue coding (BCC) for 5.1 surround audio, according to one embodiment of the present invention. BCC system 200 has a BCC encoder 202, which receives six audio input channels 208 (i.e. five regular channels and one LFE channel). BCC encoder 202 has a downmixer 210, which converts (e.g., averages) the audio input channels (including the LEF channel) into one or more, but fewer than six, combined channels 212. In addition, BCC encoder 202 has a BCC analyzer 214, which generates BCC cue code data stream 216 for the input channels. As indicated in Figure 2, for frequency sub-bands at or below a specified cut-off frequency fc- BCC analyzer 214 uses all six 5.1 surround sound input channels (including the LEF channel) when generating the BCC cue code data. For all other (i.e., high-frequency) sub- bands, BCC analyzer 214 uses only the five regular channels (and not the LFE channel) to generate the BCC cue code data. As a result, the LFE channel contributes BCC codes for only BCC sub-bands at or below the cut-off frequency rather than for the full BCC frequency range, thereby reducing the overall size of the side-information bitstream. The cut-off frequency is preferably chosen such that the effective audio bandwidth of the LFE channel is smaller than or equal to fC' (that is, the LFE channel has substantially zero energy or insubstantial audio content beyond the cut-off frequency). Unless the frequency sub-bands are aligned with the cut-off frequency, the cut-off frequency falls within a particular frequency sub-band. In that. case, part of that sub-band will exceeds the cut-off frequency. For purposes of this specification, such a sub-band is referred to as being "at" the cut-off frequency. In preferred embodiments, that entire sub- band of the LFE channel is BCC coded, arid the next higher frequency sub-band is the first high- frequency sub-band that is not BCC coded. In one possible implementation, the BCC cue codes include inter-channel level difference (ICLD), inter-channel time difference (ICTD), and inter-channel correlation (ICC) data for the input channels. BCC analyzer 214 preferably performs band-based processing analogous to that described in the '877 and '458 applications to generate ICLD and ICTD data for different frequency sub-bands of the audio input channels. In addition, BCC analyzer 214 preferably generates coherence measures as the ICC data for the different frequency sub-bands. These coherence measures are described in greater detail in the '437 and '591 applications. BCC encoder 202 transmits the one or more combined channels 212 and the BCC cue code data stream 216 (e.g., as either in-band or out-of-band side information with respect to the combined channels) to a BCC decoder 204 of BCC system 200. BCC decoder 204 has a side-information processor 218, which processes data stream 216 to recover the BCC cue codes 220 (e.g., ICLD, ICTD, and ICC data). BCC decoder 204 also has a BCC synthesizer 222, which uses the recovered BCC cue codes 220 to synthesize six audio output channels 224 from the one or more combined channels 212 for rendering by six surround-sound loudspeakers 226, respectively. As indicated in Fig. 2, BCC synthesizer 222 performs six-channel BCC synthesis for sub-bands at or below the cut-off frequency fc to generate frequency content for all six 5.1 surround channels (i.e., including the LFE channel), while performing five-channel BCC synthesis for sub-bands above the cut- off frequency to generate frequency content for only the five regular channels of 5.1 surround sound. In particular, BCC synthesizer 222 decomposes the received combined channel(s) 212 into a number of frequency sub-bands,(e.g., critical bands). In these sub-bands, different processing is applied to obtain the corresponding sub-bands of the output audio channels. The result is that, for the LFE channel, only sub-bands with frequencies at or below the cut-off frequency are obtained. In other words, the LFE channel has frequency content only for sub-bands at or below the cut-off frequency. The upper sub- bands of the LFE channel (i.e., those above the cut-off frequency) may be filled with zero signals (if necessary). Depending on the particular implementation, a BCC encoder could be designed to generate BCC cue codes for all frequencies and simply not transmit those codes for particular sub-bands (e.g., sub- bands above the cut-off frequency and/or sub-bands having substantially zero energy). Similarly, the corresponding BCC decoder could designed to perform conventional BCC synthesis for all frequencies, where the BCC decoder applies appropriate BCC cue code values for those sub-bands having no explicitly transmitted codes. Although the present invention has been described in the context of BCC decoders that apply the techniques of the '877 and '458 applications to synthesize auditory scenes, the present invention can also be implemented in the context of BCC decoders that apply other techniques for synthesizing auditory scenes that do not necessarily rely on the techniques of the '877 and '458 applications. For example, the BCC processing of the present invention can be implemented without ICTD, ICLD, and/or ICC data, with or without other suitable cue codes, such as, for example, those associated with head-related transfer functions. In the embodiment of Fig. 2, 5.1 surround sound is encoded by applying six-channel BCC analysis to sub-bands at or below the cut-off frequency and five-channel BCC analysis to sub-bands above the cut-off frequency. In another embodiment, the present invention can be applied to 7.1 surround sound in which eight-channel BCC analysis is applied to sub-bands at or below a specified cut- off frequency and seven-channel BCC analysis (excluding the single LFE channel) is applied to sub- bands above the cut-off frequency. The present invention can also be applied to surround audio having more than one LFE channel. For example, for 10.2 surround sound, twelve-channel BCC analysis could be applied to sub-bands at or below a specified cut-off frequency, while ten-channel BCC analysis (excluding the two LFE channels) could be applied to sub-bands above the cut-off frequency. Alternatively, there could be two different cut-off frequencies specified: a first cut-off frequency for a first LFE channel of the 10.2 surround sound and second cut-off frequency for the second LFE channel. In this case and assuming that the first cut-off frequency is lower than the second cut-off frequency, twelve-channel BCC analysis could be applied to sub-bands at or below the first cut-off frequency, eleven-channel BCC analysis (excluding the first LFE channel) could be applied to sub-bands that are (1) above the first cut-off frequency and (2) at or below the second cut-off frequency, and ten-channel BCC analysis (excluding both LFE channels) could be applied to sub-bands above the second cut-off frequency. Similarly, some consumer multi-channel equipment is purposely designed with different output channels having different frequency ranges. For example, some 5.1 surround sound equipment have two rear channels that are designed to reproduce only frequencies below 7kHz. The present invention could be applied to such systems by specifying two cut-off frequencies: one for the LFE channel and a higher one for the rear channels. In this case, six-channel BCC analysis could be applied to sub-bands at or below the LFE cut-off frequency, five-channel BCC analysis (excluding the LFE channel) could be applied to sub-bands that are (1) above the LFE cut-off frequency and (2) at or below the rear-channel cut-off frequency, and three-channel BCC analysis (excluding the LFE channel and the two rear channels) could be applied to sub-bands above the rear-channel cut-off frequency. The present invention can be generalized further to apply parametric audio coding to two or more different subsets of input channels for two or more different frequency regions, where the parametric audio coding could be other than BCC coding and the different frequency regions are chosen such that the frequency content of the different input channels is reflected in these regions. Depending on the particular application, different channels could be excluded from different frequency regions in any suitable combinations. For example, low-frequency channels could be excluded from high- frequency regions and/or high-frequency channels could be excluded from low-frequency regions. It may even be the case that no single frequency region involves all of the input channels. As described previously, although the input channels 208 can be downmixed to form a single combined (e.g., mono) channel 212, in alternative implementations, the multiple input channels can be downmixed to form two or more different "combined" channels, depending on the particular audio processing application. More information on such techniques can be found in U.S. patent application no. 10/762,100, filed on 01/20/04, the teachings of which are incorporated herein by reference. In some implementations, when downmixing generates multiple combined channels, the combined channel data can be transmitted using conventional audio transmission techniques. For example, when two combined channels are generated, conventional stereo transmission techniques may be able to be employed. In this case, a BCC decoder can extract and use the BCC codes to synthesize a multi-channel signal (e.g., 5.1 surround sound) from the two combined channels. Moreover, this can provide backwards compatibility, where the two BCC combined channels are played back using conventional (i.e., non-BCC-based) stereo decoders that ignore the BCC codes. Analogously, backwards compatibility can be achieved for a conventional mono decoder when a single BCC combined channel is generated. Note that, in theory, when there are multiple "combined" channels, one or more of the combined channels may actually be based on individual input channels. Although BCC system 200 can have the same number of audio input channels as audio output channels, in alternative embodiments, the number of input channels could be either greater than or less than the number of output channels, depending on the particular application. For example, the input audio could correspond to 7.1 surround sound and the synthesized output audio could correspond to 5.1 surround sound, or vice versa. In general, BCC encoders of the present invention may be implemented in the context of converting M input audio channels into N combined audio channels and one or more corresponding sets of BCC codes, where M>N≥1. Similarly, BCC decoders of the present invention may be implemented in the context of generating P output audio channels from the N combined audio channels and the corresponding sets of BCC codes, where P>N, and P may be the same as or different from M. Depending on the particular implementation, the various signals received and generated by both BCC encoder 202 and BCC decoder 204 of Fig. 2 may be any suitable combination of analog and/or digital signals, including all analog or all digital. Although not shown in Fig. 2, those skilled in the art will appreciate that the one or more combined channels 212 and the BCC cue code data stream 216 may be further encoded by BCC encoder 202 and correspondingly decoded by BCC decoder 204, for example, based on some appropriate compression scheme (e.g., ADPCM) to further reduce the size of the transmitted data. The definition of transmission of data from BCC encoder 202 to BCC decoder 204 will depend on the particular application of audio processing system 200. For example, in some applications, such as live broadcasts of music concerts, transmission may involve real-time transmission of the data for immediate playback at a remote location. In other applications, "transmission" may involve storage of the data onto CDs or other suitable storage media for subsequent (i.e., non-real-time) playback. Of course, other applications may also be possible. Depending on the particular implementation, the transmission channels may be wired or wire- less and can use customized or standardized protocols (e.g., IP). Media like CD, DVD, digital tape recorders, and solid-state memories can be used for storage. In addition, transmission and/or storage may, but need not, include channel coding. Similarly, although the present invention has been described in the context of digital audio systems, those skilled in the art will understand that the present invention can also be implemented in the context of analog audio systems, such as AM radio, FM radio, and the audio portion of analog television broadcasting, each of which supports the inclusion of an additional in- band low-bitrate transmission channel. The present invention can be implemented for many different applications, such as music reproduction, broadcasting, and telephony. For example, the present invention can be implemented for digital radio/TV/internet (e.g., Webcast) broadcasting such as Sirius Satellite Radio or XM. Other applications include voice over IP, PSTN or other voice networks, analog radio broadcasting, and Internet radio. Depending on the particular application, different techniques can be employed to embed the sets of BCC codes into a combined channel to achieve a BCC signal of the present invention. The availability of any particular technique may depend, at least in part, on the particular transmission/storage medium(s) used for the BCC signal. For example, the protocols for digital radio broadcasting usually support inclusion of additional enhancement bits (e.g., in the header portion of data packets) that are ignored by conventional receivers. These additional bits can be used to represent the sets of auditory scene parameters to provide a BCC signal. In general, the present invention can be implemented using any suitable technique for watermarking of audio signals in which data corresponding to the sets of auditory scene parameters are embedded into the audio signal to form a BCC signal. For example, these techniques can involve data hiding under perceptual masking curves or data hiding in pseudo-random noise. The pseudo-random noise can be perceived as comfort noise. Data embedding can also be implemented using methods similar to bit robbing used in TDM (time division multiplexing) transmission for in-band signaling. Another possible technique is mu-law LSB bit flipping, where the least significant bits are used to transmit data. The present invention may be implemented as circuit-based processes, including possible implementation on a single integrated circuit. As would be apparent to one skilled in the art, various functions of circuit elements may also be implemented as processing steps in a software program. Such software may be employed in, for example, a digital signal processor, micro-controller, or general- purpose computer. The present invention can be embodied in the form of methods and apparatuses for practicing those methods. The present invention can also be embodied in the form of program code embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. The present invention can also be embodied in the form of program code, for example, whether stored in a storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium or carrier, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits. It will be further understood that various changes in the details, materials, and arrangements of the parts which have been described and illustrated in order to explain the nature of this invention may be made by those skilled in the art without departing from the scope of the invention as expressed in the following claims. WE CLAIM : 1. A method in parametric multi-channel Coding system for encoding a multi-channel audio signal having a plurality of audio input channels, characterized by comprising : applying a parametric audio encoding technique to generate parametric audio codes for a first subset of the audio input channels for a first frequency region; and applying the parametric audio encoding technique to generate parametric audio for a second subset of the audio input channels for a second frequency region, wherein: the second frequency region is different from the first frequency region; and the second subset is different from the subset. 2. The method as claimed in claim 1, wherein the parametric audio encoding technique is binaural cue coding (BCC) encoding: 3. The method as claimed in claim 1, wherein: the multi-channel audio signal is a surround sound signal having a plurality of regular channels and at least one low-frequency (LFE) channel; the first subset comprises all of the audio input channels; the first frequency region corresponds to sub-bands at or below a specified cut-off frequency; the second frequency region corresponds to sub-bands above the cut- off frequency. 4. The method as claimed in claim 3, wherein the parametric audio encoding technique is BCC encoding. 5. The method as claimed in claim 3, wherein the cut-off frequency is at least the effective audio bandwidth of the LFE channel. 6. The method as claimed in claim 3, wherein the multi-channel audio signal is a 5.1 surround sound signal. 7. The method as claimed in claim 1, comprising transmitting the parametric audio codes for the first and second subsets of audio input channels. 8. An apparatus for encoding a multi-channel audio signal having a plurality of audio input channels, characterized by comprising: means for (214) applying a parametric audio encoding technique to generate parametric audio codes for a first subset of the audio input channels for a first frequency region; and means for (214) applying a parametric audio encoding technique to generate parametric audio codes for a second subset of the audio input channels for a second frequency region; wherein: the second frequency region is different from the first frequency region; and the second subset is different from the first subset. 9. A parametric audio encoder (202), comprising: a downmixer (210) adapted to generate one or more combined channels (212) from a plurality of audio input channels (208) of a multi-channel audio signal; characterized by comprising: an analyzer (214) adapted to generate: (1) parametric audio codes (216) for a first subset (M) of the audio output channels in a first frequency region; and (2) parametric audio codes (220) for a second subset (N) of the audio output channels in a second frequency region, wherein: the second frequency region is different from the first frequency region; and the second subset (N) is different from the first subset (M). 10. The encoder as claimed in 9, wherein the parametric audio codes are BCC codes. 11. The encoder as claimed in claim 9, wherein: the multi-channel audio signal is a surround sound signal having a plurality of regular channels and at least one LFE channel; the first subset comprises all of the audio output channels; the first frequency region corresponds to sub-bands at or below a specified cut-off frequency (fc); the second subset excludes the LFE channel; and the second frequency region corresponds to sub-bands above the cut- off frequency. 12. The encoder as claimed in claim 9, wherein the encoder is adapted to transmit the parametric audio codes for the first and second subsets of audio input channels. 13. A method in parametric multi channel coding system for synthesizing a multi-channel audio signal having a plurality of audio output channels, characterized by comprising: applying a parametric audio decoding technique to generate a first subset of the audio output channels for a first frequency region; and applying the parametric audio decoding technique to generate a second subset of the audio output channels for a second frequency region, wherein: the second frequency region is different from the first frequency region; and the second subset is different from the first subset. 14. The method as claimed in claim 13, wherein the parametric audio decoding technique is BCC decoding. 15. The method as claimed in claim 13, wherein: the multi-channel audio signal is a surround sound signal having a plurality of regular channels and at least one LFE channel; the first subset comprises all of the audio output channels; the first frequency region corresponds, to sub-bands at or below a specified cut-off frequency; the second subset excludes the LFE channel; and the second frequency region corresponds to sub-bands above the cut- off frequency. 16. The method as claimed in claim 15, wherein the parametric audio decoding technique is BCC decoding. 17. The method as claimed in claim 15, wherein the cut-off frequency (fc) is at least the effective audio bandwidth of the LFE channel. 18. The method as claimed in claim 15, wherein the multi-channel audio signal is a 5.1 surround sound signal. 19. A parametric audio decoder (204), comprising: a parametric code processor (218) adapted to generate parametric codes; (220) characterized by comprising: a synthesizer (222) adapted to apply the parametric codes (220) to one or more combined channels (224) to generate: (1) a first subset (P) of audio output channels of a multi-channel audio signal in a first frequency region; and (2) a second subset (N) of audio output channels of the multi-channel audio signal in a second frequency region, wherein: The second frequency region is different from the first frequency region; and The second subset (N) is different from the subset (P). 20. The decoder as claimed in claim 19, wherein the parametric codes are BCC codes. 21. The decoder as claimed in claim 19 wherein; the multi-channel audio signal is a surround sound signal having a plurality of regular channels and at least one LFE channel; the first subset comprises all of the audio output channels; the first frequency region corresponds to sub-bands at or below a specified cut-off frequency, the second subset excludes the LFE channel; and the second frequency region corresponds to sub-bands above the cut- off frequency. The invention relates to a method in parametric multi-channel Coding system for encoding a multi-channel audio signal having a plurality of audio input channels, characterized by comprising : applying a parametric audio encoding technique to generate parametric audio codes for a first subset of the audio input channels for a first frequency region; and applying the parametric audio encoding technique to generate parametric audio for a second subset of the audio input channels for a second frequency region, wherein: the second frequency region is different from the first frequency region; and the second subset is different from the subset. |
---|
02531-kolnp-2006 correspondence others.pdf
02531-kolnp-2006 description(complete).pdf
02531-kolnp-2006 intermational publication.pdf
02531-kolnp-2006 intermational search authority report.pdf
02531-kolnp-2006-correspondence-1.1.pdf
2531-KOLNP-2006-(19-10-2011)-CORRESPONDENCE.pdf
2531-KOLNP-2006-(19-10-2011)-PA.pdf
2531-KOLNP-2006-(20-01-2012)-CORRESPONDENCE.pdf
2531-KOLNP-2006-AMANDED CLAIMS.pdf
2531-KOLNP-2006-AMANDED PAGES OF SPECIFICATION.pdf
2531-KOLNP-2006-ASSIGNMENT.pdf
2531-KOLNP-2006-CORRESPONDENCE 1.1.pdf
2531-KOLNP-2006-CORRESPONDENCE 1.2.pdf
2531-KOLNP-2006-CORRESPONDENCE 1.3.pdf
2531-KOLNP-2006-DESCRIPTION (COMPLETE).pdf
2531-KOLNP-2006-EXAMINATION REPORT REPLY RECIEVED.pdf
2531-KOLNP-2006-EXAMINATION REPORT.pdf
2531-KOLNP-2006-FORM 3 1.1.pdf
2531-KOLNP-2006-FORM 5 1.1.pdf
2531-KOLNP-2006-GRANTED-ABSTRACT.pdf
2531-KOLNP-2006-GRANTED-CLAIMS.pdf
2531-KOLNP-2006-GRANTED-DESCRIPTION (COMPLETE).pdf
2531-KOLNP-2006-GRANTED-DRAWINGS.pdf
2531-KOLNP-2006-GRANTED-FORM 1.pdf
2531-KOLNP-2006-GRANTED-FORM 2.pdf
2531-KOLNP-2006-GRANTED-SPECIFICATION.pdf
2531-KOLNP-2006-OTHERS 1.1.pdf
2531-KOLNP-2006-PETITION UNDER RULE 137-1.1.pdf
2531-KOLNP-2006-PETITION UNDER RULE 137.pdf
2531-KOLNP-2006-REPLY TO EXAMINATION REPORT 1.1.pdf
Patent Number | 253157 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Indian Patent Application Number | 2531/KOLNP/2006 | ||||||||||||
PG Journal Number | 27/2012 | ||||||||||||
Publication Date | 06-Jul-2012 | ||||||||||||
Grant Date | 28-Jun-2012 | ||||||||||||
Date of Filing | 04-Sep-2006 | ||||||||||||
Name of Patentee | FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V. | ||||||||||||
Applicant Address | HANSASTRASSE 27C, 80686, MUNICH, GERMANY | ||||||||||||
Inventors:
|
|||||||||||||
PCT International Classification Number | G10L19/00; H04S3/00 | ||||||||||||
PCT International Application Number | PCT/US2005/005605 | ||||||||||||
PCT International Filing date | 2005-02-23 | ||||||||||||
PCT Conventions:
|