|
In order for humans to hear steady changes in loudness, sound intensity must increase logarithmically.[i] This logarithmic sensitivity to sound amplitude allows our species to discriminate between subtle changes of level with very soft sounds, while accommodating a wide range of very loud sounds. Given this property of human hearing, would it not make sense for digital audio systems to encode amplitude as a logarithmic value, rather than a linear one? This question occurred to the author during a seminar in Digital Audio Processing taken as part of a graduate degree program in Music Technology at New York University. Although it may not have been inspired by the group discussion, it should not be surprising that the idea should arise during a seminar that examined all aspects of digital audio conversion, processing and error correction, including data compression and perceptual encoding methods. The majority of digital audio systems use linear amplitude encoding methods, in which a numeric value is assigned in proportion to the input signal’s voltage.[ii] Although linear encoding has adequately served the needs of digital audio system designers so far, it is an inefficient method: fully half of the available numeric values are assigned to the highest six decibels of level. To take the Compact Disc (CD) system as an example, out of 65,536 possible numeric values, 32,768 are used for the loudest 6 dB of levels, while the other 32,768 values must cover the roughly 90 dB of levels that remain.[iii] In contrast, a system using a logarithmic amplitude encoding method would assign a numeric value in proportion to the input signal’s decibel level. In such a system, the numeric values evenly cover the dynamic range and a numeric increase of a given size results in a constant decibel level increase, regardless of whether the signal is soft or loud. Because its numeric values are more efficiently distributed, logarithmic encoding may provide an improved method of data compression so long as the process itself cannot be perceived. We are at an interesting juncture in the development of digital audio. New storage technologies make it possible to store larger audio files. New transmission protocols allow data to be sent at faster speeds. Such developments present a tantalizing opportunity to improve upon the limitations of “CD-quality,” 16-bit audio by increasing the stored size of digital audio files to a point at which human perception cannot detect differences between a source and its recordings. Current proposals for improving the amplitude resolution (or bit-depth) of consumer linear digital audio systems suggest increasing the size of the digital audio sample from 16 bits to 20 or 24 bits. An increase in bit-depth would certainly improve sound quality, as the number of bits determines both the noise floor of a linear digital audio system and the level of distortion introduced by the sampling process. In a linear digital audio system, these sonic flaws each have greatest impact on low-level signals. As an audio signal (such as the end of a note) drops in level, at some point it would be –72 dB below full scale. In a 16-bit sample, this waveform level is only 24 dB above the noise floor, and would be expressed over a numeric range of only 16 steps. Increasing the sample size to 20 bits improves the signal-to-noise ratio by 24 dB (to 48 dB) and the number of steps increases by a factor of 16 (to 256). If the sample size is instead increased to 24 bits, the signal-to-noise ratio improves by 48 dB (to 72 dB) and the number of steps increases by a factor of 256 (to 4096).[iv] These are useful improvements, but they come at the high cost of additional data overhead. Greater bit-depth increases the size of a 16-bit digital audio file by 25% for a 20-bit file, and by 50% for a 24-bit file. But while low-level signals would benefit from the increase in resolution, there is no perceptible improvement in the quality of high-level signals.[v] Logarithmic encoding, by contrast, has a noise floor that remains in fixed proportion to the input signal. That is, as the input level decreases, so does the system’s noise. As bit-depth is increased, the noise floor is reduced relative to the input signal. Therefore, at some bit-depth, the input program would always effectively mask the noise floor. The distortion level of a logarithmic encoder is also proportional to the input level and is greatest for high-level signals. In this regard, the behavior is more like that of analog recorders than of linear encoders, where distortion is greatest for the lowest levels. Again, at some bit- depth, the distortion level would also be effectively masked by the signal. An interesting property of a logarithmic encoding system is that all signals benefit from an increase in bit-depth—not just the lowest levels. Logarithmic encoding thus surpasses linear encoding in data efficiency, making it better suited for storage or transmission of high-resolution digital audio data. By using logarithmic encoding during transmission, for example, it might be possible to preserve all of the perceptual detail of a high-resolution, linearly recorded signal, yet decrease transmission time by 30% or more. The author believes that the merits of logarithmic encoding methods may have been briefly explored during development of early digital audio systems, but were dismissed because of limitations of the technology available at that time.[vi] Specifically, these limitations include slow data access rates, slow data processing speeds, limited storage capability and the difficulty of manufacturing converters that encode logarithmically. However, recent developments in the areas of computer processing and storage devices make the implementation of logarithmic encoding more feasible. Thus, the subject deserves a fresh examination. There are bona fide sonic benefits to improving aspects of CD-quality audio other than bit-depth, such as increasing the sampling rate or expanding the number of audio channels from two to four or more. Although these topics are beyond the scope of this paper, it can be assumed that such improvements would benefit linear encoding and logarithmic encoding methods equally. This paper attempts to demonstrate how logarithmic encoding digital audio systems can be developed, tested and implemented:
Two Appendices follow the summary:
[i] Stanley R. Alten, Audio in Media, Fifth Edition, (Belmont, CA.: Wadsworth Publishing Company, 1999), 16. [ii] Ken C. Pohlmann, Principles of Digital Audio, Fourth Edition (New York: McGraw-Hill, 2000), 38. [iii] Pohlmann, 36–38. [iv] Pohlmann, 36. [v] Pohlmann, 37. [vi] Pohlmann, 38. |