Music and Misc. Audiophilia
Yours truly keeps a modestly sized zoo of music recordings, mostly in the CD format as that is the most practical when you've got a PC-based playback setup and associated mobile player. Not into multichannel as I'm more of a headphone guy anyway. It was about time I wrote something on the topic… let's call it a blogalike. (I know, I know, I should get a real blog. Then again, everyone and their dog already has one, so… Whatever.)
Index of topics
- 32. Why over-compressed recordings sound as bad as they do
- 31. Yay! Podcasts for audio geeks!
- 30. Rockbox – the audio enthusiast's (almost) ideal MP3 player firmware
- 29. What am I listening to anyway?
- 28. How-To: Getting the best out of your MP3s
- 27. Are SHM-CDs a ripoff?
- 26. THD is dead, my friends – Audibility of distortion
- 25. Engineer's Corner: Fun with Chip Amps
- 24. Engineer's Corner: Pole Splitting in a Nutshell
- 23. What's so great about "Class A"?
- 22. X-Fi "machine gun" / "buzzing" noises: A fix?
- 21. My dear audio equipment tweakers…
- 20. About development of musical taste
- 19. Separate songwriters and performers or all original material – what's best?
- 18. Youtube memes
- 17. Remasters that are safe to buy
- 16. Gender relations in popular music
- 15. Engineer's Corner: Op-Amp Gain Error at High Frequencies
- 14. Stroboscopic speed indicators
- 13. How audio equipment specifications can help you (2)
- 12. A few places related to audio equipment measurements on the web
- 11. Testing for processing headroom in digital audio players
- 10. Apples, Oranges and Amplifier Specifications (1): Noise
- 9. How (not) to build an audiophile hi-fi component
- 8. Tweaking the Rotel RA-980BX Integrated Amplifier for Less Noise
- 7. Application Note: Using CD-Direct Functionality on Yamaha Hi-Fi Amplifiers
- 6. Remastering time!
- 5. Replaygain levels demystified
- 4. A few useful spreadsheets
- 3. Y knot...? (2)
- 2. Y knot...? (1)
- 1. On overly loud CDs, quiet MP3 players and things related
Why over-compressed recordings sound as bad as they do
Controlling dynamic range is important in recorded music. Average, peak and minimum levels have to be within certain ranges for our perception to work well. If even in a quiet listening environment you feel the need to ride the volume control all the time, chances are dynamics are too great. Conversely, something that came through well in a noisy environment may sound flat and lifeless in a quiet one. (Personally I've found that heavily compressed material can give me a headache or even cause nausea – I'm a bit sensitive to these things, I guess.) Using dynamics for effect can give music another dimension, but sadly this is not done very often.
It is quite common for "raw" recordings to contain excessive dynamics in the wrong places. Have you ever watched how good studio singers tend to move away from the microphone when they start singing more loudly? That's a basic form of dynamic range control. It gets a little harder when you have, say, a drum kit. In the olden days of tape recording, the inherent soft-limiting properties of tape would catch rogue peaks without it getting objectionable, but when digital recording came around, peak amplitudes became a real problem. Conventional analog dynamic range compressors (with VCAs and all) had been in use going back to the late '70s at least, when they were used to give pop recordings more "punch", but they weren't that effective for peak limiting. With the advent of DSPs and memory, however, look-ahead limiting became possible.
Unfortunately tools can be both used and abused, and by about 1997, popular music was firmly in "abused" territory, with the loudness war in full swing. This then got us into a decade of over-compressed music. Brickwall limiters weren't the only tool involved, but definitely the most well-known one. But it's not only that, studio recording culture in general has apparently gone from enhancing performances to taking away from them. One of the most common complaints of music listeners nowadays is that the studio versions of songs pale when compared to fairly basic live recordings. That's where I say something must have gone terribly wrong, unless the artist would have happened to evolve greatly in the meantime.
Let us, however, get back to digital look-ahead / brickwall limiting. There
is one central problem with digital limiters: Nonlinear operations in
the digital domain are not inherently band-limited.
Let me illustrate this with a simple example, a 10 kHz sine whose amplitude
was increased 3 dB beyond the point of clipping. Clipping most definitely is a
nonlinear operation.
Spectrum of clipped 10 kHz sine, fs = 44.1 kHz.
Oops. With clipping in the analog domain you would expect the original sine
along with a long line of its harmonics – and nothing else. That would
mean there would be nothing but 10 and 20 kHz components. Instead, we're
getting a whole zoo of anharmonics here. As it turns out, clipping extended the
signal spectrum beyond half sample rate, and that in turn means the excess
components reappear as aliasing. You can think of the process as
analog clipping followed by sampling without any kind of anti-alias
filtering.
Let me borrow my spectrum for a "well-behaved"
signal from Frequency domain considerations,
reconstruction and aliasing:
^ power density
__ | __ __ __
/ \ | / \ / \ / \
... / . | . \ / . . \ ...
/ | | | \ / | | \
/ | | | \ / | | \
---+---------+---------+------+------+---------+---------+--->
| | | | | | | f
-f 0 f f f - f f f + f
max max S S max S S max
--
2
The unclipped sine was such a signal:
^ power density
. | . .
. | | | | . | | . |
... . | | | | . | | . | ...
. | | | | . | | . |
. | | | | . | | . |
----+----+----+----+-----+-----+----+----+-----+-----+----+----+---->
| | | | | | | f
-f 0 f f f - f f f + f
1 1 S S 1 S S 1
--
2
The clipped sine isn't. Its harmonics "leak" into multiple repeat spectra.
^ power density
. | . .
.. | | | . | . | | . |
... .| | | | | | | . . | | . | ...
.| | | | | | | | . | | . | . |
.| | | | | | | | . | | | | |. . |
----+----+----+----+-----+-----+----+----+-----+-----+----+----+---->
| | | | | | | f
-f 0 f f f - f f f + f
1 1 S S 1 S S 1
--
2
| | | | | | |
2f 3f 4f 5f 6f 7f 8f
1 1 1 1 1 1 1
The repeat spectra obviously grow the same harmonics (not shown), and it all becomes a big mess.
Similar things happen in a digital limiter. Smart people have therefore invented oversampling limiters. If you work at a higher sample rate, the chance of any high-frequency components generating aliasing is greatly reduced. Here's what you get if you clip our sine at 192 kHz (RMAA wouldn't accept 176.4 for some reason):
Spectrum of clipped 10 kHz sine, fs = 192 kHz.
Now let's reduce levels a bit (at least 1.3 dB proved necessary in this case) and resample to 44.1 kHz again:
Spectrum of clipped 10 kHz sine, fs = 192 kHz, downsampled to 44.1 kHz.
Here's what you get if you start out at 176.4 kHz instead:
Spectrum of clipped 10 kHz sine, fs = 176.4 kHz, downsampled to 44.1 kHz.
Now doesn't that look a whole lot more well-behaved than the mess we started out with?
Spectrum of clipped 10 kHz sine, fs = 44.1 kHz.
Another problem in non-oversampling limiters is that they generate intersample overs – peaks beyond full-scale amplitude that only show up in the signal's continuous-time representation (or when upsampling), with individual sample values being within the permitted range. On my "hottest" CDs, I have seen peaks up to +2.08 dB when resampling to 176.4k using SSRC (which is not too far off from the +3dBFS fs/4 test tone). Digital filtering and other processing, DAC and following analog stages must have enough headroom in order to avoid even further deterioration of signal quality.
Entry last modified: 2012-01-20 – Entry created: 2012-01-18
Yay! Podcasts for audio geeks!
So far, I haven't run across too many podcasts that would be of interest to audio enthusiasts. Well, to be honest, I found the first one last week – Home Theater Geeks with Scott Wilkinson. In each episode of typically an hour, Scott (video)chats with a guest from the ranks of industry veterans. With names like Floyd Toole, Bob Carver, Nelson Pass, John Atkinson and Tyll Hertsens listed, you bet it'll get interesting. Two video qualities and one audio-only version (64 kbit/s 44.1 kHz mono MP3, OK sounding and Rockbox-friendly) are offered for download.
The episodes I have enjoyed so far include:
#1,
#9,
#10,
#14,
#16,
#17,
#20,
#22,
#24,
#29,
#42,
#51,
#55,
#69,
#84,
#90,
#91, and
#92 will probably be
next.
It is obviously helpful to have some background so you don't have to take everything as gospel. Even veterans can be plain wrong once in a while (audiophiles in particular). It happens to (almost) all of us. Anyway, plenty of food for thought for sure.
Entry last modified: 2011-12-22 – Entry created: 2011-12-22
Rockbox – the audio enthusiast's (almost) ideal MP3 player firmware
Rockbox is an open-source firmware for MP3 players (more correctly called digital audio players) that has more recently also been ported to Android to be used as kind of a player application. In its fully-featured but functional and no-BS approach, it seems like the natural companion to the "Swiss army knife of audio players", Foobar2000. I'm a total fan, and let me tell you why:
- Rockbox comes with the proverbial kitchen sink. Where most player firmwares are rather basic, it offers a lot of genuinely useful settings and functions, and there's a context menu nearly everywhere. Playlists? Check. Bookmarks? Check. Themes? Check. Configurable quick screen for frequently used settings? Check. Parametric EQ? Check. last.fm logging without the need to use MTP? Check. Adjustable compressor for noisy surroundings? Check. Balance control? Check. Stereo width control? Check. Speed and pitch control? Check. Crossfading? Check. Crossfeed? Check. Custom music database view? Check. Plugins to provide both useful little applications and games (even a Gameboy emulator and, on sufficiently well-equipped players, a Doom clone)? Check. Voice output for navigation? Check. You're well advised to RTFM.
- Compared to stock firmwares, support for file formats usually is both wider and better. (Only DRM-infested tracks are not playable at all.)
- It has a nice music database system that updates in the background and, again, usually is faster and more robust when compared to stock firmwares. The popular Sandisk AMS players, for example, often fail to pick up part of the contents on big µSD cards when stock. Speaking of which, Rockbox would definitely support 64 gig µSDXC cards on these (but with FAT32 rather than exFAT, for which there apparently isn't even an open-source implementation yet). 32 giggers are quite popular right now.
- Startup (on flash-based players at least) is very quick. My Clip+ is ready to go in under 3 seconds. The original firmware needs almost twice as long, and that's by no means considered particularly slow.
- Rockbox does a lot right in the signal processing department.
Replaygain support, for example, is not just tacked on, but in fact the processing chain does provide enough headroom to allow for more than nominally full-scale output from the decoders. Assuming there is enough negative gain following, such material will be played back without clipping. I have seen peak amplitudes of more than ±1.5 on some MP3s with very "hot" material (MP3 is the most critical format here). "Dumb" decoder implementations like on my Clip+'s stock firmware just convert the +1..-1 range to digital full-scale (e.g. +32767..-32768) and clip everything that goes beyond that, so there is no chance of recovering the peaks even with Replaygain enabled.
Rockbox also offers noise-shaped dither to be used at the end of the processing chain, as the DACs in DAPs usually aren't more than 16-bit affairs.
So, are there any downsides? Well, of course.
- Rockbox does not run equally well on all the players it supports. On some models the power management has not been figured out entirely so battery life suffers, and even on the very common Sandisk Sansa Clip / Fuze family players which for the most part work like a charm, USB support only was enabled in very recent builds (post-3.10) after being troublesome and hence disabled for a long time.
- The internal DSP engine runs at a fixed 44.1 kHz sample rate, so that's all Rockbox can output. Material at other kinds of sample rates has to be resampled, which is done via linear interpolation, resoluting in very little extra processor load but also about the lousiest quality possible. If you were wondering why your audiobooks at 24 kHz or less are sounding like a crappy metallic mess, this is it. While work on a better-quality resampler has begun, there still is quite a long way to go. Since then, better stick with CD-quality material. Quite a shame, since otherwise audiobook functionality is very nice.
- As mentioned, no DRM support. Us technical folks have generally been avoiding protected content like the plague anyway.
Entry last modified: 2011-12-22 – Entry created: 2011-12-15
What am I listening to anyway?
For a music-related blog there's been precious little discussion of music here, don't you think? Then again, how interesting could the musical taste of a slightly weird egocentric technophile (and yes, headphone guy, no surprises there) ever be to the general public? Well, find out.
My parents had a modest record collection: Quite a bit of classical, some Elvis here, CCR, Beatles and 'Stones there, '70s pop like ABBA and Fleetwood Mac's inescapable "Rumours", prog-pop like Supertramp, ELO and Alan Parsons Project, electronic/instrumental from Jean-Michel Jarre and Mike Oldfield, and a bit of '80s stuff (like Springsteen's "Born in the USA" or some Chris Rea, and a Juliane Werding album with lyricist Michael Kunze – one of the few newer non-classical and non-folk things my mom likes). Nothing all too fancy there, getting some decent records in the GDR wasn't all that easy. The first CD would eventually be Roxette's "The Look" (again, not exactly music nerd territory but still something I might listen to out of nostalgia).
I could never get much into the Beatles, interestingly enough, and preferred classical, instrumental and prog-pop, with some CCR and ABBA sprinkled in (I never got a big ABBA fan but did like their "Dancing Queen" album, mostly identical to "Arrival"). My first CDs accordingly were classical, Mike Oldfield and Jean-Michel Jarre and the odd Roxette album.
Fast-forward to the present. In the meantime I became much more interested in music, fueled by the right kind of radio stations and ye olde interweb.
In terms of classical, I haven't progressed that much. The odd symphony here and there, and stuff like that. But hey, I still have a few decades till retirement. Coolest recording award: The 4th movement from Bruckner's 8th (Karajan / Preußische Staatskapelle), captured on early stereophonic tape in 1944. While obviously restored, resulting sound quality is a lot better than you'd expect in a recording that old.
The non-classical department is mostly populated by sophisticated / prog-influenced (indie-)pop and folk acts with a female twist. The usual suspects include:
- Kate Bush, well-known UK pop singer / songwriter. Big,
big, BIG inspiration. With some songs from her first two albums, you really
wonder how she knew about some things at such a young age, 30+ years ago no
less. Her debut "The Kick Inside" still holds up well today. Later work like
"The Dreaming" and "Hounds of Love" is nothing less than brilliant. Well, as we
all know, she didn't stay up there, but her recent output still tends to edge
out whatever colleague Peter Gabriel comes up with these days
(whose 4th eponymous is right up there in my all-time top albums for both
content and top-notch studio work).
Ms. Bush once mentioned Minnie Riperton in one of her songs ("All we ever look for"), and so it should not come as a big surprise that the US soul singer with an impressive vocal range (who, sadly, passed away at only 31) was an influence of hers. - Alice (real name Carla Bissi), Italian pop singer who released some really nice albums in the '80s, like the wonderful moody "Il sole nella pioggia". Great singer, great album. My Italian still sucks, but that doesn't matter much here.
- Concrete Blonde, LA-based alternative rock band (with influences ranging from Leonard Cohen to goth) that never quite managed full-out mainstream success, but nonetheless terrific. Music for those who've been through some sh!t. (Yours truly is approaching 30, which is showing here and there, I guess.) Singer Johnette Napolitano could pass as the definition of "real"… and she displays some fairly impressive pipes here and there, too. Following her tweets, it appears that some new material is in the works, which would be exciting since the last regular album dates from 1993.
- Laurie Anderson, genius American performance-artist-gone-pop who turned out some great late-night headphone listening material (typically with very good sound quality). Very smart lady.
- Folk singers, old guard: jazzy era Joni Mitchell is great (if a bit too verbose for my tastes), and while we're in Canada, Gordon Lightfoot also is a great musical storyteller ("Canadian Railroad Trilogy", anyone?). Special mention to UK folk's golden voice, Sandy Denny (another of those taken too early). Going into the '80s and onwards, there's New Yorker Suzanne Vega, who has got to be one of the best lyricists of her generation.
- Folk singers, new guard: This is firmly a UK affair so far, and more on the poppy side than the above. First we have rising star Laura Marling, who put out three high-quality albums so far (not to forget the fantastic folk crossover "Mumford & Sons, Laura Marling and Dharohar Project EP"). And the young lady is only 21. Hong Kong-born Emma-Lee Moss a.k.a. Emmy the Great put out what I considered a very strong second album this year.
- Cocteau Twins, legendary Scottish indie band with vocalist Elizabeth Fraser and guitarist Robin Guthrie. File under "one-offs". Strongest output around the mid-'80s. Modern bands that are similarly innovative and ethereal, hmm, try School of Seven Bells (loved their first album, though production on the second one seemed like a big step back to me).
- Siouxsie and the Banshees, UK band commonly credited with founding the "gothic" genre. If so, they're definitely from the artsy corner of goth, and quite accessible to the regular listener.
- Marina and the Diamonds (Marina Diamandis), Welsh-Greek
pop goddess with a knack for visuals who could make any chameleon go green with
envy. (If they typically weren't already, that is.) While it is not hard to see
why yours truly would appreciate a smart if egocentric young lady with guts and
determination (not exactly my forté) in spite of questionable vocal training,
the feeling of being reduced to an excited 13-year-old was a new one to me.
(Better 15 years late than never, right?) Her debut "The Family Jewels" was the
first (and so far, only) album I ever preordered, and it subsequently got her
to the #2 spot in my scrobble stats as I played it to death. Sound quality
isn't much to write home about, but there's lots of great tunes and
thought-provoking lyrics. I guess synaesthesia helps with the former.
Apparently she's been anything but immune to the modern world's ubiquitous
corrupting influences (which sadly applies to many of us younger folks), though
I do believe that she can still get herself out of the resulting internal mess.
Trivia: When the Family Jewels first came out, I started writing a fan letter, but when at the second attempt I was at 6K characters (~1000 words) and still hadn't gotten to the point I'd wanted to make, I figured she'd never bother to read all that crap and scrapped it. Oh well. - Danielle Dax, formerly in Lemon Kittens, '80s UK independent post-punk / experimental / pop artist. Not only did she set up a recording studio of her own (pretty much unheard of among female artists in the early '80s), she also designed all of her stage outfits herself and as such can be seen as a blueprint for a number of modern-day UK indie acts (e.g. Bat For Lashes). With smart but usually rather cryptic lyrics and a keen ear on international music trends, she'd gravitate from the weird to the pop side of things towards the late '80s, but success remained limited. Reportedly Ms. Dax is now working as an interior designer.
- Throwing Muses, female-fronted country-esque / alternative rock / post-punk band with a penchant for tempo changes and bizarre lyrics. If you can get the "In A Doghouse" dual CD compilation with material from their early days, this is easily as crazy as it's good.
- St. Vincent, real name Annie Clark, quirky US indie rocker / singer-songwriter who likes to balance on the edge between experimentation and pop, or jump back and forth between the two for that matter. Also see: Sufjan Stevens (would deserve official legend status if he doesn't already have it), and similarly quirky singer-songwriter and violinist Andrew Bird.
- Fever Ray, formerly in electro-pop duo The Knife, standing for the lively Scandinavian electronic and electro-pop music scene. Also see Lykke Li (who released a great pop album earlier this year), Robyn or Annie.
- Björk, well-known Icelandic dance-pop / avantgarde legend, reportedly busy searching the globe for the latest trends nowadays. I am mainly a fan of her first two albums right now, especially "Post" which is quite varied.
- The Gathering, moody female-fronted Dutch prog/metal band who themselves have called their style "trip rock" (analogous to the trip-hop of the day). Their strongest output arguably dates from the mid-late '90s. They undoubtably influenced many of the "beauty and the beast" type metal bands that would become very popular a few years later (Nightwish etc.). While we're in the Netherlands, a country with what seems like a very active rock and metal scene, check out the proggy "metal operas" that Arjen Lucassen a.k.a. Ayreon comes up with, with many contributors from the scene. Pretty cool stuff I thought.
- UK electro-pop duo Orchestral Manoeuvres in the Dark (OMD), who have been known to regret their pompous choice of name, turned out some really good albums in the early '80s before going very (cheesy) pop. I haven't heard their latest one, which seems to mark a return to their roots. For some reason, they make me think of US power-pop group the Cars, another good one. (I don't have enough Blondie either...)
- Given my musical past, Pink Floyd would seem inescapable, and in fact I like "Wish You Were Here" a lot, "Meddle" also is quite good. Interestingly enough, the million-seller DSOTM does not resonate with me much at all.
- The gentlemen Oldfield and Jarre have been joined by the inescapable Vangelis.
- In the one album wonders department, Sinéad O'Connor's 1987 debut deserved special mention, which is a "once in a lifetime" type affair. "Face Value" (more specifically, tracks 1, 4 and 5) is about all one needs from Phil Collins, too. Soul-pop songwriter Laura Nyro took her genre to an experimental extreme on "New York Tendaberry", which is definitely worth lending an ear to. As far as "one album and change" wonders go, singer-songwriter Tanita Tikaram would seem to be a good candidate – after a promising first album with the '80s hymn "Twist in my sobriety", the next ones turned out to be rather tiring affairs, but she did regain some ground with "Lovers in the City", which isn't bad at all.
This list is not nearly complete, but I hope I haven't forgotten anyone important.
Soul, funk and stuff like that plus jazzy things tend to be covered by corresponding radio programmes, so there's not as much of these in my personal library.
Entry last modified: 2011-12-15 – Entry created: 2011-12-07
How-To: Getting the best out of your MP3s
While music playback on "grown-up" computers is likely to – and should – involve losslessly compressed material e.g. in FLAC or Apple Lossless nowadays, there still are plenty of places in which lossy data reduction formats (MP3, AAC, Vorbis) are a lot more practical. Contrary to popular belief, those actually do a pretty good job, provided your hearing is reasonably healthy. You should obey a few simple rules though:
- Never transcode from one lossy format into another if you can help it. The result is always going to be of lesser quality than it could be. That's why it's useful to have stuff around in lossless formats. In addition, it has been found that commercially sold MP3 and AAC files are not always audibly identical to a correspondingly encoded CD rip. (Which may mean that either they dropped levels a bit to keep overshoots in check – good – or that the master was stored as MUSICAM / MP2 which is known to have trouble with 0dBFS+ levels – not good.)
- Make sure there is enough headroom in the playback chain.
This is an issue for MP3 in particular, which tends to show significant peak
amplitude enlargement (overshoots) at lower bitrates. This is most pronounced
on heavily limited, "hot" material as commonly found
on modern-day CDs. With my typical "MP3-player quality" setting (LAME 3.98.4
-V 6 -q 0), I've seen peak amplitudes of more than 1.5 times full scale. If such peaks are clipped during playback, audio quality may deterioriate audibly even when the codec itself does a fine job. There are two basic ways of dealing with this problem:- Use ReplayGain during playback. This requires an MP3 decoder that outputs float samples or at least provides some headroom so that its output can safely be downscaled later. Foobar2000 and Rockbox (on a typical software codec target like a Sansa Clip+) do a fine job here, while the Clip+'s stock firmware fails. (See here for a test file.) Since I own a few unusually dynamic recordings, I've combined this with a pre-gain of -3.6..-4.0 dB.
- Use MP3Gain. This application computes ReplayGain values and applies gain (in 1.5 dB granularity) to the files themselves. While effective with anything that can play MP3s, this procedure takes significantly longer than a simple ReplayGain scan.
- Don't use overly crummy sound transducers. Psychoacoustic models do not account for headphone / loudspeaker frequency response. Things like too little low-frequency content or weird highs peaks may counteract masking and make compression artifacts audible.
Entry last modified: 2011-12-01 – Entry created: 2011-11-29
Are SHM-CDs a ripoff?
The SHM-CD format (Super High Material CD) makes use of some kind of special polycarbonate plastic with better optical properties than regular polycarbonate, which (of course) is supposed to give improved audio quality. Now ordinary CDs have been perfectly fine in terms of readability in the vast majority of cases, thank you very much, so it would probably take some very special conditions for this to matter. That, however, is not the part that angers me.
No, it's the audio material they put on there (which is what matters a great deal, as you'll hopefully agree). A quick search in the Dynamic Range Database shows that it's all over the place as far as dynamics go, with both a few original CD masters and a few real turkeys included. Improved audio quality, yeah right.
To add another example, today I was able to compare a 1987 (1984?) CD edition of Fleetwood Mac's million-seller "Rumours" to the SHM-CD edition. A look with Audacity quickly revealed that both must have come from the same 16-bit master, but the latter had apparently seen remastering and was approximately 3..4 dB louder, with peaks limited accordingly. No dithering. While this remaster is by no means a botch-job and would seem to be less heavy-handed than the 2004 one, I would have expected more considering the audio quality claims for SHM-CD. Now of course it would sound "better" to the uninitiated when compared to the CD (since it's simply louder), and it would probably be objectively better than the 2004 remaster, but that's some weird logic if you ask me.
Interestingly, after getting ahold of the 1990 German CD reissue of said record, it would seem this was mastered from an analog tape copy (of a digital master, judging from steep lowpass filter action at 20 kHz). Complete with DC offset and inverted absolute phase. This is lower in level and thus gives about 1 or 2 dB higher peak amplitudes than even the 1987 CD, but does sound rather muffled and even has a small dropout. I'll have to drag out the vinyl copy in order to determine which tonal balance is right, but it really shouldn't be as muffled as this. Funnily enough, the 1988 Greatest Hits CD is different yet again, with tonality being the same as the 1987 CD's but levels being lower, so peaks are somewhat better preserved still. What a mess.
Entry last modified: 2011-11-20 – Entry created: 2011-12-01
THD is dead, my friends – Audibility of distortion
THD (Total Harmonic Distortion) has been a standard way of condensing nonlinear distortion performance into a single number. It sums the power of all the harmonics a nonlinear device creates in addition to a pure sine wave. However, even many decades ago people noticed that what they were hearing correlated badly with this measure. Early weighting schemes that put more emphasis on higher-order harmonics were proposed as early as the mid-1930s, to be refined later on. Geddes and Lee's "GedLee metric" from last decade is one of the youngest ones.
The root to the disparity lies in human hearing, which is a nonlinear apparatus in itself. First of all it works in frequency domain, like a spectrum analyzer – however, it treats frequencies semi-logarithmically, in what is known as a "mel" or Bark scale (linear to about 500 Hz and logarithmic beyond). This also causes a major nonlinear effect, masking of frequencies – as sound waves travel down the basilar membrane, they pass the sensory cells for high frequencies first, and for low frequencies they have to pass all the others which also have some out-of-band sensitivity, the more the closer they are to the target cells (after which the sound is quickly attenuated). Hence, a tone at one frequency will cause a fake response in the region above it. This is not directly audible as processing has adapted to the problem, but it does increase hearing threshold in the affected region. See ISO 532B and DIN 45631.
Now it would be highly surprising if the auditory system didn't also have its share of conventional nonlinearity (distortion). There is nothing inherently symmetric here, so the expected distortion profile would be much like a single-ended amplifier's – dominant second harmonic, then dropping off quickly at low sound levels and less quickly at higher ones.
The Bryan / Parbrook studies from 1960 allow a glimpse at masking behavior, as they determined the level of detectibility for various harmonics of a 360 Hz fundamental (table reproduced from Human Hearing - Distortion Audibility Part 3):
| dB SPL | 2nd | 3rd | 4th |
|---|---|---|---|
| 52.5 | -44 | -52 | -52 |
| 60 | -52 | -57 | -61 |
| 70 | -47 | -62 | -67 |
| 76 | n/a | -54 | -59 |
From these numbers it is obvious that audibility of harmonics is limited by two major factors:
- Frequency masking, as discussed, and
- Hearing threshold. For obvious reasons, if you can't hear the harmonic even by itself, you aren't going to pick it up when combined with its fundamental. (It's not as obvious as you may think though – in the funny world of hearing, a combination of two tones can be a lot more audible than either by themselves.)
It also seems like odd-order distortion is a few dB more audible.
Now what about higher-order harmonics? I conducted the most simple
experiment possible, using Audacity for tone generation and level adjustment
and my trusty Sennheiser HD580 headphones connected to a Terratec Aureon Sky
soundcard for listening. As expected, a 440 Hz tone's 2nd harmonic at
-40 dB was detectable but by no means strong, only giving the signal some
"grit". It was definitely gone 20 dB below this level. 3rd, easily
detected at -40 dB, had also gone by -60 dB. So far, it seems I'm a
slightly worse "distortion listener" than the people on those studies, which
does not surprise me.
Now the higher ones: 5th was just barely audible at -60 dB, as was
6th. 7th and 8th were still audible at -70 dB, requiring fairly high
listening volume already (fundamental amplitude 0.8, Prodigy 7.1 driver main
volume 43%, headphone channel gain -6 dB).
I guess you can see the trend. That's what makes slowly-decaying or near-flat distortion spectra (typical for output stage distortion) problematic. These tend to be dominant odd-order, too. And if the spectrum looks bad at low frequencies, high-frequency intermod isn't likely to be pretty either. Counting on sound transducer distortion to mask high-order amplifier distortion is pretty much futile, as that tends to be dominant low order (mostly 2nd/3rd), too.
As an audio equipment designer, you can only make sure that distortion stays below the worst-case envelope by at least 10 dB or so. You never know what kind of frequency response deviation the sound transducers may have.
Entry last modified: 2011-11-20 – Entry created: 2011-11-03
Engineer's Corner: Fun with Chip Amps
"Chip amps", power amplifier ICs delivering less than a watt to several dozen watts into typical speakers, are popular in DIY since they are compact and known-working circuits (which is something you'll appreciate if you don't feel like designing / debugging a discrete design). There must be dozens, if not hundreds of different types, but few of them are ever seen driving headphones. Let's examine a few that may find use as such:
LM386
The venerable LM386 is a small mono amplifier IC, usually in an 8-pin DIP package, that's been around since the mid-1970s. It offers gains in the 26 to 46 dB range, externally settable, and runs on a single supply voltage with a capacitor-coupled output. Originally from National Semiconductor, it or similar chips have been manufactured by everyone and their dog. Even NSC themselves offer multiple versions (I'd prefer the beefier N-3), and then there's the JRC/NJR NJM386 which is pin-compatible but is quite different internally (and, again, has a better-performing "big brother" called NJM386B). What a mess. Result, you're never going to know what you get if you find a random '386 chip in your junkbox (and that doesn't even include Intel's).
'386s have a reputation for being first-rate hiss and distortion generators. I guess the noise is rooted in the level shifting transistors at the inputs which run on very little current. Output noise level was experimentally found to be around 150 µVrms across the audio bandwidth at 26 dB gain, which translates to an equivalent input noise of 7.5 µVrms or input noise density of approximately 50 nV/√(Hz), definitely on the high side of things. (No wonder it's not specified.) It takes rather insensitive headphones to push that kind of a noise level into inaudible territory.
Nonetheless, LM386 equipped headphone amplifiers have long been a staple in band practising rooms and similar non-demanding applications. Noise levels become more acceptable when dropping the gain to 20 dB as the Headbanger amp circuit does (which according to simulation still is extremely stable, in agreement with the datasheet), and the bypass capacitor really helps keeping noisy power supplies in check (even though PSRR still is relatively pedestrian at lower frequencies when compared to typical opamp circuits).
As far as linearity is concerned, this chip is an interesting case. The less muscular versions like the LM386N-1 definitely struggle when trying to drive speakers, which is also reflected in the SPICE model that you'll find in the LTspice group. Linearity with 32 ohm loads (headphones) is better but still nothing much to write home about. There are ways of tweaking it though.
In its usual application, the NSC LM386 pretty much employs a single-ended input stage. While it does essentially have a differential amplifier, one half of it is merely used as a ground level shifter and for setting output stage offset.
LM386 equivalent circuit as found in datasheet from National
Semiconductor.
Have you ever wondered how the output stage "knows" that it has to operate
around half supply? In this case, it's a simple trick:
By using a current mirror, they ensured that both non-inverting and
inverting input transistors would run at the same current. Statically, both
inputs will also be at ground level, and so both ends of the 1.35k + 150R
gain-setting resistors will end up two B-E voltage drops higher, i.e. there is
essentially no DC current running through these. From this point, it's two 15k
resistors up to supply on the inverting input side (left), and one 15k resistor
to the output on the non-inverting input side (right). The non-inverting input
transistor can only draw current through that one 15k resistor. Hence the
potential there will end up being pretty much exactly the same as on "bypass"
pin 7, halfway between two B-E voltage drops up and supply! Simulation
says that it should be about half supply plus 0.5 V.
Back in the day, many a single-ended pnp input power amplifier used input transistor collector current to set output stage offset, so the above technique should have been a lot less exotic than it might seem today. Not being able to choose input transistor Ic and feedback resistor independently can be annoying though.
I was looking for ways of improving device performance beyond the levels displayed by the "Headbanger" circuit. So I thought, "Why not use the thing like an ordinary op-amp?", cranked up the loop gain and added an external feedback loop. (I probably wasn't the first one to come up with that idea.) So I modified the example circuit accordingly and simulated it:
LM386 used as op-amp, 20 dB gain.
Resulting simulated distortion improved by at least 10 dB over the "Headbanger" circuit, not only in quantity but also in quality (distribution of harmonics). Since there is no free lunch, I had to tame a high-frequency response peak, and getting the affair stable for capacitive loads greater than 22 nF (which already is about 10 times as much as I'd expect for any headphone cable) involved a 1 ohm series resistor. Dropping gain to less than 20 dB expectedly makes things a lot more critical still, so I wouldn't recommend that.
TBA820M
The '70s brought about quite a number of small chip amps. There were many radios of all kinds that appreciated them. One of them was ST Micro's TBA820M, a mono effort in a DIP-8 package, also sold as Samsung KA2201. In spite of having no more pins than the LM386, this part squeezed in the ability of restricting bandwidth via an external compensation capacitor including VAS and output stage, and would accomodate an external bootstrap capacitor for increased output voltage swing near supply – all this while keeping external bypass and gain setting and having reduced input noise (3 µV across the audio bandwidth) and more output power. With less noise, minimum recommended gain was increased to 34 dB, but given that the circuit uses external compensation, it shouldn't be too hard to adapt to lower gains.
TBA820M internals, taken from datasheet.
Now that's a bit fancier looking than ye olde LM386, isn't it?
The input stage is nothing too extraordinary by today's standards, a differential amp (Q2/5) with current source (Q6) and current mirror (Q3/4), with a level shifter transistor (Q1) on the input side so it'll accept input down to -0.3 V and change. When simulating circuits like these, I had big trouble with input stage distortion. Eventually I found out that both the level shifter and differential amp transistors had to be types with low saturation voltage (Is), which in hindsight makes a whole lot of sense – this means maximum B-E voltage drop at a given current, hence maximum Vce for the differential amp transistors and thus maximum accepted negative input voltage swing before saturation.
Setting output offset is a current mirror affair not too dissimilar from what the LM386 does. A dual-transistor current source (Q8/9, R4) establishes a reference current which is then mirrored elsewhere via a diode (Q7) to be used for input stage, VAS and output stage bias. Now Q9's current, which comes from supply via 5k9 (R2) and 6k (R3), is mirrored by Q10. Q10 in turn can only draw current from the output via another 6k (R5), hence the output ends up at pin 8 potential, as in the LM386. (Yes, I neglected Q5 base current, but that would contribute no more than 1% here.)
Transistor Q11 does voltage amplification stage (VAS) duties, with Q12 being its current source. Miller compensation, requiring an external capacitor, stretches across both VAS and output stage. This bears some risks WRT stability (capacitive loading in particular), but also allows bandwidth limitation without degrading linearity a whole lot.
The output stage is a quasi-complementary affair, as in most any IC power amp to this day. Seems like npn transistors still are more efficient when it comes to die area per power handling. Let's look at both halves separately.
The lower half (current sink) essentially is a CFP arrangement (Q13, output driver Q15), with an additional level shifter thrown in (Q14), plus a two-diode level shifter (D3/4) and bias current source (Q16). The extra parts are needed for two reasons:
- Q14 essentially replaces a Baxandall diode, which reduces nonlinear distortion by ironing out an ugly kink near crossover in the quasicomp stage's transfer characteristic (CFP then more or less mimicks EF).
- With the level shifting, output level can swing down all the way to Q15's saturation voltage, only about 0.3 V from ground. In amplifier ICs powered from small supply voltages, that's important.
The upper half of the output stage (current source) is a relatively standard 2-stage emitter follower / Darlington affair (Q17, Q18, R6, bias via D1/2, bias current source Q12) – with a little twist that, again, is intended to increase voltage swing. You've probably noticed that there are two distinct positive supplies – main supply is on pin 6, and connects to the secondary supply on pin 7 via an external 56 ohm resistor. Pin 7 also has an external 100 µF capacitor going to the output, a prime example of the technique called "bootstrapping". While statically the secondary supply will be a little below main supply (no real heavy loads there so the drop across 56 ohms won't be too big), dynamically the output signal will be superimposed on it. This means that secondary supply may swing up to almost 150% of main supply, always providing enough "headroom" for Q18 to be biased properly. Thus the output can come as close as Q18's saturation voltage to main supply, like we saw on the current sink side.
Bootstrapped supplies are neat, but the obvious downside is that a relatively big electrolytic capacitor is needed unless you were willing to run your speaker load between output and main supply. The technique also shows its limits once you get to very low supply voltages (less than 3 V), then current source topology needs to be changed for a lower minimum voltage drop.
TDA2822M
You can, however, even squeeze a stereo power amp into a DIP-8 package, which became interesting in the '80s. NJR's NJM2073 was one example, ST Micro's TDA2822M is another. (A TDA2822 sans M is a bigger 16-pin affair that is better suited to getting rid of heat.) Just the minimum of connections here, and an internally-set gain. The NJR part's datasheet shows a way of reducing gain externally, but minimum stable gain is officially limited to about 26 dB.
Gain reduction for NJM2073 as shown in datasheet.
The same way of gain reduction applies with the TDA2822M, and it seems that this part can take significant amounts of resistance added to its input ground pin – up to 5k6, which should translate to little more than 6 dB of gain. (At this point, a compensation capacitor of maybe 33 pF from output to inverting input seems to be advisable.) Input noise is given as 3 µV with a 10k source resistance, translating to about 16 nV/√(Hz) – still not exciting, but about 10 dB better than the LM386 at least. Plus, the thing is dirt cheap nowadays, so you can afford to fry one. Sounds like the right chip for a cheap and cheerful headphone amp, huh?
You can actually buy one as a kit with the Samsung equivalent KA2209 if you're in the UK, with the options of using either stock (40 dB) or reduced gain (20 dB). The measurements given look pretty good, and most importantly noise should be more than 6 dB lower than for a LM386-based amp. At 30 µV output noise, only reasonably sensitive headphones from about 110 dB SPL / 1 Vrms up should show any audible hiss. (At stock gain, the chip's noise can only be tamed by my least sensitive 600 ohm cans.)
The TDA2822M's inner workings aren't too hard to grasp once you've studied the TBA820M's above. Output offset setting still is sort of a current mirror arrangement (R3, D6, Q12/13, R1/2, R4/5), though Q/D device characteristics would now come into play. For lack of a bootstrapped supply, the output stage's upper half (current source) had to be changed; output swing extends to about 1 V from positive supply, which is not grand but saves one B-E voltage drop (0.6 V) at least. Some parts of the schematic have been boxed up to reduce visual complexity, which is a bit of a pity as I'd certainly like to know how the quiescent current control works.
LM1876
An LM1876 is a seriously "grown-up" two-channel chip amp of hi-fi quality that can deliver 20 W into 8 ohm speakers and is operated from a split power supply. Its input noise level of about 1.7 µVrms(A) (calculated from SNR) is a little lower than the TDA2822's, about 3 dB or so. Power supply rejection is on opamp level.
This is an unusual part to use in a headphone amplifier, but various Lake People amps use the LM1876 as an output buffer with 8 dB of gain. I guess they had to use an external compensation capacitor or "bleed off" some open-loop gain with a small cap across the inputs for this to work, since ICs like that usually have a minimum stable gain of 20 dB. Then, however, you end up with an output buffer that can really take a beating while giving entirely non-critical output noise levels and low output impedance. Smart move.
Interestingly enough, all the chip amps presented so far have one thing in common: An oldschool quasi-complementary output stage! Those have not been used in commercial discrete amplifiers since the mid to late 1970s, as "real" complementary topologies tended to give lower distortion. I guess going all NPN for the power transistors still is the best option when die area is in short supply. You can usually tell the two topologies apart by looking at the distortion spectrum under heavy output loading – very symmetric amplifiers (up to all-complementary push-pull concepts) will have dominating odd-order harmonics, while for very asymmetric ones the even harmonics will dominate. Our hearing seems to mask even-order distortion a bit better.
Entry last modified: 2011-11-21 – Entry created: 2011-10-29
Engineer's Corner: Pole Splitting in a Nutshell
I'm sure those dabbling in discrete amplifier circuits have seen the term "pole splitting" thrown around. You move one pole down in frequency which in turn shifts a second one up so you don't have to shift the first one quite as far down to achieve stability (Miller compensation). Sounds kinda complicated, doesn't it? So how does that work, and why?
Circuit in which one might apply pole splitting.
Here's a typical circuit configuration in which pole splitting might be employed. You may recognize it as the VAS and half the output stage in an typical audio amplifier.
First, let's look where the second pole comes from. That's easy. Your typical output stage transistor is a little bigger and thus may have input capacitance in the hundreds of pF, and the output resistance of a common emitter stage would typically be tens or hundreds of ohms.
Origin of second pole.
Now for the first pole. It originates in feedback capacitance (transistor Ccb), typically low pF range for an average small-signal transistor. Input capacitance is effectively increased via Miller effect, as Ccb is seen multiplied by voltage gain.
Origin of first pole.
Now for the "trick" employed in pole splitting:
If we artifically increase first-stage feedback capacitance, not only will
its pole be shifted down in frequency, the increase in feedback at higher
frequencies also decreases its gain and output impedance. Now the
output impedance (resistance) together with the second stage's input
capacitance was responsible for the second pole. Hence that one promptly moves
up in frequency! Neat, huh?
Besides, you typically need a pole for dominant-pole (Miller) compensation
anyway, so our first stage is an ideal place to apply it. Two birds, one
stone.
This is useful because the output stage is usually the slowest part in an amplifier, i.e. its pole is lowest in frequency. If you get that one moved up, less compensation is required overall in order to achieve closed-loop stability, i.e. you can afford a higher gain-bandwidth product. That in turn reduces nonlinear distortion.
Now you should be able to understand the usual formal explanations of the topic a whole lot better.
Entry last modified: 2011-10-27 – Entry created: 2011-10-27
What's so great about "Class A"?
Any audiophile is likely to have encountered the term "Class A" in the context of amplifier circuitry. Now Wikipedia is likely to tell you most anything you (n)ever wanted to know about class A, B, AB and all the rest, but today I want to look at the topic in the greater scheme of things.
Do you know these "pick two out of three" scenarios that seem to pop up in
the most impossible places? Here's another:
Low nonlinear distortion, low circuit complexity, low operating current
– pick two.
That's pretty much what it boils down to.
Class A picks the first two at the expense of the last one. Getting any noteworthy amount of power out of circuitry with a Class A output stage involves quite a bit of input power and things getting nice and toasty. In return, a handful of active components may give very decent results already, like this John Linsley Hood 5-transistor headphone amplifier. Back in the olden days, semiconductors were expensive – as were the vacuum tubes preceding them.
At the opposite end, there's the kind of circuitry that does a pretty good job driving headphones to several hundred mVrms in portable MP3 players, with a quiescent current somewhere in the single-digit mA. Low currents make semiconductors slower and reduce amplification, thus you can't apply as much feedback without things getting instable. To make matters worse, running a typical output stage at a low quiescent current increases its distortion. There is only one way of keeping distortion down under adverse conditions like these: Use the high level of integration that's possible today and increase circuit complexity and pull all kinds of topological tricks.
Any practical solution is likely to be somewhere in between these extremes. If you wanted to build a headphone amplifier that performs well without being too complex or overly power hungry, for example, you might go with something involving an op-amp IC and a discrete AB output stage.
Entry last modified: 2011-10-22 – Entry created: 2011-10-22
X-Fi "machine gun" / "buzzing" noises: A fix?
A number of X-Fi soundcards (whether from Creative or Auzentech) seem to develop an annoying problem after a while, especially those not yet shipped with a heatsink for the EMU20Kx chip:
Instead of audio, they only output a loud, machine gun like noise. This may only occur sporadically at first, until eventually it does all the time.
So far, it appears nobody has come up with a good explanation, except that the problem must have a thermal background of some kind, as the cards with heatsinks are affected much less frequently.
Here's my two theories:
- The X-Fi chip gets too warm, deteriorates and eventually becomes flaky above a certain temperature, as half-dead semiconductors like to do. That's the boring one.
- I noticed that the EMU20K2 chip on the Auzentech X-Fi Forte comes in a BGA (ball grid array) package. Now the combination of BGA and thermal stress rang a bell, bigtime. Ever since manufacturing went lead-free, this combination has been causing a lot of trouble. Remember flaky XBox 360s? Or Power Mac G5s acting up? Or a bunch of Thinkpads with "flexing" issues? Essentially thermal cycles eventually lead to cracks in the solder, with all the problems that may bring.
If theory #2 is correct, there is a chance of fixing affected cards. This fix would involve two steps:
- Resolder card in a reflow oven. Enthusiasts have been known to construct ones at home, but keeping a well-defined temperature profile is another matter. Soundcards also have plenty of electrolytic capacitors, which shouldn't be exposed to soldering temperatures for very long at all.
- Once the card is confirmed working again, install heatsink on EMU20Kx with thermal glue.
Entry last modified: 2011-10-22 – Entry created: 2011-10-22
My dear audio equipment tweakers…
Modifying audio equipment for (supposedly) better performance has become kind of a sport in the audiophile community, at least among those able to tell the two ends of a soldering iron apart. Even the enthusiast with no soldering abilities is commonly provided with tweaking opportunities right from the factory, like socketed opamps to allow "opamp rolling" (analogous to "tube rolling" among the fans of hollow state devices). Now while the latter usually doesn't do much harm (as the manufacturer will probably have taken care of the usual pitfalls), more advanced hacks come with their share of problems.
Let's be honest, most of the time people don't really know what they're doing. They'll swap out parts for what they consider to be "better" based on hearsay, and do away with some that they consider potentially detrimental. Usually they have no idea of what kind of improvements to expect (or the wrong one), and few ever bother with verifying performance even if they'd be able to. (Obviously, some things like EMI testing are way out of reach for the average hobbyist.) They'll usually judge things by ear, usually hearing massive improvements – yeah right. Ever heard of placebo effect?
As an engineer, I cannot regard most of these "mods" as more than mindless tinkering. So you've got a soundcard that looks like a porcupine tree – great, but you probably have no idea whether it performs any better or worse than the original, in which aspects and why. You only know that the whole affair cost you money in parts plus some of your time. Not terribly satisfying if you ask me.
The usual procedure for modifying a piece of equipment would be:
- Quantify performance.
- Study (reverse-engineer if necessary) circuit. Identify potential bottlenecks and possible causes for them.
- Work out a modification that eliminates one bottleneck at a time and apply it.
- Verify performance. Identify changes, if any. Discard and undo worthless or detrimental mods.
Any mods that do not follow this scheme in some way are probably worthless and should not be published as a example for others to follow. I can think of very few mods with immediately audible effects, mostly things like changing output resistance or previously overly small coupling capacitors, so usually measurements will be necessary.
When can you expect to achieve any noteworthy improvements anyway? Well, pretty much under the following conditions only:
- The original designer was under some constraint that you are not. For example, a tight budget may have required skimping on parts quality. Your average $20 Creative budget soundcard isn't going to have any boutique opamps or oversized electrolytics, and it may not have as many PCB layers as it could have used. Still, the original parts are likely to have been chosen with care, and the same will have to apply for their replacements.
- The original designer messed up. It happens to all of us once in a while. For example, whoever decided to use an AD712 opamp in the Rotel RA-980BX integrated amplifier apparently overlooked the effect the fairly high voltage noise of this part would have on amplifier noise floor when used right after the volume control, possibly blinded by the low distortion specs.
Do not, in general, expect to be smarter than the original designer. Usually the folks who construct basically well-performing electronics aren't dumb. Too many modders seem to think that they are – and if that isn't foolish I don't know what is.
Entry last modified: 2011-10-22 – Entry created: 2011-10-20
About development of musical taste
Ed. note: This topic is scientifically covered by sociomusicology.
Teaching an old dog new tricks is, as you'll probably know, a non-trivial affair. Given how much we pick up in our youth, it shouldn't be surprising that musical taste is shaped along the way and evolves in a similar manner.
Up to the onset of puberty, kids absorb influences better than many a sponge. Thus if you (more or less) "get" a style of music by age 12 or 13, you won't have a problem with it later in life. If, however, you'd never even heard of it at this point, there is a non-zero chance that you'll never like it.
About age 12 or 13 seems to be quite typical for really getting interested in popular music of some kind. Listening habits develop and change as the kids themselves do. There still is lots of opportunity to pick up new musical influences here, but they may already be concentrated in a smaller range.
The next phase typically starts in the early 20s. By then, raging hormones have typically calmed down a bit and world view is no longer is gloomy as it once was. With a more open mind, looking beyond "angry young men's music" (or whatever else dominates adolescence) becomes more enticing again. Typically having some spare cash doesn't hurt either.
It would be very unusual to see any kind of dramatic changes beyond about the age of 30. At this point, musical tastes seem to be settled down fairly well – unsurprisingly so, since people tend to be quite busy in other areas (work and family and such). You may still encounter gradual changes in the following decades, depending on how much interest in music there was to begin with and how open-minded people remain.
Entry last modified: 2011-10-20 – Entry created: 2011-09-09
Separate songwriters and performers or all original material – what's best?
And the answer is: 42.
As you may have guessed by now, it very much depends on the circumstances and whatever the goal is. Both approaches have their merits.
Lately I was looking at German folk music and found that work was typically split about as far as possible: Lyricist, composer, performer. Back in the 18th and 19th century, it was not at all unusual to reuse older melodies (sometimes hundreds of years old), or compose a new tune for a text that could be several decades old.
Now folk music being what it is, ordinary people had to be able to sing it, which highlights an important aspect: Interoperability. The further all the people involved are apart, the more things have to stick to some sort of standards. In return, you potentially get the typical advantages associated with division of labor (a rather successful concept among humans and other lifeforms alike). You are much more likely to find a good lyricist, a good composer and a good performer compared to a single person equally talented and with equal experience in all three fields. That also is why it takes a genius of a bedroom producer to rival a big studio's output. Since nowadays most any self-respecting artist tries to come up with original material, there's a good bit of mediocre output floating around.
On the flipside, a singer-songwriter's output can be a lot more personal (and still be as good as anything). There's an arbitrary number of songs out there which are arbitrarily hard to cover for various reasons (for example, try singing something like Kate Bush's "Kite" if you don't have the vocal range it requires – good luck). With the advent of music recording, the requirement for songwriter / performer interoperability more or less evaporated. It will, however, not disappear entirely as some demand for sing-along songs does remain.
In this context, it is interesting to note the different approaches in
building up a pop star. In the US, you commonly find the classic "top-down"
approach: Record company finds person with performing talent, sticks them in
some drawer (people sure love their genres over there, and I guess they can
afford to because there's so much personnel to begin with) and has songs
written to them. Sometimes it works (don't think people like Michael Jackson
wrote all of their stuff), but frequently the result is yawntastic.
In the UK, things are more commonly handled "bottom-up" – the artist
carves out a little niche, and it's very much a self-made (wo)man affair.
Entry last modified: 2011-09-09 – Entry created: 2011-09-09
Youtube memes
Youtube commenters, widely known to be about the smartest people you'll find on the interwebs, are good for some amusing social phenomena. Those who like to watch music videos on there (it's a unique tool for exploring the world of music, isn't it?) are likely to be familiar with the following…
- "What a shame kids today don't know this any more&hellip" – is how it started out. Some time later you started seeing comments like…
- "Hey, I'm 13/14/15 and I listen to <this and similar artists>!" – which nowadays seem to be countered by
- "Oh, just shut up, OK?"
In accordance with step number three, mention of $infamous_pop_star_of_the_day (Justin Bieber, Lady Gaga) has also been decreasing – thankfully so.
Entry last modified: 2011-09-09 – Entry created: 2011-09-09
Remasters that are safe to buy
As a general rule, I do not touch remastered CDs with the proverbial 10-foot pole if I can help it. Usually the loudness levels of those produced in the late 1990s through the 2000s were adapted to contemporary tastes, which more often than not means that dynamics have suffered. Now for any rule, there are exceptions, of course.
- Audio Fidelity and MFSL releases tend to be hard to fault, but aren't particularly cheap either.
- Anything that came out up to about 1994 seems to be OK, too.
- Johnny Hates Jazz – Turn Back The Clock (2008 extended reissue; no level increase at all in spite of being released at the height of the loudness war)
- Emerson, Lake & Palmer's 2011 remasters on Sony Music (inexpensive, with "classic" levels and reported to be very good-sounding)
It looks like the era of "pushed" remasters may slowly be coming to a close. If so, it was about time.
Entry last modified: 2011-09-04 – Entry created: 2011-09-04
Gender relations in popular music
Most populations on this planet have about a 50:50 gender ratio, give or take a few percent. Now from my experience, females are more likely to be drawn to arts of all kinds than males. Hence you would expect at least parity between genders in music, wouldn't you? Interestingly enough, this doesn't seem to be the case – for example, Kate Nash in her Rock 'n' Roll for Girls After School Music Club video, cites a number of 14% (that's about 1 in 7) of performance royalties collected by PRS going to women.
However, changes already are well underway. Look at female drummers, for example. They used to be quite exotic even 25 years ago – not any more these days. In the future, I would expect male domination in music to fade away even more than it already has. Get used to it, folks.
Now speaking of gender relations… apparently the worst thing you can do as a straight male is enjoying music that's leaning more towards the female side. (There's a limit to everything, of course – it tends to get pretty boring when approaching classic "girly" stuff. You know, relationships, relationships and, err, more relationships. If they're packaged in an interesting way, I'm fine with that though.) Nobody has a problem with girls being into punk rock, heavy metal or other traditionally "male" genres these days. The opposite tends to get you looks like you've got three heads or something, from both genders alike no less. (And you know what happened to the Hydra in Greek mythology...) Hey, I'm a science kid – I don't need testosterone dripping out of my music left, right and center, 'k?
What little music I had in my youth was mostly classical, instrumental (e.g. Mike Oldfield) and some both-genders-alike pop (like Roxette – the stuff you like as a kid…). With that kind of background, I was easy prey to Kate Bush many years later (I didn't actually become overly interested in music until my early 20s). The rest, as they say, is history…
Let's see what my music library has to say on the subject. I made a list of artists and grouped these as "female solo artists", "female-fronted bands", "male-fronted bands", "male solo artists" (including several classical composers), plus the rest without a distinct gender or with both being present about equally. Rinse and repeat for a number of artists still waiting to be added.
| Artist type | Music library | To be added | Total |
|---|---|---|---|
| female solo | 51 | 12 | 62 |
| female-fronted | 27 | 3 | 30 |
| male-fronted | 41 | 0 | 41 |
| male solo | 38 | 4 | 41 |
| indifferent | 8 | 0 | 8 |
| Total | 165 | 19 | 184 |
Interestingly enough, in spite of my music collection leaning quite heavily towards female artists, the disparity isn't all that large – yet. It would seem to be growing in the future.
These numbers do not show the respective typical target audiences, of course. That would be interesting, but not terribly easy to determine.
Entry last modified: 2011-08-20 – Entry created: 2011-08-20
Engineer's Corner: Op-Amp Gain Error at High Frequencies
Introduction
Recently I stumbled across an old edition of EDN Europe magazine that had an article on op-amp gain error. While technically correct, it still left me with a big question mark floating above my head – why was it like that? So I proceeded to turn the interwebs upside down, collect the information needed and grab a good ol' spreadsheet. Let's turn that question mark into a lightbulb, shall we?
The gist of the above article is: If we need a small-signal bandwidth of X in our op-amp based amplifier circuit, and then choose an op-amp that has a gain bandwidth product (GBW) of only a wee bit more than the rule-of-thumb minimum of X times non-inverting gain (noise gain), we need not be surprised if actual closed-loop gain is up to 6 dB less than intended when approaching frequency X. Let's examine why.
Part uno: A look at closed-loop gain
Here's an expression for closed-loop
gain of a circuit using a differential amplifier of finite and
frequency-dependent open-loop gain A(f) in a non-inverting
configuration:
A(f)
G = ----------
1 + A(f)×β
…where β, our feedback factor, is simply the inverse of nominal non-inverting gain or noise gain:
R
1 g
β = --- = -------
G R + R
N f g
…with Rf and Rg being
the feedback and ground resistors you'll see in the non-inverting op-amp
circuit. In other words,
A(f) A(f)
G = ---------- = G × ---------
A(f) N G + A(f)
1 + ---- N
G
N
In terms of asymptotic behavior, we find
G → G ; A(f) >> G
N N
(phew!), and
G → A(f) ; A(f) << G .
N
Finally, let's examine two specific values of A(f):
Firstly, A(f) = 1, which happens to be our definition of
unity-gain bandwidth:
G
N
G = ------ ; A(f) = 1
1 + G
N
Next, A(f) = GN, i.e. the point where open-loop gain
is down to desired closed-loop gain:
G
N
G = --- ; A(f) = G
2 N
Here closed-loop gain is already down by not 3, but a whopping 6 dB!
Part due: Examining Open-loop gain
In most of ordinary voltage feedback op-amps, open-loop gain follows a
first-order lowpass response with a high initial value AOL
(commonly around 1E5) and low corner frequency fc (in
the 10s or 100s of Hz), due to the compensation applied. The response is
determined by a constant gain-bandwidth product (GBW) over a large
frequency range, not uncommonly from 100s of Hz well into the MHz range.
A(f) × f = GBW = const
In other words, we get this handy expression for A(f):
GBW
A(f) = ---
f
If we go all out with the math and derive a proper first-order lowpass response, we obtain
A
OL
A(f) = --------------------------
___________________
/ / / A \2 \
/ | | OL | |
_ / |1 + | --- * f | |
\/ \ \ GBW / /
…which thankfully reduces to the above for f >> GBW /
AOL.
Now I don't know about you, but I'd much rather stick with the simple formula for the time being.
Part tre: Putting it all together
Now we have everything we need. At this point we can already whip out the
spreadsheet and have our open loop gains computed for a range of frequencies
and a given set of op-amp AOL, GBW and
desired GN, which can then be used to compute
closed-loop gain, plus fun stuff like its deviation from nominal in dB and all
that jazz.
Now where's that -6 dB point that we examined earlier? That's easy, here are the relevant formulas again:
G
N
G = --- ; A(f) = G
2 N
GBW
A(f) = ---
f
From those it follows pretty clearly that
GBW
f = ---
6 G
N
…which is exactly our gain-bandwidth rule of thumb. Thus our estimated bandwidth is at -6 dB! If you need less than 3 dB of deviation, you're well advised to stay a factor of two lower, or a factor of 4 for -1 dB.
Therefore an audio amplifier with a gain of 10 that's supposed to be flat to -1 dB to 20 kHz would have to have a rule-of-thumb GBW of
GBW = 4 × f × G = 4 × 20 kHz × 10 = 800 kHz
1 N
(In this case the dominant consideration in real life would be nonlinearity, as suppression of nonlinear distortion cannot be higher than whatever the "spare" open-loop gain is. Ideally you want at least 40 dB at 20 kHz, which would necessitate a GBW of 20 MHz here. This is also about the maximum you'll find in typical audio op-amps, in the interest of stability. More advanced designs employ additional internal feedback for reduced distortion.)
Entry last modified: 2011-08-03 – Entry created: 2011-07-29
Stroboscopic speed indicators
Intro
You'll see them on most any halfway "serious" turntable / record player: Stroboscopic speed indicators with a little neon or glow lamp and a pattern on the side of the disc platter that you have to make "stand still" for speed to be accurate. But how does that work? And how close to the real deal can you get?
Strobe pattern on turntable platter.
(Audio Technica AT-LP120-USB image reproduced courtesy of CNET.)
Basic operation
The basic principle is simple:
The lamp (strobe light) flashes periodically, usually once per mains
frequency cycle. It then briefly lights up the periodic pattern. (Human vision
has the peculiar property of retaining images that flash up shortly but
brightly for some time, which helps here.) If this is supposed to appear
standing still, it has to move by exactly one division per cycle – or
two, or any other integer number of divisions.
Does that remind you of something? Well, of course, that's
sampling! With aliasing!
Looking at it in frequency domain, a sine at sampling frequency (or its
multiples) is aliased down to one at zero, i.e. one that is standing still. And
that's exactly what we get.
Dimensioning
So how many divisions of platter circumference do we need? Apparently just as many as we have mains cycles in the time the platter takes to complete one rotation, or an integer multiple:
Ndiv = n * fmains * Tr; n ∈ {1, 2, 3, ...}
With rotational speed typically given in RPM, that means
f
mains
N = n * ------ * 60 s; n ∈ {1, 2, 3, ...}
div RPM
With this formula, calculating the minimum number of divisions required for common RPMs and mains frequencies is hardly rocket science:
| 331/3 rpm | 45 rpm | 78 rpm | |
|---|---|---|---|
| 50 Hz | 90 | 200 * | ≅77 ** |
| 60 Hz | 108 | 80 | ≅46 |
*) n = 3.
**) n = 2.
But hey, what's up with 78 rpm? Simple, you can't match it exactly within a
reasonable number of divisions. You have to be content with
77.922 rpm or 78.261 rpm,
respectively. (For an exact match, you'd have to go up to
n = 13, and that would result in 500 and 600 divisions, or
<=2 mm per division on the outside of the platter. Not exactly a joy to
manufacture, I imagine. Besides, the adjacent "stand-still" points would be at
72 and 84 rpm, so the speed would be very hard to ballpark.) Now given the
variations in RPM among 78s, that's probably small potatoes. Let's hope you've
got someone with perfect pitch at hand if you don't have it yourself.
Now if you were looking at the image of the real-life strobo pattern
earlier, you may have noticed that it does not correspond to the numbers given
above. Instead, it appears to be n = 3 throughout, which would
give the following division numbers:
| 331/3 rpm | 45 rpm | |
|---|---|---|
| 50 Hz | 270 | 200 |
| 60 Hz | 324 | 240 |
Interpreting pattern movement and adjustment accuracy
So what about speed deviations? As it turns out, any observed slow pattern movement can be treated as being linearly superposed upon the correct rotational speed. Or in other words, real platter edge velocity is nominal platter edge velocity plus strobo pattern movement velocity (positive if clockwise).
vedge = vedge,n + vdrift
Therefore, if you know the platter diameter D and observe a
certain amount of pattern movement (drift) vdrift, it
is easy to calculate the speed difference:
v
drift
ΔRPM = ------ * 60 s
πD
For example, a 1 mm/s drift to the right (counterclockwise) on a
platter 310 mm in diameter indicates that speed is slow by
0.0616 rpm, or -0.18% at 33 rpm.
Since a drift in that order still is quite easily visible, it should be
possible to get it down by another factor of 3 at least, so we're looking at
a minimum deviation of 0.06% (600 ppm) or less. For reference, the
audibility threshold is considered to be about 0.3%, so we're comfortably below
that.
Of course there still is one variable that we have no control over: Mains frequency accuracy. This may vary depending on where you are. I have measured 50.00±0.02 Hz here (within ±400 ppm), but you cannot take that kind of accuracy for granted. Usually only the total number of cycles over a 24-hour period will be very tightly controlled (i.e. ∫f rather than f), for mains-sync'd clocks to be accurate. In extreme cases, deviations of up to 0.8% may occur. So if you want to be absolutely sure, you'll need a crystal-controlled strobe.
Valid speed range
Now for any formula, you typically want to know where it holds. Earlier I mentioned that there would be other stand-still speeds which are not the same as target speed. So where do those come from? Let's recap how we picked the number of divisions:
f
mains
N = n * ------ * 60 s; n ∈ {1, 2, 3, ...}
div RPM
This gives us a set of fixed values
{Ndiv, n, RPMn}.
RPMn is nominal RPM, n the value chosen
for this speed, and Ndiv is the resulting number of
divisions, of course.
With a given number of divisions, that's easily solved for RPM:
f
mains
RPM = m * ------ * 60 s; m ∈ {1, 2, 3, ...}
N
div
As n is now considered fixed, we have substituted it by the
variable m.
We will now
express Ndiv as a function of nominal RPM
RPMn and the value of n that belongs to
the selected value of Ndiv:
f
mains
N = n * ------ * 60 s; n ∈ {1, 2, 3, ...} fixed
div RPM
n
Now let's put that into the previous equation:
m
RPM = --- * RPM ; m ∈ {1, 2, 3, ...}, n fixed
n n
Or to make things even more clear:
RPM
n
RPM = m * ------ ; m ∈ {1, 2, 3, ...}, n fixed
n
When we picked n > 1, the lowest stand-still RPM dropped by
the same factor. Now remember that the whole affair is periodic in frequency,
and lowest stand-still RPM determines periodicity. Therefore, the pattern drift
speed and direction will only indicate the correct deviation from nominal speed
in a band of width RPMn / n around
RPMn.
! 1 1
RPM ∈ ( RPM * (1 - ----) ; RPM * (1 + ----) )
n 2n n 2n
v ^ drift | | +v -+- -.- - - - . - - - . - - - . - - - - - max | . / / / . | / / / / . |/ / / / . 0 -+- - - - / - - - / - - - / - - - / - - - | /. /. /. / | / . / . / . / | / . / . / . / -v -+--...---|---.---|---.---|--------...---> max | n-1 | . | . | n+1 RPM 0 --- RPM . RPM . --- RPM n n . n . n n . . .<----->. . RPM . . n . . ---- . . n .
| n | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| 1/2n | 50% | 25% | 16.67% | 12.5% | 10% | 8.33% | 7.14% | 6.25% | 5.56% | 5% |
Real-life record players tend to have pitch adjustment ranges of maybe
±10% to ±12%. As you can see, anything much beyond n =
4 is problematic. At n = 13 (as would be necessary for 78
rpm) we're down to a valid indication range of a mere ±3.85%.
Entry last modified: 2011-06-19 – Entry created: 2011-06-17
How audio equipment specifications can help you (2)
If the first part of this series went right over your head, no worries – here's your chance of catching up with the help of a simple real-life example.
Let's say we want to assemble a headphone listening system with a signal source S, an amplifier A and a headphone model H. Will this be sufficiently loud?
First let's look at a system-level model of the whole affair:
+--------+ +-------+ +--------+ | Source | | Amp | | Trans- | | |V | | | ducer | | | out| |\ | | _/| | | /\/ |--->| |/ |--->| |_)| |---> Sound | | | | Z | \| | | G | | A | T| G | | S | | V | | T | +--------+ +-------+ +--------+
Transducer is a fancy name for something that converts electric power into sonic power or vibration (and vice versa if need be – many of them are reciprocal, which is the fancy way of saying that it works in both directions).
So what do we need to know?
- Levels within source material. With digital material, RMS levels are
usually given as dB vs. fullscale output voltage, which may refer to either
maximum undistorted sine output or square wave output, with a difference of
3 dB (as the crest factor = peak-to-average ratio of a sine is 1.414 or
3 dB, vs. 1 for a square wave). The former seems to be more common, but
be aware that some sound editors like Audacity give square-wave-referred RMS
levels. This always keeps indications at <=0 dBFS (and allows giving
peak-to-average ratio right away), whereas on a sine-referred RMS level meter,
a fullscale square wave would show up as +3 dBFS.
On classical or old pop CDs prior to Loudness War, it is not uncommon to see RMS averages between -17 and -25 dBFS (sine-referred), with loud passages between -3 and -9 dBFS. Modern-day pop material may average at around -7 dBFS already. If Replaygain or similar is being used, its effect has to be considered here. - Source gain
GS. Usually this is given as maximum RMS output level for a sine signal (sine-referred 0 dBFS level). Typical mains-operated sources output 2.0 Vrms or a little more, while owners of DAPs may have to be content with about 600 mV, or in case of volume-limited units, 100 mV or even less, down to 35 mV.
Sources usually have an output impedance much lower than the next stage's input impedance, so we don't need to worry about that here. - Maximum amplifier voltage gain
AV. In a headphone amplifier, this could be anything from -4 dB to about +20 dB, not uncommonly adjustable in multiple steps. Typical values for speaker amplifiers range from about 30 to 46 dB; if it is not given, it has to be calculated using specified output power and nominal input sensitivity.
Dedicated headphone and speaker amplifiers usually have an output impedanceZout(usually dominated byRout) much lower than load impedanceZT, but when this is not the case, output impedance has to be taken into account. - Upper bounds (1): Maximum amplifier RMS output voltage or
power into desired load impedance
ZT. When looking at peak levels, corresponding maximum peak voltage (which is 3 dB or √(2) higher) may also be interesting. - Transducer gain
GT. For headphones, this is usually given as sensitivity in dB SPL (dB ref. 20 µPa, SPL = sound pressure level) for a specified input level. This input level can be either 1 mW of input power at 1 kHz (efficiency spec as per DIN 45500, no longer current but still not uncommonly seen), or 1 Vrms of input voltage (as per IEC 60268-7). The latter gives higher numbers and relates better to "how loud will it go" type questions on low-output-impedance sources, so if no specifics are given, assume an IEC spec. Conversion between the two requires nominal impedance.
For speakers, you usually find sensitivity specs of dB SPL at 1 m distance in free space, referred to either 1 W of input power or 2.83 Vrms of input voltage (which gives 1 W into 8 ohms). - Upper bounds (2): Maximum RMS power handling of transducer, and even before that, maximum levels with acceptably low nonlinear distortion. The latter usually are frequency-dependent.
- Desired average and maximum listening volume. An average value between 60 and 70 dB SPL would be considered normal. The occasional loud passage should be allowed to reach 90 dB SPL, and if there's a few dB of power reserve, that won't hurt.
Phew. Still with me?
Now for the numeric example I promised:
- Signal source S has a maximum (sine) output volume of 2.06 Vrms.
- Amplifier A has selectable voltage gains of -4 and +10 dB, a low output impedance of no more than 1 Ohm, a maximum output amplitude of 11 Vrms and maximum peak output current of 250 mA.
- Headphone H has a nominal impedance of 600 ohms and requires 0.5 Vrms for 90 dB SPL. Its maximum power rating is 100 mW. Distortion plots indicate that the drivers are breaking into sweat at 100 dB towards the lower frequencies, with harmonic distortion crossing 1% at the 100 Hz mark.
Will this be loud enough?
First let's see what we get out of the amplifier at maximum gain for a 0 dBFS signal (sine-referred). That's approximately 6.5 Vrms. This is below maximum output, and into 600 ohms only requires about 10 mA of RMS output current, so the amplifier would be expected to drive that kind of load with ease.
At this point we are getting about 70 mW of power per channel into the headphones, which is close to but still short of their maximum rating.
Computed output volume is 112 dB SPL. Even with loud passages being at -10 dBFS, we thus get over 100 dB SPL if need be, which is really loud.
Now let's try these same headphones on an MP3 player that gives us 600 mVrms maximum for a 0 dBFS sine. In this case we're hardly getting to 92 dB SPL, and chances are we won't get very much beyond 85 dB SPL on real-life signals. That's a bit tight.
Entry last modified: 2011-05-31 – Entry created: 2011-05-31
A few places related to audio equipment measurements on the web
Here are a few new(ish) ones that I've been enjoying lately.
- NwAvGuy blog: Objective Reviews & Commentary - An Engineer's Perspective. What it says, basically. Applying a healthy dose of skepticism and electrical engineering knowledge to the subjectivist-dominated field of audio is not new per se, but this guy has some fairly nice test gear to back things up, too. This has already resulted in a number of surprises.
- The headphone geeks among you will certainly be aware of Headroom's "Build a graph" feature that allows you to look at and compare various headphone measurements. Well guess what, their now ex-CEO Tyll Hertsens decided that measuring stuff was more fun to him than selling it, and he set up InnerFidelity, a headphone measurement and review site. There also is a nice Youtube channel to go with it, which includes some videos on the measurement setup.
- You can find even more headphone measurements at goldenears.net – Korean audio geeks at work here, and every bit as thorough as you'd expect.
And here are some other resources that I've known for a while longer:
Entry last modified: 2012-01-12 – Entry created: 2011-05-31
Testing for processing headroom in digital audio players
It should be fairly well-known by now that FFT-based audio data reduction formats like to increase peak levels. The classic MP3 format is particularly notorious for this. With some loudness war victims, I have seen peak levels of up to 1.5 times fullscale (+3.6 dBFS) in LAME -V 6 quality. Needless to say, those better be reproduced correctly, or sound may suffer even more than it had already.
So how does one test for sufficient processing headroom then? That turns out to be quite easy with the help of the classic MP3Gain tool. All you need to do is generate, say, a sine test tone at a few hundred Hz or lower (Audacity will do that fine) and encode that as MP3. Then you can apply gain in 1.5 dB steps to push decoded audio peaks above fullscale.
If a 1.5 dB granularity seems too coarse to you, adjust amplitude during test tone generation.
In my experience even a fairly small amount of clipping is plainly audible on a low-frequency sine.
For another twist, you can also Replaygain-scan the result (e.g. with the classic "Swiss army knife of audio players", Foobar2000) and see whether that gets the playback chain out of clipping.
Here's the little test MP3 I made. This contains a 440 Hz sine with a peak amplitude of 1.17845 times fullscale (+1.4 dBFS), Replaygain -16.23 dB. I guess I should have used some lower-frequency tone instead, where hearing is less sensitive, but anyway.
The results obtained with this one were quite interesting already. On my Sansa Clip+ with original firmware 01.02.15, there was no way I could play it without clipping. Replaygain made things quieter, but it still clipped. By contrast, the only way I could make things clip in Rockbox (r29855) on the very same player is turning the volume waaay up, even with Replaygain off. That's even better than expected.
If you want to test a DAC's or resampler's 0dBFS+ handling, here's a little WAV file with the infamous fs/4 +3dBFS sine. This is best used in conjunction with a scope or audio analyzer, since most of us won't be able to hear any harmonics of 11.025 kHz and I'm not sure how many bats are into audio stuff. ;)
Entry last modified: 2011-05-14 – Entry created: 2011-05-13
Apples, Oranges and Amplifier Specifications (1): Noise
Intro
To most people, the specifications of electronic devices are plenty obscure numbers. However, they can be quite useful. (That is, provided they aren't rigged, like the sensitivity specs for some loudspeakers. An overstatement of 6 dB is too much, period.) For example, if you have ever been bothered by amplifier hiss, you may want to know whether this is going to be a problem in your new amp. So let's get noisy, shall we?
A few real-life examples
Here are the output power into 8 ohms, high-level input sensitivity and noise specs of several affordable mass-market hi-fi integrated amplifiers:
- Denon PMA-710AE: 50 W; 105 mV; 107 dB (input shorted, A-weighted)
- Marantz PM5003: 36 W; 200 mV; 87 dB (500 mV input, IHF-A, 1 W / 8 ohms)
- NAD C-326BEE: 50 W; 159 mV (?); via CD-In: 94 dB at 1 W (A-weighted), or 110 dB at 50 W (A-weighted, 2 V input); power amplifier section alone: 100 dB at 1 W, 117 dB at 50 W
- Pioneer A-307R: 45 W; 200 mV; 106 dB, or per DIN, continuous / 50 mW: 91 dB / 71 dB (Direct mode, A-weighted)
- Yamaha AX-497: 85 W; 195 mV; 110 dB (CD-Direct, input shorted, 195 mV, A-weighted); Residual noise (A-weighted): 35 µV (CD-Direct), 90 µV (Pure Direct)
Now you can certainly tell me which one is the noisiest and which one is the least noisy, and whether it matters for you. You can't? Well, then let's inspect this in some more detail.
Basics on amplifiers and noise
First of all, let's have a look at the components inside a typical, rather conventional hi-fi integrated amplifier:
+-------+ +---------+ +-------+
| Input | | Volume | | Amp |
| o o | | --+ | | |
| | | | | | | |\ |
Source --->| o---o |--->| |R| |--->| |/ |---> Speaker
| | | | | |_|<-- | | |
| o v o | | | | | 45 dB |
| | | --- | | |
+-------+ +---------+ +-------+
There is an input selector – which can be a straightforward mechanical switch – followed by a potentiometer (a resistor with a variable tap) for volume control and finally the power amplifier.
You're missing the tone controls (including balance and stuff)? Well yes, we left them out here, as would be the case when using the "Source Direct" or "Pure Direct" functionality present on many amps – keep it simple. If you do want them to be included by all means, here's a typical arrangement:
+-------+ +------+ +---------+ +-------+
| Input | | Tone | | Volume | | Amp |
| o o | | _ B | | --+ | | |
| | | | _\__ | | | | | |\ |
Source --->| o---o |--->| _/ _ |--->| |R| |--->| |/ |---> Speaker
| | | | | __/_ | | |_|<-- | | |
| o v o | | \_ | | | | | 45 dB |
| | | T | | --- | | |
+-------+ +------+ +---------+ +-------+
Let's go back to our simplified amplifier though…
+-------+ +---------+ +-------+
| Input | | Volume | | Amp |
| o o | | --+ | | |
| | | | | | | |\ |
Source --->| o---o |--->| |R| |--->| |/ |---> Speaker
| | | | | |_|<-- | | |
| o v o | | | | | 45 dB |
| | | --- | | |
+-------+ +---------+ +-------+
…and inspect the noise sources. That's quite straightforward, the only active component in here is the power amplifier, which usually employs heavy feedback so that its noise characteristics are input-dominated. Any noise at the input will be amplified by the power amplifier's voltage gain and fed to the speaker, where it may become audible. We have 3 contributions to input noise here:
- Amplifier voltage noise with spectral noise density vn
- Amplifier current noise with spectral noise density in, giving a voltage noise density contribution of Rs × in
- Source impedance (thermal) noise with spectral noise density contribution vn,s = √(4 kB T Rs); there may be further, so-called "excess" noise
Once you have a given voltage noise density, obtaining the RMS noise voltage within a certain bandwidth is easy:
Vn = vn × √(BW)
For the typical 20 kHz, the factor becomes 100×√(2) √(Hz), which is about 141 √(Hz).
Determining the total amplitude of the noise contributions isn't too hard either, they are statistically independent and thus their powers add up:
Pn,tot = ∑ Pn,i; i = 1..3 Vn,tot = √(∑ (Vn,i)²); i = 1..3 = √( (Vn,1)² + (Vn,2)² + (Vn,3)² )
Amplifier voltage noise is constant, while the other two contributions depend upon the source impedance presented by the volume pot, which is
R ( R + R - R )
tap pot src tap
R = R || (R + R - R ) = -----------------------,
s tap pot src tap R + R
pot src
with Rsrc being the signal source's source impedance.
This value can range between about zero and half the total potentiometer resistance, but in the important lower range it is approximately
Rs ≈ Rtap ; Rtap << Rpot ≈ Rpot × Gv,tap ,
with Gv,tap being the voltage "gain" at the tap (which, of course, actually is a loss, always being less than 1 = 0 dB).
…and the consequences
Thus if you turn down the volume (so Gv,tap approaches zero), only the amplifier's voltage noise will remain at some point, with negligible contributions from the other noise sources. This is what they call residual noise.
As the volume is turned up, typically source impedance noise will appear first, rising by 3 dB for any 6 dB volume increase (note the √(Rs) and thus approximate √(Gv,tap) dependency). If the amplifier's input stage exhibits enough current noise (bipolar ones commonly do for typical 50 kOhm volume pots), this will also make an appearance eventually, rising in an approximately linear fashion with volume.
Here's a quick numeric example. The amplifier as outlined above be assumed
to be equipped with a 50 kOhm volume pot and a perfect, noiseless power
amplifier. It then be fed with a 300 mVrms line level to give an output of
50 mW into 4 ohms. What is the resulting noise level and SNR?
The desired output level of 447 mVrms requires a modest 3.46 dB
total voltage gain, which means we need to set the volume pot for 41.5 dB
of attenuation or 419 ohms. This resistor generates a voltage noise
density of 2.6 nV/√(Hz), which amounts to 370 nVrms over a
20 kHz bandwidth. Amplified by 45 dB in the power amp, this becomes
66 µVrms at the output. Our SNR thus is 76.7 dB. This is the
theoretical maximum. Most any real-life power amplifier will have at
least equally much voltage noise (commonly 50% to 200% more), so the real-life
result is likely to be at least 3 dB worse.
Let's put this into perspective. A 4 ohm hi-fi loudspeaker of moderate
sensitivity, 85 dB SPL / 1 W / 1 m, produces 72 dB SPL of
output at 1 m for a 50 mW input. Our real-life amplifier noise is
thus likely to end up somewhere around 0 dB SPL @ 1 m, a level that
should be on the limit of audibility and therefore uncritical.
Now let's replace the speaker with a "cute little" Klipschorn, 8 ohms and
(supposedly) 104 dB SPL / 1 W / 1 m. The same amplifier output
now produces 88 dB SPL of output at 1 m, and noise level ends up at
about 15 dB SPL @ 1 m, which is plainly audible. (And this amplifier
isn't even particularly noisy. Some known offenders don't have 100, but
500 µVrms of output noise.)
If you are not satisfied with the resulting noise level, here are the options that you have:
- If residual noise is too high, amplifier voltage noise needs to be reduced.
- If input impedance noise or current noise turns out too high, maximum volume pot impedance needs to be decreased. (Think of it this way: For a given signal amplitude, thermal noise power in the pot will remain the same, but signal power increases.) This will, however, eventually clash with input impedance requirements, as in an amplifier as discussed here, the volume pot dominates the whole amplifier's input impedance. You typically don't want less than 10 kOhm here.
- Redistribute voltage gain. If you move part of the voltage gain in front of the volume pot, noise at the power amplifier's input won't be amplified as much any more. Of course the preamplifier sees full line level, so it may run into trouble with linearity (which does happen). We'll inspect this a little more closely now.
Amplifiers with redistributed gain
We have seen that apparently the amplifier stage right behind the volume pot is critical. But why is that? Simple, here the signal is commonly attenuated to very low levels, and the more gain follows, the less the input signal amplitude will be for a given output level. Since some voltage noise is quite unavoidable, high gains are begging for trouble.
What, then, can you do to maximize SNR? Simple, increase power levels in the volume pot. Specifically, an increase in signal amplitude delivered to it will make for better signal-to-noise ratios at the power amplifier input. Power amplifier gain can then be reduced by the same amount. Here's an example:
+-------+ +-------+ +---------+ +-------+
| Input | | Amp | | Volume | | Amp |
| o o | | | | --+ | | |
| | | | |\ | | | | | |\ |
Source --->| o---o |--->| |/ |--->| |R| |--->| |/ |---> Speaker
| | | | | | | |_|<-- | | |
| o v o | | 10 dB | | | | | 35 dB |
| | | | | --- | | |
+-------+ +-------+ +---------+ +-------+
And that already is pretty much what the various Yamaha amplifiers with CD-Direct look like:
+-------+ +-------+ +---------+ +-------+
| Amp | | Input | | Volume | | Amp |
| | | o o | | --+ | | |
| |\ | | | | | | | | |\ |
Source --->| |/ |--->| o---o |--->| |R| |--->| |/ |---> Speaker
| | | | | | | |_|<-- | | |
| 10 dB | | o v o | | | | | 33.5 |
| | | | | --- | | dB |
+-------+ +-------+ +---------+ +-------+
What are the limitations of this approach? First, the input amplifier needs to be able to handle the maximum input signal levels without running into clipping at the output. For typical +/-15 V supplies, an opamp wil usually handle output amplitudes of about 9 Vrms (25.5 Vpp). (At levels like these, you will also have to consider the nonlinear distortion.) By contrast, some CD players have fullscale output amplitudes of 2.2 to 2.3 Vrms, with worst-case intersample-overs of 3 dB higher or about 3.25 Vrms. Thus maximum input amplifier voltage gain will be limited to about 3 (approximately 10 dB). At least your input amplifier will commonly drive lower-impedance loads than the minimum required for a high-level input (e.g. down to 2 kOhm while itself presenting 47 kOhm to the signal source), so you can reduce the value of the volume pot and thus achieve some current gain as well.
Then, of course, you don't want to fry the poor volume pot. These commonly handle only about 50 mW, and you don't want to go anywhere near that to avoid longterm problems – so maybe 10 mW at nominal maximum input level, but not any more. For our 9 Vrms level above, we can use about 8 kOhms or higher, so a 10k part should be fine.
So is there any even smarter solution? Yes there is – use a two-stage volume control, employing a 4-gang pot for a stereo amplifier. Here's a typical example:
.............................
. .
+-------+ +---------+. +-------+ +---------+. +-------+
| Input | | Volume1 | | Amp | | Volume2 | | Amp |
| o o | | --+ .| | | | --+ .| | |
| | | | | . | | |\ | | | . | | |\ |
Source --->| o---o |--->| |R| . |--->| |/ |--->| |R| . |--->| |/ |---> Speaker
| | | | | |_|<-- | | | | |_|<-- | | |
| o v o | | | | | 16 dB | | | | | 29 dB |
| | | --- | | | | --- | | |
+-------+ +---------+ +-------+ +---------+ +-------+
This amplifier will be even less noisy at low volumes, having only 29 dB after the second volume pot. It will also maintain a significant SNR advantage over its single-pot colleagues throughout much of the volume range, as in order to achieve, say, 40 dB of attenuation, you only need 20 dB at both pots now, and the minimum signal level is -24 dB (at pot 2). Nonetheless, it offers a lot of total gain if needed, and the preamplifier would never even get close to clipping (in a 100 W/8 ohm amplifier, it would have to drive about 1 Vrms at most – small potatoes).
If you want to keep the values of both volume pot sections at minimum value, an input buffer may be necessary:
.............................
. .
+-------+ +-------+ +---------+. +-------+ +---------+. +-------+
| Buf | | Input | | Volume1 | | Amp | | Volume2 | | Amp |
| | | o o | | --+ .| | | | --+ .| | |
| |\ | | | | | | . | | |\ | | | . | | |\ |
+-->| |/ |--->| o---o |--->| |R| . |--->| |/ |--->| |R| . |--->| |/ |--+
| | | | | | | | |_|<-- | | | | |_|<-- | | | |
Src | 0 dB | | o v o | | | | | 16 dB | | | | | 29 dB | v
| | | | | --- | | | | --- | | | Spk
+-------+ +-------+ +---------+ +-------+ +---------+ +-------+
The buffer will limit maximum input level, but since it has no voltage gain, the limit will usually be significantly above real-life signal levels. As a nice side-effect, it tends to reduce impedance on the potentially long lines to the input selector, reducing crosstalk of all kinds along the way.
So... any downsides? Well, first of all 4-gang volume pots don't exactly grow on trees. If you have an old but good preamp or integrated amp that uses one, treasure it. Secondly, the classic problem of channel tracking may become even more severe. (Because of this, Yamaha used a series resistor on the first section that restricted maximum attenuation to 20 dB. Clever, that.)
How could one achieve something similar without using one of these elusive 4-gang pots then? Well, maybe like this:
+-------+ +----------+ +-------+ +---------+ +-------+
| Input | | StpAtten | | Amp | | Volume | | Amp |
| o o | | --+ | | | | --+ | | |
| | | | | - | | |\ | | | | | |\ |
Source --->| o---o |--->| |R|-<-- |--->| |/ |--->| |R| |--->| |/ |---> Speaker
| | | | | |_|- | | | | |_|<-- | | |
| o v o | | | | | 16 dB | | | | | 29 dB |
| | | --- | | | | --- | | |
+-------+ +----------+ +-------+ +---------+ +-------+
Here the first section of the volume pot has been replaced by a stepped attenuator. A relatively coarse affair would be quite sufficient in practice, down to a simple 20 dB "Mute" switch. Switches and resistors with low tolerances are quite abundant.
If the stepped attenuator reduces input impedance too much, there's always the trusty old buffer:
+-------+ +-------+ +----------+ +-------+ +---------+ +-------+
| Buf | | Input | | StpAtten | | Amp | | Volume | | Amp |
| | | o o | | --+ | | | | --+ | | |
| |\ | | | | | | - | | |\ | | | | | |\ |
+-->| |/ |--->| o---o |--->| |R|-<-- |--->| |/ |--->| |R| |--->| |/ |--+
| | | | | | | | |_|- | | | | |_|<-- | | | |
Src | 0 dB | | o v o | | | | | 16 dB | | | | | 29 dB | v
| | | | | --- | | | | --- | | | Spk
+-------+ +-------+ +----------+ +-------+ +---------+ +-------+
For another idea, why would all inputs need to have the same sensitivity? Basically you're a lot more flexible with an A/D, DSP and D/A setup here, but even the classic way you could equip the CD input with 10 dB of attenuation while another, intended for notoriously weak digital audio players, might have as much gain.
If you intend to use PGAs (PGA2320, CS3318 etc.) instead of potentiometers, expect to get quite a different optimum gain distribution. Sometimes as little as 10 dB of effective voltage gain may be needed in the power amplifier for a typical hi-fi amp, with a headphone amplifier requiring a healthy amount of attenuation.
Noise specifications discussed
With all of this background, let's go back to our practical examples now. These showed about 4 different kinds of noise specs:
- Full power SNR
- 1 W / 8 ohm SNR
- 50 mW / 4 ohm SNR
- Integral residual noise level
Full power SNR is the signal-to-noise ratio on the minimum signal needed to achieve full output power when the volume is cranked up all the way, i.e. the same level as when specifying nominal input sensitivity. In practice, it is sufficient to measure noise levels at this setting and compute the SNR value. Noise levels are commonly A-weighted to (crudely) approximate subjective perception, which also tends to do away with mains hum components that may be present. Shorting the input has a simple reason, it means that input source impedance cannot contribute any noise.
Full power SNR gives big numbers (which people like), but only shows the whole picture if there are no other amplifying components in front of the volume control (see our most basic amplifier as discussed earlier).
Normally we don't crank up an amplifier all the way. Therefore it makes sense to specify SNR at more realistic power levels, where noise would be more likely to be actually audible. For the 50 mW at 4 ohm referred measurement according to trusty old DIN 45500, volume is turned down until a signal at nominal input sensitivity results in a 447 mVrms output (which, as you might have guessed, delivers 50 mW into a 4 ohm load). At this point, most amplifiers will be down to a constant noise level independent of volume setting, as we discussed earlier. That's the kind of constant noise level that might prove disturbing when too high. As we have seen, something around 72 dB of 50 mW SNR is desirable when using hi-fi speakers of average sensitivity. Noise tends to become rather audible as we enter the mid-60 dB range, and objectionable at about 60 dB.
A 1 W / 8 ohm (2.83 Vrms) referred measurement is basically similar. However, note a bit of "cheating" in the Marantz amplifier, where the measurement is carried out at a higher level to keep volume pot noise down.
Finally, residual noise is measured at the output with the volume control all the way down. It tends to be about constant well into normal volume control settings. The Yamaha amplifier has two different specs because the gain distribution varies.
Now how do we convert the results from one kind of measurement to another? In general this is not easy, but there are some cases which do allow it:
The noise level of the power amplifier part driven from an approximate short (as specified for the NAD model) is obviously constant regardless of input signal level. Therefore it should come as no surprise that the full power and 1 W SNR specs will differ by exactly 10*log10(Pmax/W). For the NAD unit this gives an output noise level of 28 µVrms(A), or an effective input noise level of 1.0 µVrms(A), so our input voltage noise density should be in the order of 7 nV/√(Hz).
A basic amplifier as discussed earlier will reveal its residual noise level via the full power SNR spec. This is because the maxed-out volume pot essentially gives a direct connection to the source, so it's a low-impedance affair similar to what you get at the low end of the volume range. Among our contenders, the Denon and Pioneer units belong into this group when their respective "Direct" functions are enabled, and presumably so does the Marantz.
For the Pioneer model, the power amplifier's voltage gain is about
40.6 dB. The full power SNR of 106 dB(A) for a 35 Wpc unit hints
at a residual noise level of about 84 µVrms(A), while the 50 mW spec
of 71 dB(A) shows that at this position of the volume control, noise has
already increased to 126 µVrms(A), about 3.5 dB. Input wise, that's
some 1.2 µVrms(A), or roughly 8 nV/√(Hz). Let's check…
we need a 33.6 dB attenuation here, on a 50k pot that's about
1.0 kOhm, plus 2.2 kOhm of series resistance, so we could expect
about 7 nV/√(Hz) from there. Bingo. (The µPC4570 opamp in there
only has about 3.5 nV/√(Hz) of voltage noise, and the other terminal
sees a very low 24 ohms.) With this high-valued series resistor, it seems
like full power SNR is a touch overstated, like 1 or 2 dB or so –
nothing to write home about.
The Marantz unit seems quite similar in concept. Translating the
87 dB(A) (1 W / 8 ohm) spec in terms of input level and load
impedance, it should give about 79 dB(A) at 320 mW into 4 ohms for a
200 mVrms input, or 127 µVrms(A).
So how do we translate this into a proper 50 mW spec? Well, we have to estimate. Depending on which noise source is dominant, SNR could scale with anything between half a dB per dB of signal level (if it's volume pot thermal noise) and signal level itself (if it's constant voltage noise). In this case the difference could be anything between 4 and 8 dB, for 71 to 75 dB(A). This unit would thus be expected to exhibit similar noise levels as the Pioneer or a bit less. Not a big surprise there.
The Denon unit, giving about 1.5 dB more power and 5 dB more voltage gain than the previous two, is specified to give about 1 dB more than the Pioneer unit in terms of full power SNR, so the residual noise level would be expected to be about the same. We'd thus need a voltage noise density of about 3.5 to 4 nV/√(Hz), which does seem doable.
- To be continued, please check back shortly -
How (not) to build an audiophile hi-fi component
So you think an audiophile component (like an amplifier) would require state-of-the-art technology and every effort to make it as good as possible? It would be subjectively and objectively impeccable, right?
WRONG.
The decisive buzzword is "product differentiation". In a crowded market, you have to make your product stand out somehow. It may be that not everyone can build an amplifier that works well, but still there are enough of them that marketing one more isn't easy. If you want people's attention, it has to be exotic, exciting.Everyone is looking for "hidden treasures".
You have to leave trodden conventional paths. Designs with ordinary semiconductors and lots of negative feedback? Pah, everyone can do that. You need tubes (or valves for you UK folks), preferably OTL. Or at least your own very special circuit topology that is nothing less than the holy grail itself (of course). Let's do away with that pesky negative feedback, it's got a bad reputation anyway. So what if we end up with a power hog with overly high parts count – if anything, the customer will only think that we spared no effort, and marvel at the internals. Good measured performance? The subjectivists won't give a damn anyway. So what if it's noisy, has modest distortion performance and a lousy damping factor and will make any cellphone within a 5 m radius heard – it's an audiophile component, that'll excuse anything. (If in doubt, most of the self-ascribed "golden ears" won't pick up the distortion levels anyway.) Hey, it has gold-plated fuses, it must be good!
What to file this under? Simple, design by agenda – the classic sure-fire way of ending up at a local minimum rather than a global one. Normally you'd start with a performance specification (treating the device to be constructed as a black box that has to do certain things with some kind of performance and within a certain budget) and then work out what to put in so that the final device gets the job done (usually from a choice of existing solutions and in the most economic or feasible way). Starting with some kind of circuitry and then working out how the result performs is a clear violation of this workflow. This may be common in tinkering, but shouldn't be the approach of a pro.
So to sum it up – what does that mean for me as a consumer? Easy, the best product is not necessarily the most expensive or most fancy one. Usually the average performance vs. cost curve only increases at the very bottom, while beyond a level that allows flawless performance, it can get highly erratic. Better look out for indications that those who constructed the device put in some thought about how to meet the user's needs.
What can one do as an honest engineer then? Simple, think about how to best meet people's needs. Gather experience from practical use. Listen to users who know what they're doing – and even those who don't, as long as you interpret things correctly. Don't be afraid of rethinking old or "standard" design practices in light of changed usage patterns or technology, and do think outside the box if necessary. Nobody said headphone outputs in integrated amplifiers had to be dropping resistor jobs until the end of time, or that input sensitivity adjustments were of no use. At the same time, if you stumble across an interesting construction detail or feature in an old design that you think made sense at the time and still does, don't be afraid of implementing it (even if possibly it never really caught on). People had plenty of good ideas in the past, too.
Tweaking the Rotel RA-980BX Integrated Amplifier for Less Noise
The RA-980BX is an upper middle class integrated amplifier from the late '80s that was and still is rather powerful and sports a good phono section. The only major flaw is a fairly high level of noise present at all volume settings. There is something that can be done about this, however.
As we learned in the last entry, the noise (hiss) floor at low volume levels is dominated by voltage noise of the amplifier stage directly after the volume potentiometer. Inspecting the schematic reveals a circuit using an AD712JN opamp here, a FET input device cited to have high slew rate and low distortion – and a voltage noise density of a whopping 18 nV/√(Hz) (where very low-noise parts can have less than one tenth). Adding the noise from the gain setting resistors, we're at a nominal 18.7 nV/√(Hz).
Total voltage gain after the volume pot comes out as a fairly standard 45 dB (the first 16 of which are achieved by the opamp circuit), which gives a computed noise level of almost 500 µVrms at the output, about in line with the spec. This is not particularly grand. Even a hi-fi speaker of moderate sensitivity (85 dB SPL / 1 W / 1 m, 4 ohm) would give off about 13 dB SPL of hiss. From personal experience I know that even 6 dB less still is easily audible.
Now let's improve the noise level. This basically takes two steps:
- Select an opamp with low noise that fits the working environment.
- Reduce noise from gain setting resistors.
So let's pick an opamp:
- The new opamp would need to run on nominal +/-18 V supplies, measured to be about +17.6 and -17.3 V.
- It has to be in DIP form factor.
- Source impedance for the non-inverting circuit can be anything between about 150 ohms and 25 kOhm, but typically would not be expected to be more than a few kOhm at normal volume.
- DC offset is not an issue, plenty of bipolar coupling caps here.
- There are resistors to provide input bias current, too.
- Maximum expected output amplitude is only about 1 Vrms, with a load that's typically well in the kOhm range even with the tone controls active.
- Speaking of which, the dual opamps are used in a slightly unusual fashion here, with one half after the volume control and the other as a gain stage after the tone control. The latter sees source impedances in the two-digit kOhm range, unfortunately, thus possibly prompting use of a FET input part in the first place. However, you can always bypass the whole affair, so I'd see this as less critical.
The first guy I discussed this with bravely went with a LM4562 for IC501 and IC502; hopefully the slightly over-spec supplies won't give any trouble in the long run. The closely related LME49860 would be a better choice, it is rated for +/-22 rather than only +/-17 V. Otherwise the fairly low voltage noise (2.7 nV/√(Hz) nominal), good output drive capability, excellent transfer linearity and reasonably low common-mode distortion make this part a good choice here.
The original gain setting resistors are 10k and 1.8k parts (R505/507, R506/508). We can cut these values by a factor of 10 to reduce noise even further, as their contribution would be dominant at low volumes otherwise. New values are 1k and 180 ohms, respectively. Expected total voltage noise would be 2.8 nV/√(Hz) with the volume turned down, for a noise reduction of up to 16.5 dB (possibly limited by the power amplifier and very likely to be less when the tone controls are active).
The proud owner of the modified amplifier stated that the noise was cut significantly. Where it had been plainly audible before, you now had to listen with your ear close to the speaker to pick up any at all. I consider that a success. Subjectively, sound after the mod was perceived to be "sharper, more accurate". (Audible noise is reputed to make sound "softer", so that would fit right in.)
Application Note: Using CD-Direct Functionality on Yamaha Hi-Fi Amplifiers
The Yamaha AX-397 and AX-497 integrated amplifiers and the related RX-797 receiver (as well as the older AX-396/496 and AX-596 models and newer A-S700) offer a feature called CD-Direct, which promises even better performance when using the CD input. We'll now look at what it does and when it may be useful.
In CD-Direct mode, the following two things happen:
- Tone control circuitry is bypassed for lower distortion and noise, as in Source-Direct mode.
- Some voltage gain (about 9.2 or 10.7 dB, respectively) is shifted from directly after the volume control to the input, which employs a preamplifier whose effect is negated immediately after in normal operation.
Now you have to know that in a regular hi-fi amplifier, the noise (hiss) floor at low volume levels is dominated by voltage noise of the amplifier stage directly after the volume potentiometer. This noise is amplified by whatever voltage gain follows and then made more or less audible by attached speakers, depending on their sensitivity and listening distance.
Under these circumstances, taking away some voltage gain after the volume pot reduces background noise, which may be quite welcome when using the amplifier to drive speakers at low listening distances. In case of the AX-497, it is about 8 dB less noise for 9 dB less gain, which is not a bad deal. According to specs, noise floor drops to 35 µVrms, A-weighted. Audible noise already is at around 0 dB SPL / 1 m with ordinary hi-fi speakers (or monitors) of average sensitivity in Pure-Direct mode, in CD-Direct mode it drops noticeably below the hearing threshold at this distance. Those with very low listening distances or very sensitive speakers (e.g. horn types) should appreciate that.
The input then makes up for the missing total gain. So far, so good.
Unfortunately, the older models (AX-396/496/596) as well as the AX-397 exhibit one potential problem here: When used with a modern-day CD player that delivers up to 2.2 or 2.3 Vrms for fullscale output and hotly mastered CD material that could contain worst-case intersample overs of +3 dB (see Overly loud CDs), we are potentially looking at 3.25 Vrms of maximum input level. The amplifier at the CD input would have to drive up to 11.5 Vrms or 32.6 Vpp, which however is outright impossible for something that only has 28.5 Vpp available in terms of supply, with a minimum distance of about 1.5 V to either side no less – so the maximum output level is 25.5 Vpp or 9 Vrms. The maximum permissible input level thus is about 2.5 Vrms, potentially giving less than a dB of headroom above fullscale – rather tight. Maybe this is part of why some people consider these models "bright". I'd preferably use the CD input for something with a less hot output, like a typical PC soundcard (where Replaygain may find use, too).
The AX-497 employs somewhat less gain in the input amplifier (2.96x rather than 3.55x), which means it can handle about 3.0 Vrms of input. Not truly worst-case, but probably adequate in practice.
If you intend to upgrade the more problematic models for bulletproof CD input handling (about 3.4 Vrms), the gain for the CD input amp needs to be lowered to about 2.65x. I'd suggest swapping the 1.2k and 470 ohm gain setting resistors for 1k and 620 ohms, respectively. The CD input will then be a touch quieter (obviously), but if anything this would only buy you a more comfortable volume control setting.
The A-S700 contains a special goof in the CD preamp – in an attempt to reduce noise and distortion even further, gain setting resistors of 390 and 220 ohms were chosen, but this means that the OP275 used has to drive a nominal 710 ohm load. That's not too different from 600 ohms, which was shown to noticeably decrease linearity in Samuel Groner's opamp measurements, starting at levels well below 1 Vrms. (If at least the supply rails were higher, the chip would be more well-behaved, with dominant and steadily decreasing even-order harmonics.) If you feel like swapping the opamp, an SOIC version of LME49720 (LM4562) or LME49860 should be a suitable replacement.
It looks like the A-S700 circuit was taken over from the RX-797 receiver, where the standard issue NJM2068 opamp had been used instead. With this low-noise part, the low resistor values do make some kind of sense from a noise perspective (even if the practical benefit would be nonexistant), but a 710 ohm load means that the opamp has already lost almost 3 Vpp of its maximum voltage swing. It should still accept a little over 3.0 Vrms of input like this, but I doubt the opamp is a current driving king, so distortion is likely to be higher than necessary. If you don't feel like replacing the opamp, you can try resistor values of 1.5k / 820 or 1.2k / 680 ohms. The guys at Yamaha would probably have trusted it to drive a 1k / 560 ohm combo.
Remastering time!
Yours truly has been trying to avoid buying albums which have been mastered overly loud, knowing that they would eventually gather dust or at the very least not get as much play time as they'd deserve. However, a few have crept in that I considered just too good to pass up. Thus, here's a short list of albums from the last decade that I'd like to see remastered at a more reasonable level (typically between 3 and 6 dB lower, and with dithering please), along with the current editions' replaygain level as per Foobar2000.
- Wir sind Helden - Die Reklamation (2003, -10.02 dB)
- The Robocop Kraus - They Think They Are The Robocop Kraus (2005, -9.55 dB)
- Ladyhawke - Ladyhawke (2008, -10.65 dB)
Other examples of "too loud to be really listenable in the long term" include The Ting Tings - We Started Nothing (2008, -10.72 dB) and Ladytron - Velocifero (2008, -9.80 dB). I'm sure there are lots and lots more.
Isn't it ironic that the decade which saw an awful lot of remasters of older material would now need some remastering itself?
Replaygain levels demystified
Update 2011-05-31: This article needs a revision since it erroneously makes use of square-wave-referenced rather than sine-referenced dBFS (as I used Audacity to determine reference noise RMS level). Most of the levels thus are 3 dB low. Meanwhile, a major overhaul of the Replaygain documentation has brought about much-needed clarification and a section on RMS level calculation.
Have you been wondering what the connection is between Replaygain levels and the ominous 89 dB SPL, and how all this relates to levels in the digital domain? Then read on.
The basic aim of Replaygain is clear: Providing an about equal listening level regardless of how loud the original source material is. For this to work, you obviously need to find out how loud it would appear to us and compare with something of known volume. Here this is done by performing some frequency weighing according to a somewhat crude hearing sensitivity curve plus low-frequency roll-off, then computing the RMS values of the result and comparing with the result for a reference signal.
The reference signal, to be obtained on this page, contains one channel of pink noise at about -23 dBFS. When played back at a movie mixing desk with calibrated speaker system, this would give 80 dB SPL (C-weighted) over one speaker or 83 dB SPL when blowing it up to stereo. (Replaygain automatically assumes that mono files will be played back on two channels at once, giving twice the sound power a.k.a. 3 dB more.) The replay gain for this file should be +0.0 dB according to the proposal.
Practical implementations usually aim for a 89 dB SPL level, i.e. 6 dB louder. Therefore it's not surprising that Foobar2000 assigns the reference file a replay gain of +5.99 dB. This gives us a pink noise level of -17 dBFS, with a maximum amplitude of about 0.56. So basically we have about 17 dB of headroom to work with. That correlates pretty well with the results in the dynamic range database, where contemporary loud pop albums with replay gains of ca. -9 dB achieve about a 6 or 7 dB rating (which is not too far off from the expected 8 dB – music is only approximately pink noise after all).
In my experience, most recordings get along just fine with the aforementioned ca. 17 dB, even though there is the occasional '80s CD which needs 2, 3 or even 4 dB more (among them the infamous Dire Straits – Brothers in Arms original edition with a maximum track gain of +3.78 dB at peak amplitude 1 [another track with +6.82 dB and peak 0.974 doesn't count because the peak belongs to a nice cracking noise resulting from a digital transfer error], Laurie Anderson's Mister Heartbreak, Peter Gabriel IV or Vangelis' Blade Runner soundtrack).
A few useful spreadsheets
These are mainly geared towards the headphone user. OpenDocument format.
- Headphone / earphone sensitivity table – I took that over from Head-Fi user j-curve a long time ago. Sensitivity per mW or per Vrms are calculated from each other as needed, and you can specify an amplifier output impedance to see how loud things will be there. I have added the sensitivity per mA of current, which is a measure of the BL product. The list is pretty big but not comprehensive, in particular the last major update was in like 2008 or so, so the very newest models won't be in it.
- Output Impedance Influence Calculator – This calculates how much the frequency response will be bent by the frequency-dependent (headphone/speaker) driver impedance when operating on an output with a given output resistance. While dedicated headphone amplifiers usually have a low output resistance, not uncommonly below 10 ohms, integrated amplifiers and receivers can have hundreds of ohms, potentially leading to a considerably skewed frequency response. Headphone impedance plots can be obtained from various places.
- Headphone Level Calculator – This allows you to estimate headphone sound levels in three different scenarios, depending on what you can or cannot measure. If you do not have a suitable multimeter, a rockboxed Sansa Clip+ / ClipV2 / FuzeV2 is the next best thing, for I have determined its maximum output level. Make sure your headphone sensitivity spec is halfway accurate. The other information can be determined by examining settings and using software.
Entry last modified: 2012-01-18 – Entry created: Sometime
Y knot...? (2)
These days, portable MP3 players with associated earphones or headphones are very common. Paradoxically, there is very little source material that would specifically target headgear (not necessarily exclusively, but also). There's the odd binaural recording, either in album format, as radio play or on demo CDs (usually from headphone manufacturers), but rather little from the normally playful pop music world. So Radiohead used binaural recording, and there's Pearl Jam's album "Binaural" (which seemingly is better suited for speakers anyway), but otherwise there doesn't seem to be a whole lot in the "goodies for headphone users" department. It's not like binaural recording would be required, since you can also do cool things in a more conventional way. But very few people seem to bother at all.
To me, that very much is surprising.
Y knot...? (1)
Did you know that a regular DVD can hold a 96-kHz 24-bit stereo audio stream, in addition to (for example) another 48-kHz 16-bit stereo one? In fact, this has been possible ever since the very first DVD spec in 1996. The DVD-Audio format added true multichannel 24/96 audio (data rate limitations would allow for no more than 3 channels on a plain vanilla DVD), nonetheless the standard DVD would seem to be an attractive choice for music distribution. Production technology is mature (with demand probably shrinking due to the impact of Blueray discs), compatible playback devices are very common, and extracting the audio data isn't significantly more difficult than for an audio CD, allowing for easy integration into harddrive-based libraries.
Nonetheless, you very rarely encounter such discs. Maybe on the odd audiophile label, but that's about it. There are a few paid 24/96 downloads available, like Peter Gabriel's covers album "Scratch My Back", but those still are rare.
I wonder why?
(Recording quality of pop CDs may be pretty sucktastic on average these days, making the choice of format quite secondary, but still there should be the occasional candidate for a 24/96 release.)
Speaking of hi-res audio formats, did you know that DSD as used on SACD is actually a rather poor choice with lousy storage efficiency? I suspected something like that all along – it was more about have a format that's as obscure as possible in order to make copying even more unattractive. And to my knowledge, there still isn't a way of digitally copying the contents of a SACD other than tapping off the data stream inside a SACD player, while you can apparently very much rip DVD-Audios (most definitely non-encrypted ones). In other words, SACDs are just about entirely useless in the age of harddrive-based music libraries and streaming clients. Well, thank you very much for a format that's obsolete by design.
On overly loud CDs, quiet MP3 players and things related
Overly loud CDs
Buying new music can be frustrating these days – provided you've got ears and some half-decent playback equipment, that is. The reason: As a result of the loudness war that started raging in popular music from about the early/mid-1990s on, many CDs are just too darn loud and not uncommonly sound like crap as a result.
How you mean, too loud, you might say. Let me explain:
As you may know from the digital audio basics on this page on mine, digital audio is a series of numbers in regular intervals. For CD audio, that would be 44100 times per second, with the numbers being 16-bit integers. For 16-bit integer numbers in the common 2's complement representation, the possible values range from -32768 to +32767. This range is considered full scale (sometimes also with the very smallest value left out for symmetry considerations, but that is largely academic). Most importantly, it is quite obviously limited.
Therefore, you cannot make a digital waveform arbitrarily loud without introducing clipping – the signal peaks are chopped off, and eventually this becomes audible as distortion.
Now with advanced DSP-based lookahead compressor / limiter technology, it is possible to increase peak to average ratio quite a bit without too much impact on sound quality – or even further, right into audibly distorted territory if so desired. Too much of the current CD output is in the latter camp.
This has a number of downsides:
- Sound quality can be degraded quite significantly for the discerning listener, up to the point of perceived "flatness" and listening fatigue. Accordingly, the willingness to buy new records (yes, buy) decreases. Now one would think that losing loyal paying (!) customers is about the last thing the music industry could afford these days.
- Devaluation of music in the long run. Subpar sound quality always hurts long-term appreciation in some way. A throwaway society might not care but in general music shouldn't be some kind of throwaway goods.
- Mixing recordings of varying vintage is no fun, especially on portable players. Imagine you just got done listening to Enya's Watermark (1987, album gain +2.9 dB, still a headphone junkie's delight BTW) and then chose a modern-day pop recording (say, about -9.5 dB). If your player doesn't support ReplayGain or anything similar, you better turn the volume waaay down before your ears are blown off.
- Many integrated amplifiers (most past ones and a number of current ones) have an input sensitivity of 150 mV, forcing operation of the volume pot in an uncomfortably low range that may aggravate channel tracking issues and make precise setting of volume (especially by remote control) tricky. Levels on CD were originally set up so that they'd match other devices, which meant plenty of headroom (like 17 dB for the typical CD fullscale amplitude of 2 Vrms) with an easily sufficient amount of dynamic range left. Given today's technology (most importantly 24/32 bit processing and dithering), it would definitely be sufficient, given typical amplifier noise floors. Instead, stuff is crammed into the very top of the dynamic range, not uncommonly necessitating no more than 8 (eight) bits per sample.
- Even more distortion may be added by playback devices through overflows in digital filters (intersample overs; a few examples of how consumer electronics devices behave).
- The conversion to lossy data reduction formats, most prominently MP3, opens up another can of worms. Since those are typically based on the FFT with frequency-domain filtering, peak amplitudes tend to rise due to the Gibbs phenomenon. In the LAME 3.98.2 -V 4 -q 0 files that I use for my portable player, I have found peak amplitudes as high as +3.6 dBFS after decoding (±1.5something in the floating point representation à la Foobar2000, where the usual 16-bit value range maps out to -1 to +0.999969…). If the MP3 decoder library and following processing chain don't provide some headroom, there will be even more distortion. (The players from a certain fruity brand are particularly notorious for this.) Encoded data rates also go up slightly, as the clipping already present in the input material means additional high-frequency content.
- Passing the whole shebang through the kind of fancy multiband compression commonly used in radio stations doesn't make things any better. In fact, heavy brickwalling can actually be counterproductive as excessive high-frequency content makes the compressor turn down the volume. Oops.
By contrast, I cannot see any real upsides. How dynamic a record is mostly depends on production, so even in critical environments like cars the difference between a heavily brickwalled version and another at more reasonable level (with the volume adjusted accordingly – contrary to popular belief, people do know how to operate a volume control) wouldn't be all that dramatic. Besides, environments like this would be better served by DSP-based dynamic compression and ambient-noise-level-based EQing in the playback device (car radio) anyway, both of which have been technically feasible for a while. It cannot be the job of the source material to cater to the lowest common denominator while ignoring everything else. Lowest common denominator standards tend to be pretty low these days.
So… can we have CD levels back where they were in the early '90s please? I'm sick and tired of having to scrutinize promising new releases for mastering levels and sound quality in order to make sure that I get something that I actually want to listen to after buying. (And don't even get me started about "remastered" titles – or should I say "butchered", which in most cases is equally fitting?)
This whole affair also bugs me on a more abstract level because it turns the whole idea of progress upside down. More advanced technology used to make worse-sounding records, that doesn't sit right with me as an engineer. (I tend to marvel at what ingenious minds – mostly smarter than yours truly, it will seem – have pulled off with much less advanced technology in the past.)
It's interesting how musicians, being the creative bunch that they are, have found ways to make things sound pretty good in spite of the whole loudness craze. This, however, still assumes the latter is necessary. I'd rather get the basic assumptions right. Definitely helps in engineering (provided you follow a result-driven rather than agenda-driven approach), can't be that far off elsewhere.
I have yet to see any factual evidence that levels being as high as they
typically are nowadays is actually beneficial – which means it comes down
to believing rather than knowing, an approach that has
thoroughly messed up the classic Hi-Fi hobby (consider the case of the "CD
demagnetizers" that people quite obviously bought – too bad that there
aren't any ferromagnetic substances in CDs at all). The main reason they still
are, I guess, is that people are lemmings and will do whatever everyone else
does without giving it much thought, and paranoia never goes out of fashion
anyway. In addition, it has become much easier to record a CD, and in theory
one single musician can do all the work by himself in a smallish home studio
– in practice, however, it takes a genius to equal the expertise of all
the people found in a classic recording studio, so the smaller the scale of the
whole operation the more can potentially go wrong. Hard to beat plain ol'
manpower sometimes.
You know that something is wrong when you hear of the band whose CD was
found to be much louder than the Myspace samples (i.e. too loud), and who
didn't even know of it when asked – apparently the CD pressing plant had
"spiced up" the material in much the same way that some photo labs do (except
that that I wouldn't call material with normal loudness levels the equivalent
of underexposed). It gets really scary when people actually complain
when they encounter a record with halfway-normal dynamic range (seen among the
first reviews of Mumford & Sons' "Sigh No More") – in a way, they are
used to overspiced "music convenience food".
Here are some initiatives that aim to bring back more dynamic range to music:
Honorable mention (pun intended) also goes to the Honor Roll of Dynamic Recordings and other articles by mastering engineer Bob Katz, as well as the mastering articles by Nielsen / Lund at TC Electronics.
Quiet MP3 players
If you are a European citizen, you may have noticed the result of misguided
well-meaning politics: Portable MP3 players with rather limited output volume
indeed (usually with the same models going noticeably louder elsewhere,
sometimes even when set up for another region). Those are usually loud enough
with the supplied earduds, err, -buds and common in-ears, but try connecting
some "grown-up" headphones, and they'll run out of steam very quickly. The same
applies when used as high-level sources for some other equipment, although it
may be possible to compensate by turning up the volume externally.
Some players have additional output series resistors to drop volume on
common low-impedance earphones, which opens up another can of worms (see
output impedance calculator spreadsheet).
I will try explaining why a fixed hard limit to the player's volume setting is rather pointless.
First of all, the intent of this legislation was keeping youths from blowing their ears out while walking around with their MP3 players. (That is, if they haven't "upgraded" to a tinny cellphone speaker that lets the environment take part in their well-developed taste in music along the way. I guess those still sound better than the early transistor radios back in the day, but it's not a long way off for sure…) Now why would they do that? Usually it would be to drown out external noise. Unfortunately, many environments are too noisy on their own – at like 75 dB SPL, there's not too much room for music on top of that before reaching dangerous levels. Therefore the only answer can be attenuating external noise by using suitable headphones or more likely earphones with good isolation properties. With like 20 or 30 dB of attenuation, the world looks quite different. In fact, levels with music at reasonable volume may now be lower than with nothing at all!
With that out of the way, let's have a look at the factors influencing playback volume for a given volume setting:
- Headphone sensitivity.
There are two ways of rating this, both specify sonic output in dB SPL at 1 kHz but one per 1 mW of input power and the other per 1 Vrms of input signal amplitude (it be noted that especially the latter may be a calculated value using a measurement at much lower volume, as 1 Vrms may be quite a bit beyond what the drivers can handle). The conversion from one to the other is possible if you know the nominal headphone impedance.
For a low-impedance source like your average MP3 player, the 1 Vrms referred spec is the most practical. According to this list that I originally took over from Head-Fi member j-curve, the sensitivity of regular headphones and IEMs ranges from about 90 to 139 dB SPL (that's almost 50 dB of difference, folks!), or most definitely from under 100 to over 130 dB. - Recording levels.
As discussed further up, modern-day pop / rock recordings may average at around -7 dBFS rms, while a number of my '80s CDs are at around -17dBFS or even lower. That means more than 10 dB of possible extra variation.
So overall, depending on the type of headphone and material, there could be anything from about 40 dB to 60 dB (!) of variation. That's a lot. A signal to noise ratio of 60 dB is commonly perceived as entirely noise-free. 40 dB still makes the difference from a fairly normal home listening volume (65 dB SPL) to ear-damaging disco levels (105 dB SPL).
So, now please pick a suitable setting for a volume cap. What, you don't see how? Nor do I.
To sum it up: Education (!), isolated earphones and earplugs (for concerts and such) help. Artificial non-adjustable volume limits, by contrast, are more of an annoyance than anything else.