In yesterday’s column, I expressed my deep concerns about elements of Consumer Reports’ testing process. It was based on an article from AppleInsider. I eagerly awaited part two, hoping that there would be at least some commentary about the clear shortcomings in the way the magazine evaluates tech gear.
I also mentioned two apparent editorial glitches I noticed, in which product descriptions and recommendations contained incorrect information. These mistakes were obvious with just casual reading, not careful review. Clearly CR needs to beef up its editorial review process. A publication with its pretensions needs to demonstrate a higher level of accuracy.
Unfortunately, AppleInsider clearly didn’t catch the poor methodology used to evaluate speaker systems. As you recall, they use a small room, and crowd the tested units together without consideration of placement, or the impact of vibrations and reflections. The speakers should be separated, perhaps by a few feet, and the tests should be blind, so that the listeners aren’t prejudiced by the look or expectations for a particular model.
CR’s editors claim not to be influenced by appearance, but they are not immune to the effects of human psychology, and the factors that might cause them to give one product a better review than another. Consider, for example, the second part of a blind test, which is level matching. All things being equal, a system a tiny bit louder (a fraction of a dB) might seem to sound better.
I don’t need to explain why.
Also, I was shocked that CR’s speaker test panel usually consists of just two people with some sort of unspecified training so they “know” what loudspeakers should sound like. A third person is only brought in if there’s a tie. Indeed calling this a test panel, rather than a couple of testers or a test duo or trio, is downright misleading.
Besides, such a small sampling doesn’t consider the subjective nature of evaluating loudspeakers. People hear things differently, people have different expectations and preferences. All things being equal, even with blind tests and level matching, a sampling of two or three is still not large enough to get a consensus. A large enough listening panel, with enough participants to reveal a trend, might, but the lack of scientific controls from a magazine that touts accuracy and reliability is very troubling.
I realize AppleInsider’s reporters, though clearly concerned about the notebook tests, were probably untutored about the way the loudspeakers were evaluated, and the serious flaws that make the results essentially useless.
Sure, it’s very possible that the smart speakers from Google and Sonos are, in the end, superior to the HomePod. Maybe a proper test with a large enough listener panel and proper setup would reveal such a result. So far as I’m concerned, however, CR’s test process is essentially useless on any system other than those with extreme audio defects, such as excessive bass or treble
I also wonder just how large and well equipped the other testing departments are. Remember that magazine editorial departments are usually quite small. The consumer publications I wrote for had a handful of people on staff, and mostly relied on freelancers. Having a full-time staff is expensive. Remember that CR carries no ads. Income is mostly from magazine sales, plus the sale of extra publications and services, such as a car pricing service, and reader donations. In addition, CR requires a multimillion dollar budget to buy thousands of products at retail every year.
Sure, cars will be sold off after use, but even then there is a huge loss due to depreciation. Do they sell their used tech gear and appliances via eBay? Or donate to Goodwill?
Past the pathetic loudspeaker test process, we have their lame notebook battery tests. The excuse for why they turn off browser caching doesn’t wash. To provide an accurate picture of what sort of battery life consumers should expect under normal use, they should perform tests that don’t require activating obscure menus and/or features that only web developers might use.
After all, people who buy personal computers will very likely wonder why they aren’t getting the battery life CR achieved. They can’t! At the end of the day, Apple’s tests of MacBook and MacBook Pro battery life, as explained in the fine print at its site, are more representative of what you might achieve. No, not for everyone, but certainly if you follow the steps listed, which do represent reasonable, if not complete, use cases.
It’s unfortunate that CR has no competition. It’s the only consumer testing magazine in the U.S. that carries no ads, is run by a non-profit corporation, and buys all of the products it tests anonymously via regular retail channels. Its setup conveys the veneer of being incorruptible, and thus more accurate than the tests from other publications.
It does seem, from the AppleInsider story, that the magazine is sincere about its work, though perhaps somewhat full of itself. If it is truly honest about perfecting its testing processes, however, perhaps it should reach out to professionals in the industries that it covers and refine its methodology. How CR evaluates notebooks and speaker systems raises plenty of cause for concern.