Brilliant Art Dudley's article

rbbert · Oct 29, 2014

I suspect the number of test subjects (listeners) and listening sessions would have to be very large for the small differences (between blind and sighted) in the Harman test to have statistical significance.

AJ Soundfield · Oct 29, 2014

jkeny said:
This was sighted-listening & there were differences between the DACs

That makes sense.

jkeny said:
We all knew the reputation of all the DACs & expected the dCs would shine or at least be the equal of the others but not so. So our bias towards it did not influence the result- bottom of the class

You clearly don't understand the nature of subconscious bias. It is not something you are aware of, or would expect, etc.

jkeny said:
Afterwards two things emerged from the guy running the audition - his Jadis amp, he learned, throws a blanket over everything suppressing most of the differences between DACs.

Differences you know exist how? Using what method of differentiation?

cheers,

AJ

jkeny · Oct 29, 2014

rbbert said:
I suspect the number of test subjects (listeners) and listening sessions would have to be very large for the small differences (between blind and sighted) in the Harman test to have statistical significance.

Well I think 40 employees were used in the test - so do you think the graph is really showing nothing of any real statistical significance & the whole premise of the article is undermined?

jkeny · Oct 29, 2014

AJ Soundfield said:
That makes sense.

Correct, it does make sense, given the different approaches each of the DACs take to their D to A conversion, different filters used & different analogue output stages - it would be a feat of great luck & coincidence if they all sounded the same (or some nocebo bias at play)

You clearly don't understand the nature of subconscious bias. It is not something you are aware of, or would expect, etc.

Yes, I know that the psychological environment of a group listening session (sighted or blind) is prone to all sorts of influences which fail to be recognised & addressed. There are very strict standards applied in the broadcasting industry (ITU & MUSHRA) which sets out to eliminate (as much as possible) biases which may effect the results - sightedness is only one of them. One recommendation is the use of positive & negative controls in any such test - to check for false positives & false negatives. Until someone uses these as part of their blind tests, I simply treat all blind tests as equivalent to sighted tests as we have no way of knowing just how the hidden biases you mention have effected the results & delivered false negatives.

Differences you know exist how? Using what method of differentiation?

For certain there are measured differences between the DACs. SO the only question is - are there audible differences. I think the answer was already given to you in Mike's post of blind testing above where the Meitner was consistently chosen as no 1 (see my note about blind testing controls, however). I add hi result to my many other sighted listening results & form an opinion. Only a more rigorous, statistically significant, ITU or MUSHRA test will give you more assuredness. Until then we both have to live with the insecurity of not knowing for sure - I'm happy to do that - I do it in all other areas of my life, don't see why this audio hobby should be any different.

PS: I & others have listened to the Meitner, Lampizator, Chord in other systems with other amplifiers, speakers etc. & the differences in sound between these DACs was more pronounced than in this system using a Jadis DA88S tube amplifier

Myles B. Astor · Oct 29, 2014

jkeny said:
I should have included a pic of me typing that post at the keyboard with my aluminium foil covered head but I forgot
It's just my analysis of the shortcomings of that piece. I believe critical evaluation should be used in reading these things & not blind acceptance. Just because the word blind or double blind is used doesn't make results valid - there's too much of that goes on, in my opinion.

Especially since it's put forth as scientific research. That said the methodology is of utmost importance.

AJ Soundfield · Oct 29, 2014

jkeny said:
Correct, it does make sense, given the different approaches each of the DACs take to their D to A conversion, different filters used & different analogue output stages - it would be a feat of great luck & coincidence if they all sounded the same (or some nocebo bias at play)

No it wouldn't, unless there is evidence to suggest otherwise. Measurable difference does not equate to automatic audible difference, nor do construction methods or parts used, etc.
Unless you have some ITU/MUSHRA evidence to support that contention?

jkeny said:
Until someone uses these as part of their blind tests, I simply treat all blind tests as equivalent to sighted tests as we have no way of knowing just how the hidden biases you mention have effected the results & delivered false negatives.

Great, so you do believe in blind tests to reduce/remove confounders, just rigorous one like MUSHRA, etc. Excellent.
But that puts you in conflict with both Dudley/Stereophile and your previous statement:

In my experience sighted tests repeated over time on different days, with different music, different moods, maybe different parts of the system, allow us to get to know the characteristic sound of the device by triangulation.

How does longer term fully sighted/biased "triangulation", remove biases/confounders the way MUSHRA et al does?
Isn't that just pure sighted/biased preference? The one we all use in the end?

jkeny said:
I add hi result to my many other sighted listening results & form an opinion. Only a more rigorous, statistically significant, ITU or MUSHRA test will give you more assuredness.

Right. So why are exact zero of the high end manufacturers, making the claims about their products (and subsequent high price tags as justification), performing these ITU or MUSHRA tests? Logic dictates the onus falls squarely on them, not the naysayers. Logic also dictates that one cannot prove a negative. So it's not up to anyone else to "disprove" these claims. It's up to the manufacturers of DACs (etc) to do so. Or not be taken too seriously.

jkeny said:
Until then we both have to live with the insecurity of not knowing for sure - I'm happy to do that - I do it in all other areas of my life, don't see why this audio hobby should be any different.

Yes, the manufacturers know those insecurities well, don't they?

Which is why we will never see those ITU or MUSHRA tests performed by them....and if they are performed by curious naysayers, well, the goalpost can always be shifted to, hey I prefer it sighted anyway. IOW, right back where we started.:lol:

cheers,

AJ

jkeny · Oct 29, 2014

AJ Soundfield said:
No it wouldn't, unless there is evidence to suggest otherwise. Measurable difference does not equate to automatic audible difference, nor do construction methods or parts used, etc.
Unless you have some ITU/MUSHRA evidence to support that contention?

Ok, so let's analyse this from the angle I was taking - the idea that different approaches to D to A conversions will necessarily result in different results. You agree that they will measure differently, (so we are half way there) but you want it proven that they are audible? I have "proven" it to myself sufficiently so don't feel any compunction (nor have I the time or money) to engage in ITU/MUSHRA tests simply to "prove" it to someone else. I guess it boils down to what evidence you pay attention to Vs what I pay attention to (both of these are prone to a selectivity bias) - I cite all my listening experience both sighted & blind (none of which is rigorous) & you cite what - blind tests (which aren't rigorous, either) which return null results? Neither evidence is 100% valid - so it's just opinions we are sharing.

Great, so you do believe in blind tests to reduce/remove confounders, just rigorous one like MUSHRA, etc. Excellent.
But that puts you in conflict with both Dudley/Stereophile and your previous statement:

How does longer term fully sighted/biased "triangulation", remove biases/confounders the way MUSHRA et al does?
Isn't that just pure sighted/biased preference? The one we all use in the end?

You have to take the practicality & pragmatism of the hobby into account here. Almost nobody in audio has the time, or financial resources to conduct such rigorous tests so what we are left with are non-rigorous tests, both blind & sighted. I prefer to stick with a natural listening style for my listening as I have proven to myself that I can differentiate devices to the level that interests me without a bias which unduly interferes with the order of my preference. Every now & then I use a personal blind test as a cross check but I don't consider it any more valid than my sighted listening. Yes, these listening sessions help me to triangulate onto the sonic nature of the device, something that could be done blind but I think there would be fewer of them done if they were blind so the risk being that a smaller pool of results from which one would be drawing conclusions. The fact that these sighted listening sessions are done on different days, on different systems, with different music, with different moods, with different ear wax, etc, all go towards it being a more all-encompassing test than the usual once-off blind test. Yes, I have changed as I went through sighted listening sessions, thinking I heard something but on second playing not. So, as long as you remain open to self-correction,I believe sighted listening, over time, with different systems can be more accurate & informative than the usual blind, once-off tests that one usually sees being cited.

Right. So why are exact zero of the high end manufacturers, making the claims about their products (and subsequent high price tags as justification), performing these ITU or MUSHRA tests? Logic dictates the onus falls squarely on them, not the naysayers. Logic also dictates that one cannot prove a negative. So it's not up to anyone else to "disprove" these claims. It's up to the manufacturers of DACs (etc) to do so. Or not be taken too seriously.

Given the time & resources available, it's really not surprising that most audio manufacturers don't run ITU/MUSHRA tests.

Yes, the manufacturers know those insecurities well, don't they?
Which is why we will never see those ITU or MUSHRA tests performed by them

Oops, I fear the same poster might level conspiracy theorist at you after this remark

....and if they are performed by curious naysayers

Can you cite one of these, please - I would be interested in reading it?

, well, the goalpost can always be shifted to, hey I prefer it sighted anyway. IOW, right back where we started.:lol:

I'm not sure if you are a frequenter of WB forum but Amir's thread showed his positive ABX results for ArnyK's High res Vs RB files; for Winer's A/D D/A loopback tests (both of which stood for 15 years or so as always providing negative results) caused the reaction you predict but this time from those that swear by blind tests - suggestions of cheating, of IMD, of bad resampling software, etc. None of which was ever proven to be the reason for the positive results - so I guess nobody is immune to what you say?

Point of all this is - there is a difference between a valid blind test & an invalid one - they are not all made equal. I have no problem with valid blind tests but few actually are (I suspect this is what Dudley is saying). If you want the usual blind listening sessions to approach anything like validity then use some controls in them - this is simple enough & shows that the test is not returning false negatives which is what is suspected underlies most such tests.

AJ Soundfield · Oct 29, 2014

jkeny said:
You agree that they will measure differently, (so we are half way there) but you want it proven that they are audible?

Could, not will...and yes, if you claim for example, your 0.005% THD DAC is "10x better", or better "performing" than the 0.05% THD DAC, I want you to prove it audibly. Good luck.

Measuring different does not equate to audibly different. Can, when determined by....drum roll...JND threshold blind tests. Like for frequency response. Loudness...and the list goes on.

jkeny said:
I have "proven" it to myself sufficiently so don't feel any compunction (nor have I the time or money) to engage in ITU/MUSHRA tests simply to "prove" it to someone else.

You posited ITU/MUSHRA as the gold standard, but if pure anecdote is fine for you, then fine. You certainly couldn't blame someone for being skeptical if you then made audible claims about DACs etc, then, could you.

jkeny said:
I prefer to stick with a natural listening style for my listening as I have proven to myself that I can differentiate devices to the level that interests me without a bias which unduly interferes with the order of my preference.

You have proven to yourself that you are free of unduly interfering bias? That's circular.

jkeny said:
Every now & then I use a personal blind test as a cross check but I don't consider it any more valid than my sighted listening.

Then why bother? Especially if it's not ITU/MUSHRA.

jkeny said:
So, as long as you remain open to self-correction,I believe sighted listening, over time, with different systems can be more accurate & informative than the usual blind, once-off tests that one usually sees being cited.

For biased filled preference yes, for sound>ears, no.

jkeny said:
Given the time & resources available, it's really not surprising that most audio manufacturers don't run ITU/MUSHRA tests.

That or the house of card collapses, along with the money.

jkeny said:
I'm not sure if you are a frequenter of WB forum but Amir's thread showed his positive ABX results for ArnyK's High res Vs RB files; for Winer's A/D D/A loopback tests (both of which stood for 15 years or so as always providing negative results) caused the reaction you predict but this time from those that swear by blind tests - suggestions of cheating, of IMD, of bad resampling software, etc. None of which was ever proven to be the reason for the positive results - so I guess nobody is immune to what you say?

I thought you said the gold standard was ITU/MUSHRA? An amateur online test with corrupt files that can be gamed (as shown on AVS & HA), is neither. But you believe the results, when they fit?

jkeny said:
If you want the usual blind listening sessions to approach anything like validity then use some controls in them - this is simple enough & shows that the test is not returning false negatives which is what is suspected underlies most such tests.

And this suspicion that blind test are "hiding" the purported differences, is based on what positives for the purported differences?
What exact method is being used to generate these postives, so that you are suspicious of, or know that the non-ITU/MUSHRA blind tests are hiding them, with these negative outcomes? Triangulation, or....?

cheers,

AJ

rbbert · Oct 29, 2014

jkeny said:
Well I think 40 employees were used in the test - so do you think the graph is really showing nothing of any real statistical significance & the whole premise of the article is undermined?

I don't know, but a difference of less than 1 out of 8 (the preference scale) with only 40 subjects? It might depend on the methods used, but off-the-cuff that doesn't sound like a statistically significant difference.

Myles B. Astor · Oct 29, 2014

rbbert said:
I don't know, but a difference of less than 1 out of 8 (the preference scale) with only 40 subjects? It might depend on the methods used, but off-the-cuff that doesn't sound like a statistically significant difference.

Remember the number required for the test is based on upon 100% of the population being able to discriminate. The number of test subjects required if only 30% of the population is able to identify the DUT goes up tremendously.

mep · Oct 29, 2014

This is starting to smell like a WBF thread...

AJ Soundfield · Oct 29, 2014

World Boxing Federation?

Blind tests work fine for audio, are the de facto standard of science....and with stereo equipment, like watches, wine and women....we end up with whatever we prefer.

cheers,

AJ

Myles B. Astor · Oct 29, 2014

AJ Soundfield said:
World Boxing Federation?

Blind tests work fine for audio, are the de facto standard of science....and with stereo equipment, like watches, wine and women....we end up with whatever we prefer.

cheers,

AJ

No they're not. Just because blind tests are the standard in one field does not AUTOMATICALLY make the standard in every field. Same holds true with ANY research methodology. You have to carefully apply the methodology to each situation. I've already listed the many reasons blind tests are not applicable to audio testing.

BTW do you know what an internal control is? If so, where, what and how is it used?

AJ Soundfield · Oct 29, 2014

Myles B. Astor said:
I've already listed the many reasons blind tests are not applicable to audio testing.

That would be news to anyone with a cel phone, TV, hearing aid, etc.
Myles, not only does blind testing work for every field of science including audio, it is/was used for nearly everything except "hi end".
I consider the selection of orchestra players to be "audio" also. Yep, those have been selected blind now since the 70s.

Myles B. Astor said:
BTW do you know what an internal control is? If so, where, what and how is it used?

Controls are used for all forms of testing, including blind, the de facto standard of audio perceptual science. As JG Holt pointed out long ago "As far as the real world is concerned, high-end audio lost its credibility during the 1980s, when it flatly refused to submit to the kind of basic honesty controls (double-blind testing, for example) that had legitimized every other serious scientific endeavor since Pascal. [This refusal] is a source of endless derisive amusement among rational people and of perpetual embarrassment for me."
Couldn't have said it better.

Obviously it's not for everyone, but the world turns regardless. Cel phones, TVs, JBLs, Genelecs and hearing aids all work, orchestras become more gender diverse, etc, etc.
High end guys remain high end guys.

cheers,

AJ

Myles B. Astor · Oct 29, 2014

AJ Soundfield said:
That would be news to anyone with a cel phone, TV, hearing aid, etc.
Myles, not only does blind testing work for every field of science including audio, it is/was used for nearly everything except "hi end".
I consider the selection of orchestra players to be "audio" also. Yep, those have been selected blind now since the 70s.

Controls are used for all forms of testing, including blind, the de facto standard of audio perceptual science. As JG Holt pointed out long ago "As far as the real world is concerned, high-end audio lost its credibility during the 1980s, when it flatly refused to submit to the kind of basic honesty controls (double-blind testing, for example) that had legitimized every other serious scientific endeavor since Pascal. [This refusal] is a source of endless derisive amusement among rational people and of perpetual embarrassment for me."
Couldn't have said it better.
Obviously it's not for everyone, but the world turns regardless. Cel phones, TVs, JBLs, Genelecs and hearing aids all work, orchestras become more gender diverse, etc, etc.
High end guys remain high end guys.

cheers,

AJ

As much as I respect JGH, can you say CDP 101? One of the god awfulest pieces of audio equipment foisted on the public and audiophiles. Gordon creamed over it. Wonder how it would have fared in a blind test? And if Gordon truly believed that blind tests were cat's meow, then why didn't he didn't he put his money where his mouth was? Certainly he was a leader in the field and had ample opportunity. Guess we'll never know.

Sorry AJ but you are wrong. (And I spent 20+ years doing research and just might know a thing or two about trials.) Blind trials are used for drug testing for a specific reason. But blind trials simply don't work in all situations. BTW who officially crowned blind trials the gold standard in audio testing?

But more to the point, why are you dodging the topic of internal controls (not to mention why blind test don't work for audio and simply tell us what we already know). We are not talking just any old controls but specifically internal controls that show whether or not the test works and what are the test's limits. Where have any been used in these audio tests?

AJ Soundfield · Oct 29, 2014

Myles B. Astor said:
AJ we are not talking any controls but specifically internal controls for any scientific research. Where have any been used in these audio tests?

Well, I can't go back to 1933 and ask Fletcher-Munson, but there have been an awful lot of blind audio tests since then, through ISO 226:2003, so perhaps you could ask ISO what they use today. Or maybe someone like the aforementioned JJ what "internal controls" are used to make audio over Cel phones, TVs, VOIP, etc, work today.

cheers,

AJ

AJ Soundfield · Oct 29, 2014

Whoa, did you edit while I was typing...?

jkeny · Oct 29, 2014

The argument that ITU/MUSHRA testing is the only true criteria for the evaluation of audio devices seems to lead to the logical conclusion that any audio devices not backed up by such tests are dubious & any manufacturers not willing to invest the resources into such tests are again questionable. I don't really follow the argument that singles out high-end manufacturers in all this? I suggest 99.9% of ALL audio device manufacturers do not use ITU/MUSHRA testing in their evaluation, not just high-end manufacturers?

jkeny · Oct 29, 2014

Myles B. Astor said:
...
BTW do you know what an internal control is? If so, where, what and how is it used?

I believe this went sailing over the heads of those who strongly advocate blind testing. It's not the first time I've witnessed this oversight - it would seem that yet again, the use of the word "blind" is enough for these advocates & "double blind" really nails it down - nothing else is of much concern (Oh, except level matching)

Myles B. Astor · Oct 29, 2014

AJ Soundfield said:
Well, I can't go back to 1933 and ask Fletcher-Munson, but there have been an awful lot of blind audio tests since then, through ISO 226:2003, so perhaps you could ask ISO what they use today. Or maybe someone like the aforementioned JJ what "internal controls" are used to make audio over Cel phones, TVs, VOIP, etc, work today.

cheers,

AJ

AJ I'm speaking to you not JJ. Nor are we talking about cell phones, etc.You are the one who says you carry out blind testing. Specifically what standards are being used in your tests to establish the validity and sensitivity of your tests.

The real problem is that these tests are set up by engineers who are simply clueless when it comes to biology. And reductionism does not work when it comes to biological systems unlike what the engineering books tell them.

Brilliant Art Dudley's article

Well-known member

Well-known member

New member

New member

Active member

Well-known member

New member

Well-known member

Well-known member

Active member

Well-known member

Well-known member

Active member

Well-known member

Active member

Well-known member

Well-known member

New member

New member

Active member