Kaiser Fung, author of "Numbers rule your world" and the Junk Charts blog has an interesting post on the ethics of "correcting" data so that it provides a better fit for a theoretical model. His post refers to a deeper (and longer) post by Phil Price explaining why climate scientists corrrecting their data is not as fishy as it has been made to seem.
He outlines three circumstances in which we might alter data to improve its fit to a model, which I would paraphrase as:
- "If it ain't broke..." We tend to look into data that don't fit the model, and investigate "corrections" that would bring them closer to the model, more deeply than data that does.
- Cherry picking We tend to discard questionable data that doesn't fit the model, while using weaker standards for data that supports our theory.
- The model is right If a model is very well established, and very good, then the chances are that data which don't fit are wrong.
It's an excellent article, and well worth reading. I think the take-home point is that analysts in general have a tendency to bias data to fit their theory, but that it CAN be right to tinker with data. That said, we have to be very careful about whether it's justified in each particular case.
In customer research there is often a pressure, when scores are reported for individual cases or customers (or even at specific times), to cherry pick. "Don't interview Bob, he always gives low scores."
Next time you feel like removing a low score you don't deserve, ask yourself if you would be willing to take out one of the scores that's higher than you deserve.