In PDH Pareto Analysis, Chandoo [Congrulations mate;-)))], shares with us a chart showing some pareto charting he was doing. This interested me because I didn’t get the chart. I’d like to open this up for debate, and hopefully learn something.
Here Chandoo’s Chart:
Here are my thoughts on Chandoo’s chart – there probanly wrong!
- The chart title doesn’t provide us with much information about the chart. I thought you where suppose to use titles that gave some insight into what the chart is telling us?
- I don’t get what the bars show, for pareto I don’t care about the count of each X category, I just want to know the %, of X to Y, right?
- Over all I don’t get what the chart is telling me. The 80% line is crossed at “Dashboards” or “100 Excel Tips”, which get about 5k’s each – but what’s that got to do with pareto?
Here’s how I show pareto relationships, which address the issues above
I often use pareto for analysis stock profiles etc., where there are 1000’s of items, and these are 3 things I really want to know:
1. Does this profile conform to 80/20 rule?
2. If not, what’s 80% of the X by Y.
3. How long is the 20% tail?
In practice I often add the lines, but don’t normally add the text, I think it’s clear what the lines are showing. I’m not so worried about the data in Chandoos chart, because it’s a bit misleading anyway (only 10 pages?), but what I hope my chart shows better is the 2 important data point 20% of X and 80% of Y and also how the tail looks.
Now that’s just my take on it and I’m almost certainly wrong! What’s your view?
Related posts:
Hi Ross -
What I would change about Chandoo’s chart is that I would remove the secondary axis. It is clearer to have the line and bars on the same scale. I would also tweak the horizontal alignment so that the markers are aligned with the right edges of their respective bars, not centered above them.
The bars themselves are to me more important than the cumulative line, because they can show you which two or so items you need to give your attention to first. In my days as engineer, we used Paretos mostly to show problems, like scrap or defect rates. Work on the first two bars, then come back next time period and revisit the data.
The Pareto Principle originally didn’t really say 80% and 20%. It was more like “Most of something can be attributed to a small proportion of something else”. The obsession with the exact 80% and 20% limits is counterproductive. It might be 80-20, or 80-40, or 80-10. It’s immaterial, ranking the bars in decreasing order shows where attention is needed.
Very good analysis of the pareto chart Ross.
Few thoughts,
(1) As Jon already suggested, the purpose of Pareto chart is not to confirm the 80-20 principle but to show the relationship in that certain scenario, in my case it is more like 80-60.
(2) A pareto chart, by definition shows both individual and cumulative contributions. We can debate whether it is a right chart. But my intention is to demonstrate how such a chart can be created in Excel, rather than think about its merits and de-merits.
(3) Here is a corrected version of the chart based on Jon’s recommendation – http://chandoo.org/img/p/pareto-chart-with-diff-axis-option.png
I am going to tweet about this article. I am interested in knowing what others think too. :)
Must read:
Lee Wilkinson, Revising the Pareto chart. The American Statistician, Volume 60, Number 4, November 2006, pp. 332-334.
With Reference to the Updated Chandoo chart
The Cummulative Line must go to 100%
otherwise we are missing data (Bars)
If there is only a few missing include the data
If there are 20+ bars missing put in an Other category but tell us that it includes 20+ bars of various categories
If you have more than about a Dozen bars can they be re-categorised to group common items
You are generally looking for Step changes in the distribution, not a % here or there.
Thanks for the comments everyone.
Naomi, alas I don’t have a subscription to the American Statistician, I’ll have to do without, but thanks for the suggestion.
Jon, Chandoo,
Yes the 80/20 not begin absolute is a good point. I think the 80% bit is the most important aspect through. When I use pareto I want to know if I should invest my efforts equally across all the aspect of x or focus on just a few and the pareto helps show this.
I guess one difference between my use and yours Jon, it that I could never really use bars, typically my top 20% would be made up of 100-200 items, I’ll still need to look those up in a table, but I use the chart to see the whole population.
I’m in a bit of a rush here, I’ll try and do some more research and dig out some of my examples and do another post.
Thanks for you comments again, much appreciated!
Ross
Why we need Cumulative curve in Pareto ?
Hi Ross,
I just reviewed my inplementation of Pareto in SFE.
By stacking the segments, the cumulative curve is not required anymore : http://sparklines-excel.blogspot.com/2009/12/pareto-v2.html
Cheers