We’re shifting from a Gaussian world to a Paretian world, with profound implications for business. Johann Gauss was a famous mathematician in the 18th century and Vilfredo Pareto was a great economist who lived across the cusp of the 19th and 20th centuries. So, what possible relevance do these dead white men have for business today?
Gauss versus Pareto
Gauss contributed the Gaussian distribution, also known as the normal distribution, as a way to characterize the probability of events – most of us know it as the familiar bell curve with a significant hump in the middle and two relatively modest tails on either side of the hump. Pareto, on the other hand, inspired the Pareto, or power law, probability distribution. Chris Anderson’s The Long Tail offers a great contemporary example of the Pareto probability distribution – a few extreme events or “blockbusters” on the left hand side of the curve and a very long tail of much less popular events on the right hand side of the curve. The Pareto distribution has also been popularized as the “80/20” rule.
(Image courtesy of Albert-Laszlo Barabasi, "Linked: The New Science of Networks")
These are two very different ways of viewing the world, with some events following a Gaussian distribution (classic example: the heights of individual human beings) and other events following a Pareto distribution (classic examples: frequency of word use, size of human settlements, distribution of Internet traffic and intensity of earthquakes).
Bill McKelvey, a professor at UCLA’s business school, has written a series of excellent papers exploring the significance and implications of these two world views for business (warning to the casual business reader – these are dense works that cannot be skimmed and certainly should not be perused on a Blackberry lest you experience Berrybite blowback).
In a recent journal article (purchase required) written with Pierpaolo Andriani, McKelvey highlights a crucial distinction between the Gaussian and Paretian worlds:
Gaussian and Paretian distributions differ radically. The main feature of the Gaussian distribution . . . can be entirely characterized by its mean and variance . . . A Paretian distribution does not show a well-behaved mean or variance. A power law, therefore, has no average that can be assumed to represent the typical features of the distribution and no finite standard deviations upon which to base confidence intervals . . .
Andriani and McKelvey focus on the desperate efforts of social “scientists” to fit social phenomena into Gaussian distributions. There is a sad humor in their discussion of the creative approaches used by econometricians to add “robustness” improvements to standard linear multiple regression models, in a vain effort to account for extreme events. As they observe in another paper, “robustness tests bury the most important variance.”
But it is not just social scientists who fall prey to this temptation to adopt a Gaussian view of the world. Business executives also are drawn to a Gaussian world. At one level it is much simpler – there is a meaningful “average consumer” that can be used to scale products and operations around – and it is a much more predictable world. In many respects, the history of Western business in the twentieth century represents an effort to build scalable operations through standardization designed to serve “average consumers”.
As McKelvey observes in another paper ("Extreme Events, Power Laws, and Adaptation" - unfortunately not yet available online) co-authored with Max Boisot:
Organizations can be shaped or forced into a Gaussian form. The large hierarchies that managers work in, and the procedures that they impose on their organizational members – the division of labor, single-point accountability, cost accounting, etc. – aim to achieve control by isolating and objectifying. These managers inherit from the industrial economy a belief that, even where the world is not yet Gaussian, it can be made so through design.
The growth of the Paretian world
Here’s the problem (or opportunity). Gaussian distributions tend to prevail when events are completely independent of each other. As soon as you introduce the assumption of interdependence across events, Paretian distributions tend to surface because positive feedback loops tend to amplify small initial events. For example, the fact that a website has a lot of links increases the likelihood that others will also link to this website.
McKelvey and Andriani suggest that Gaussian distributions can morph into Paretian distributions under two conditions – when tension increases and when the cost of connections decreases. In our globalizing economy, tension rises as competitive intensity increases and as business landscapes evolve faster than the capacity of most organizations to adapt. At the same time, costs of connections are rapidly decreasing as public policy shifts towards freer movement of goods, money and ideas and rapid improvements in the price-performance of IT infrastructures dramatically reduce the cost of information transmission. Bottom line: Paretian distributions become even more prevalent.
(Just as an aside, I wonder what bio-engineering will do to even the most traditional Gaussian distribution – the height of individuals. As we acquire the ability to genetically engineer the height of our children, will we see height “fads” emerge in the same way we see naming fads evolve from generation to generation today? Perhaps in some generations, tall will be “in” while in other generations short will come back – leading to the crumbling of yet another Gaussian bastion.)
Extreme events
So, why does this matter? In a world of power law or Pareto distributions, extreme events become much more prominent. Extreme events can take many forms. They can be sudden and severe disturbances like a class 9 earthquake or a financial meltdown like the one that occurred in US stock markets in 1987. As McKelvey and Andriani observe, “the lesson that we can draw . . . is that extreme events, which in a Gaussian world could be safely ignored, are not only more common than expected but also of vastly larger magnitude and far more consequential.”
Our institutions (not just businesses, but also educational and governmental) are largely designed for a Gaussian world where averages and forecasts are meaningful. As a result, we have evolved a sophisticated set of push programs that have delivered significant efficiency. In a world of sudden, severe and difficult to anticipate shifts, push programs become much less viable and we need to become a lot more creative in terms of designing pull platforms – something that JSB and I have written extensively about in the past. Bottom line: our institutional architectures, not to mention our technology architectures, will need to be redesigned to cope with a Paretian world.
Using examples like earthquakes and financial meltdowns obscures a related form of extreme event that has a more positive outcome (at least for direct participants) and generally takes longer to play out than the hours or days characteristic of sudden events. As McKelvey and Andriani point out, companies like Google and Microsoft have achieved enormous concentration of economic value creation that defies the averages of the Gaussian world. These extreme events have an interesting property – they emerge first in the “fat tail”, on the edge of conventional business activity, driven by a different view of business opportunity, and then gather momentum until they eventually break into the head of the distribution and change the game for everyone else. The challenge for business managers is to sort out the signal from the noise in the fat tail and spot early on the emergent extreme events that could reshape the business landscape. The Gaussian focus on averages obscures these events, treating them as meaningless
“outliers” until it is too late.
There’s another form of extreme event that also becomes more prominent in a Paretian world – this is the tendency for extreme forms of clustering in social networks, whether it takes the form of clustering in mega-cities in physical space or clustering of links and traffic on web sites in virtual space. Economic value inexorably follows these social clusters. This also has powerful implications for business, ranging from where to locate operations in physical space to how to redesign institutional architectures to accommodate thousands of business partners. There’s also a public policy implication – in many domains we are likely to see degrees of concentration and consolidation of economic power that is unprecedented (although Pareto just over 100 years ago observed that 20% of the population in Italy owned 80% of the land).
Now, of course, all three of these extreme events are related – the clustering events generate and amplify the positive feedback loops that lead to both sudden and severe negative events as well as the more gradual, but no less significant, positive events.
Searching for simplicity
Besides extreme events, there’s another implication of the shift to a Paretian world. While on the surface Paretian worlds appear much more complex and unpredictable than the seductive simplicity of the Gaussian world, deep structural forces are at work shaping Paretian worlds. These structural forces play out at multiple levels of the Paretian world - for example, the local work group, the enterprise, broader process networks, cities, regions or the world.
The problem is that most of our analytical tools are designed to understand Gaussian worlds. These same tools seriously miss, or even distort, the dynamics of Paretian worlds. We need an entirely new analytical tool kit for the Paretian world. McKelvy and Andriani, in the journal article mentioned earlier, urge business researchers to learn from
. . . earthquake science where the study of extremes is routine, and complexity science, where focus is on emergent self-organization stemming from agent interdependence and positive feedback, consequent extremes and underlying scale-free theory. . . We see very little in existing social science disciplines that offer anything constructive here. Only by facing up to this redirection of strategic organization research can it actually become a practitioner-relevant science like the natural sciences.
In the paper co-authored with Max Boisot, McKelvey points to an alternative view of simplicity:
Underlying most power laws is a causal dynamic explained via a scale-free theory. Such a theory points to a single generative cause to explain the dynamics at each of however many levels are being studied. Scale-free theories yield what [Murray] Gell-Mann . . . refers to as “deep simplicity”. . . . Scale-free theories point to the same causes operating at multiple levels – simplicity here consists of one theory explaining dynamics at multiple levels.
Later in the paper, McKelvey and Boisot offer a suggestion about different strategies for achieving understanding between the Gaussian and Paretian worlds:
Processing dots is appropriate to what we label the routinizing strategy. Processing patterns, on the other hand, better serves what we call the Pareto-adaptive strategy. Processing dots means processing data, a low-level cognitive activity. By contrast, processing patterns – pattern recognition – is a high-level cognitive activity, one that involves selecting relevant patterns from among myriad possibilities. . .
In a Paretian world, surface events can become a distraction, diverting attention from the deep structures molding these surface events. Surfaces are extraordinarily complex and rapidly evolving while the deep structures display more simplicity and stability. These deep structures are profoundly historical in nature – they evolve through positive feedback loops and path dependence. Snapshots become misleading and understanding requires a dynamic view of the landscape.
The payoff
This is not simply an academic exercise. The rewards for achieving a better understanding of the Paretian world are enormous. Small moves, smartly made, can lead to exponential improvements in wealth creation provided they leverage the deep structures that define Paretian distributions. In contrast to the scaling strategies described earlier in the Gaussian world, different and even more powerful scaling strategies become feasible in the Paretian world, converting instability from a liability into an advantage.
Shifting mindsets
But, as with most things in business (and in life), mindsets become a key stumbling block. McKelvey and Boisot describe the “Gaussian perspective of the world” as one built on atomism, privileging “stability over instability, structure over process, objects over fields, and being over becoming.” Not a bad summary of the way most Western executives view the business landscape. There is a natural and very human tendency to seek out the typical or the average and to search for more predictability. By implication, a Paretian world requires a much more dynamic view of the world, one that looks for patterns in evolving relationships, rooted deeply in context, and that understands how these changing patterns reshape who we are as well as our opportunities for growth. McKelvey’s provocative work will help to challenge and shift our mindsets.
Having recently read "The Power of Pull", I can see how the ideas in this 2007 post (which I discovered via a tweet today) influenced many of the ideas presented there.
This is a very interesting analysis - and labeling - of the Paretian World (though I'm wondering how that might best be pronounced ... more like "Parisian" or "paration"). The terms "fat tails" and "extreme clusters" are also good takeaways.
The emphasis on increasing interdependence and interconnection is well taken, and helps explain why it seems increasingly challenging to make sense of [Gaussian-based] statistics reported on a variety of phenomena.
I particularly like your insights about surface events, distraction, diversion and complexity in a Paretian world, which helps explain why, at a time when we've never had so much direct access to original sources of information, more and more people seem to simply skim the surface (e.g., retweeting links to articles with provocative titles without taking the time to actually read them).
For what it's worth, I reviewed some related ideas in a blog post of my own in 2010 on power laws and pyramids: participation, gratification and distraction in social media, synthesizing some insights from Ross Mayfield, Josh Bernoff, Christopher Allen and Charles Cooley.
Posted by: Joe McCarthy | April 12, 2011 at 10:21 AM
John,
Thanks very much for the thought-provoking post. It makes me even more excited to read your book, which I coincidentally ordered last night!
Practitioners in social media, and people interested in social media metrics, would do well to read your post and ponder its implications. In particular, the quest for a simple (single) metric for "influence" seems like a Gaussian way of thinking (and probably a fool's errand).
Posted by: Ryan McCormack | August 23, 2010 at 07:07 AM
ERP/MRP II needs a fundamental rethink. SAP, Oracle, Infor, ERP et al are designed to push a hierarchal plan. In a world of extremes systems now need to pull in demand and produce results without generating reams of redundant accounting and lead time exceptions.
Posted by: IC | August 22, 2010 at 10:47 AM
John, I *really* enjoyed this post, and I will equally enjoy re-reading it along with the amazing conversation /posts you generated.
I am in the midst of a strategic and intellectual exercise that is Paretian/Gaussian in nature, around what we used to call the "Black Swan Event."
We are looking for big changes that while unexpected, impact an entire ecosystem and change the rules forever-even generating revisionist history or assumptions. We (perhaps arrogantly) want to be a trigger of one of those Black Swan events, or at least be among the first to see one happening and thus maneuver ourselves to take optimum advantage of it. You know, jump into the dry river bed before the storm, and all that. But Black Swan events no longer occur (as in the past) in isolation or outside of the system. So they're not so easy to see, or even interpret. There is an argument that they only occur in the concert of multiple (and non-collaborating) actors in a system. In fact, we expect that they will occur in-sync and as a part of the *Omniscient* or organic changes of a system... Meaning that we can no longer "try" to make one happen. Kind of like the folly of "purposefully designing" a viral video or a new genre of popular music.
So while dramatic and perhaps cataclysmic or creatively destructive, they will still likely come from within or adjacent to one of our ecosystem spheres... because *everything* is now "connected" as never before.
Yet we don't have a way to map those connections because they form and are broken and form back so easily and dynamically.
The idea is that these *connections* are less expensive, and that as a result, things occur *and* morph/extend in near-real-time across the "real-time Web" (which has become a buzzword around here).
So in a business sense, and in a sociological context we see these "ripples" of an event repeated, and then creating additional concentric ripple-waves of their own, it often becomes one big chaotic hair-ball to interpret cause-and-effect. No matter how good your math is.
Borrowing another metaphor, it is also seeming to become more like the Schrödinger's Cat/Copenhagen shift of perspective. In an advertising condition, discrete outcomes are measured but not *known* until later, as there are secondary-dependent actions and outcomes that define the "success" of the ad transaction. Is the cat dead or alive? Is the ad successful or not? Depends on when you look, and if you look. And how you look.
We see connections that are accidental, incidental and unintentional actually impacting and changing the nature of the ecosystem. Gmail goes down because someone forgot to update a server. Twitter fails because of the usage shift, realtime communications shift to this discussion. People create alternative communications conduits (like a living organism creates new blood vessels to the heart when others get clogged up), and thus the system is permanently altered. Or is it? Sorry about the trivial and banal example with Gmail and Twitter.
But the point I'm struggling with, and which excites me about your post, and the subsequent brilliant discussion (others' not mine) is that we have apparently underestimated the evolutionary or quantum shifts happening faster due to connections and realtime feedback cycles being shortened.
We are looking at ordinary consumer and advertiser behavior around ads and the social graph. Simply by *introducing* a set of ad control feedback (passive and active) mechanisms we see that *other* incidental and accidental connections and feedback results begin to change the entire nature of the interaction, especially scattered in these infinite, concentric wavelets or droplets. So we see wonderful chaos that we now get to sort out in a Paretian/Gaussian shift to figure out if there is any cause-and-effect (or meaning) or if it is munging up (another technical term) because the very nature of the mass activity is shifting organically... as in Copenhagen.
In the end, we won't know until we know, and then if it is a Black Swan, we'll likely back-into a justification of the shift (as writers of history) and use the luxury of hindsight as our "proof." I think we'd like to figure out how to get 5% better than that and see if we can trend-spot something this big and get in front of it. Gaussian/Paretian, Schrödinger's cat/Copenhagen
Ok enough. This is fun, but it's hurting my brain on a sunny California Autumn afternoon to be inside and thinking so hard.
Keep up the great conversation! I'll try to keep up.
Cheers,
Matt Weeks
Posted by: mattweeks | September 06, 2009 at 04:13 PM
One piece of nitpicking: We have never been in a Gaussian world, where anything involving human social systems was concerned.
Posted by: twitter.com/AFG85 | September 06, 2009 at 02:14 PM
Great post! One issue to be careful with, however, is distinguishing between ranked data and probability density functions (PDFs). Ranked data is by definition always decreasing, and so of course can never be a bell curve or Gaussian. In some cases, ranked data can match a power law, in which case the corresponding PDF is also a power law. However, if the ranked data has a "long tail" but doesn't really fit a power law, the corresponding PDF can in fact be an exact Gaussian! More on this here and here.
I mention this because the distributions you mention as "Pareto", e.g. frequency of word use, are typically thought of as ranked data: the graph is that of words (x-axis) vs. frequency of use (y-axis), ordered from highest to lowest frequency. This of course cannot possibly be a Gaussian, since it is by definition decreasing!
This graph has a corresponding Cumulative Distribution Function (CDF): frequency of use (x-axis) vs. number of words with at least this frequency (y-axis). If this graph follows an inverse power law, then it's called a "Pareto distribution" (a Pareto distribution originally described the percentage of people owning more than x amount of wealth). However, this isn't the graph most people first associate with "frequency of word use"!
Finally, the corresponding PDF would be frequency of use (x-axis) vs. percent of words with that frequency (y-axis). Here's where you could potentially look for a Gaussian.
Posted by: Adam Marsh | May 20, 2007 at 08:28 PM
John,
Please have a look at this:
http://www.econtalk.org/archives/2007/04/taleb_on_black.html
It's a link to an interview with Nassim Taleb about his new book Black Swan.
The interview deals mainly with Gaussian truths and extreme events and the unpredictability of these events. The interview (MP3)is excellent.
Posted by: Rutger van Waveren | May 15, 2007 at 06:09 AM
As someone who prefers jumping from the saddle-point, your wonderfully written viewpoint brought me joy. But, then your perspectives usually do.
So, I can only assume that you've noticed this isn't confined to business trends. Perhaps human and universal tendancies are shifting. In the words of Gregory Bateson, "[It is] difference that makes a difference" and "Embedded and interacting systems have a capacity to select pattern from random elements."
I look forward to reading more on this topic and related communities of thought. Here's to patterns that connect!
Posted by: KM | May 10, 2007 at 08:27 PM
First, the mathematics of the two are substantively different. Gaussian law is based on the law of large numbers. Pareto made no such claim in any of his writings. In fact Pareto's work as it pertains to distribution is sui generis. Next Pareto elements are based on sharing properties while Gaussian distribution is based on common elements intrinic (maybe naturally so) in a system.
Posted by: George Albert | May 09, 2007 at 04:29 PM
Love this blog. I am creating a course that has as its goal (in fact my whole company has this as its goal) to shift the mindset of systems engineers to more knowledge of the "Paretian" world from the Gaussian, but I didn't have those words to describe it until I read this. Gotta get my hands on the journal paper. Thanks.
Posted by: Sarah | May 07, 2007 at 10:40 AM
John,
I submit that we are already living in a "Paretian World". Consider the scale-free nature of the things in our daily lives (e.g., transportation hubs; web search engines; social network topologies). Most of us simply choose to ignore it because Gaussian methods support our dogmas so nicely.
Our education system (particularly in the west) thrives on reductionism that is inconsistent with a true "system of systems" view. Hence our fascination with single metrics in the most complex of situations (e.g., "enemy combatants killed"), and our preference for "silver bullet" solutions.
We also favor uniqueness in our answers, because that too favors our propensity for simple solutions. And we have a really tough time conceiving of very large numbers (q.v. Huxley's "six monkeys" strumming on typewriters that would, in a mere matter of years, randomly write all the works of Shakespeare -- when the real answer is more like 1E30 years to randomly get Hamlet).
My point: I absolutely agree with your charge to "change mindsets" and think more dynamically. As a parent of a kindergartener and a 4th grader, I try to expose my kids to as wide an array of disciplines and opinions as possible. This is perhaps the most important challenge of our time: how to raise dynamically-thinking, socially aware citizens of the world who will thrive in the connectedness that is upon us.
vr/ shane
Posted by: deichmans | May 05, 2007 at 12:49 PM
Thanks for this concise treatment of a very intriguing property of the real world. Your readers might also enjoy the references to power-law distributions as an emerging property of dynamic systems in Eric Beinhocker's "The Origin of Wealth" (see my review at http://livepaola.wordpress.com/2007/04/21/eric-d-beinhocker-the-origin-of-wealth-a-must-read/).
Posted by: LivePaola | May 04, 2007 at 08:54 AM
Great post !! One of the things that I have seen especially with my friends doing research is that most of them love the gaussian distribution because the mathematical tools available to handle gaussian are quite well established while there are not good tools.
For ex I have seen so many glee over the fact that Fourier Transform( of a Gaussian) is again a Gaussian and this helps them build to nice illustrations and papers explained neatly in mathematical equations.
IMHO I think the problem starts from the scientists is prevalent all the way to economists & business people.
Laendro/JOhn
Will try to check out the two books mentioned the comment but I think the only book that I have seen which has given the pareto distribution its due in fooled by randomness by NN Taleb.
Rajan
Posted by: Rajan | May 04, 2007 at 08:00 AM
Extreme events are just that... regardless of the distribution used to fit them. Averages are not applicable to phenomena described by normal distributions... frankly, this argument could use a bit of Occam's Razor
Posted by: Adelino de Almeida | May 03, 2007 at 03:03 PM
Excellent! Power law is at the core of our methodology of ‘management of change’, described in my book ‘Viral Change: the alternative to slow, painful and unsuccessful management of change in organizations’ (meetingminds, 2006) I apply that network based conceptual framework to the management of the organizational change. I can also say that it works! In many organizations I am working with, cultural and other changes become more sustainable and appear faster, precisely because we are using the power law distribution of people within the organization in terms of their influence. The power of a small set of non negotiable behaviours, spread by the small number of people in the head of the power law, generates social copying and tipping points of new routines ( cultural change, process change…) It is a contrarian view to the traditional linear, sequential, painful, expensive and mostly unsuccessful ‘change management process’. I create internal epidemics of success. Thanks for all the above thoughts which will help further building… - Leandro Herrero (www.thechalfontproject,com)
Posted by: Leandro Herrero | May 03, 2007 at 02:54 PM
John,
Excellent explanation that brings clarity to an important topic.
I use some of the same type of analysis in my work on warfare. Long tail war. Dynamism pattern recognition through never ending analysis/synthesis loops (Boyd's snowmobile) vs. static snapshots to support dogma.
If you get a chance, check out my book Brave New War (it's on Amazon). I'm also going to write this up on my weblog, Global Guerrillas, in some fashion.
Thanks much for such great work.
Sincerely,
John Robb
Posted by: John Robb | May 03, 2007 at 06:24 AM