Archive for the 'SystemDynamics' Category

Page 2 of 16

A Geoff Coyle reading list

The System Dynamics Society reports that SD pioneer Geoff Coyle has passed away.

We report the sad news that longtime system dynamicist R. Geoffrey Coyle died on November 19, 2012. Geoff was 74. He started his career as a mining engineer. Having completed a PhD in Operations Research, he came to Cambridge, Massachusetts from the UK in the late 1960’s, and studied with Jay Forrester to learn system dynamics. Upon his return to the UK, he started to develop system dynamics in England. He was the founder of the first system dynamics group in the UK, at the University of Bradford in 1970. This group grew terrifically and produced some of the most important people in our field. Geoff and his students have made enormously important contributions to the field and the next generation of their students have as well, all following in Geoff’s footsteps and under his tutelage.

Geoff and the Bradford group also founded the first system dynamics journal, Dynamica. They created DYSMAP, the first system dynamics software that had built-in optimization and built-in dimensional consistency technique.

Geoff authored a number of very important books in the field including: Management in System Dynamics (1977), System Dynamics Modelling: A Practical Approach (1996) and Practical Strategy: Tools and Techniques (2004). In 1998, he was the first recipient of the Lifetime Achievement Award of the System Dynamics Society. More recently he returned to his first academic love and wrote a highly acclaimed history of mining in the UK: The riches beneath our feet (2010). This is a wonderful legacy in the field of system dynamics and beyond.

I realized that, while I’ve always enjoyed his irascibly interesting presentations, I’ve only read a few of his works. So, I’ve collected a Coyle reading list: Continue reading ‘A Geoff Coyle reading list’

Not even wrong: a school board’s discussion of systems thinking

Socialism. Communism. “Nazism.” American Exceptionalism. Indoctrination. Buddhism. Meditation. “Americanism.” These are not words or terms one would typically expect to hear in a Winston-Salem/Forsyth County School Board meeting. But in the Board’s last meeting on October 9th, they peppered the statements of public commenters and Board Members alike.

The object of this invective? Systems thinking. You really have to read part 1 and part 2 of Camel City Dispatch’s article to get an appreciation for the school board’s discussion of the matter.

I know that, as a systems thinker, I should look for the unstated assumptions that led board members to their critiques, and establish a constructive dialog. But I just can’t do it – I have to call out the fools. While there are some voices of reason, several of the board members and commenters apparently have no understanding of the terms they bandy about, and have no business being involved in the education of anyone, particularly children.

The low point of the exchange:

Jeannie Metcalf said she “will never support anything that has to do with Peter Senge… I don’t care what [the teachers currently trained in System’s Thinking] are teaching. I don’t care what lessons they are doing. He’s is trying to sell a product. Once it insidiously makes its way into our school system, who knows what he’s going to do. Who knows what he’s going to do to carry out his Buddhist way of thinking and his hatred of Capitalism. I know y’all are gonna be thinkin’ I’m a crazy person, but I’ve been around a long time.”

Yep, you’re crazy all right. In your imaginary parallel universe, “hatred of capitalism” must be a synonym for writing one of the most acclaimed business books ever, sitting at one of the best business schools in the world, and consulting at the highest levels of many Fortune 50 companies.

The common thread among the ST critics appears to be a total failure to actually observe classrooms combined with shoot-the-messenger reasoning from consequences. They see, or imagine, a conclusion that they don’t like, something that appears vaguely environmental or socialist, and assume that it must be part of the hidden agenda of the curriculum. In fact, as supporters pointed out, ST is a method, which could as easily be applied to illustrate the benefits of individualism, markets, or whatnot, as long as they are logically consistent. Of course, if one’s pet virtue has limits or nuances, ST may also reveal those – particularly when simulation is used to formalize arguments. That is what the critics are really afraid of.

Finding SD conference papers

Some change at google or the SD society site has broken the old approach for filtering searches to SD conference papers, which was

site:www.systemdynamics.org/conferences/* bathtub dynamics

Now it appears that

site:www.systemdynamics.org bathtub dynamics

still works, though with a bit of noise from general SD society pages.

For some reason, the old way still works in a google custom search:

Update: the new preferred syntax appears to be:

site:systemdynamics.org inurl:conferences bathtub dynamics

Kon-Tiki & the STEM workforce

I don’t know if Thor Heyerdahl had Polynesian origins or Rapa Nui right, but he did nail the stovepiping of thinking in organizations:

“And there’s another thing,” I went on.
“Yes,” said he. “Your way of approaching the problem. They’re specialists, the whole lot of them, and they don’t believe in a method of work which cuts into every field of science from botany to archaeology. They limit their own scope in order to be able to dig in the depths with more concentration for details. Modern research demands that every special branch shall dig in its own hole. It’s not usual for anyone to sort out what comes up out of the holes and try to put it all together.

Carl was right. But to solve the problems of the Pacific without throwing light on them from all sides was, it seemed to me, like doing a puzzle and only using the pieces of one color.

Thor Heyerdahl, Kon-Tiki

This reminds me of a few of my consulting experiences, in which large firms’ departments jealously guarded their data, making global understanding or optimization impossible.

This is also common in public policy domains. There’s typically an abundance of micro research that doesn’t add up to much, because no one has bothered to build the corresponding macro theory, or to target the micro work at the questions you need to answer to build an integrative model.

An example: I’ve been working on STEM workforce issues – for DOE five years ago, and lately for another agency. There are a few integrated models of workforce dynamics – we built several, the BHEF has one, and I’ve heard of efforts at several aerospace firms and agencies like NIH and NASA. But the vast majority of education research we’ve been able to find is either macro correlation studies (not much causal theory, hard to operationalize for decision making) or micro examination of a zillion factors, some of which must really matter, but in a piecemeal approach that makes them impossible to integrate.

An integrated model needs three things: what, how, and why. The “what” is the state of the system – stocks of students, workers, teachers, etc. in each part of the system. Typically this is readily available – Census, NSF and AAAS do a good job of curating such data. The “how” is the flows that change the state. There’s not as much data on this, but at least there’s good tracking of graduation rates in various fields, and the flows actually integrate to the stocks. Outside the educational system, it’s tough to understand the matrix of flows among fields and economic sectors, and surprisingly difficult even to get decent measurements of attrition from a single organization’s personnel records. The glaring omission is the “why” – the decision points that govern the aggregate flows. Why do kids drop out of science? What attracts engineers to government service, or the finance sector, or leads them to retire at a given age? I’m sure there are lots of researchers who know a lot about these questions in small spheres, but there’s almost nothing about the “why” questions that’s usable in an integrated model.

I think the current situation is a result of practicality rather than a fundamental philosophical preference for analysis over synthesis. It’s just easier to create, fund and execute standalone micro research than it is to build integrated models.

The bad news is that vast amounts of detailed knowledge goes to waste because it can’t be put into a framework that supports better decisions. The good news is that, for people who are inclined to tackle big problems with integrated models, there’s lots of material to work with and a high return to answering the key questions in a way that informs policy.

In search of SD conference excellence

I was pleasantly surprised by the quality of presentations I attended at the SD conference in St. Gallen. Many of the posters were also very good – the society seems to have been successful in overcoming the booby-prize stigma, making it a pleasure to graze on the often-excellent work in a compact format (if only the hors d’oeuvre line had had brevity to match its tastiness…).

In anticipation of an even better array of papers next year, here’s my quasi-annual reminder about resources for producing good work in SD:

I suppose I should add posts on good presentation technique and poster development (thoughts welcome).

Thanks to the organizers for a well-run enterprise in a pleasant venue.

The Capen Quiz at the System Dynamics Conference

I ran my updated Capen quiz at the beginning of my Vensim mini-course on optimization and uncertainty at the System Dynamics conference. The results were pretty typical – people expressed confidence bounds that were too narrow compared to their actual knowledge of the questions. Thus their effective confidence was at the 40% level rather than the 80% level desired. Here’s the distribution of actual scores from about 30 people, compared to a Binomial (10,.8) distribution:

(I’m going from memory here on the actual distribution, because I forgot to grab the flipchart of results. Did anyone take a picture? I won’t trouble you with my confidence bounds on the the confidence bounds.)

My take on this is that it’s simply very hard to be well-calibrated intuitively, unless you dedicate time for explicit contemplation of uncertainty. But it is a learnable skill – my kids, who had taken the original Capen quiz, managed to score 7 out of 10.

Even if you can get calibrated on a set of independent questions, real-world problems where dimensions covary are really tough to handle intuitively. This is yet another example of why you need a model.

Spot the health care smokescreen

A Tea Party presentation on health care making the rounds in Montana claims that life expectancy is a smoke screen, and it’s death rates we should be looking at. The implication is that we shouldn’t envy Japan’s longer life expectancy, because the US has lower death rates, indicating superior performance of our health care system.

Which metric really makes the most sense from a systems perspective?

Here’s a simple, 2nd order model of life and death:

From the structure, you can immediately observe something important: life expectancy is a function only of parameters, while the death rate also includes the system states. In other words, life expectancy reflects the expected life trajectory of a person, given structure and parameters, while the aggregate death rate weights parameters (cohort death rates) by the system state (the distribution of population between old and young).

In the long run, the two metrics tell you the same thing, because the system comes into equilibrium such that the death rate is the inverse of the life expectancy. But people live a long time, so it might take decades or even centuries to achieve that equilibrium. In the meantime, the death rate can take on any value between the death rates of the young and old cohorts, which is not really helpful for understanding what a new person can expect out of life.

So, to the extent that health care performance is visible in the system trajectory at all, and not confounded by lifestyle choices, life expectancy is the metric that tells you about performance, and the aggregate death rate is the smokescreen.

Here’s the model: LifeExpectancyDeathRate.mdl or LifeExpectancyDeathRate.vpm

It’s initialized in equilibrium. You can explore disequilbrium situations by varying the initial population distribution (Init Young People & Init Old People), or testing step changes in the death rates.

False positives, publication bias and systems models

A PLOS Medicine paper asserts that most published results are false.

It can be proven that most claimed research findings are false

Corollary 1: The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.

Corollary 2: The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.

Corollary 3: The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true.

Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.

Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.

Corollary 6: The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.

This somewhat alarming result arises from fairly simple statistics of false positives, publication selection bias, and causation vs. correlation problems. While the math is incontrovertible, some of the assumptions have been challenged:

… calculating the unreliability of the medical research literature, in whole or in part, requires more empirical evidence and different inferential models than were used. The claim that “most research findings are false for most research designs and for most fields” must be considered as yet unproven.

Still, the argument seems to be a matter of how much rather than whether publication bias influences findings:

We agree with the paper’s conclusions and recommendations that many medical research findings are less definitive than readers suspect, that P-values are widely misinterpreted, that bias of various forms is widespread, that multiple approaches are needed to prevent the literature from being systematically biased and the need for more data on the prevalence of false claims.

(Others propose similar challenges. There’s conflicting literature about whether (weak) observational studies hold up with (strong) randomized follow-up trials.)

This is obviously a big problem from a control perspective, because the kind of information provided by the studies in question is key to managing many systems, as in Nancy Leveson‘s pharma safety example:

It’s also leads me to a rather pointed self-question. To what extent is typical system dynamics modeling practice subject to the same kinds of biases? Can we say not only that all models are wrong, but that most are useless?

First the good news.

  • SD doesn’t usually operate in the data mining space, where large observational studies seek effects absent any a priori causal theory. That means we’re not operating where false positives are most likely to arise.
  • Often, SD practitioners are not testing our own pet theories, but those of some decision makers – perhaps even theories of competing interests in an organization.
  • SD models play a “knowledge integration” role that’s somewhat analogous to meta-analysis. A meta-analysis pools the statistics from a number of replications of some observation, which improves the signal to noise ratio, making it easier to see whether there’s any baby in the bathwater. An SD model instead pools the effect sizes of inputs (studies or anecdotes) and puts them to a functional test: do the individual components, assembled into a system, yield the observed behavior of the macro system?
  • Similarly, good SD modelers tend to supplement purely statistical inputs with Reality Checks that effectively provide additional data verification by testing extreme conditions where outcomes are known (though this is not helpful if you don’t know anything about relationships to begin with).
  • Including physics (using the term loosely to include things like conservation of people) in models also greatly constrains the space of plausible hypotheses a priori.

Now the bad news.

  • Models are often used in one-off, non-replicable strategic decision making situations, so we’ll never know. Refereed forecasting helps, but success can still be due to luck rather than skill.
  • We often have to formalize soft variable concepts for which definitions are uncertain and measurements are lacking.
  • SD models are often reliant on thin literature bases, small studies, or subject matter expertise to establish relationships. Studies with randomized control are a rarity.
  • Available data for model verification is often of low quality and short duration.
  • Data can provide a weak check on the model – if a system exhibits exponential growth, for example, one positive feedback loop in the dynamic hypothesis is as good as another (though of course good a priori explanations of the structure of the system help).

My suspicion is that savvy modelers are already well aware of just how messy and uncertain their problem domains are. Decisions will be taken, with or without a model, so the real objective is to use the model to add value by rejecting ideas that don’t work. The problem then is not that wrong models make decisions worse, but that we could probably do a lot better if we could be smarter about the possible biases in models and thinking in general.

Alex Tabarrok at Marginal Revolution has a nice take on remedies:

What can be done about these problems? (Some cribbed straight from Ioannidis and some my own suggestions.)

1) In evaluating any study try to take into account the amount of background noise. That is, remember that the more hypotheses which are tested and the less selection which goes into choosing hypotheses the more likely it is that you are looking at noise.

2) Bigger samples are better. (But note that even big samples won’t help to solve the problems of observational studies which is a whole other problem).

3) Small effects are to be distrusted.

4) Multiple sources and types of evidence are desirable.

5) Evaluate literatures not individual papers.

6) Trust empirical papers which test other people’s theories more than empirical papers which test the author’s theory.

7) As an editor or referee, don’t reject papers that fail to reject the null.

For SD modeling, I’d add a few more:

8) Reserve time for exploration of uncertainty (lots of Monte Carlo simulation).

9) Calibrate your confidence bounds.

10) Help clients to appreciate the extent and implications of uncertainty.

11) Pay attention to the language used to describe statistical concepts. Words like “expectation” and “significance” that have specific mathematical interpretations don’t mean the same thing to managers.

11) Look for robust policies that work irrespective of uncertain relationships.

12) Explicitly seek out and test alternative hypotheses (This sounds like it’s at odds with Corollary 3 above, but I think it’s the right thing to do. Testing multiple hypotheses in the context of the model is not the same thing as mining data for multiple relationships.).

13) If you can’t estimate something directly from data, or back it up with literature (more than a single paper), at least articulate some bounds on the effect, perhaps through experiments with a submodel.

What do you think? When is modeling and statistical analysis helpful, and when is it risky business?

 

 

Thinking systemically about safetey

Accidents involve much more than the reliability of parts. Safety emerges from the systemic interactions of devices, people and organizations. Nancy Leveson’s Engineering a Safer World (free pdf currently at the MIT press link, lower left) picks up many of the threads in Perrow’s classic Normal Accidents, plus much more, and weaves them into a formal theory of systems safety. It comes to life with many interesting examples and prescriptions for best practice.

So far, I’ve only had time to read this the way I read the New Yorker (cartoons first), but a few pictures give a sense of the richness of systems perspectives that are brought to bear on the problems of safety:

Leveson - Pharma safety
Leveson - Safety as control
Leveson - Aviation information flow
The contrast between the figure above and the one that follows in the book, showing links that were actually in place, is striking. (I won’t spoil the surprise – you’ll have to go look for yourself.)

Leveson - Columbia disaster

Facebook reloaded

Facebook trading opened with it’s IPO and closed at $105 billion market capitalization.

I wondered how my model tracked reality over the last six months.

Facebook stats put users at 901 million at the end of March. My maximum likelihood run was rather lower than that – it corresponds with the K950 run in my last post (saturation users of 950 million), and predicted 840M users for end of Q1 2012. The latest data point corresponds with my K1250 run. I’m not sure if it’s interesting or not, but the new data point is a bit of an outlier. For one thing, it’s reported to the nearest million at a precise time, not with aggressive rounding as in earlier numbers I’d found. Re-estimating the model with the new, precise data point, it’s necessary to pass on the high size over most of the data from 2008-2011. That seems a bit fishy – perhaps a change in reporting methods has occurred.

In any case, it hardly matters whether the user carrying capacity is a bit over or under a billion. Either way, the valuation with current revenue per user is on the order of $20 billion. I had picked $5/user/year based on past performance, which turned out to be very close to the 2011 actuals. It would take a 10-year ramp to 7x current revenue/user to justify current pricing, or very low interest rates and risk premiums.

So the real question is, can Facebook increase its revenue per user dramatically?

Another short sell opportunity?

“I have no interest in shorting a cultural phenomenon,” hedge fund manager Jeffrey Matthews of Ram Partners in Greenwich, Connecticut, told Reuters in an email interview.

Asked if this was because such stocks trade without regard to normal market valuation, he wrote back, “Bingo.”