Posts Tagged ‘Rasch’

Psychology and the social sciences: An atheoretical, scattered, and disconnected body of research

February 16, 2019

A new article in Nature Human Behaviour (NHB) points toward the need for better theory and more rigorous mathematical models in psychology and the social sciences (Muthukrishna & Henrich, 2019). The authors rightly say that the lack of an overarching cumulative theoretical framework makes it very difficult to see whether new results fit well with previous work, or if something surprising has come to light. Mathematical models are especially emphasized as being of value in specifying clear and precise expectations.

The point that the social sciences and psychology need better theories and models is painfully obvious. But there are in fact thousands of published studies and practical real world applications that not only provide, but indeed often surpass, the kinds of predictive theories and mathematical models called for in the NHB article. The article not only makes no mention of any of this work, its argument is framed entirely in a statistical context instead of the more appropriate context of measurement science.

The concept of reliability provides an excellent point of entry. Most behavioral scientists think of reliability statistically, as a coefficient with a positive numeric value usually between 0.00 and 1.00. The tangible sense of reliability as indicating exactly how predictable an outcome is does not usually figure in most researchers’ thinking. But that sense of the specific predictability of results has been the focus of attention in social and psychological measurement science for decades.

For instance, the measurement of time is reliable in the sense that the position of the sun relative to the earth can be precisely predicted from geographic location, the time of day, and the day of the year. The numbers and words assigned to noon time are closely associated with the Sun being at the high point in the sky (though there are political variations by season and location across time zones).

That kind of a reproducible association is rarely sought in psychology and the social sciences, but it is far from nonexistent. One can discern different degrees to which that kind of association is included in models of measured constructs. Though most behavioral research doesn’t mention the connection between linear amounts of a measured phenomenon and a reproducible numeric representation of it (level 0), quite a significant body of work focuses on that connection (level 1). The disappointing thing about that level 1 work is that the relentless obsession with statistical methods prevents most researchers from connecting a reproducible quantity with a single expression of it in a standard unit, and with an associated uncertainty term (level 2). That is, level 1 researchers conceive measurement in statistical terms, as a product of data analysis. Even when results across data sets are highly correlated and could be equated to a common metric, level 1 researchers do not leverage that source of potential value for simplified communication and accumulated comparability.

And then, for their part, level 2 researchers usually do not articulate theories about the measured constructs, by augmenting the mathematical data model with an explanatory model predicting variation (level 3). Level 2 researchers are empirically grounded in data, and can expand their network of measures only by gathering more data and analyzing it in ways that bring it into their standard unit’s frame of reference.

Level 3 researchers, however, have come to see what makes their measures tick. They understand the mechanisms that make their questions vary. They can write new questions to their theoretical specifications, test those questions by asking them of a relevant sample, and produce the predicted calibrations. For instance, reading comprehension is well established to be a function of the difference between a person’s reading ability and the complexity of the text they encounter (see articles by Stenner in the list below). We have built our entire educational system around this idea, as we deliberately introduce children first to the alphabet, then to the most common words, then to short sentences, and then to ever longer and more complicated text. But stating the construct model, testing it against data, calibrating a unit to which all tests and measures can be traced, and connecting together all the books, articles, tests, curricula, and students is a process that began (in English and Spanish) only in the 1980s. The process still is far from finished, and most reading research still does not use the common metric.

In this kind of theory-informed context, new items can be automatically generated on the fly at the point of measurement. Those items and inferences made from them are validated by the consistency of the responses and the associated expression of the expected probability of success, agreement, etc. The expense of constant data gathering and analysis can be cut to a very small fraction of what it is at levels 0-2.

Level 3 research methods are not widely known or used, but they are not new. They are gaining traction as their use by national metrology institutes globally grows. As high profile critiques of social and psychological research practices continue to emerge, perhaps more attention will be paid to this important body of work. A few key references are provided below, and virtually every post in this blog pertains to these issues.

References

Baghaei, P. (2008). The Rasch model as a construct validation tool. Rasch Measurement Transactions, 22(1), 1145-6 [http://www.rasch.org/rmt/rmt221a.htm].

Bergstrom, B. A., & Lunz, M. E. (1994). The equivalence of Rasch item calibrations and ability estimates across modes of administration. In M. Wilson (Ed.), Objective measurement: Theory into practice, Vol. 2 (pp. 122-128). Norwood, New Jersey: Ablex.

Cano, S., Pendrill, L., Barbic, S., & Fisher, W. P., Jr. (2018). Patient-centred outcome metrology for healthcare decision-making. Journal of Physics: Conference Series, 1044, 012057.

Dimitrov, D. M. (2010). Testing for factorial invariance in the context of construct validation. Measurement & Evaluation in Counseling & Development, 43(2), 121-149.

Embretson, S. E. (2010). Measuring psychological constructs: Advances in model-based approaches. Washington, DC: American Psychological Association.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374.

Fischer, G. H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48(1), 3-26.

Fisher, W. P., Jr. (1992). Reliability statistics. Rasch Measurement Transactions, 6(3), 238 [http://www.rasch.org/rmt/rmt63i.htm].

Fisher, W. P., Jr. (2008). The cash value of reliability. Rasch Measurement Transactions, 22(1), 1160-1163 [http://www.rasch.org/rmt/rmt221.pdf].

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496.

Green, S. B., Lissitz, R. W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37(4), 827-833.

Hattie, J. (1985). Methodology review: Assessing unidimensionality of tests and items. Applied Psychological Measurement, 9(2), 139-64.

Hobart, J. C., Cano, S. J., Zajicek, J. P., & Thompson, A. J. (2007). Rating scales as outcome measures for clinical trials in neurology: Problems, solutions, and recommendations. Lancet Neurology, 6, 1094-1105.

Irvine, S. H., Dunn, P. L., & Anderson, J. D. (1990). Towards a theory of algorithm-determined cognitive test construction. British Journal of Psychology, 81, 173-195.

Kline, T. L., Schmidt, K. M., & Bowles, R. P. (2006). Using LinLog and FACETS to model item components in the LLTM. Journal of Applied Measurement, 7(1), 74-91.

Lunz, M. E., & Linacre, J. M. (2010). Reliability of performance examinations: Revisited. In M. Garner, G. Engelhard, Jr., W. P. Fisher, Jr. & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 1 (pp. 328-341). Maple Grove, MN: JAM Press.

Mari, L., & Wilson, M. (2014). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327.

Markward, N. J., & Fisher, W. P., Jr. (2004). Calibrating the genome. Journal of Applied Measurement, 5(2), 129-141.

Maul, A., Mari, L., Torres Irribarra, D., & Wilson, M. (2018). The quality of measurement results in terms of the structural features of the measurement process. Measurement, 116, 611-620.

Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human Behaviour, 1-9.

Obiekwe, J. C. (1999, August 1). Application and validation of the linear logistic test model for item difficulty prediction in the context of mathematics problems. Dissertation Abstracts International: Section B: The Sciences & Engineering, 60(2-B), 0851.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Pendrill, L., & Petersson, N. (2016). Metrology of human-based and other qualitative measurements. Measurement Science and Technology, 27(9), 094003.

Sijtsma, K. (2009). Correcting fallacies in validity, reliability, and classification. International Journal of Testing, 8(3), 167-194.

Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74(1), 107-120.

Stenner, A. J. (2001). The necessity of construct theory. Rasch Measurement Transactions, 15(1), 804-5 [http://www.rasch.org/rmt/rmt151q.htm].

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14.

Stenner, A. J., & Horabin, I. (1992). Three stages of construct definition. Rasch Measurement Transactions, 6(3), 229 [http://www.rasch.org/rmt/rmt63b.htm].

Stenner, A. J., Stone, M. H., & Fisher, W. P., Jr. (2018). The unreasonable effectiveness of theory based instrument calibration in the natural sciences: What can the social sciences learn? Journal of Physics Conference Series, 1044(012070).

Stone, M. H. (2003). Substantive scale construction. Journal of Applied Measurement, 4(3), 282-297.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wright, B. D., & Stone, M. H. (1979). Chapter 5: Constructing a variable. In Best test design: Rasch measurement (pp. 83-128). Chicago, Illinois: MESA Press.

Wright, B. D., & Stone, M. H. (1999). Measurement essentials. Wilmington, DE: Wide Range, Inc. [http://www.rasch.org/measess/me-all.pdf].

Wright, B. D., Stone, M., & Enos, M. (2000). The evolution of meaning in practice. Rasch Measurement Transactions, 14(1), 736 [http://www.rasch.org/rmt/rmt141g.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Advertisements

New Ideas on How to Realize the Purpose of Capital

September 20, 2018

I’d like to offer the following in reply to James Militzer, at https://nextbillion.net/deciphering-emersons-tears-time-impact-investing-lower-expectations/.

Rapid advances toward impact investing’s highest goals of social transformation are underway in quiet technical work being done in places no one is looking. That work shares Jed Emerson’s sentiments expressed at the 2017 Social Capital Markets conference, as he is quoted in Militzer’s NextBillion.net posting, that “The purpose of capital is to advance a more progressively free and just experience of life for all.” And he is correct in what Militzer reported he said the year before, that we need a “real, profound critique of current practices within financial capitalism,” one that would “require real change in our own behavior aside from adding a few funds to our portfolios here or augmenting a reporting process there.”

But the efforts he and others are making toward fulfilling that purpose and articulating that critique are incomplete, insufficient, and inadequate. Why? How? Language is the crux of the matter, and the issues involved are complex and technical. The challenge, which may initially seem simplistic or naive, is how to bring human, social, and environmental values into words. Not just any words, but meaningful words in a common language. What is most challenging is that this language, like any everyday language, has to span the range from abstract theoretical ideals to concrete local improvisations.

That means it cannot be like our current languages for expressing human, social, and environmental value. If we are going to succeed in aligning those forms of value with financial value, we have a lot of work to do.

Though there is endless talk of metrics for managing sustainable impacts, and though the importance of these metrics for making sustainability manageable is also a topic of infinite discussion, almost no one takes the trouble to seek out and implement the state of the art in measurement science. This is a crucial way, perhaps the most essential way, in which we need to criticize current practices within financial capitalism and change our behaviors. Oddly, almost no one seems to have thought of that.

That is, one of the most universally unexamined assumptions of our culture is that numbers automatically stand for quantities. People who analyze numeric data are called quants, and all numeric data analysis is referred to as quantitative. That is the case, but almost none of these quants and quantitative methods involve actually defining, modeling, identifying, evaluating, or applying an substantive unit of something real in the world that can be meaningfully represented by numbers.

There is, of course, an extensive and longstanding literature on exactly this science of measurement. It has been a topic of research, philosophy, and practical applications for at least 90 years, going back to the work of Thurstone at the University of Chicago in the 1920s. That work continued at the University of Chicago with Rasch’s visit there in 1960, with Wright’s adoption and expansion of Rasch’s theory and methods, and with the further work done by Wright’s students and colleagues in the years since.

Most importantly, over the last ten years, metrologists, the physicists and engineers who maintain and improve the SI units, the metric system, have taken note of what’s been going on in research and practice involving the approaches to measurement developed by Rasch, Wright, and their students and colleagues (for just two of many articles in this area, see here and here). The most recent developments in this new metrology include

(a) initiatives at national metrology institutes globally (Sweden and the UK, Portugal, Ukraine, among others) to investigate potentials for a new class of unit standards;

(b) a special session on this topic at the International Measurement Confederation (IMEKO) World Congress in Belfast on 5 September 2018;

(c) the Journal of Physics Conference Series proceedings of the 2016 IMEKO Joint Symposium hosted by Mark Wilson and myself at UC Berkeley;

(d) the publication of a 2017 book on Ben Wright edited by Mark Wilson and myself in Springer’s Series on Measurement Science and Technology; and

(e) the forthcoming October 2018 special issue of Elsevier’s Measurement journal edited by Wilson and myself, and a second one currently in development.

There are profound differences between today’s assumptions about measurement and how a meaningful art and science of precision measurement proceeds. What passes for measurement in today’s sustainability economics and accounting are counts, percentages, and ratings. These merely numeric metrics do not stand for anything that adds up the way they do. In fact, it’s been repeatedly demonstrated over many years that these kinds of metrics measure in a unit that changes size depending on who or what is measured, who is measuring, and what tool is used to measure. What makes matters even worse is that the numbers are usually taken to be perfectly precise, as uncertainty ranges, error terms, and confidence intervals are only sporadically provided and are usually omitted.

Measurement is not primarily a matter of data analysis. Measurement requires calibrated instruments that can be read as standing for a given amount of something that stays the same, within the uncertainty range, no matter who is measuring, no matter what or who is measured, and no matter what tool is used. This is, of course, quite an accomplishment when it can be achieved, but it is not impossible and has been put to use in large scale practical ways for several decades (for instance, see here, here, and here). Universally accessible instruments calibrated to common unit standards are what make society in general, and markets in particular, efficient in the way of projecting distributed network effects, turning communities into massively parallel stochastic computers (as W. Brian Arthur put it on p. 6 of his 2014 book, Complexity Economics).

These are not unexamined assumptions or overly ideal theoretical demands. They are pragmatic ways of adapting to emergent patterns in various kinds of data that have repeatedly been showing themselves around the world for decades. Our task is to literally capitalize on these nonhuman forms of life by creating multilevel, complex ecosystems of relationships with them, letting them be what they are in ways that also let us represent ourselves to each other. (Emerson quotes Bruno Latour to this effect on page 136 in his new book, The Purpose of Capital; those familiar with my work will know I’ve been reading and citing Latour since the early 1980s).

So it seems to me that, however well-intentioned those promoting impact investing may be, there is little awareness of just how profound and sweeping the critique of current practices needs to be, or of just how much our own behaviors are going to have to change. There are, however, truly significant reasons to be optimistic and hopeful. The technical work being done in measurement and metrology points toward possibilities for extending everyday language into a pragmatic idealism that does not require caving in to either varying local circumstances or to authoritarian dictates.

The upside of the situation is that, as so often happens in the course of human history, this critique and the associated changes are likely to have that peculiar quality captured in the French expression, “plus ça change, plus c’est la même chose” (the more things change, the more they stay the same). The changes in process are transformative, but will also be recognizable repetitions of human scale patterns.

In sum, what we are doing is tuning the instruments of the human, social, and environmental sciences to better harmonize relationships. Just as jazz, folk, and world music show that creative improvisation is not constrained by–but is facilitated by–tuning standards and high tech solutions, so, too, can we make that the case in other areas.

For instance, in my presentation at the IMEKO World Congress in Belfast on 5 September, I showed that the integration of beauty and meaning we have within our grasp reiterates principles that date back to Plato. The aesthetics complement the mathematics, with variations on the same equations being traceable from the Pythagorean theorem to Newton’s laws to Rasch’s models for measurement (see, for instance, Fisher & Stenner, 2013). In many ways, the history of science and philosophy continues to be a footnote to Plato.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

 

Evaluating Questionnaires as Measuring Instruments

June 23, 2018

An email came in today asking whether three different short (4- and 5-item) questionnaires could be expected to provide reasonable quality measurement. Here’s my response.

—–

Thanks for raising this question. The questionnaire plainly was not designed to provide data suitable for measurement. Though much can be learned about making constructs measurable from data produced by this kind of questionnaire, “Rasch analysis” cannot magically create a silk purse from a sow’s ear (as the old expression goes). Use Linacre’s (1993) generalizability theory nomograph to see what reliabilities are expected for each subscale, given the numbers of items and rating categories, and applying a conservative estimate of the adjusted standard deviations (1.0 logit, for instance). Convert the reliability coefficients into strata (Fisher, 1992, 2008; Wright & Masters, 1982, pp. 92, 105-106) to make the practical meaning of the precision obtained obvious.

So if you have data, analyze it and compare the expected and observed reliabilities. If the uncertainties are quite different, is that because of targeting issues? But before you do that, ask experts in the area to rank order:

  • the courses by relevance to the job;
  • the evaluation criteria from easy to hard; and
  • the skills/competencies in order of importance to job performance.

Then study the correspondence between the rankings and the calibration results. Where do they converge and diverge? Why? What’s unexpected? What can be learned?

Analyze all of the items in each area (student, employer, instructor) together in Winsteps and study each of the three tables 23.x, setting PRCOMP=S. Remember that the total variance explained is not interpreted simply in terms of “more is better” and that the total variance explained is not as important as the ratio of that variance to the variance in the first contrast (see Linacre, 2006, 2008). If the ratio is greater than 3, the scale is essentially unidimensional (though significant problems may remain to be diagnosed and corrected).

Common practice holds that unexplained variance eigenvalues should be less than 1.5, but this overly simplistic rule of thumb (Chou & Wang, 2010; Raîche, 2005) has been contradicted in practice many times, since, even if one or more eigenvalues are over 1.5, theory may say the items belong to the same construct, and the disattenuated correlations of the measures implied by the separate groups of items (provided in tables 23.x) may still approach 1.00, indicating that the same measures are produced across subscales. See Green (1996) and Smith (1996), among others, for more on this.

If subscales within each of the three groups of items are markedly different in the measures they produce, then separate them in different analyses. If these further analyses reveal still more multidimensionalities, it’s time to go back to the drawing board, given how short these scales are. If you define a plausible scale, study the item difficulty orders closely with one or more experts in the area. If there is serious interest in precision measurement and its application to improved management, and not just a bureaucratic need for data to satisfy empty demands for a mere appearance of quality assessment, then trace the evolution of the construct as it changes from less to more across the items.

What, for instance, is the common theme addressed across the courses that makes them all relevant to job performance? The courses were each created with an intention and they were brought together into a curriculum for a purpose. These intentions and purposes are the raw material of a construct theory. Spell out the details of how the courses build competency in translation.

Furthermore, I imagine that this curriculum, by definition, was set up to be effective in training students no matter who is in the courses (within the constraints of the admission criteria), and no matter which particular challenges relevant to job performance are sampled from the universe of all possible challenges. You will recognize these unexamined and unarticulated assumptions as what need to be explicitly stated as hypotheses informing a model of the educational enterprise. This model transforms implicit assumptions into requirements that are never fully satisfied but can be very usefully approximated.

As I’ve been saying for a long time (Fisher, 1989), please do not accept the shorthand language of references to “the Rasch model”, “Rasch scaling”, “Rasch analysis”, etc. Rasch did not invent the form of these models, which are at least as old as Plato. And measurement is not a function of data analysis. Data provide experimental evidence testing model-based hypotheses concerning construct theories. When explanatory theory corroborates and validates data in calibrated instrumentation, the instrument can be applied at the point of use with no need for data analysis, to produce measures, uncertainty (error) estimates, and graphical fit assessments (Connolly, Nachtman, & Pritchett, 1971; Davis, et al., 2008; Fisher, 2006; Fisher, Kilgore, & Harvey, 1995; Linacre, 1997; many others).

So instead of using those common shorthand phrases, please speak directly to the problem of modeling the situation in order to produce a practical tool for managing it.

Further information is available in the references below.

 

Aryadoust, S. V. (2009). Mapping Rasch-based measurement onto the argument-based validity framework. Rasch Measurement Transactions, 23(1), 1192-3 [http://www.rasch.org/rmt/rmt231.pdf].

Chang, C.-H. (1996). Finding two dimensions in MMPI-2 depression. Structural Equation Modeling, 3(1), 41-49.

Chou, Y. T., & Wang, W. C. (2010). Checking dimensionality in item response models with principal component analysis on standardized residuals. Educational and Psychological Measurement, 70, 717-731.

Connolly, A. J., Nachtman, W., & Pritchett, E. M. (1971). Keymath: Diagnostic Arithmetic Test. Circle Pines, Minnesota: American Guidance Service. Retrieved 23 June 2018 from https://images.pearsonclinical.com/images/pa/products/keymath3_da/km3-da-pub-summary.pdf

Davis, A. M., Perruccio, A. V., Canizares, M., Tennant, A., Hawker, G. A., Conaghan, P. G. et al. (2008, May). The development of a short measure of physical function for hip OA HOOS-Physical Function Shortform (HOOS-PS): An OARSI/OMERACT initiative. Osteoarthritis Cartilage, 16(5), 551-559.

Fisher, W. P., Jr. (1989). What we have to offer. Rasch Measurement Transactions, 3(3), 72 [http://www.rasch.org/rmt/rmt33d.htm].

Fisher, W. P., Jr. (1992). Reliability statistics. Rasch Measurement Transactions, 6(3), 238  [http://www.rasch.org/rmt/rmt63i.htm].

Fisher, W. P., Jr. (2006). Survey design recommendations [expanded from Fisher, W. P. Jr. (2000) Popular Measurement, 3(1), pp. 58-59]. Rasch Measurement Transactions, 20(3), 1072-1074 [http://www.rasch.org/rmt/rmt203.pdf].

Fisher, W. P., Jr. (2008). The cash value of reliability. Rasch Measurement Transactions, 22(1), 1160-1163 [http://www.rasch.org/rmt/rmt221.pdf].

Fisher, W. P., Jr., Harvey, R. F., & Kilgore, K. M. (1995). New developments in functional assessment: Probabilistic models for gold standards. NeuroRehabilitation, 5(1), 3-25.

Green, K. E. (1996). Dimensional analyses of complex data. Structural Equation Modeling, 3(1), 50-61.

Linacre, J. M. (1993). Rasch-based generalizability theory. Rasch Measurement Transactions, 7(1), 283-284; [http://www.rasch.org/rmt/rmt71h.htm].

Linacre, J. M. (1997). Instantaneous measurement and diagnosis. Physical Medicine and Rehabilitation State of the Art Reviews, 11(2), 315-324 [http://www.rasch.org/memo60.htm].

Linacre, J. M. (1998). Detecting multidimensionality: Which residual data-type works best? Journal of Outcome Measurement, 2(3), 266-83.

Linacre, J. M. (1998). Structure in Rasch residuals: Why principal components analysis? Rasch Measurement Transactions, 12(2), 636 [http://www.rasch.org/rmt/rmt122m.htm].

Linacre, J. M. (2003). PCA: Data variance: Explained, modeled and empirical. Rasch Measurement Transactions, 17(3), 942-943 [http://www.rasch.org/rmt/rmt173g.htm].

Linacre, J. M. (2006). Data variance explained by Rasch measures. Rasch Measurement Transactions, 20(1), 1045 [http://www.rasch.org/rmt/rmt201a.htm].

Linacre, J. M. (2008). PCA: Variance in data explained by Rasch measures. Rasch Measurement Transactions, 22(1), 1164 [http://www.rasch.org/rmt/rmt221j.htm].

Raîche, G. (2005). Critical eigenvalue sizes in standardized residual Principal Components Analysis. Rasch Measurement Transactions, 19(1), 1012 [http://www.rasch.org/rmt/rmt191h.htm].

Schumacker, R. E., & Linacre, J. M. (1996). Factor analysis and Rasch. Rasch Measurement Transactions, 9(4), 470 [http://www.rasch.org/rmt/rmt94k.htm].

Smith, E. V., Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2), 205-31.

Smith, R. M. (1996). A comparison of methods for determining dimensionality in Rasch measurement. Structural Equation Modeling, 3(1), 25-40.

Wright, B. D. (1996). Comparing Rasch measurement and factor analysis. Structural Equation Modeling, 3(1), 3-24.

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis: Rasch measurement. Chicago, Illinois: MESA Press.

Revisiting Hayek’s Relevance to Measurement

May 31, 2018

As so often happens, I’m finding new opportunities for restating what seems obvious to me but does not impact others in the way it ought to. The work of the Austrian economist Friedrich Hayek speaks to me in a particular way that has always, to me, self-evidently expressed ideas of fundamental value and interest. Reviewing his work again lately has opened it up to a new level of detail that is worth sharing here.

Hayek (1948, p. 54) is onto a key point about measurement and its role in economics when he says:

…the spontaneous actions of individuals will, under conditions which we can define, bring about a distribution of resources which can be understood as if it were made according to a single plan, although nobody has planned it…?

Decades of measurement research shows that individuals’ spontaneous responses to assessment and survey questions conform to one another in ways that might appear to have been centrally organized according to a single plan. But over and over again the same patterns are produced with no efforts made to guide or coerce responses that conform in that way.

The results of testing and assessment produced in educational measurement can be expressed in economic terms fitting quite well with Hayek’s observation. Student abilities, economically speaking, are human capital resources. Each student has some amount of ability that can be considered a supply of resources available for application to the demands of the challenges posed by the assessment questions. When assessment data fit a Rasch model, the supply of student abilities have spontaneously organized themselves in relation to challenging demands for that supply of abilities posed by the test questions. The invariant consistency of the data and resulting model fit has not been produced by coercing or guiding the students to respond in a particular way. Although questions can be written to vary in difficulty according to a construct theory, and though educational curricula traditionally vary in difficulty across grade levels, the patterns of growth and change that are observed are plainly not taking place as a result of anyone’s intentions or plans.

This kind of complex adaptive, self-organizing process (Fisher, 2017) describes not just the relations of student abilities and task difficulties, but also the relations of customer preferences to product features, patient health and functionality relative to disease and disability, etc. It also, of course, applies to supply and demand relative to a price (Fisher, 2015). For students, the price to be paid follows from the probability of a supply of ability meeting the demand for it posed by the challenges encountered in assessment items.

Getting back to Hayek (1948, p. 54), here we meet the relevance of the

…central question of all social sciences: How can the combination of fragments of knowledge existing in different minds bring about results which, if they were to be brought about deliberately, would require a knowledge on the part of the directing mind which no single person can possess?

Per Hayek’s point, no one student will know the answers to all of the questions posed in a test, and yet all of the students’ fragments of knowledge combine in a way that bring about results seemingly defined by a single intelligence. It is this bottom up and self-organized emergence of knowledge structures that we capture in measurement and bring into our culture, our sciences, and our economies by bringing things into words and the common languages of standardized metrics.

This spontaneous emergence of structure does not lead directly of its own accord to the creation of markets. Rather, it is vitally important to recognize, along with Miller and O’Leary (2007, p. 710) that:

Markets are not spontaneously generated by the exchange activity of buyers and sellers. Rather, skilled actors produce institutional arrangements, the rules, roles and relationships that make market exchange possible. The institutions define the market, rather than the reverse.

The institutional arrangements we need to make to create efficient markets for human, social, and natural capital will be staggeringly difficult to realize. But a point in time will come when the costs of remaining in our current cultural, political, and economic ruts will be greater, and the benefits will be lower, than the costs and benefits of investing in a new future. That time may be sooner than anyone thinks it will be.

References

Fisher, W. P., Jr. (2015). A probabilistic model of the law of supply and demand. Rasch Measurement Transactions, 29(1), 1508-1511  [http://www.rasch.org/rmt/rmt291.pdf].

Fisher, W. P., Jr. (2017). A practical approach to modeling complex adaptive flows in psychology and social science. Procedia Computer Science, 114, 165-174. Retrieved from https://doi.org/10.1016/j.procs.2017.09.027

Hayek, F. A. (1948). Individualism and economic order. Chicago: University of Chicago Press.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-734.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Excerpts and Notes from Goldberg’s “Billions of Drops…”

December 23, 2015

Goldberg, S. H. (2009). Billions of drops in millions of buckets: Why philanthropy doesn’t advance social progress. New York: Wiley.

p. 8:
Transaction costs: “…nonprofit financial markets are highly disorganized, with considerable duplication of effort, resource diversion, and processes that ‘take a fair amount of time to review grant applications and to make funding decisions’ [citing Harvard Business School Case No. 9-391-096, p. 7, Note on Starting a Nonprofit Venture, 11 Sept 1992]. It would be a major understatement to describe the resulting capital market as inefficient.”

A McKinsey study found that nonprofits spend 2.5 to 12 times more raising capital than for-profits do. When administrative costs are factored in, nonprofits spend 5.5 to 21.5 times more.

For-profit and nonprofit funding efforts contrasted on pages 8 and 9.

p. 10:
Balanced scorecard rating criteria

p. 11:
“Even at double-digit annual growth rates, it will take many years for social entrepreneurs and their funders to address even 10% of the populations in need.”

p. 12:
Exhibit 1.5 shows that the percentages of various needs served by leading social enterprises are barely drops in the respective buckets; they range from 0.07% to 3.30%.

pp. 14-16:
Nonprofit funding is not tied to performance. Even when a nonprofit makes the effort to show measured improvement in impact, it does little or nothing to change their funding picture. It appears that there is some kind of funding ceiling implicitly imposed by funders, since nonprofit growth and success seems to persuade capital sources that their work there is done. Mediocre and low performing nonprofits seem to be able to continue drawing funds indefinitely from sympathetic donors who don’t require evidence of effective use of their money.

p. 34:
“…meaningful reductions in poverty, illiteracy, violence, and hopelessness will require a fundamental restructuring of nonprofit capital markets. Such a restructuring would need to make it much easier for philanthropists of all stripes–large and small, public and private, institutional and individual–to fund nonprofit organizations that maximize social impact.”

p. 54:
Exhibit 2.3 is a chart showing that fewer people rose from poverty, and more remained in it or fell deeper into it, in the period of 1988-98 compared with 1969-1979.

pp. 70-71:
Kotter’s (1996) change cycle.

p. 75:
McKinsey’s seven elements of nonprofit capacity and capacity assessment grid.

pp. 94-95:
Exhibits 3.1 and 3.2 contrast the way financial markets reward for-profit performance with the way nonprofit markets reward fund raising efforts.

Financial markets
1. Market aggregates and disseminates standardized data
2. Analysts publish rigorous research reports
3. Investors proactively search for strong performers
4. Investors penalize weak performers
5. Market promotes performance
6. Strong performers grow

Nonprofit markets
1. Social performance is difficult to measure
2. NPOs don’t have resources or expertise to report results
3. Investors can’t get reliable or standardized results data
4. Strong and weak NPOs spend 40 to 60% of time fundraising
5. Market promotes fundraising
6. Investors can’t fund performance; NPOs can’t scale

p. 95:
“…nonprofits can’t possibly raise enough money to achieve transformative social impact within the constraints of the existing fundraising system. I submit that significant social progress cannot be achieved without what I’m going to call ‘third-stage funding,’ that is, funding that doesn’t suffer from disabling fragmentation. The existing nonprofit capital market is not capable of [p. 97] providing third-stage funding. Such funding can arise only when investors are sufficiently well informed to make big bets at understandable and manageable levels of risk. Existing nonprofit capital markets neither provide investors with the kinds of information needed–actionable information about nonprofit performance–nor provide the kinds of intermediation–active oversight by knowledgeable professionals–needed to mitigate risk. Absent third-stage funding, nonprofit capital will remain irreducibly fragmented, preventing the marshaling of resources that nonprofit organizations need to make meaningful and enduring progress against $100 million problems.”

pp. 99-114:
Text and diagrams on innovation, market adoption, transformative impact.

p. 140:
Exhibit 4.2: Capital distribution of nonprofits, highlighting mid-caps

pages 192-3 make the case for the difference between a regular market and the current state of philanthropic, social capital markets.

p. 192:
“So financial markets provide information investors can use to compare alternative investment opportunities based on their performance, and they provide a dynamic mechanism for moving money away from weak performers and toward strong performers. Just as water seeks its own level, markets continuously recalibrate prices until they achieve a roughly optimal equilibrium at which most companies receive the ‘right’ amount of investment. In this way, good companies thrive and bad ones improve or die.
“The social sector should work the same way. .. But philanthropic capital doesn’t flow toward effective nonprofits and away from ineffective nonprofits for a simple reason: contributors can’t tell the difference between the two. That is, philanthropists just don’t [p. 193] know what various nonprofits actually accomplish. Instead, they only know what nonprofits are trying to accomplish, and they only know that based on what the nonprofits themselves tell them.”

p. 193:
“The signs that the lack of social progress is linked to capital market dysfunctions are unmistakable: fundraising remains the number-one [p. 194] challenge of the sector despite the fact that nonprofit leaders divert some 40 to 60% of their time from productive work to chasing after money; donations raised are almost always too small, too short, and too restricted to enhance productive capacity; most mid-caps are ensnared in the ‘social entrepreneur’s trap’ of focusing on today and neglecting tomorrow; and so on. So any meaningful progress we could make in the direction of helping the nonprofit capital market allocate funds as effectively as the private capital market does could translate into tremendous advances in extending social and economic opportunity.
“Indeed, enhancing nonprofit capital allocation is likely to improve people’s lives much more than, say, further increasing the total amount of donations. Why? Because capital allocation has a multiplier effect.”

“If we want to materially improve the performance and increase the impact of the nonprofit sector, we need to understand what’s preventing [p. 195] it from doing a better job of allocating philanthropic capital. And figuring out why nonprofit capital markets don’t work very well requires us to understand why the financial markets do such a better job.”

p. 197:
“When all is said and done, securities prices are nothing more than convenient approximations that market participants accept as a way of simplifying their economic interactions, with a full understanding that market prices are useful even when they are way off the mark, as they so often are. In fact, that’s the whole point of markets: to aggregate the imperfect and incomplete knowledge held by vast numbers of traders about much various securities are worth and still make allocation choices that are better than we could without markets.
“Philanthropists face precisely the same problem: how to make better use of limited information to maximize output, in this case, social impact. Considering the dearth of useful tools available to donors today, the solution doesn’t have to be perfect or even all that good, at least at first. It just needs to improve the status quo and get better over time.
“Much of the solution, I believe, lies in finding useful adaptations of market mechanisms that will mitigate the effects of the same lack of reliable and comprehensive information about social sector performance. I would even go so far as to say that social enterprises can’t hope to realize their ‘one day, all children’ visions without a funding allociation system that acts more like a market.
“We can, and indeed do, make incremental improvements in nonprofit funding without market mechanisms. But without markets, I don’t see how we can fix the fragmentation problem or produce transformative social impact, such as ensuring that every child in America has a good education. The problems we face are too big and have too many moving parts to ignore the self-organizing dynamics of market economics. As Thomas Friedman said about the need to impose a carbon tax at a time of falling oil prices, ‘I’ve wracked my brain trying to think of ways to retool America around clean-power technologies without a price signal–i.e., a tax–and there are no effective ones.”

p. 199:
“Prices enable financial markets to work the way nonprofit capital markets should–by sending informative signals about the most effective organizations so that money will flow to them naturally..”

p. 200:
[Quotes Kurtzman citing De Soto on the mystery of capital. Also see p. 209, below.]
“‘Solve the mystery of capital and you solve many seemingly intractable problems along with it.'”
[That’s from page 69 in Kurtzman, 2002.]

p. 201:
[Goldberg says he’s quoting Daniel Yankelovich here, but the footnote does not appear to have anything to do with this quote:]
“‘The first step is to measure what can easily be measured. The second is to disregard what can’t be measured, or give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily isn’t very important. This is blindness. The fourth step is to say that what can’t be easily measured really doesn’t exist. This is suicide.'”

Goldberg gives example here of $10,000 invested witha a 10% increase in value, compared with $10,000 put into a nonprofit. “But if the nonprofit makes good use of the money and, let’s say, brings the reading scores of 10 elementary school students up from below grade level to grade level, we can’t say how much my initial investment is ‘worth’ now. I could make the argument that the value has increased because the students have received a demonstrated educational benefit that is valuable to them. Since that’s the reason I made the donation, the achievement of higher scores must have value to me, as well.”

p. 202:
Goldberg wonders whether donations to nonprofits would be better conceived as purchases than investments.

p. 207:
Goldberg quotes Jon Gertner from the March 9, 2008, issue of the New York Times Magazine devoted to philanthropy:

“‘Why shouldn’t the world’s smartest capitalists be able to figure out more effective ways to give out money now? And why shouldn’t they want to make sure their philanthropy has significant social impact? If they can measure impact, couldn’t they get past the resistance that [Warren] Buffet highlighted and finally separate what works from what doesn’t?'”

p. 208:
“Once we abandon the false notions that financial markets are precision instruments for measuring unambiguous phenomena, and that the business and nonproft sectors are based in mutually exclusive principles of value, we can deconstruct the true nature of the problems we need to address and adapt market-like mechanisms that are suited to the particulars of the social sector.
“All of this is a long way (okay, a very long way) of saying that even ordinal rankings of nonprofit investments can have tremendous value in choosing among competing donation opportunities, especially when the choices are so numerous and varied. If I’m a social investor, I’d really like to know which nonprofits are likely to produce ‘more’ impact and which ones are likely to produce ‘less.'”

“It isn’t necessary to replicate the complex working of the modern stock markets to fashion an intelligent and useful nonprofit capital allocation mechanism. All we’re looking for is some kind of functional indication that would (1) isolate promising nonprofit investments from among the confusing swarm of too many seemingly worthy social-purpose organizations and (2) roughly differentiate among them based on the likelihood of ‘more’ or ‘less’ impact. This is what I meant earlier by increasing [p. 209] signals and decreasing noise.”

p. 209:
Goldberg apparently didn’t read De Soto, as he says that the mystery of capital is posed by Kurtzman and says it is solved via the collective intelligence and wisdom of crowds. This completely misses the point of the crucial value that transparent representations of structural invariance hold in market functionality. Goldberg is apparently offering a loose kind of market for which there is an aggregate index of stocks for nonprofits that are built up from their various ordinal performance measures. I think I find a better way in my work, building more closely from De Soto (Fisher, 2002, 2003, 2005, 2007, 2009a, 2009b).

p. 231:
Goldberg quotes Harvard’s Allen Grossman (1999) on the cost-benefit boundaries of more effective nonprofit capital allocation:

“‘Is there a significant downside risk in restructuring some portion of the philanthropic capital markets to test the effectiveness of performance driven philanthropy? The short answer is, ‘No.’ The current reality is that most broad-based solutions to social problems have eluded the conventional and fragmented approaches to philanthropy. It is hard to imagine that experiments to change the system to a more performance driven and rational market would negatively impact the effectiveness of the current funding flows–and could have dramatic upside potential.'”

p. 232:
Quotes Douglas Hubbard’s How to Measure Anything book that Stenner endorsed, and Linacre and I didn’t.

p. 233:
Cites Stevens on the four levels of measurement and uses it to justify his position concerning ordinal rankings, recognizing that “we can’t add or subtract ordinals.”

pp. 233-5:
Justifies ordinal measures via example of Google’s PageRank algorithm. [I could connect from here using Mary Garner’s (2009) comparison of PageRank with Rasch.]

p. 236:
Goldberg tries to justify the use of ordinal measures by citing their widespread use in social science and health care. He conveniently ignores the fact that virtually all of the same problems and criticisms that apply to philanthropic capital markets also apply in these areas. In not grasping the fundamental value of De Soto’s concept of transferable and transparent representations, and in knowing nothing of Rasch measurement, he was unable to properly evaluate to potential of ordinal data’s role in the formation of philanthropic capital markets. Ordinal measures aren’t just not good enough, they represent a dangerous diversion of resources that will be put into systems that take on lives of their own, creating a new layer of dysfunctional relationships that will be hard to overcome.

p. 261 [Goldberg shows here his complete ignorance about measurement. He is apparently totally unaware of the work that is in fact most relevant to his cause, going back to Thurstone in 1920s, Rasch in the 1950s-1970s, and Wright in the 1960s to 2000. Both of the problems he identifies have long since been solved in theory and in practice in a wide range of domains in education, psychology, health care, etc.]:
“Having first studied performance evaluation some 30 years ago, I feel confident in saying that all the foundational work has been done. There won’t be a ‘eureka!’ breakthrough where someone finally figures out the one true way to guage nonprofit effectiveness.
“Indeed, I would venture to say that we know virtually everything there is to know about measuring the performance of nonprofit organizations with only two exceptions: (1) How can we compare nonprofits with different missions or approaches, and (2) how can we make actionable performance assessments common practice for growth-ready mid-caps and readily available to all prospective donors?”

p. 263:
“Why would a social entrepreneur divert limited resources to impact assessment if there were no prospects it would increase funding? How could an investor who wanted to maximize the impact of her giving possibly put more golden eggs in fewer impact-producing baskets if she had no way to distinguish one basket from another? The result: there’s no performance data to attract growth capital, and there’s no growth capital to induce performance measurement. Until we fix that Catch-22, performance evaluation will not become an integral part of social enterprise.”

pp. 264-5:
Long quotation from Ken Berger at Charity Navigator on their ongoing efforts at developing an outcome measurement system. [wpf, 8 Nov 2009: I read the passage quoted by Goldberg in Berger’s blog when it came out and have been watching and waiting ever since for the new system. wpf, 8 Feb 2012: The new system has been online for some time but still does not include anything on impacts or outcomes. It has expanded from a sole focus on financials to also include accountability and transparency. But it does not yet address Goldberg’s concerns as there still is no way to tell what works from what doesn’t.]

p. 265:
“The failure of the social sector to coordinate independent assets and create a whole that exceeds the sum of its parts results from an absence of.. platform leadership’: ‘the ability of a company to drive innovation around a particular platform technology at the broad industry level.’ The object is to multiply value by working together: ‘the more people who use the platform products, the more incentives there are for complement producers to introduce more complementary products, causing a virtuous cycle.'” [Quotes here from Cusumano & Gawer (2002). The concept of platform leadership speaks directly to the system of issues raised by Miller & O’Leary (2007) that must be addressed to form effective HSN capital markets.]

p. 266:
“…the nonprofit sector has a great deal of both money and innovation, but too little available information about too many organizations. The result is capital fragmentation that squelches growth. None of the stakeholders has enough horsepower on its own to impose order on this chaos, but some kind of realignment could release all of that pent-up potential energy. While command-and-control authority is neither feasible nor desirable, the conditions are ripe for platform leadership.”

“It is doubtful that the IMPEX could amass all of the resources internally needed to build and grow a virtual nonprofit stock market that could connect large numbers of growth-capital investors with large numbers of [p. 267] growth-ready mid-caps. But it might be able to convene a powerful coalition of complementary actors that could achieve a critical mass of support for performance-based philanthropy. The challenge would be to develop an organization focused on filling the gaps rather than encroaching on the turf of established firms whose participation and innovation would be required to build a platform for nurturing growth of social enterprise..”

p. 268-9:
Intermediated nonprofit capital market shifts fundraising burden from grantees to intermediaries.

p. 271:
“The surging growth of national donor-advised funds, which simplify and reduce the transaction costs of methodical giving, exemplifies the kind of financial innovation that is poised to leverage market-based investment guidance.” [President of Schwab Charitable quoted as wanting to make charitable giving information- and results-driven.]

p. 272:
Rating agencies and organizations: Charity Navigator, Guidestar, Wise Giving Alliance.
Online donor rankings: GlobalGiving, GreatNonprofits, SocialMarkets
Evaluation consultants: Mathematica

Google’s mission statement: “to organize the world’s information and make it universally accessible and useful.”

p. 273:
Exhibit 9.4 Impact Index Whole Product
Image of stakeholders circling IMPEX:
Trading engine
Listed nonprofits
Data producers and aggregators
Trading community
Researchers and analysts
Investors and advisors
Government and business supporters

p. 275:
“That’s the starting point for replication [of social innovations that work]: finding and funding; matching money with performance.”

[WPF bottom line: Because Goldberg misses De Soto’s point about transparent representations resolving the mystery of capital, he is unable to see his way toward making the nonprofit capital markets function more like financial capital markets, with the difference being the focus on the growth of human, social, and natural capital. Though Goldberg intuits good points about the wisdom of crowds, he doesn’t know enough about the flaws of ordinal measurement relative to interval measurement, or about the relatively easy access to interval measures that can be had, to do the job.]

References

Cusumano, M. A., & Gawer, A. (2002, Spring). The elements of platform leadership. MIT Sloan Management Review, 43(3), 58.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2003). Measurement and communities of inquiry. Rasch Measurement Transactions, 17(3), 936-8 [http://www.rasch.org/rmt/rmt173.pdf].

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [http://www.livingcapitalmetrics.com/images/FisherJAM05.pdf].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In M. Wilson, K. Draney, N. Brown & B. Duckor (Eds.), Advances in Rasch Measurement, Vol. Two (p. in press [http://www.livingcapitalmetrics.com/images/BringingHSN_FisherARMII.pdf]). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2009b, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement (Elsevier), 42(9), 1278-1287.

Garner, M. (2009, Autumn). Google’s PageRank algorithm and the Rasch measurement model. Rasch Measurement Transactions, 23(2), 1201-2 [http://www.rasch.org/rmt/rmt232.pdf].

Grossman, A. (1999). Philanthropic social capital markets: Performance driven philanthropy (Social Enterprise Series 12 No. 00-002). Harvard Business School Working Paper.

Kotter, J. (1996). Leading change. Cambridge, Massachusetts: Harvard Business School Press.

Kurtzman, J. (2002). How the markets really work. New York: Crown Business.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

The Counterproductive Consequences of Common Study Designs and Statistical Methods

May 21, 2015

Because of the ways studies are designed and the ways data are analyzed, research results in psychology and the social sciences often appear to be nonlinear, sample- and instrument-dependent, and incommensurable, even when they need not be. In contrast with what are common assumptions about the nature of the constructs involved, invariant relations may be more obscured than clarified by typically employed research designs and statistical methods.

To take a particularly salient example, the number of small factors with Eigenvalues greater than 1.0 identified via factor analysis increases as the number of modes in a multi-modal distribution also increases, and the interpretation of results is further complicated by the fact that the number of factors identified decreases as sample size increases (Smith, 1996).

Similarly, variation in employment test validity across settings was established as a basic assumption by the 1970s, after 50 years of studies observing the situational specificity of results. But then Schmidt and Hunter (1977) identified sampling error, measurement error, and range restriction as major sources of what was only the appearance of incommensurable variation in employment test validity. In other words, for most of the 20th century, the identification of constructs and comparisons of results across studies were pointlessly confused by mixed populations, uncontrolled variation in reliability, and unnoted floor and/or ceiling effects. Though they do nothing to establish information systems deploying common languages structured by standard units of measurement (Feinstein, 1995), meta-analysis techniques are a step forward in equating effect sizes (Hunter & Schmidt, 2004).

Wright and Stone’s (1979) Best Test Design, in contrast, takes up each of these problems in an explicit way. Sampling error is addressed in that both the sample’s and the items’ representations of the same populations of persons and expressions of a construct are evaluated. The evaluation of reliability is foregrounded and clarified by taking advantage of the availability of individualized measurement uncertainty (error) estimates (following Andrich, 1982, presented at AERA in 1977). And range restriction becomes manageable in terms of equating and linking instruments measuring in different ranges of the same construct. As was demonstrated by Duncan (1985; Allerup, Bech, Loldrup, et al., 1994; Andrich & Styles, 1998), for instance, the restricted ranges of various studies assessing relationships between measures of attitudes and behaviors led to the mistaken conclusion that these were separate constructs. When the entire range of variation was explicitly modeled and studied, a consistent relationship was found.

Statistical and correlational methods have long histories of preventing the discovery, assessment, and practical application of invariant relations because they fail to test for invariant units of measurement, do not define standard metrics, never calibrate all instruments measuring the same thing in common units, and have no concept of formal measurement systems of interconnected instruments. Wider appreciation of the distinction between statistics and measurement (Duncan & Stenbeck, 1988; Fisher, 2010; Wilson, 2013a), and of the potential for metrological traceability we have within our reach (Fisher, 2009, 2012; Fisher & Stenner, 2013; Mari & Wilson, 2013; Pendrill, 2014; Pendrill & Fisher, 2015; Wilson, 2013b; Wilson, Mari, Maul, & Torres Irribarra, 2015), are demonstrably fundamental to the advancement of a wide range of fields.

References

Allerup, P., Bech, P., Loldrup, D., Alvarez, P., Banegil, T., Styles, I., & Tenenbaum, G. (1994). Psychiatric, business, and psychological applications of fundamental measurement models. International Journal of Educational Research, 21(6), 611-622.

Andrich, D. (1982). An index of person separation in Latent Trait Theory, the traditional KR-20 index, and the Guttman scale response pattern. Education Research and Perspectives, 9(1), 95-104 [http://www.rasch.org/erp7.htm].

Andrich, D., & Styles, I. M. (1998). The structural relationship between attitude and behavior statements from the unfolding perspective. Psychological Methods, 3(4), 454-469.

Duncan, O. D. (1985). Probability, disposition and the inconsistency of attitudes and behaviour. Synthese, 42, 21-34.

Duncan, O. D., & Stenbeck, M. (1988). Panels and cohorts: Design and model in the study of voting turnout. In C. C. Clogg (Ed.), Sociological Methodology 1988 (pp. 1-35). Washington, DC: American Sociological Association.

Feinstein, A. R. (1995). Meta-analysis: Statistical alchemy for the 21st century. Journal of Clinical Epidemiology, 48(1), 71-79.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2010). Statistics and measurement: Clarifying the differences. Rasch Measurement Transactions, 23(4), 1229-1230.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5.

Fisher, W. P., Jr., & Stenner, A. J. (2013). Overcoming the invisibility of metrology: A reading measurement network for education and the social sciences. Journal of Physics: Conference Series, 459(012024), http://iopscience.iop.org/1742-6596/459/1/012024.

Hunter, J. E., & Schmidt, F. L. (Eds.). (2004). Methods of meta-analysis: Correcting error and bias in research findings. Thousand Oaks, CA: Sage.

Mari, L., & Wilson, M. (2013). A gentle introduction to Rasch measurement models for metrologists. Journal of Physics Conference Series, 459(1), http://iopscience.iop.org/1742-6596/459/1/012002/pdf/1742-6596_459_1_012002.pdf.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62(5), 529-540.

Smith, R. M. (1996). A comparison of methods for determining dimensionality in Rasch measurement. Structural Equation Modeling, 3(1), 25-40.

Wilson, M. R. (2013a). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wilson, M. R. (2013b). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wilson, M., Mari, L., Maul, A., & Torres Irribarra, D. (2015). A comparison of measurement concepts across physical science and social science domains: Instrument design, calibration, and measurement. Journal of Physics: Conference Series, 588(012034), http://iopscience.iop.org/1742-6596/588/1/012034.

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Chicago, Illinois: MESA Press.

Externalities are to markets as anomalies are to scientific laws

October 28, 2011

Economic externalities are to efficient markets as any consistent anomaly is relative to a lawful regularity. Government intervention in markets is akin to fudging the laws of physics to explain the wobble in Uranus’ orbit, or to explain why magnetized masses would not behave like wooden or stone masses in a metal catapult (Rasch’s example). Further, government intervention in markets is necessary only as long as efficient markets for externalized forms of capital are not created. The anomalous exceptions to the general rule of market efficiency have long since been shown to themselves be internally consistent lawful regularities in their own right amenable to configuration as markets for human, social and natural forms of capital.

There is an opportunity here for the concise and elegant statement of the efficient markets hypothesis, the observation of certain anomalies, the formulation of new theories concerning these forms of capital, the framing of efficient markets hypotheses concerning the behavior of these anomalies, tests of these hypotheses in terms of the inverse proportionality of two of the parameters relative to the third, proposals as to the uniform metrics by which the scientific laws will be made commercially viable expressions of capital value, etc.

We suffer from the illusion that trading activity somehow spontaneously emerges from social interactions. It’s as though comparable equivalent value is some kind of irrefutable, incontestable feature of the world to which humanity adapts its institutions. But this order of things plainly puts the cart before the horse when the emergence of markets is viewed historically. The idea of fair trade, how it is arranged, how it is recognized, when it is appropriate, etc. varies markedly across cultures and over time.

Yes, “’the price of things is in inverse ratio to the quantity offered and in direct ratio to the quantity demanded’ (Walras 1965, I, 216-17)” (Mirowski, 1988, p. 20). Yes, Pareto made “a direct extrapolation of the path-independence of equilibrium energy states in rational mechanics and thermodynamics” to “the path-independence of the realization of utility” (Mirowski, 1988, p. 21). Yes, as Ehrenfest showed, “an analogy between thermodynamics and economics” can be made, and economic concepts can be formulated “as parallels of thermodynamic concepts, with the concept of equilibrium occupying the central position in both theories” (Boumans, 2005, p. 31).  But markets are built up around these lawful regularities by skilled actors who articulate the rules, embody the roles, and initiate the relationships comprising economic, legal, and scientific institutions. “The institutions define the market, rather than the reverse” (Miller & O’Leary, 2007, p. 710). What we need are new institutions built up around the lawful regularities revealed by Rasch models. The problem is how to articulate the rules, embody the roles, and initiate the relationships.

Noyes (1936, pp. 2, 13; quoted in De Soto 2000, p. 158) provides some useful pointers:

“The chips in the economic game today are not so much the physical goods and actual services that are almost exclusively considered in economic text books, as they are that elaboration of legal relations which we call property…. One is led, by studying its development, to conceive the social reality as a web of intangible bonds–a cobweb of invisible filaments–which surround and engage the individual and which thereby organize society…. And the process of coming to grips with the actual world we live in is the process of objectivizing these relations.”

 Noyes (1936, p. 20, quoted in De Soto 2000, p. 163) continues:

“Human nature demands regularity and certainty and this demand requires that these primitive judgments be consistent and thus be permitted to crystallize into certain rules–into ‘this body of dogma or systematized prediction which we call law.’ … The practical convenience of the public … leads to the recurrent efforts to systematize the body of laws. The demand for codification is a demand of the people to be released from the mystery and uncertainty of unwritten or even of case law.” [This is quite an apt statement of the largely unstated demands of the Occupy Wall Street movement.]

  De Soto (2000, p. 158) explains:

 “Lifting the bell jar [integrating legal and extralegal property rights], then, is principally a legal challenge. The official legal order must interact with extralegal arrangements outside the bell jar to create a social contract on property and capital. To achieve this integration, many other disciplines are of course necessary … [economists, urban planners, agronomists, mappers, surveyers, IT specialists, etc]. But ultimately, an integrated national social contract will be concretized only in laws.”

  “Implementing major legal change is a political responsibility. There are various reasons for this. First, law is generally concerned with protecting property rights. However, the real task in developing and former communist countries is not so much to perfect existing rights as to give everyone a right to property rights–‘meta-rights,’ if you will. [Paraphrasing, the real task in the undeveloped domains of human, social, and natural capital is not so much the perfection of existing rights as it is to harness scientific measurement in the name of economic justice and grant everyone legal title to their shares of their ownmost personal properties, their abilities, health, motivations, and trustworthiness, along with their shares of the common stock of social and natural resources.] Bestowing such meta-rights, emancipating people from bad law, is a political job. Second, very small but powerful vested interests–mostly repre- [p. 159] sented by the countries best commercial lawyers–are likely to oppose change unless they are convinced otherwise. Bringing well-connected and moneyed people onto the bandwagon requires not consultants committed to serving their clients but talented politicians committed to serving their people. Third, creating an integrated system is not about drafting laws and regulations that look good on paper but rather about designing norms that are rooted in people’s beliefs and are thus more likely to be obeyed and enforced. Being in touch with real people is a politician’s task. Fourth, prodding underground economies to become legal is a major political sales job.”

 De Soto continues (p. 159), intending to refer only to real estate but actually speaking of the need for formal legal title to personal property of all kinds, which ought to include human, social, and natural capital:

  “Without succeeding on these legal and political fronts, no nation can overcome the legal apartheid between those who can create capital and those who cannot. Without formal property, no matter how many assets they accumulate or how hard they work, most people will not be able to prosper in a capitalist society. They will continue to remain beyond the radar of policymakers, out of the reach of official records, and thus economically invisible.”

Boumans, M. (2005). How economists model the world into numbers. New York: Routledge.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Mirowski, P. (1988). Against mechanism: Protecting economics from science. Lanham, MD: Rowman & Littlefield.

Noyes, C. R. (1936). The institution of property. New York: Longman’s Green.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part III: Reflections on Greider’s “Bold Ideas” in The Nation

September 10, 2011

And so, The Nation’s “Bold Ideas for a New Economy” is disappointing for not doing more to start from the beginning identified by its own writer, William Greider. The soul of capitalism needs to be celebrated and nourished, if we are to make our economy “less destructive and domineering,” and “more focused on what people really need for fulfilling lives.” The only real alternative to celebrating and nourishing the soul of capitalism is to kill it, in the manner of the Soviet Union’s failed experiments in socialism and communism.

The article speaks the truth, though, when it says there is no point in trying to persuade the powers that be to make the needed changes. Republicans see the market as it exists as a one-size-fits-all economic panacea, when all it can accomplish in its current incomplete state is the continuing externalization of anything and everything important about human, social, and environmental decency. For their part, Democrats do indeed “insist that regulation will somehow fix whatever is broken,” in an ever-expanding socialistic micromanagement of every possible exception to the rules that emerges.

To date, the president’s efforts at a nonpartisan third way amount only to vacillations between these opposing poles. The leadership that is needed, however, is something else altogether. Yes, as The Nation article says, capitalism needs to be made to serve the interests of society, and this will require deep structural change, not just new policies. But none of the contributors of the “bold ideas” presented propose deep structural changes of a kind that actually gets at the soul of capitalism. All of the suggestions are ultimately just new policies tweaking superficial aspects of the economy in mechanical, static, and very limited ways.

The article calls for “Democratizing reforms that will compel business and finance to share decision-making and distribute rewards more fairly.” It says the vision has different names but “the essence is a fundamental redistribution of power and money.” But corporate distortions of liability law, the introduction of boardroom watchdogs, and a tax on financial speculation do not by any stretch of the imagination address the root causes of social and environmental irresponsibility in business. They “sound like obscure technical fixes” because that’s what they are. The same thing goes for low-cost lending from public banks, the double or triple bottom lines of Benefit Corporations, new anti-trust laws, calls for “open information” policies, added personal stakes for big-time CEOs, employee ownership plans, the elimination of tax subsidies for, new standards for sound investing, new measures of GDP, and government guarantees of full employment.

All of these proposals sound like what ought to be the effects and outcomes of efforts addressing the root causes of capitalisms’ shortcomings. Instead, they are band aids applied to scratched fingers and arms when multiple by-pass surgery is called for. That is, what we need is to understand how to bring the spirit of capitalism to life in the new domains of human, social, and environmental interests, but what we’re getting are nothing but more of the same piecemeal ways of moving around the deck chairs on the Titanic.

There is some truth in the assertion that what really needs reinventing is our moral and spiritual imagination. As someone (Einstein or Edison?) is supposed to have put it, originality is simply a matter of having a source for an analogy no one else has considered. Ironically, the best model is often the one most taken for granted and nearest to hand. Such is the case with the two-sided scientific and economic effects of standardized units of measurement. The fundamental moral aspect here is nothing other than the Golden Rule, independently derived and offered in cultures throughout history, globally. Individualized social measurement is nothing if not a matter of determining whether others are being treated in the way you yourself would want to be treated.

And so, yes, to stress the major point of agreement with The Nation, “the new politics does not start in Washington.” Historically, at their best, governments work to keep pace with the social and technical innovations introduced by their peoples. Margaret Mead said it well a long time ago when she asserted that small groups of committed citizens are the only sources of real social change.

Not to be just one of many “advocates with bold imaginations” who wind up marginalized by the constraints of status quo politics, I claim my personal role in imagining a new economic future by tapping as deeply as I can into the positive, pre-existing structures needed for a transition into a new democratic capitalism. We learn through what we already know. Standards are well established as essential to commerce and innovation, but 90% of the capital under management in our economy—the human, social, and natural capital—lacks the standards needed for optimal market efficiency and effectiveness. An intangible assets metric system will be a vitally important way in which we extend what is right and good in the world today into new domains.

To conclude, what sets this proposal apart from those offered by The Nation and its readers hinges on our common agreement that “the most threatening challenge to capitalism is arguably the finite carrying capacity of the natural world.” The bold ideas proposed by The Nation’s readers respond to this challenge in ways that share an important feature in common: people have to understand the message and act on it. That fact dooms all of these ideas from the start. If we have to articulate and communicate a message that people then have to act on, we remain a part of the problem and not part of the solution.

As I argue in my “The Problem is the Problem” blog post of some months ago, this way of defining problems is itself the problem. That is, we can no longer think of ourselves as separate from the challenges we face. If we think we are not all implicated through and through as participants in the construction and maintenance of the problem, then we have not understood it. The bold ideas offered to date are all responses to the state of a broken system that seek to reform one or another element in the system when what we need is a whole new system.

What we need is a system that so fully embodies nature’s own ecological wisdom that the medium becomes the message. When the ground rules for economic success are put in place such that it is impossible to earn a profit without increasing stocks of human, social, and natural capital, there will be no need to spell out the details of a microregulatory structure of controlling new anti-trust laws, “open information” policies, personal stakes for big-time CEOs, employee ownership plans, the elimination of tax subsidies, etc. What we need is precisely what Greider reported from Innovest in his book: reliable, high quality information that makes human, social, and environmental issues matter financially. Situated in a context like that described by Bernstein in his 2004 The Birth of Plenty, with the relevant property rights, rule of law, scientific rationality, capital markets, and communications networks in place, it will be impossible to stop a new economic expansion of historic proportions.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part II: Scientific Credibility in Improving Information Quality

September 10, 2011

The previous posting here concluded with two questions provoked by a close consideration of a key passage in William Greider’s 2003 book, The Soul of Capitalism. First, how do we create the high quality, solid information markets need to punish and reward relative to ethical and sustainable human, social, and environmental values? Second, what can we learn from the way we created that kind of information for property and manufactured capital? There are good answers to these questions, answers that point in productive directions in need of wide exploration and analysis.

The short answer to both questions is that better, more scientifically rigorous measurement at the local level needs to be implemented in a context of traceability to universally uniform standards. To think global and act local simultaneously, we need an efficient and transparent way of seeing where we stand in the world relative to everyone else. Having measures expressed in comparable and meaningful units is an important part of how we think global while acting local.

So, for markets to punish and reward businesses in ways able to build human, social, and environmental value, we need to be able to price that value, to track returns on investments in it, and to own shares of it. To do that, we need a new intangible assets metric system that functions in a manner analogous to the existing metric system and other weights and measures standards. In the same way these standards guarantee high quality information on volume, weight, thermal units, and volts in grocery stores and construction sites, we need a new set of standards for human abilities, performances, and health; for social trust, commitment, and loyalty; and for the environment’s air and water processing services, fisheries, gene pools, etc.

Each industry needs an instrumentarium of tools and metrics that mediate relationships universally within its entire sphere of production and/or service. The obvious and immediate reaction to this proposal will likely be that this is impossible, that it would have been done by now if it was possible, and that anyone who proposes something like this is simply unrealistic, perhaps dangerously so. So, here we have another reason to add to those given in the June 8, 2011 issue of The Nation (http://www.thenation.com/article/161267/reimagining-capitalism-bold-ideas-new-economy) as to why bold ideas for a new economy cannot gain any traction in today’s political discourse.

So what basis in scientific authority might be found for this audacious goal of an intangible assets metric system? This blog’s postings offer multiple varieties of evidence and argument in this regard, so I’ll stick to more recent developments, namely, last week’s meeting of the International Measurement Confederation (IMEKO) in Jena, Germany. Membership in IMEKO is dominated by physicists, engineers, chemists, and clinical laboratorians who work in private industry, academia, and government weights and measures standards institutes.

Several IMEKO members past and present are involved with one or more of the seven or eight major international standards organizations responsible for maintaining and improving the metric system (the Systeme Internationale des Unites). Two initiatives undertaken by IMEKO and these standards organizations take up the matter at issue here concerning the audacious goal of standard units for human, social, and natural capital.

First, the recently released third edition of the International Vocabulary of Measurement (VIM, 2008) expands the range of the concepts and terms included to encompass measurement in the human and social sciences. This first effort was not well informed as to the nature of widely realized state of the art developments in measurement in education, health care, and the social sciences. What is important is that an invitation to further dialogue has been extended from the natural to the social sciences.

That invitation was unintentionally accepted and a second initiative advanced just as the new edition of the VIM was being released, in 2008. Members of three IMEKO technical committees (TC 1-7-13; those on Measurement Science, Metrology Education, and Health Care) cultivate a special interest in ideas on the human and social value of measurement. At their 2008 meeting in Annecy, France, I presented a paper (later published in revised form as Fisher, 2009) illustrating how, over the previous 50 years and more, the theory and practice of measurement in the social sciences had developed in ways capable of supporting convenient and useful universally uniform units for human, social, and natural capital.

The same argument was then advanced by my fellow University of Chicago alum, Nikolaus Bezruczko, at the 2009 IMEKO World Congress in Lisbon. Bezruczko and I both spoke at the 2010 TC 1-7-13 meeting in London, and last week our papers were joined by presentations from six of our colleagues at the 2011 IMEKO TC 1-7-13 meeting in Jena, Germany. Another fellow U Chicagoan, Mark Wilson, a long time professor in the Graduate School of Education at the University of California, Berkeley, gave an invited address contrasting four basic approaches to measurement in psychometrics, and emphasizing the value of methods that integrate substantive meaning with mathematical rigor.

Examples from education, health care, and business were then elucidated at this year’s meeting in Jena by myself, Bezruczko, Stefan Cano (University of Plymouth, England), Carl Granger (SUNY, Buffalo; paper presented by Bezruczko, a co-author), Thomas Salzberger (University of Vienna, Austria), Jack Stenner (MetaMetrics, Inc., Durham, NC, USA), and Gordon Cooper (University of Western Australia, Crawley, WA, Australia; paper presented by Fisher, a co-author).

The contrast between these presentations and those made by the existing IMEKO membership hinges on two primary differences in focus. The physicists and engineers take it for granted that all instrument calibration involves traceability to metrological reference standards. Dealing as they are with existing standards and physical or chemical materials that usually possess deterministically structured properties, issues of how to construct linear measures from ordinal observations never come up.

Conversely, the social scientists and psychometricians take it for granted that all instrument calibration involves evaluations of the capacity of ordinal observations to support the construction of linear measures. Dealing as they are with data from tests, surveys, and rating scale assessments, issues of how to relate a given instrument’s unit to a reference standard never come up.

Thus there is significant potential for mutually instructive dialogue between natural and social scientists in this context. Many areas of investigation in the natural sciences have benefited from the introduction of probabilistic concepts in recent decades, but there are perhaps important unexplored opportunities for the application of probabilistic measurement, as opposed to statistical, models. By taking advantage of probabilistic models’ special features, measurement in education and health care has begun to realize the benefit of broad generalizations of comparable units across grades, schools, tests, and curricula.

Though the focus of my interest here is in the capacity of better measurement to improve the efficiency of human, social, and natural capital markets, it may turn out that as many or more benefits will accrue in the natural sciences’ side of the conversation as in the social sciences’ side. The important thing for the time being is that the dialogue is started. New and irreversible mutual understandings between natural and social scientists have already been put on the record. It may happen that the introduction of a new supply of improved human, social, and natural capital metrics will help articulate the largely, as yet, unstated but nonetheless urgent demand for them.

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Debt, Revenue, and Changing the Way Washington Works: The Greatest Entrepreneurial Opportunity of Our Time

July 30, 2011

“Holding the line” on spending and taxes does not make for a fundamental transformation of the way Washington works. Simply doing less of one thing is just a small quantitative change that does nothing to build positive results or set a new direction. What we need is a qualitative metamorphosis akin to a caterpillar becoming a butterfly. In contrast with this beautiful image of natural processes, the arguments and so-called principles being invoked in the sham debate that’s going on are nothing more than fights over where to put deck chairs on the Titanic.

What sort of transformation is possible? What kind of a metamorphosis will start from who and where we are, but redefine us sustainably and responsibly? As I have repeatedly explained in this blog, my conference presentations, and my publications, with numerous citations of authoritative references, we already possess all of the elements of the transformation. We have only to organize and deploy them. Of course, discerning what the resources are and how to put them together is not obvious. And though I believe we will do what needs to be done when we are ready, it never hurts to prepare for that moment. So here’s another take on the situation.

Infrastructure that supports lean thinking is the name of the game. Lean thinking focuses on identifying and removing waste. Anything that consumes resources but does not contribute to the quality of the end product is waste. We have enormous amounts of wasteful inefficiency in many areas of our economy. These inefficiencies are concentrated in areas in which management is hobbled by low quality information, where we lack the infrastructure we need.

Providing and capitalizing on this infrastructure is The Greatest Entrepreneurial Opportunity of Our Time. Changing the way Washington (ha! I just typed “Wastington”!) works is the same thing as mitigating the sources of risk that caused the current economic situation. Making government behave more like a business requires making the human, social, and natural capital markets more efficient. Making those markets more efficient requires reducing the costs of transactions. Those costs are determined in large part by information quality, which is a function of measurement.

It is often said that the best way to reduce the size of government is to move the functions of government into the marketplace. But this proposal has never been associated with any sense of the infrastructural components needed to really make the idea work. Simply reducing government without an alternative way of performing its functions is irresponsible and destructive. And many of those who rail on and on about how bad or inefficient government is fail to recognize that the government is us. We get the government we deserve. The government we get follows directly from the kind of people we are. Government embodies our image of ourselves as a people. In the US, this is what having a representative form of government means. “We the people” participate in our society’s self-governance not just by voting, writing letters to congress, or demonstrating, but in the way we spend our money, where we choose to live, work, and go to school, and in every decision we make. No one can take a breath of air, a drink of water, or a bite of food without trusting everyone else to not carelessly or maliciously poison them. No one can buy anything or drive down the street without expecting others to behave in predictable ways that ensure order and safety.

But we don’t just trust blindly. We have systems in place to guard against those who would ruthlessly seek to gain at everyone else’s expense. And systems are the point. No individual person or firm, no matter how rich, could afford to set up and maintain the systems needed for checking and enforcing air, water, food, and workplace safety measures. Society as a whole invests in the infrastructure of measures created, maintained, and regulated by the government’s Department of Commerce and the National Institute for Standards and Technology (NIST). The moral importance and the economic value of measurement standards has been stressed historically over many millennia, from the Bible and the Quran to the Magna Carta and the French Revolution to the US Constitution. Uniform weights and measures are universally recognized and accepted as essential to fair trade.

So how is it that we nonetheless apparently expect individuals and local organizations like schools, businesses, and hospitals to measure and monitor students’ abilities; employees’ skills and engagement; patients’ health status, functioning, and quality of care; etc.? Why do we not demand common currencies for the exchange of value in human, social, and natural capital markets? Why don’t we as a society compel our representatives in government to institute the will of the people and create new standards for fair trade in education, health care, social services, and environmental management?

Measuring better is not just a local issue! It is a systemic issue! When measurement is objective and when we all think together in the common language of a shared metric (like hours, volts, inches or centimeters, ounces or grams, degrees Fahrenheit or Celsius, etc.), then and only then do we have the means we need to implement lean strategies and create new efficiencies systematically. We need an Intangible Assets Metric System.

The current recession in large part was caused by failures in measuring and managing trust, responsibility, loyalty, and commitment. Similar problems in measuring and managing human, social, and natural capital have led to endlessly spiraling costs in education, health care, social services, and environmental management. The problems we’re experiencing in these areas are intimately tied up with the way we formulate and implement group level decision making processes and policies based in statistics when what we need is to empower individuals with the tools and information they need to make their own decisions and policies. We will not and cannot metamorphose from caterpillar to butterfly until we create the infrastructure through which we each can take full ownership and control of our individual shares of the human, social, and natural capital stock that is rightfully ours.

We well know that we manage what we measure. What counts gets counted. Attention tends to be focused on what we’re accountable for. But–and this is vitally important–many of the numbers called measures do not provide the information we need for management. And not only are lots of numbers giving us low quality information, there are far too many of them! We could have better and more information from far fewer numbers.

Previous postings in this blog document the fact that we have the intellectual, political, scientific, and economic resources we need to measure and manage human, social, and natural capital for authentic wealth. And the issue is not a matter of marshaling the will. It is hard to imagine how there could be more demand for better management of intangible assets than there is right now. The problem in meeting that demand is a matter of imagining how to start the ball rolling. What configuration of investments and resources will start the process of bursting open the chrysalis? How will the demand for meaningful mediating instruments be met in a way that leads to the spreading of the butterfly’s wings? It is an exciting time to be alive.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.