Archive for the ‘Traceability’ Category

Comments on the New ANSI Human Capital Investor Metrics Standard

April 16, 2012

The full text of the proposed standard is available here.

It’s good to see a document emerge in this area, especially one with such a broad base of support from a diverse range of stakeholders. As is stated in the standard, the metrics defined in it are a good place to start and in many instances will likely improve the quality and quantity of the information made available to investors.

There are several issues to keep in mind as the value of standards for human capital metrics becomes more widely appreciated. First, in the context of a comprehensively defined investment framework, human capital is just one of the four major forms of capital, the other three being social, natural, and manufactured (Ekins, 1992; Ekins, Dresden, and Dahlstrom, 2008). To ensure as far as possible the long term stability and sustainability of their profits, and of the economic system as a whole, investors will certainly want to expand the range of the available standards to include social and natural capital along with human capital.

Second, though we manage what we measure, investment management is seriously compromised by having high quality scientific measurement standards only for manufactured capital (length, weight, volume, temperature, energy, time, kilowatts, etc.). Over 80 years of research on ability tests, surveys, rating scales, and assessments has reached a place from which it is prepared to revolutionize the management of intangible forms of capital (Fisher, 2007, 2009a, 2009b, 2010, 2011a, 2011b; Fisher & Stenner, 2011a, 2011b; Wilson, 2011; Wright, 1999). The very large reductions in transaction costs effected by standardized metrics in the economy at large (Barzel, 1982; Benham and Benham, 2000) are likely to have a similarly profound effect on the economics of human, social, and natural capital (Fisher, 2011a, 2012a, 2012b).

The potential for dramatic change in the conceptualization of metrics is most evident in the proposed standard in the sections on leadership quality and employee engagement. For instance, in the section on leadership quality, it is stated that “Investors will be able to directly compare all organizations that are using the same vendor’s methodology.” This kind of dependency should not be allowed to stand as a significant factor in a measurement standard. Properly constructed and validated scientific measures, such as those that have been in wide use in education, psychology and health care for several decades (Andrich, 2010; Bezruzcko, 2005; Bond and Fox, 2007; Fisher and Wright, 1994; Rasch, 1960; Salzberger, 2009; Wright, 1999), are equated to a common unit. Comparability should never depend on which vendor is used. Rather, any instrument that actually measures the construct of interest (leadership quality or employee engagement) should do so in a common unit and within an acceptable range of error. “Normalizing” measures for comparability, as is suggested in the standard, means employing psychometric methods that are 50 years out of date and that are far less rigorous and practical than need be. Transparency in measurement means looking through the instrument to the thing itself. If particular instruments color or reshape what is measured, or merely change the meaning of the numbers reported, then the integrity of the standard as a standard should be re-examined.

Third, for investments in human capital to be effectively managed, each distinct aspect of it (motivations, skills and abilities, health) needs to be measured separately, just as height, weight, and temperature are. New technologies have already transformed measurement practices in ways that make the necessary processes precise and inexpensive. Of special interest are adaptively administered precalibrated instruments supporting mass customized—but globally comparable—measures (for instance, see the examples at http://blog.lexile.com/tag/oasis/ and that were presented at the recent Pearson Global Research Conference in Fremantle, Australia http://www.pearson.com.au/marketing/corporate/pearson_global/default.html; also see Wright and Bell 1984, Lunz, Bergstrom, and Gershon, 1994, Bejar, et al., 2003).

Fourth, the ownership of human capital needs clarification and legal status. If we consider each individual to own their abilities, health, and motivations, and to be solely responsible for decisions made concerning the disposition of those properties, then, in accord with their proven measured amounts of each type of human capital, everyone ought to have legal title to a specific number of shares or credits of each type. This may transform employment away from wage-based job classification compensation to an individualized investment-based continuous quality improvement platform. The same kind of legal titling system will, of course, need to be worked out for social and natural capital, as well.

Fifth, given scientific standards for each major form of capital, practical measurement technologies, and legal title to our shares of capital, we will need expanded financial accounting standards and tools for managing our individual and collective investments. Ongoing research and debates concerning these standards and tools (Siegel and Borgia, 2006; Young and Williams, 2010) have yet to connect with the larger scientific, economic, and legal issues raised here, but developments in this direction should be emerging in due course.

Sixth, a number of lingering moral, ethical and political questions are cast in a new light in this context. The significance of individual behaviors and decisions is informed and largely determined by the context of the culture and institutions in which those behaviors and decisions are executed. Many of the morally despicable but not illegal investment decisions leading to the recent economic downturn put individuals in the position of either setting themselves apart and threatening their careers or doing what was best for their portfolios within the limits of the law. Current efforts intended to devise new regulatory constraints are misguided in focusing on ever more microscopically defined particulars. What is needed is instead a system in which profits are contingent on the growth of human, social, and natural capital. In that framework, legal but ultimately unfair practices would drive down social capital stock values, counterbalancing ill-gotten gains and making them unprofitable.

Seventh, the International Vocabulary of Measurement, now in its third edition (VIM3), is a standard recognized by all eight international standards accrediting bodies (BIPM, etc.). The VIM3 (http://www.bipm.org/en/publications/guides/vim.html) and forthcoming VIM4 are intended to provide a uniform set of concepts and terms for all fields that employ measures across the natural and social sciences. A new dialogue on these issues has commenced in the context of the International Measurement Confederation (IMEKO), whose member organizations are the weights and standards measurement institutes from countries around the world (Conference note, 2011). The 2012 President of the Psychometric Society, Mark Wilson, gave an invited address at the September 2011 IMEKO meeting (Wilson, 2011), and a member of the VIM3 editorial board, Luca Mari, is invited to speak at the July, 2012 International Meeting of the Psychometric Society. I encourage all interested parties to become involved in efforts of these kinds in their own fields.

References

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Barzel, Y. (1982). Measurement costs and the organization of markets. Journal of Law and Economics, 25, 27-48.

Bejar, I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003, November). A feasibility study of on-the-fly item generation in adaptive testing. The Journal of Technology, Learning, and Assessment, 2(3), 1-29; http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1663.

Benham, A., & Benham, L. (2000). Measuring the costs of exchange. In C. Ménard (Ed.), Institutions, contracts and organizations: Perspectives from new institutional economics (pp. 367-375). Cheltenham, UK: Edward Elgar.

Bezruczko, N. (Ed.). (2005). Rasch measurement in health sciences. Maple Grove, MN: JAM Press.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Conference note. (2011). IMEKO Symposium: August 31- September 2, 2011, Jena, Germany. Rasch Measurement Transactions, 25(1), 1318.

Ekins, P. (1992). A four-capital model of wealth creation. In P. Ekins & M. Max-Neef (Eds.), Real-life economics: Understanding wealth creation (pp. 147-155). London: Routledge.

Ekins, P., Dresner, S., & Dahlstrom, K. (2008). The four-capital method of sustainable development evaluation. European Environment, 18(2), 63-80.

Fisher, W. P., Jr. (2007). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009b). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC: National Institute for Standards and Technology.

Fisher, W. P.. Jr. (2010). Rasch, Maxwell’s method of analogy, and the Chicago tradition. In G. Cooper (Chair), https://conference.cbs.dk/index.php/rasch/Rasch2010/paper/view/824. Probabilistic models for measurement in education, psychology, social science and health: Celebrating 50 years since the publication of Rasch’s Probabilistic Models.., University of Copenhagen School of Business, FUHU Conference Centre, Copenhagen, Denmark.

Fisher, W. P., Jr. (2011a). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In N. Brown, B. Duckor, K. Draney & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 2 (pp. 1-27). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2011b). Measurement, metrology and the coordination of sociotechnical networks. In  S. Bercea (Chair), New Education and Training Methods. International Measurement Confederation (IMEKO), http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24491/ilm1-2011imeko-017.pdf, Jena, Germany.

Fisher, W. P., Jr. (2012a). Measure local, manage global: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. in press). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012b). What the world needs now: A bold plan for new standards. Standards Engineering, 64, in press.

Fisher, W. P., Jr., & Stenner, A. J. (2011a). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). Retrieved 25 October 2011, from National Science Foundation: http://www.nsf.gov/sbe/sbe_2020/submission_detail.cfm?upld_id=36.

Fisher, W. P., Jr., & Stenner, A. J. (2011b). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO) TC1-TC7-TC13 Joint Symposium, http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24493/ilm1-2011imeko-018.pdf, Jena, Germany.

Fisher, W. P., Jr., & Wright, B. D. (Eds.). (1994). Applications of probabilistic conjoint measurement. International Journal of Educational Research, 21(6), 557-664.

Lunz, M. E., Bergstrom, B. A., & Gershon, R. C. (1994). Computer adaptive testing. International Journal of Educational Research, 21(6), 623-634.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Salzberger, T. (2009). Measurement in marketing research: An alternative framework. Northampton, MA: Edward Elgar.

Siegel, P., & Borgia, C. (2006). The measurement and recognition of intangible assets. Journal of Business and Public Affairs, 1(1).

Wilson, M. (2011). The role of mathematical models in measurement: A perspective from psychometrics. In L. Mari (Chair), Plenary lecture. International Measurement Confederation (IMEKO), http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24178/ilm1-2011imeko-005.pdf, Jena, Germany.

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D., & Bell, S. R. (1984, Winter). Item banks: What, why, how. Journal of Educational Measurement, 21(4), 331-345 [http://www.rasch.org/memo43.htm].

Young, J. J., & Williams, P. F. (2010, August). Sorting and comparing: Standard-setting and “ethical” categories. Critical Perspectives on Accounting, 21(6), 509-521.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

2011 IMEKO Conference Papers Published Online

January 13, 2012

Papers from the Joint International IMEKO TC1+ TC7+ TC13 Symposium held August 31st to September 2nd,  2011, in Jena, Germany are now available online at http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24575/IMEKO2011_TOC.pdf. The following will be of particular interest to those interested in measurement applications in the social sciences, education, health care, and psychology:

Nikolaus Bezruczko
Foundational Imperatives for Measurement with Mathematical Models
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24419/ilm1-2011imeko-030.pdf

Nikolaus Bezruczko, Shu-Pi C. Chen, Connie Hill, Joyce M. Chesniak
A Clinical Scale for Measuring Functional Caregiving of Children Assisted with Medical Technologies
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24507/ilm1-2011imeko-032.pdf

Stefan Cano, Anne F. Klassen, Andrea L. Pusic, Andrea
From Breast-Q © to Q-Score ©: Using Rasch Measurement to Better Capture Breast Surgery Outcomes
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24429/ilm1-2011imeko-039.pdf

Gordon A. Cooper, William P. Fisher, Jr.
Continuous Quantity and Unit; Their Centrality to Measurement
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24494/ilm1-2011imeko-019.pdf

William P. Fisher, Jr.
Measurement, Metrology and the Coordination of Sociotechnical Networks
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24491/ilm1-2011imeko-017.pdf

William .P Fisher, Jr., A. Jackson Stenner
A Technology Roadmap for Intangible Assets Metrology
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24493/ilm1-2011imeko-018.pdf

Carl V. Granger, Nikolaus Bezruczko
Body, Mind, and Spirit are Instrumental to Functional Health: A Case Study
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24494/ilm1-2011imeko-019.pdf

Thomas Salzberger
The Quantification of Latent Variables in the Social Sciences: Requirements for Scientific Measurement and Shortcomings of Current Procedures
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24417/ilm1-2011imeko-029.pdf

A. Jackson Stenner, Mark Stone, Donald Burdick
How to Model and Test for the Mechanisms that Make Measurement Systems Tick
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24416/ilm1-2011imeko-027.pdf

Mark Wilson
The Role of Mathematical Models in Measurement: A Perspective from Psychometrics
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24178/ilm1-2011imeko-005.pdf

Also of interest will be Karl Ruhm’s plenary lecture and papers from the Fundamentals of Measurement Science session and the Special Session on the Role of Mathematical Models in Measurement:

Karl H. Ruhm
From Verbal Models to Mathematical Models – A Didactical Concept not just in Metrology
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24167/ilm1-2011imeko-002.pdf

Alessandro Giordani, Luca Mari
Quantity and Quantity Value
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24414/ilm1-2011imeko-025.pdf

Eric Benoit
Uncertainty in Fuzzy Scales Based Measurements
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24415/ilm1-2011imeko-020.pdf

Susanne C.N. Töpfer
Application of Mathematical Models in Optical Coordinate Metrology
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24445/ilm1-2011imeko-008.pdf

Giovanni Battista Rossi
Measurement Modelling: Foundations and Probabilistic Approach
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24446/ilm1-2011imeko-009.pdf

Sanowar H. Khan, Ludwik Finkelstein
The Role of Mathematical Modelling in the Analysis and Design of Measurement Systems
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24448/ilm1-2011imeko-010.pdf

Roman Z. Morawski
Application-Oriented Approach to Mathematical Modelling of Measurement Processes
http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24449/ilm1-2011imeko-011.pdf

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part II: Scientific Credibility in Improving Information Quality

September 10, 2011

The previous posting here concluded with two questions provoked by a close consideration of a key passage in William Greider’s 2003 book, The Soul of Capitalism. First, how do we create the high quality, solid information markets need to punish and reward relative to ethical and sustainable human, social, and environmental values? Second, what can we learn from the way we created that kind of information for property and manufactured capital? There are good answers to these questions, answers that point in productive directions in need of wide exploration and analysis.

The short answer to both questions is that better, more scientifically rigorous measurement at the local level needs to be implemented in a context of traceability to universally uniform standards. To think global and act local simultaneously, we need an efficient and transparent way of seeing where we stand in the world relative to everyone else. Having measures expressed in comparable and meaningful units is an important part of how we think global while acting local.

So, for markets to punish and reward businesses in ways able to build human, social, and environmental value, we need to be able to price that value, to track returns on investments in it, and to own shares of it. To do that, we need a new intangible assets metric system that functions in a manner analogous to the existing metric system and other weights and measures standards. In the same way these standards guarantee high quality information on volume, weight, thermal units, and volts in grocery stores and construction sites, we need a new set of standards for human abilities, performances, and health; for social trust, commitment, and loyalty; and for the environment’s air and water processing services, fisheries, gene pools, etc.

Each industry needs an instrumentarium of tools and metrics that mediate relationships universally within its entire sphere of production and/or service. The obvious and immediate reaction to this proposal will likely be that this is impossible, that it would have been done by now if it was possible, and that anyone who proposes something like this is simply unrealistic, perhaps dangerously so. So, here we have another reason to add to those given in the June 8, 2011 issue of The Nation (http://www.thenation.com/article/161267/reimagining-capitalism-bold-ideas-new-economy) as to why bold ideas for a new economy cannot gain any traction in today’s political discourse.

So what basis in scientific authority might be found for this audacious goal of an intangible assets metric system? This blog’s postings offer multiple varieties of evidence and argument in this regard, so I’ll stick to more recent developments, namely, last week’s meeting of the International Measurement Confederation (IMEKO) in Jena, Germany. Membership in IMEKO is dominated by physicists, engineers, chemists, and clinical laboratorians who work in private industry, academia, and government weights and measures standards institutes.

Several IMEKO members past and present are involved with one or more of the seven or eight major international standards organizations responsible for maintaining and improving the metric system (the Systeme Internationale des Unites). Two initiatives undertaken by IMEKO and these standards organizations take up the matter at issue here concerning the audacious goal of standard units for human, social, and natural capital.

First, the recently released third edition of the International Vocabulary of Measurement (VIM, 2008) expands the range of the concepts and terms included to encompass measurement in the human and social sciences. This first effort was not well informed as to the nature of widely realized state of the art developments in measurement in education, health care, and the social sciences. What is important is that an invitation to further dialogue has been extended from the natural to the social sciences.

That invitation was unintentionally accepted and a second initiative advanced just as the new edition of the VIM was being released, in 2008. Members of three IMEKO technical committees (TC 1-7-13; those on Measurement Science, Metrology Education, and Health Care) cultivate a special interest in ideas on the human and social value of measurement. At their 2008 meeting in Annecy, France, I presented a paper (later published in revised form as Fisher, 2009) illustrating how, over the previous 50 years and more, the theory and practice of measurement in the social sciences had developed in ways capable of supporting convenient and useful universally uniform units for human, social, and natural capital.

The same argument was then advanced by my fellow University of Chicago alum, Nikolaus Bezruczko, at the 2009 IMEKO World Congress in Lisbon. Bezruczko and I both spoke at the 2010 TC 1-7-13 meeting in London, and last week our papers were joined by presentations from six of our colleagues at the 2011 IMEKO TC 1-7-13 meeting in Jena, Germany. Another fellow U Chicagoan, Mark Wilson, a long time professor in the Graduate School of Education at the University of California, Berkeley, gave an invited address contrasting four basic approaches to measurement in psychometrics, and emphasizing the value of methods that integrate substantive meaning with mathematical rigor.

Examples from education, health care, and business were then elucidated at this year’s meeting in Jena by myself, Bezruczko, Stefan Cano (University of Plymouth, England), Carl Granger (SUNY, Buffalo; paper presented by Bezruczko, a co-author), Thomas Salzberger (University of Vienna, Austria), Jack Stenner (MetaMetrics, Inc., Durham, NC, USA), and Gordon Cooper (University of Western Australia, Crawley, WA, Australia; paper presented by Fisher, a co-author).

The contrast between these presentations and those made by the existing IMEKO membership hinges on two primary differences in focus. The physicists and engineers take it for granted that all instrument calibration involves traceability to metrological reference standards. Dealing as they are with existing standards and physical or chemical materials that usually possess deterministically structured properties, issues of how to construct linear measures from ordinal observations never come up.

Conversely, the social scientists and psychometricians take it for granted that all instrument calibration involves evaluations of the capacity of ordinal observations to support the construction of linear measures. Dealing as they are with data from tests, surveys, and rating scale assessments, issues of how to relate a given instrument’s unit to a reference standard never come up.

Thus there is significant potential for mutually instructive dialogue between natural and social scientists in this context. Many areas of investigation in the natural sciences have benefited from the introduction of probabilistic concepts in recent decades, but there are perhaps important unexplored opportunities for the application of probabilistic measurement, as opposed to statistical, models. By taking advantage of probabilistic models’ special features, measurement in education and health care has begun to realize the benefit of broad generalizations of comparable units across grades, schools, tests, and curricula.

Though the focus of my interest here is in the capacity of better measurement to improve the efficiency of human, social, and natural capital markets, it may turn out that as many or more benefits will accrue in the natural sciences’ side of the conversation as in the social sciences’ side. The important thing for the time being is that the dialogue is started. New and irreversible mutual understandings between natural and social scientists have already been put on the record. It may happen that the introduction of a new supply of improved human, social, and natural capital metrics will help articulate the largely, as yet, unstated but nonetheless urgent demand for them.

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part I: Reflections on Greider’s Soul of Capitalism

September 10, 2011

In his 2003 book, The Soul of Capitalism, William Greider wrote, “If capitalism were someday found to have a soul, it would probably be located in the mystic qualities of capital itself” (p. 94). The recurring theme in the book is that the resolution of capitalism’s deep conflicts must grow out as organic changes from the roots of capitalism itself.

In the book, Greider quotes Innovest’s Michael Kiernan as suggesting that the goal has to be re-engineering the DNA of Wall Street (p. 119). He says the key to doing this is good reliable information that has heretofore been unavailable but which will make social and environmental issues matter financially. The underlying problems of exactly what solid, high quality information looks like, where it comes from, and how it is created are not stated or examined, but the point, as Kiernan says, is that “the markets are pretty good at punishing and rewarding.” The objective is to use “the financial markets as an engine of reform and positive change rather than destruction.”

This objective is, of course, the focus of multiple postings in this blog (see especially this one and this one). From my point of view, capitalism indeed does have a soul and it is actually located in the qualities of capital itself. Think about it: if a soul is a spirit of something that exists independent of its physical manifestation, then the soul of capitalism is the fungibility of capital. Now, this fungibility is complex and ambiguous. It takes its strength and practical value from the way market exchange are represented in terms of currencies, monetary units that, within some limits, provide an objective basis of comparison useful for rewarding those capable of matching supply with demand.

But the fungibility of capital can also be dangerously misconceived when the rich complexity and diversity of human capital is unjustifiably reduced to labor, when the irreplaceable value of natural capital is unjustifiably reduced to land, and when the trust, loyalty, and commitment of social capital is completely ignored in financial accounting and economic models. As I’ve previously said in this blog, the concept of human capital is inherently immoral so far as it reduces real human beings to interchangeable parts in an economic machine.

So how could it ever be possible to justify any reduction of human, social, and natural value to a mere number? Isn’t this the ultimate in the despicable inhumanity of economic logic, corporate decision making, and, ultimately, the justification of greed? Many among us who profess liberal and progressive perspectives seem to have an automatic and reactionary prejudice of this kind. This makes these well-intentioned souls as much a part of the problem as those among us with sometimes just as well-intentioned perspectives that accept such reductionism as the price of entry into the game.

There is another way. Human, social, and natural value can be measured and made manageable in ways that do not necessitate totalizing reduction to a mere number. The problem is not reduction itself, but unjustified, totalizing reduction. Referring to all people as “man” or “men” is an unjustified reduction dangerous in the way it focuses attention only on males. The tendency to think and act in ways privileging males over females that is fostered by this sense of “man” shortchanges us all, and has happily been largely eliminated from discourse.

Making language more inclusive does not, however, mean that words lose the singular specificity they need to be able to refer to things in the world. Any given word represents an infinite population of possible members of a class of things, actions, and forms of life. Any simple sentence combining words into a coherent utterance then multiplies infinities upon infinities. Discourse inherently reduces multiplicities into texts of limited lengths.

Like any tool, reduction has its uses. Also like any tool, problems arise when the tool is allowed to occupy some hidden and unexamined blind spot from which it can dominate and control the way we think about everything. Critical thinking is most difficult in those instances in which the tools of thinking themselves need to be critically evaluated. To reject reduction uncritically as inherently unjustified is to throw the baby out with the bathwater. Indeed, it is impossible to formulate a statement of the rejection without simultaneously enacting exactly what is supposed to be rejected.

We have numerous ready-to-hand examples of how all reduction has been unjustifiably reduced to one homogenized evil. But one of the results of experiments in communal living in the 1960s and 1970s, as well as of the fall of the Soviet Union, was the realization that the centralized command and control of collectively owned community property cannot compete with the creativity engendered when individuals hold legal title to the fruits of their labors. If individuals cannot own the results of the investments they make, no one makes any investments.

In other words, if everything is owned collectively and is never reduced to individually possessed shares that can be creatively invested for profitable returns, then the system is structured so as to punish innovation and reward doing as little as possible. But there’s another way of thinking about the relation of the collective to the individual. The living soul of capitalism shows itself in the way high quality information makes it possible for markets to efficiently coordinate and align individual producers’ and consumers’ collective behaviors and decisions. What would happen if we could do that for human, social, and natural capital markets? What if “social capitalism” is more than an empty metaphor? What if capital institutions can be configured so that individual profit really does become the driver of socially responsible, sustainable economics?

And here we arrive at the crux of the problem. How do we create the high quality, solid information markets need to punish and reward relative to ethical and sustainable human, social, and environmental values? Well, what can we learn from the way we created that kind of information for property and manufactured capital? These are the questions taken up and explored in the postings in this blog, and in my scientific research publications and meeting presentations. In the near future, I’ll push my reflection on these questions further, and will explore some other possible answers to the questions offered by Greider and his readers in a recent issue of The Nation.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

New Opportunities for Job Creation and Prosperity

August 17, 2011

What can be done to create jobs and revive the economy? There is no simple, easy answer to this question. Creating busywork is nonsense. We need fulfilling occupations that meet the world’s demand for products and services. It is not easy to see how meaningful work can be systematically created on a broad scale. New energy efficiencies may lead to the cultivation of significant job growth, but it may be unwise to put all of our eggs in this one basket.

So how are we to solve this puzzle? What other areas in the economy might be ripe for the introduction of a new technology capable of supporting a wave of new productivity, like computers did in the 1980s, or the Internet in the 1990s? In trying to answer this question, simplicity and elegance are key factors in keeping things at a practical level.

For instance, we know we accomplish more working together as a team than as disconnected individuals. New jobs, especially new kinds of jobs, will have to be created via innovation. Innovation in science and industry is a team sport. So the first order of business in teaming up for job creation is to know the rules of the game. The economic game is played according to the rules of law embodied in property rights, scientific rationality, capital markets, and transportation/communications networks (see William Bernstein’s 2004 book, The Birth of Plenty). When these conditions are met, as they were in Europe and North America at the beginning of the nineteenth century, the stage is set for long term innovation and growth on a broad scale.

The second order of business is to identify areas in the economy that lack one or more of these four conditions, and that could reasonably be expected to benefit from their introduction. Education, health care, social services, and environmental management come immediately to mind. These industries are plagued with seemingly interminable inflationary spirals, which, no doubt, are at least in part caused by the inability of investors to distinguish between high and low performers. Money cannot flow to and reward programs producing superior results in these industries because they lack common product definitions and comparable measures of their results.

The problems these industries are experiencing are not specific to each of them in particular. Rather, the problem is a general one applicable across all industries, not just these. Traditionally, economic thinking focuses on three main forms of capital: land, labor, and manufactured products (including everything from machines, roads, and buildings to food, clothing, and appliances). Cash and credit are often thought of as liquid capital, but their economic value stems entirely from the access they provide to land, labor, and manufactured products.

Economic activity is not really, however, restricted to these three forms of capital. Land is far more than a piece of ground. What are actually at stake are the earth’s regenerative ecosystems, with the resources and services they provide. And labor is far more than a pair of skilled hands; people bring a complex mix of abilities, motivations, and health to bear in their work. Finally, this scheme lacks an essential element: the trust, loyalty, and commitment required for even the smallest economic exchange to take place. Without social capital, all the other forms of capital (human, natural, and manufactured, including property) are worthless. Consistent, sustainable, and socially responsible economic growth requires that all four forms of capital be made accountable in financial spreadsheets and economic models.

The third order of business, then, is to ask if the four conditions laying out the rules for the economic game are met in each of the four capital domains. The table below suggests that all four conditions are fully met only for manufactured products. They are partially met for natural resources, such as minerals, timber, fisheries, etc., but not at all for nature’s air and water purification systems or broader genetic ecosystem services.

 Table

Existing Conditions Relevant to Conceiving a New Birth of Plenty, by Capital Domains

Human

Social

Natural

Manufactured

Property rights

No

No

Partial

Yes

Scientific rationality

Partial

Partial

Partial

Yes

Capital markets

Partial

Partial

Partial

Yes

Transportation & communication networks

Partial

Partial

Partial

Yes

That is, no provisions exist for individual ownership of shares in the total available stock of air and water, or of forest, watershed, estuary, and other ecosystem service outcomes. Nor do any individuals have free and clear title to their most personal properties, the intangible abilities, motivations, health, and trust most essential to their economic productivity. Aggregate statistics are indeed commonly used to provide a basis for policy and research in human, social, and natural capital markets, but falsifiable models of individually applicable unit quantities are not widely applied. Scientifically rational measures of our individual stocks of intangible asset value will require extensive use of these falsifiable models in calibrating the relevant instrumentation.

Without such measures, we cannot know how many shares of stock in these forms of capital we own, or what they are worth in dollar terms. We lack these measures, even though decades have passed since researchers first established firm theoretical and practical foundations for them. And more importantly, even when scientifically rational individual measures can be obtained, they are never expressed in terms of a unit standardized for use within a given market’s communications network.

So what are the consequences for teams playing the economic game? High performance teams’ individual decisions and behaviors are harmonized in ways that cannot otherwise be achieved only when unit amounts, prices, and costs are universally comparable and publicly available. This is why standard currencies and exchange rates are so important.

And right here we have an insight into what we can do to create jobs. New jobs are likely going to have to be new kinds of jobs resulting from innovations. As has been detailed at length in recent works such as Surowiecki’s 2004 book, The Wisdom of Crowds, innovation in science and industry depends on standards. Standards are common languages that enable us to multiply our individual cognitive powers into new levels of collective productivity. Weights and measures standards are like monetary currencies; they coordinate the exchange of value in laboratories and businesses in the same way that dollars do in the US economy.

Applying Bernstein’s four conditions for economic growth to intangible assets, we see that a long term program for job creation then requires

  1. legislation establishing human, social, and natural capital property rights, and an Intangible Assets Metrology System;
  2. scientific research into consensus standards for measuring human, social, and natural capital;
  3. venture capital educational and marketing programs; and
  4. distributed information networks and computer applications through which investments in human, social, and natural capital can be tracked and traded in accord with the rule of law governing property rights and in accord with established consensus standards.

Of these four conditions, Bernstein (p. 383) points to property rights as being the most difficult to establish, and the most important for prosperity. Scientific results are widely available in online libraries. Capital can be obtained from investors anywhere. Transportation and communications services are available commercially.

But valid and verifiable means of representing legal title to privately owned property is a problem often not yet solved even for real estate in many Third World and former communist countries (see De Soto’s 2000 book, The Mystery of Capital). Creating systems for knowing the quality and quantity of educational, health care, social, and environmental service outcomes is going to be a very difficult process. It will not be impossible, however, and having the problem identified advances us significantly towards new economic possibilities.

We need leaders able and willing to formulate audacious goals for new economic growth from ideas such as these. We need enlightened visionaries able to see our potentials from a new perspective, and who can reflect our new self-image back at us. When these leaders emerge—and they will, somewhere, somehow—the imaginations of millions of entrepreneurial thinkers and actors will be fired, and new possibilities will unfold.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Science, Public Goods, and the Monetization of Commodities

August 13, 2011

Though I haven’t read Philip Mirowski’s new book yet (Science-Mart: Privatizing American Science. Cambridge, MA: Harvard University Press, 2011), a statement in the cover blurb given at Amazon.com got me thinking. I can’t help but wonder if there is another way of interpreting neoliberal ideology’s “radically different view of knowledge and discovery: [that] the fruits of scientific investigation are not a public good that should be freely available to all, but are commodities that could be monetized”?

Corporations and governments are not the only ones investing in research and new product development, and they are not the only ones who could benefit from the monetization of the fruits of scientific investigation. Individuals make these investments as well, and despite ostensible rights to private ownership, no individuals anywhere have access to universally comparable, uniformly expressed, and scientifically valid information on the quantity or quality of the literacy, health, community, or natural capital that is rightfully theirs. They accordingly also then do not have any form of demonstrable legal title to these properties. In the same way that corporations have successfully advanced their economic interests by seeing that patent and intellectual property laws were greatly strengthened, so, too, ought individuals and communities advance their economic interests by, first, expanding the scope of weights and measures standards to include intangible assets, and second, by strengthening laws related to the ownership of privately held stocks of living capital.

The nationalist and corporatist socialization of research will continue only as long as social capital, human capital, and natural capital are not represented in the universally uniform common currencies and transparent media that could be provided by an intangible assets metric system. When these forms of capital are brought to economic life in fungible measures akin to barrels, bushels, or kilowatts, then they will be monetized commodities in the full capitalist sense of the term, ownable and purchasable products with recognizable standard definitions, uniform quantitative volumes, and discernable variations in quality. Then, and only then, will individuals gain economic control over their most important assets. Then, and only then, will we obtain the information we need to transform education, health care, social services, and human and natural resource management into industries in which quality is appropriately rewarded. Then, and only then, will we have the means for measuring genuine progress and authentic wealth in ways that correct the insufficiencies of the GNP/GDP indexes.

The creation of efficiently functioning markets for all forms of capital is an economic, political, and moral necessity (see Ekins, 1992 and others). We say we manage what we measure, but very little effort has been put into measuring (with scientific validity and precision in universally uniform and accessible aggregate terms) 90% of the capital resources under management: human abilities, motivations, and health; social commitment, loyalty, and trust; and nature’s air and water purification and ecosystem services (see Hawken, Lovins, & Lovins, 1999, among others). All human suffering, sociopolitical discontent, and environmental degradation are rooted in the same common cause: waste (see Hawken, et al., 1999). To apply lean thinking to removing the wasteful destruction of our most valuable resources, we must measure these resources in ways that allow us to coordinate and align our decisions and behaviors virtually, at a distance, with no need for communicating and negotiating the local particulars of the hows and whys of our individual situations. For more information on these ideas, search “living capital metrics” and see works like the following:

Ekins, P. (1992). A four-capital model of wealth creation. In P. Ekins & M. Max-Neef (Eds.), Real-life economics: Understanding wealth creation (pp. 147-15). London: Routledge.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Hawken, P., Lovins, A., & Lovins, H. L. (1999). Natural capitalism: Creating the next industrial revolution. New York: Little, Brown, and Co.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to Actor-Network-Theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Miller, P., & O’Leary, T. (2007). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Debt, Revenue, and Changing the Way Washington Works: The Greatest Entrepreneurial Opportunity of Our Time

July 30, 2011

“Holding the line” on spending and taxes does not make for a fundamental transformation of the way Washington works. Simply doing less of one thing is just a small quantitative change that does nothing to build positive results or set a new direction. What we need is a qualitative metamorphosis akin to a caterpillar becoming a butterfly. In contrast with this beautiful image of natural processes, the arguments and so-called principles being invoked in the sham debate that’s going on are nothing more than fights over where to put deck chairs on the Titanic.

What sort of transformation is possible? What kind of a metamorphosis will start from who and where we are, but redefine us sustainably and responsibly? As I have repeatedly explained in this blog, my conference presentations, and my publications, with numerous citations of authoritative references, we already possess all of the elements of the transformation. We have only to organize and deploy them. Of course, discerning what the resources are and how to put them together is not obvious. And though I believe we will do what needs to be done when we are ready, it never hurts to prepare for that moment. So here’s another take on the situation.

Infrastructure that supports lean thinking is the name of the game. Lean thinking focuses on identifying and removing waste. Anything that consumes resources but does not contribute to the quality of the end product is waste. We have enormous amounts of wasteful inefficiency in many areas of our economy. These inefficiencies are concentrated in areas in which management is hobbled by low quality information, where we lack the infrastructure we need.

Providing and capitalizing on this infrastructure is The Greatest Entrepreneurial Opportunity of Our Time. Changing the way Washington (ha! I just typed “Wastington”!) works is the same thing as mitigating the sources of risk that caused the current economic situation. Making government behave more like a business requires making the human, social, and natural capital markets more efficient. Making those markets more efficient requires reducing the costs of transactions. Those costs are determined in large part by information quality, which is a function of measurement.

It is often said that the best way to reduce the size of government is to move the functions of government into the marketplace. But this proposal has never been associated with any sense of the infrastructural components needed to really make the idea work. Simply reducing government without an alternative way of performing its functions is irresponsible and destructive. And many of those who rail on and on about how bad or inefficient government is fail to recognize that the government is us. We get the government we deserve. The government we get follows directly from the kind of people we are. Government embodies our image of ourselves as a people. In the US, this is what having a representative form of government means. “We the people” participate in our society’s self-governance not just by voting, writing letters to congress, or demonstrating, but in the way we spend our money, where we choose to live, work, and go to school, and in every decision we make. No one can take a breath of air, a drink of water, or a bite of food without trusting everyone else to not carelessly or maliciously poison them. No one can buy anything or drive down the street without expecting others to behave in predictable ways that ensure order and safety.

But we don’t just trust blindly. We have systems in place to guard against those who would ruthlessly seek to gain at everyone else’s expense. And systems are the point. No individual person or firm, no matter how rich, could afford to set up and maintain the systems needed for checking and enforcing air, water, food, and workplace safety measures. Society as a whole invests in the infrastructure of measures created, maintained, and regulated by the government’s Department of Commerce and the National Institute for Standards and Technology (NIST). The moral importance and the economic value of measurement standards has been stressed historically over many millennia, from the Bible and the Quran to the Magna Carta and the French Revolution to the US Constitution. Uniform weights and measures are universally recognized and accepted as essential to fair trade.

So how is it that we nonetheless apparently expect individuals and local organizations like schools, businesses, and hospitals to measure and monitor students’ abilities; employees’ skills and engagement; patients’ health status, functioning, and quality of care; etc.? Why do we not demand common currencies for the exchange of value in human, social, and natural capital markets? Why don’t we as a society compel our representatives in government to institute the will of the people and create new standards for fair trade in education, health care, social services, and environmental management?

Measuring better is not just a local issue! It is a systemic issue! When measurement is objective and when we all think together in the common language of a shared metric (like hours, volts, inches or centimeters, ounces or grams, degrees Fahrenheit or Celsius, etc.), then and only then do we have the means we need to implement lean strategies and create new efficiencies systematically. We need an Intangible Assets Metric System.

The current recession in large part was caused by failures in measuring and managing trust, responsibility, loyalty, and commitment. Similar problems in measuring and managing human, social, and natural capital have led to endlessly spiraling costs in education, health care, social services, and environmental management. The problems we’re experiencing in these areas are intimately tied up with the way we formulate and implement group level decision making processes and policies based in statistics when what we need is to empower individuals with the tools and information they need to make their own decisions and policies. We will not and cannot metamorphose from caterpillar to butterfly until we create the infrastructure through which we each can take full ownership and control of our individual shares of the human, social, and natural capital stock that is rightfully ours.

We well know that we manage what we measure. What counts gets counted. Attention tends to be focused on what we’re accountable for. But–and this is vitally important–many of the numbers called measures do not provide the information we need for management. And not only are lots of numbers giving us low quality information, there are far too many of them! We could have better and more information from far fewer numbers.

Previous postings in this blog document the fact that we have the intellectual, political, scientific, and economic resources we need to measure and manage human, social, and natural capital for authentic wealth. And the issue is not a matter of marshaling the will. It is hard to imagine how there could be more demand for better management of intangible assets than there is right now. The problem in meeting that demand is a matter of imagining how to start the ball rolling. What configuration of investments and resources will start the process of bursting open the chrysalis? How will the demand for meaningful mediating instruments be met in a way that leads to the spreading of the butterfly’s wings? It is an exciting time to be alive.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Translating Gingrich’s Astute Observations on Health Care

June 30, 2011

“At the very heart of transforming health and healthcare is one simple fact: it will require a commitment by the federal government to invest in science and discovery. The period between investment and profit for basic research is too long for most companies to ever consider making the investment. Furthermore, truly basic research often produces new knowledge that everyone can use, so there is no advantage to a particular company to make the investment. The result is that truly fundamental research is almost always a function of government and foundations because the marketplace discourages focusing research in that direction” (p. 169 in Gingrich, 2003).

Gingrich says this while recognizing (p. 185) that:

“Money needs to be available for highly innovative ‘out of the box’ science. Peer review is ultimately a culturally conservative and risk-averse model. Each institution’s director should have a small amount of discretionary money, possibly 3% to 5% of their budget, to spend on outliers.”

He continues (p. 170), with some important elaborations on the theme:

“America’s economic future is a direct function of our ability to take new scientific research and translate it into entrepreneurial development.”

“The [Hart/Rudman] Commission’s second conclusion was that the failure to invest in scientific research and the failure to reform math and science education was the second largest threat to American security [behind terrorism].”

“Our goal [in the Hart/Rudman Commission] was to communicate the centrality of the scientific endeavor to American life and the depth of crisis we believe threatens the math and science education system. The United States’ ability to lead today is a function of past investments in scientific research and math and science education. There is no reason today to believe we will automatically maintain that lead especially given our current investments in scientific research and the staggering levels of our failures in math and science education.”

“Our ability to lead in 2025 will be a function of current decisions. Increasing our investment in science and discovery is a sound and responsible national security policy. No other federal expenditure will do more to create jobs, grow wealth, strengthen our world leadership, protect our environment, promote better education, or ensure better health for the country. We must make this increase now.”

On p. 171, this essential point is made:

“In health and healthcare, it is particularly important to increase our investment in research.”

This is all good. I agree completely. What NG says is probably more true than he realizes, in four ways.

First, the scientific capital created via metrology, controlled via theory, and embodied in technological instruments is the fundamental driver of any economy. The returns on investments in metrological improvements range from 40% to over 400% (NIST, 1996). We usually think of technology and technical standards in terms of computers, telecommunications, and electronics, but there actually is not anything at all in our lives untouched by metrology, since the air, water, food, clothing, roads, buildings, cars, appliances, etc. are all monitored, maintained, and/or manufactured relative to various kinds of universally uniform standards. NG is, as most people are, completely unaware that such standards are feasible and already under development for health, functionality, quality of life, quality of care, math and science education, etc. Given the huge ROIs associated with metrological improvements, there ought to be proportionately huge investments being made in metrology for human, social, and natural capital.

Second, NG’s point concerning national security is right on the mark, though for reasons that go beyond the ones he gives. There are very good reasons for thinking investments in, and meaningful returns from, the basic science for human, social, and natural capital metrology could be expected to undercut the motivations for terrorism and the retreats into fundamentalisms of various kinds that emerge in the face of the failures of liberal democracy (Marty, 2001). Making all forms of capital measured, managed, and accountable within a common framework accessible to everyone everywhere could be an important contributing factor, emulating the property titling rationale of DeSoto (1989, 2000) and the support for distributed cognition at the social level provided by metrological networks (Latour, 1987, 2005; Magnus, 2007), The costs of measurement can be so high as to stifle whole economies (Barzel, 1982), which is, broadly speaking, the primary problem with the economies of education, health care, social services, philanthropy, and environmental management (see, for instance, regarding philanthropy, Goldberg, 2009). Building the legal and financial infrastructure for low-friction titling and property exchange has become a basic feature of World Bank and IMF projects. My point, ever since I read De Soto, has been that we ought to be doing the same thing for human, social, and natural capital, facilitating explicit ownership of the skills, motivations, health, trust, and environmental resources that are rightfully the property of each of us, and that similar effects on national security ought to follow.

Third, NG makes an excellent point when he stresses the need for health and healthcare to be individual-centered, saying that, in contrast with the 20th-century healthcare system, “In the 21st Century System of Health and Healthcare, you will own your medical record, control your healthcare dollars, and be able to make informed choices about healthcare providers.” This is basically equivalent to saying that health capital needs to be fungible, and it can’t be fungible, of course, without a metrological infrastructure that makes every measure of outcomes, quality of life, etc. traceable to a reference standard. Individual-centeredness is also, of course, what distinguishes proper measurement from statistics. Measurement supports inductive inference, from the individual to the population, where statistics are deductive, going from the population to the individual (Fisher & Burton, 2010; Fisher, 2010). Individual-centered healthcare will never go anywhere without properly calibrated instrumentation and the traceability to reference standards that makes measures meaningful.

Fourth, NG repeatedly indicates how appalled he is at the slow pace of change in healthcare, citing research showing that it can take up to 17 years for doctors to adopt new procedures. I contend that this is an effect of our micromanagement of dead, concrete forms of capital. In a fluid living capital market, not only will consumers be able to reward quality in their purchasing decisions by having the information they need when they need it and in a form they can understand, but the quality improvements will be driven from the provider side in much the same way. As Brent James has shown, readily available, meaningful, and comparable information on natural variation in outcomes makes it much easier for providers to improve results and reduce the variation in them. Despite its central importance and the many years that have passed, however, the state of measurement in health care remains in dire need of dramatic improvement. Fryback (1993, p. 271; also see Kindig, 1999) succinctly put the point, observing that the U.S.

“health care industry is a $900 + billion [over $2.5 trillion in 2009 (CMS, 2011] endeavor that does not know how to measure its main product: health. Without a good measure of output we cannot truly optimize efficiency across the many different demands on resources.”

Quantification in health care is almost universally approached using methods inadequate to the task, resulting in ordinal and scale-dependent scores that cannot take advantage of the objective comparisons provided by invariant, individual-level measures (Andrich, 2004). Though data-based statistical studies informing policy have their place, virtually no effort or resources have been invested in developing individual-level instruments traceable to universally uniform metrics that define the outcome products of health care. These metrics are key to efficiently harmonizing quality improvement, diagnostic, and purchasing decisions and behaviors in the manner described by Berwick, James, and Coye (2003) without having to cumbersomely communicate the concrete particulars of locally-dependent scores (Heinemann, Fisher, & Gershon, 2006). Metrologically-based common product definitions will finally make it possible for quality improvement experts to implement analogues of the Toyota Production System in healthcare, long presented as a model but never approached in practice (Coye, 2001).

So, what does all of this add up to? A new division for human, social, and natural capital in NIST is in order, with extensive involvement from NIH, CMS, AHRQ, and other relevant agencies. Innovative measurement methods and standards are the “out of the box” science NG refers to. Providing these tools is the definitive embodiment of an appropriate role for government. These are the kinds of things that we could have a productive conversation with NG about, it seems to me….

References

 Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Barzel, Y. (1982). Measurement costs and the organization of markets. Journal of Law and Economics, 25, 27-48.

Berwick, D. M., James, B., & Coye, M. J. (2003, January). Connections between quality measurement and improvement. Medical Care, 41(1 (Suppl)), I30-38.

Centers for Medicare and Medicaid Services. (2011). National health expenditure data: NHE fact sheet. Retrieved 30 June 2011, from https://www.cms.gov/NationalHealthExpendData/25_NHE_Fact_Sheet.asp.

Coye, M. J. (2001, November/December). No Toyotas in health care: Why medical care has not evolved to meet patients’ needs. Health Affairs, 20(6), 44-56.

De Soto, H. (1989). The other path: The economic answer to terrorism. New York: Basic Books.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Fisher, W. P., Jr. (2010). Statistics and measurement: Clarifying the differences. Rasch Measurement Transactions, 23(4), 1229-1230 [http://www.rasch.org/rmt/rmt234.pdf].

Fisher, W. P., Jr., & Burton, E. (2010). Embedding measurement within existing computerized data systems: Scaling clinical laboratory and medical records heart failure data to predict ICU admission. Journal of Applied Measurement, 11(2), 271-287.

Fryback, D. (1993). QALYs, HYEs, and the loss of innocence. Medical Decision Making, 13(4), 271-2.

Gingrich, N. (2008). Real change: From the world that fails to the world that works. Washington, DC: Regnery Publishing.

Goldberg, S. H. (2009). Billions of drops in millions of buckets: Why philanthropy doesn’t advance social progress. New York: Wiley.

Heinemann, A. W., Fisher, W. P., Jr., & Gershon, R. (2006). Improving health care quality with outcomes management. Journal of Prosthetics and Orthotics, 18(1), 46-50 [http://www.oandp.org/jpo/library/2006_01S_046.asp].

Kindig, D. A. (1997). Purchasing population health. Ann Arbor, Michigan: University of Michigan Press.

Kindig, D. A. (1999). Purchasing population health: Aligning financial incentives to improve health outcomes. Nursing Outlook, 47, 15-22.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to Actor-Network-Theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Magnus, P. D. (2007). Distributed cognition and the task of science. Social Studies of Science, 37(2), 297-310.

Marty, M. (2001). Why the talk of spirituality today? Some partial answers. Second Opinion, 6, 53-64.

Marty, M., & Appleby, R. S. (Eds.). (1993). Fundamentalisms and society: Reclaiming the sciences, the family, and education. The fundamentalisms project, vol. 2. Chicago: University of Chicago Press.

National Institute for Standards and Technology. (1996). Appendix C: Assessment examples. Economic impacts of research in metrology. In Committee on Fundamental Science, Subcommittee on Research (Ed.), Assessing fundamental science: A report from the Subcommittee on Research, Committee on Fundamental Science. Washington, DC: National Standards and Technology Council

[http://www.nsf.gov/statistics/ostp/assess/nstcafsk.htm#Topic%207; last accessed 30 June 2011].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

A New Agenda for Measurement Theory and Practice in Education and Health Care

April 15, 2011

Two key issues on my agenda offer different answers to the question “Why do you do things the way you do in measurement theory and practice?”

First, we can take up the “Because of…” answer to this question. We need to articulate an historical account of measurement that does three things:

  1. that builds on Rasch’s use of Maxwell’s method of analogy by employing it and expanding on it in new applications;
  2. that unifies the vocabulary and concepts of measurement across the sciences into a single framework so far as possible by situating probabilistic models of invariant individual-level within-variable phenomena in the context of measurement’s GIGO principle and data-to-model fit, as distinct from the interactions of group-level between-variable phenomena in the context of statistics’ model-to-data fit; and
  3. that stresses the social, collective cognition facilitated by networks of individuals whose point-of-use measurement-informed decisions and behaviors are coordinated and harmonized virtually, at a distance, with no need for communication or negotiation.

We need multiple publications in leading journals on these issues, as well as one or more books that people can cite as a way of making this real and true history of measurement, properly speaking, credible and accepted in the mainstream. This web site http://ssrn.com/abstract=1698919 is a draft article of my own in this vein that I offer for critique; other material is available on request. Anyone who works on this paper with me and makes a substantial contribution to its publication will be added as co-author.

Second, we can take up the “In order that…” answer to the question “Why do you do things the way you do?” From this point of view, we need to broaden the scope of the measurement research agenda beyond data analysis, estimation, models, and fit assessment in three ways:

  1. by emphasizing predictive construct theories that exhibit the fullest possible understanding of what is measured and so enable the routine reproduction of desired proportionate effects efficiently, with no need to analyze data to obtain an estimate;
  2. by defining the standard units to which all calibrated instruments measuring given constructs are traceable; and
  3. by disseminating to front line users on mass scales instruments measuring in publicly available standard units and giving immediate feedback at the point of use.

These two sets of issues define a series of talking points that together constitute a new narrative for measurement in education, psychology, health care, and many other fields. We and others may see our way to organizing new professional societies, new journals, new university-based programs of study, etc. around these principles.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

A Simple Example of How Better Measurement Creates New Market Efficiencies, Reduces Transaction Costs, and Enables the Pricing of Intangible Assets

March 4, 2011

One of the ironies of life is that we often overlook the obvious in favor of the obscure. And so one hears of huge resources poured into finding and capitalizing on opportunities that provide infinitesimally small returns, while other opportunities—with equally certain odds of success but far more profitable returns—are completely neglected.

The National Institute for Standards and Technology (NIST) reports returns on investment ranging from 32% to over 400% in 32 metrological improvements made in semiconductors, construction, automation, computers, materials, manufacturing, chemicals, photonics, communications and pharmaceuticals (NIST, 2009). Previous posts in this blog offer more information on the economic value of metrology. The point is that the returns obtained from improvements in the measurement of tangible assets will likely also be achieved in the measurement of intangible assets.

How? With a little bit of imagination, each stage in the development of increasingly meaningful, efficient, and useful measures described in this previous post can be seen as implying a significant return on investment. As those returns are sought, investors will coordinate and align different technologies and resources relative to a roadmap of how these stages are likely to unfold in the future, as described in this previous post. The basic concepts of how efficient and meaningful measurement reduces transaction costs and market frictions, and how it brings capital to life, are explained and documented in my publications (Fisher, 2002-2011), but what would a concrete example of the new value created look like?

The examples I have in mind hinge on the difference between counting and measuring. Counting is a natural and obvious thing to do when we need some indication of how much of something there is. But counting is not measuring (Cooper & Humphry, 2010; Wright, 1989, 1992, 1993, 1999). This is not some minor academic distinction of no practical use or consequence. It is rather the source of the vast majority of the problems we have in comparing outcome and performance measures.

Imagine how things would be if we couldn’t weigh fruit in a grocery store, and all we could do was count pieces. We can tell when eight small oranges possess less overall mass of fruit than four large ones by weighing them; the eight small oranges might weigh .75 kilograms (about 1.6 pounds) while the four large ones come in at 1.0 kilo (2.2 pounds). If oranges were sold by count instead of weight, perceptive traders would buy small oranges and make more money selling them than they could if they bought large ones.

But we can’t currently arrive so easily at the comparisons we need when we’re buying and selling intangible assets, like those produced as the outcomes of educational, health care, or other services. So I want to walk through a couple of very down-to-earth examples to bring the point home. Today we’ll focus on the simplest version of the story, and tomorrow we’ll take up a little more complicated version, dealing with the counts, percentages, and scores used in balanced scorecard and dashboard metrics of various kinds.

What if you score eight on one reading test and I score four on a different reading test? Who has more reading ability? In the same way that we might be able to tell just by looking that eight small oranges are likely to have less actual orange fruit than four big ones, we might also be able to tell just by looking that eight easy (short, common) words can likely be read correctly with less reading ability than four difficult (long, rare) words can be.

So let’s analyze the difference between buying oranges and buying reading ability. We’ll set up three scenarios for buying reading ability. In all three, we’ll imagine we’re comparing how we buy oranges with the way we would have to go about buying reading ability today if teachers were paid for the gains made on the tests they administer at the beginning and end of the school year.

In the first scenario, the teachers make up their own tests. In the second, the teachers each use a different standardized test. In the third, each teacher uses a computer program that draws questions from the same online bank of precalibrated items to construct a unique test custom tailored to each student. Reading ability scenario one is likely the most commonly found in real life. Scenario three is the rarest, but nonetheless describes a situation that has been available to millions of students in the U.S., Australia, and elsewhere for several years. Scenarios one, two and three correspond with developmental levels one, three, and five described in a previous blog entry.

Buying Oranges

When you go into one grocery store and I go into another, we don’t have any oranges with us. When we leave, I have eight and you have four. I have twice as many oranges as you, but yours weigh a kilo, about a third more than mine (.75 kilos).

When we paid for the oranges, the transaction was finished in a few seconds. Neither one of us experienced any confusion, annoyance, or inconvenience in relation to the quality of information we had on the amount of orange fruits we were buying. I did not, however, pay twice as much as you did. In fact, you paid more for yours than I did for mine, in direct proportion to the difference in the measured amounts.

No negotiations were necessary to consummate the transactions, and there was no need for special inquiries about how much orange we were buying. We knew from experience in this and other stores that the prices we paid were comparable with those offered in other times and places. Our information was cheap, as it was printed on the bag of oranges or could be read off a scale, and it was very high quality, as the measures were directly comparable with measures from any other scale in any other store. So, in buying oranges, the impact of information quality on the overall cost of the transaction was so inexpensive as to be negligible.

Buying Reading Ability (Scenario 1)

So now you and I go through third grade as eight year olds. You’re in one school and I’m in another. We have different teachers. Each teacher makes up his or her own reading tests. When we started the school year, we each took a reading test (different ones), and we took another (again, different ones) as we ended the school year.

For each test, your teacher counted up your correct answers and divided by the total number of questions; so did mine. You got 72% correct on the first one, and 94% correct on the last one. I got 83% correct on the first one, and 86% correct on the last one. Your score went up 22%, much more than the 3% mine went up. But did you learn more? It is impossible to tell. What if both of your tests were easier—not just for you or for me but for everyone—than both of mine? What if my second test was a lot harder than my first one? On the other hand, what if your tests were harder than mine? Perhaps you did even better than your scores seem to indicate.

We’ll just exclude from consideration other factors that might come to bear, such as whether your tests were significantly longer or shorter than mine, or if one of us ran out of time and did not answer a lot of questions.

If our parents had to pay the reading teacher at the end of the school year for the gains that were made, how would they tell what they were getting for their money? What if your teacher gave a hard test at the start of the year and an easy one at the end of the year so that you’d have a big gain and your parents would have to pay more? What if my teacher gave an easy test at the start of the year and a hard one at the end, so that a really high price could be put on very small gains? If our parents were to compare their experiences in buying our improved reading ability, they would have a lot of questions about how much improvement was actually obtained. They would be confused and annoyed at how inconvenient the scores are, because they are difficult, if not impossible, to compare. A lot of time and effort might be invested in examining the words and sentences in each of the four reading tests to try to determine how easy or hard they are in relation to each other. Or, more likely, everyone would throw their hands up and pay as little as they possibly can for outcomes they don’t understand.

Buying Reading Ability (Scenario 2)

In this scenario, we are third graders again, in different schools with different reading teachers. Now, instead of our teachers making up their own tests, our reading abilities are measured at the beginning and the end of the school year using two different standardized tests sold by competing testing companies. You’re in a private suburban school that’s part of an independent schools association. I’m in a public school along with dozens of others in an urban school district.

For each test, our parents received a report in the mail showing our scores. As before, we know how many questions we each answered correctly, and, unlike before, we don’t know which particular questions we got right or wrong. Finally, we don’t know how easy or hard your tests were relative to mine, but we know that the two tests you took were equated, and so were the two I took. That means your tests will show how much reading ability you gained, and so will mine.

We have one new bit of information we didn’t have before, and that’s a percentile score. Now we know that at the beginning of the year, with a percentile ranking of 72, you performed better than 72% of the other private school third graders taking this test, and at the end of the year you performed better than 76% of them. In contrast, I had percentiles of 84 and 89.

The question we have to ask now is if our parents are going to pay for the percentile gain, or for the actual gain in reading ability. You and I each learned more than our peers did on average, since our percentile scores went up, but this would not work out as a satisfactory way to pay teachers. Averages being averages, if you and I learned more and faster, someone else learned less and slower, so that, in the end, it all balances out. Are we to have teachers paying parents when their children learn less, simply redistributing money in a zero sum game?

And so, additional individualized reports are sent to our parents by the testing companies. Your tests are equated with each other, and they measure in a comparable unit that ranges from 120 to 480. You had a starting score of 235 and finished the year with a score of 420, for a gain of 185.

The tests I took are comparable and measure in the same unit, too, but not the same unit as your tests measure in. Scores on my tests range from 400 to 1200. I started the year with a score of 790, and finished at 1080, for a gain of 290.

Now the confusion in the first scenario is overcome, in part. Our parents can see that we each made real gains in reading ability. The difficulty levels of the two tests you took are the same, as are the difficulties of the two tests I took. But our parents still don’t know what to pay the teacher because they can’t tell if you or I learned more. You had lower percentiles and test scores than I did, but you are being compared with what is likely a higher scoring group of suburban and higher socioeconomic status students than the urban group of disadvantaged students I’m compared against. And your scores aren’t comparable with mine, so you might have started and finished with more reading ability than I did, or maybe I had more than you. There isn’t enough information here to tell.

So, again, the information that is provided is insufficient to the task of settling on a reasonable price for the outcomes obtained. Our parents will again be annoyed and confused by the low quality information that makes it impossible to know what to pay the teacher.

Buying Reading Ability (Scenario 3)

In the third scenario, we are still third graders in different schools with different reading teachers. This time our reading abilities are measured by tests that are completely unique. Every student has a test custom tailored to their particular ability. Unlike the tests in the first and second scenarios, however, now all of the tests have been constructed carefully on the basis of extensive data analysis and experimental tests. Different testing companies are providing the service, but they have gone to the trouble to work together to create consensus standards defining the unit of measurement for any and all reading test items.

For each test, our parents received a report in the mail showing our measures. As before, we know how many questions we each answered correctly. Now, though we don’t know which particular questions we got right or wrong, we can see typical items ordered by difficulty lined up in a way that shows us what kind of items we got wrong, and which kind we got right. And now we also know your tests were equated relative to mine, so we can compare how much reading ability you gained relative to how much I gained. Now our parents can confidently determine how much they should pay the teacher, at least in proportion to their children’s relative measures. If our measured gains are equal, the same payment can be made. If one of us obtained more value, then proportionately more should be paid.

In this third scenario, we have a situation directly analogous to buying oranges. You have a measured amount of increased reading ability that is expressed in the same unit as my gain in reading ability, just as the weights of the oranges are comparable. Further, your test items were not identical with mine, and so the difficulties of the items we took surely differed, just as the sizes of the oranges we bought did.

This third scenario could be made yet more efficient by removing the need for creating and maintaining a calibrated item bank, as described by Stenner and Stone (2003) and in the sixth developmental level in a prior blog post here. Also, additional efficiencies could be gained by unifying the interpretation of the reading ability measures, so that progress through high school can be tracked with respect to the reading demands of adult life (Williamson, 2008).

Comparison of the Purchasing Experiences

In contrast with the grocery store experience, paying for increased reading ability in the first scenario is fraught with low quality information that greatly increases the cost of the transactions. The information is of such low quality that, of course, hardly anyone bothers to go to the trouble to try to decipher it. Too much cost is associated with the effort to make it worthwhile. So, no one knows how much gain in reading ability is obtained, or what a unit gain might cost.

When a school district or educational researchers mount studies to try to find out what it costs to improve reading ability in third graders in some standardized unit, they find so much unexplained variation in the costs that they, too, raise more questions than answers.

In grocery stores and other markets, we don’t place the cost of making the value comparison on the consumer or the merchant. Instead, society as a whole picks up the cost by funding the creation and maintenance of consensus standard metrics. Until we take up the task of doing the same thing for intangible assets, we cannot expect human, social, and natural capital markets to obtain the efficiencies we take for granted in markets for tangible assets and property.

References

Cooper, G., & Humphry, S. M. (2010). The ontological distinction between units and entities. Synthese, pp. DOI 10.1007/s11229-010-9832-1.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2003). Measurement and communities of inquiry. Rasch Measurement Transactions, 17(3), 936-8 [http://www.rasch.org/rmt/rmt173.pdf].

Fisher, W. P., Jr. (2004, October). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-54.

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [http://www.livingcapitalmetrics.com/images/FisherJAM05.pdf].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009b). NIST Critical national need idea White Paper: Metrological infrastructure for human, social, and natural capital (Tech. Rep., http://www.livingcapitalmetrics.com/images/FisherNISTWhitePaper2.pdf). New Orleans: LivingCapitalMetrics.com.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), in press.

NIST. (2009, 20 July). Outputs and outcomes of NIST laboratory research. Available: http://www.nist.gov/director/planning/studies.cfm (Accessed 1 March 2011).

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Williamson, G. L. (2008). A text readability continuum for postsecondary readiness. Journal of Advanced Academics, 19(4), 602-632.

Wright, B. D. (1989). Rasch model from counting right answers: Raw scores as sufficient statistics. Rasch Measurement Transactions, 3(2), 62 [http://www.rasch.org/rmt/rmt32e.htm].

Wright, B. D. (1992, Summer). Scores are not measures. Rasch Measurement Transactions, 6(1), 208 [http://www.rasch.org/rmt/rmt61n.htm].

Wright, B. D. (1993). Thinking with raw scores. Rasch Measurement Transactions, 7(2), 299-300 [http://www.rasch.org/rmt/rmt72r.htm].

Wright, B. D. (1999). Common sense for measurement. Rasch Measurement Transactions, 13(3), 704-5  [http://www.rasch.org/rmt/rmt133h.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

 

One of the ironies of life is that we often overlook the obvious in favor of the obscure. And so one hears of huge resources poured into finding and capitalizing on opportunities that provide infinitesimally small returns, while other opportunities—with equally certain odds of success but far more profitable returns—are completely neglected.

The National Institute for Standards and Technology (NIST) reports returns on investment ranging from 32% to over 400% in 32 metrological improvements made in semiconductors, construction, automation, computers, materials, manufacturing, chemicals, photonics, communications and pharmaceuticals (NIST, 2009). Previous posts in this blog offer more information on the economic value of metrology. The point is that the returns obtained from improvements in the measurement of tangible assets will likely also be achieved in the measurement of intangible assets.

How? With a little bit of imagination, each stage in the development of increasingly meaningful, efficient, and useful measures described in this previous post can be seen as implying a significant return on investment. As those returns are sought, investors will coordinate and align different technologies and resources relative to a roadmap of how these stages are likely to unfold in the future, as described in this previous post. But what would a concrete example of the new value created look like?

The examples I have in mind hinge on the difference between counting and measuring. Counting is a natural and obvious thing to do when we need some indication of how much of something there is. But counting is not measuring (Cooper & Humphry, 2010; Wright, 1989, 1992, 1993, 1999). This is not some minor academic distinction of no practical use or consequence. It is rather the source of the vast majority of the problems we have in comparing outcome and performance measures.

Imagine how things would be if we couldn’t weigh fruit in a grocery store, and all we could do was count pieces. We can tell when eight small oranges possess less overall mass of fruit than four large ones by weighing them; the eight small oranges might weigh .75 kilograms (about 1.6 pounds) while the four large ones come in at 1.0 kilo (2.2 pounds). If oranges were sold by count instead of weight, perceptive traders would buy small oranges and make more money selling them than they could if they bought large ones.

But we can’t currently arrive so easily at the comparisons we need when we’re buying and selling intangible assets, like those produced as the outcomes of educational, health care, or other services. So I want to walk through a couple of very down-to-earth examples to bring the point home. Today we’ll focus on the simplest version of the story, and tomorrow we’ll take up a little more complicated version, dealing with the counts, percentages, and scores used in balanced scorecard and dashboard metrics of various kinds.

What if you score eight on one reading test and I score four on a different reading test? Who has more reading ability? In the same way that we might be able to tell just by looking that eight small oranges are likely to have less actual orange fruit than four big ones, we might also be able to tell just by looking that eight easy (short, common) words can likely be read correctly with less reading ability than four difficult (long, rare) words can be.

So let’s analyze the difference between buying oranges and buying reading ability. We’ll set up three scenarios for buying reading ability. In all three, we’ll imagine we’re comparing how we buy oranges with the way we would have to go about buying reading ability today if teachers were paid for the gains made on the tests they administer at the beginning and end of the school year.

In the first scenario, the teachers make up their own tests. In the second, the teachers each use a different standardized test. In the third, each teacher uses a computer program that draws questions from the same online bank of precalibrated items to construct a unique test custom tailored to each student. Reading ability scenario one is likely the most commonly found in real life. Scenario three is the rarest, but nonetheless describes a situation that has been available to millions of students in the U.S., Australia, and elsewhere for several years. Scenarios one, two and three correspond with developmental levels one, three, and five described in a previous blog entry.

Buying Oranges

When you go into one grocery store and I go into another, we don’t have any oranges with us. When we leave, I have eight and you have four. I have twice as many oranges as you, but yours weigh a kilo, about a third more than mine (.75 kilos).

When we paid for the oranges, the transaction was finished in a few seconds. Neither one of us experienced any confusion, annoyance, or inconvenience in relation to the quality of information we had on the amount of orange fruits we were buying. I did not, however, pay twice as much as you did. In fact, you paid more for yours than I did for mine, in direct proportion to the difference in the measured amounts.

No negotiations were necessary to consummate the transactions, and there was no need for special inquiries about how much orange we were buying. We knew from experience in this and other stores that the prices we paid were comparable with those offered in other times and places. Our information was cheap, as it was printed on the bag of oranges or could be read off a scale, and it was very high quality, as the measures were directly comparable with measures from any other scale in any other store. So, in buying oranges, the impact of information quality on the overall cost of the transaction was so inexpensive as to be negligible.

Buying Reading Ability (Scenario 1)

So now you and I go through third grade as eight year olds. You’re in one school and I’m in another. We have different teachers. Each teacher makes up his or her own reading tests. When we started the school year, we each took a reading test (different ones), and we took another (again, different ones) as we ended the school year.

For each test, your teacher counted up your correct answers and divided by the total number of questions; so did mine. You got 72% correct on the first one, and 94% correct on the last one. I got 83% correct on the first one, and 86% correct on the last one. Your score went up 22%, much more than the 3% mine went up. But did you learn more? It is impossible to tell. What if both of your tests were easier—not just for you or for me but for everyone—than both of mine? What if my second test was a lot harder than my first one? On the other hand, what if your tests were harder than mine? Perhaps you did even better than your scores seem to indicate.

We’ll just exclude from consideration other factors that might come to bear, such as whether your tests were significantly longer or shorter than mine, or if one of us ran out of time and did not answer a lot of questions.

If our parents had to pay the reading teacher at the end of the school year for the gains that were made, how would they tell what they were getting for their money? What if your teacher gave a hard test at the start of the year and an easy one at the end of the year so that you’d have a big gain and your parents would have to pay more? What if my teacher gave an easy test at the start of the year and a hard one at the end, so that a really high price could be put on very small gains? If our parents were to compare their experiences in buying our improved reading ability, they would have a lot of questions about how much improvement was actually obtained. They would be confused and annoyed at how inconvenient the scores are, because they are difficult, if not impossible, to compare. A lot of time and effort might be invested in examining the words and sentences in each of the four reading tests to try to determine how easy or hard they are in relation to each other. Or, more likely, everyone would throw their hands up and pay as little as they possibly can for outcomes they don’t understand.

Buying Reading Ability (Scenario 2)

In this scenario, we are third graders again, in different schools with different reading teachers. Now, instead of our teachers making up their own tests, our reading abilities are measured at the beginning and the end of the school year using two different standardized tests sold by competing testing companies. You’re in a private suburban school that’s part of an independent schools association. I’m in a public school along with dozens of others in an urban school district.

For each test, our parents received a report in the mail showing our scores. As before, we know how many questions we each answered correctly, and, as before, we don’t know which particular questions we got right or wrong. Finally, we don’t know how easy or hard your tests were relative to mine, but we know that the two tests you took were equated, and so were the two I took. That means your tests will show how much reading ability you gained, and so will mine.

But we have one new bit of information we didn’t have before, and that’s a percentile score. Now we know that at the beginning of the year, with a percentile ranking of 72, you performed better than 72% of the other private school third graders taking this test, and at the end of the year you performed better than 76% of them. In contrast, I had percentiles of 84 and 89.

The question we have to ask now is if our parents are going to pay for the percentile gain, or for the actual gain in reading ability. You and I each learned more than our peers did on average, since our percentile scores went up, but this would not work out as a satisfactory way to pay teachers. Averages being averages, if you and I learned more and faster, someone else learned less and slower, so that, in the end, it all balances out. Are we to have teachers paying parents when their children learn less, simply redistributing money in a zero sum game?

And so, additional individualized reports are sent to our parents by the testing companies. Your tests are equated with each other, so they measure in a comparable unit that ranges from 120 to 480. You had a starting score of 235 and finished the year with a score of 420, for a gain of 185.

The tests I took are comparable and measure in the same unit, too, but not the same unit as your tests measure in. Scores on my tests range from 400 to 1200. I started the year with a score of 790, and finished at 1080, for a gain of 290.

Now the confusion in the first scenario is overcome, in part. Our parents can see that we each made real gains in reading ability. The difficulty levels of the two tests you took are the same, as are the difficulties of the two tests I took. But our parents still don’t know what to pay the teacher because they can’t tell if you or I learned more. You had lower percentiles and test scores than I did, but you are being compared with what is likely a higher scoring group of suburban and higher socioeconomic status students than the urban group of disadvantaged students I’m compared against. And your scores aren’t comparable with mine, so you might have started and finished with more reading ability than I did, or maybe I had more than you. There isn’t enough information here to tell.

So, again, the information that is provided is insufficient to the task of settling on a reasonable price for the outcomes obtained. Our parents will again be annoyed and confused by the low quality information that makes it impossible to know what to pay the teacher.

Buying Reading Ability (Scenario 3)

In the third scenario, we are still third graders in different schools with different reading teachers. This time our reading abilities are measured by tests that are completely unique. Every student has a test custom tailored to their particular ability. Unlike the tests in the first and second scenarios, however, now all of the tests have been constructed carefully on the basis of extensive data analysis and experimental tests. Different testing companies are providing the service, but they have gone to the trouble to work together to create consensus standards defining the unit of measurement for any and all reading test items.

For each test, our parents received a report in the mail showing our measures. As before, we know how many questions we each answered correctly. Now, though we don’t know which particular questions we got right or wrong, we can see typical items ordered by difficulty lined up in a way that shows us what kind of items we got wrong, and which kind we got right. And now we also know your tests were equated relative to mine, so we can compare how much reading ability you gained relative to how much I gained. Now our parents can confidently determine how much they should pay the teacher, at least in proportion to their children’s relative measures. If our measured gains are equal, the same payment can be made. If one of us obtained more value, then proportionately more should be paid.

In this third scenario, we have a situation directly analogous to buying oranges. You have a measured amount of increased reading ability that is expressed in the same unit as my gain in reading ability, just as the weights of the oranges are comparable. Further, your test items were not identical with mine, and so the difficulties of the items we took surely differed, just as the sizes of the oranges we bought did.

This third scenario could be made yet more efficient by removing the need for creating and maintaining a calibrated item bank, as described by Stenner and Stone (2003) and in the sixth developmental level in a prior blog post here. Also, additional efficiencies could be gained by unifying the interpretation of the reading ability measures, so that progress through high school can be tracked with respect to the reading demands of adult life (Williamson, 2008).

Comparison of the Purchasing Experiences

In contrast with the grocery store experience, paying for increased reading ability in the first scenario is fraught with low quality information that greatly increases the cost of the transactions. The information is of such low quality that, of course, hardly anyone bothers to go to the trouble to try to decipher it. Too much cost is associated with the effort to make it worthwhile. So, no one knows how much gain in reading ability is obtained, or what a unit gain might cost.

When a school district or educational researchers mount studies to try to find out what it costs to improve reading ability in third graders in some standardized unit, they find so much unexplained variation in the costs that they, too, raise more questions than answers.

But we don’t place the cost of making the value comparison on the consumer or the merchant in the grocery store. Instead, society as a whole picks up the cost by funding the creation and maintenance of consensus standard metrics. Until we take up the task of doing the same thing for intangible assets, we cannot expect human, social, and natural capital markets to obtain the efficiencies we take for granted in markets for tangible assets and property.

References

Cooper, G., & Humphry, S. M. (2010). The ontological distinction between units and entities. Synthese, pp. DOI 10.1007/s11229-010-9832-1.

NIST. (2009, 20 July). Outputs and outcomes of NIST laboratory research. Available: http://www.nist.gov/director/planning/studies.cfm (Accessed 1 March 2011).

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Williamson, G. L. (2008). A text readability continuum for postsecondary readiness. Journal of Advanced Academics, 19(4), 602-632.

Wright, B. D. (1989). Rasch model from counting right answers: Raw scores as sufficient statistics. Rasch Measurement Transactions, 3(2), 62 [http://www.rasch.org/rmt/rmt32e.htm].

Wright, B. D. (1992, Summer). Scores are not measures. Rasch Measurement Transactions, 6(1), 208 [http://www.rasch.org/rmt/rmt61n.htm].

Wright, B. D. (1993). Thinking with raw scores. Rasch Measurement Transactions, 7(2), 299-300 [http://www.rasch.org/rmt/rmt72r.htm].

Wright, B. D. (1999). Common sense for measurement. Rasch Measurement Transactions, 13(3), 704-5  [http://www.rasch.org/rmt/rmt133h.htm].