Archive for the ‘measurement’ Category

What is the point of sustainability impact investing?

June 10, 2018

What if the sustainability impact investing problem is not just a matter of judiciously supporting business policies and practices likely to enhance the long term viability of life on earth? What if the sustainability impact investing problem is better conceived in terms of how to create markets that function as self-sustaining ecosystems of diverse forms of economic life?

The crux of the sustainability problem from this living capital metrics point of view is how to create efficient markets for virtuous cycles of productive value creation in the domains of human, social, and natural capital. Mainstream economics deems this an impossible task because its definition of measurement makes trade in these forms of capital unethical and immoral forms of slavery.

But what if there is another approach to measurement? What if this alternative approach is scientific in ways unimagined in mainstream economics? What if this alternative approach has been developing in research and practice in education, psychology, health care, sociology, and other fields for over 90 years? What if there are thousands of peer-reviewed publications supporting its validity and reliability? What if a wide range of commercial firms have been successfully employing this alternative approach to measurement for decades? What if this alternative approach has been found legally and scientifically defensible in ways other approaches have not? What if this alternative approach enables us to be better stewards of our lives together than is otherwise possible?

Put another way, measuring and managing sustainability is fundamentally a problem of harmonizing relationships. What do we need to harmonize our relationships with each other, between our communities and nations, and with the earth? How can we achieve harmonization without forcing conformity to one particular scale? How can we tune the instruments of a sustainability art and science to support as wide a range of diverse ensembles and harmonies as exists in music?

Positive and hopeful answers to these questions follow from the fact that we have at our disposal a longstanding, proven, and advanced art and science of qualitatively rich measurement and instrument calibration. The crux of the message is that this art and science is poised to be the medium in which sustainability impact investing and management fulfills its potential and transforms humanity’s capacities to care for itself and the earth.

The differences between the quality of information that is available, and the quality of information currently in use in sustainability impact investing, are of such huge magnitudes that they can only be called transformative. Love and care are the power behind these transformative differences. Choosing discourse over violence, considerateness for the vulnerabilities we share with others, and care for the unity and sameness of meaning in dialogue are all essential to learning the lesson Diotima taught Socrates in Plato’s Symposium. These lessons can all be brought to bear in creating the information and communications systems we need for sustainable economies.

The current world of sustainability impact investing’s so-called metrics lead to widespread complaints of increased administrative and technical burdens, and resulting distractions that lead away from pursuit of the core social mission. The maxim, “you manage what you measure,” becomes a cynical commentary on red tape and bureaucracy instead of a commendable use of tools fit for purpose.

In contrast with the cumbersome and uninterpretable masses of data that pass for sustainability metrics today, the art and science of measurement establishes the viability and feasibility of efficient markets for human, social, and natural capital. Instead of counting paper clips in mindless accounting exercises, we can instead be learning what comes next in the formative development of a student, a patient, an employee, a firm, a community, or the ecosystem services of watersheds, forests, and fisheries.

And we can moreover support success in those developments by means of information flows that indicate where the biggest per-dollar human, social, and natural capital value returns accrue. Rigorous measurability of those returns will make it possible to price them, to own them, to make property rights legally enforceable, and to thereby align financial profits with the creation of social value. In fact, we could and should set things up so that it will be impossible to financially profit without creating social value. When that kind of system of incentives and rewards is instituted, then the self-sustaining virtuous cycle of a new ecological economy will come to life.

Though the value and originality of the innovations making this new medium possible are huge, in the end there’s really nothing new under the sun. As the French say, “plus ça change, plus c’est la même chose.” Or, as Whitehead put it, philosophically, the innovations in measurement taking hold in the world today are nothing more than additional footnotes to Plato. Contrary to both popular and most expert opinion, it turns out that not only is a moral and scientific art of human measurement possible, Plato’s lessons on how experiences of beauty teach us about meaning provide what may well turn out to be the only way today’s problems of human suffering, social discontent, and environmental degradation will be successfully addressed.

We are faced with a kind of Chinese finger-puzzle: the more we struggle, the more trapped we become. Relaxing into the problem and seeing the historical roots of scientific reasoning in everyday thinking opens our eyes to a new path. Originality is primarily a matter of finding a useful model no one else has considered. A long history of innovations come together to point in a new direction plainly recognizable as a variation on an old theme.

Instead of a modern focus on data and evidence, then, and instead of the postmodern focus on the theory-dependence of data, we are free to take an unmodern focus on how things come into language. The chaotic complexity of that process becomes manageable as we learn to go with the flow of adaptive evolving processes stable enough to support meaningful communication. Information infrastructures in this linguistic context are conceived as ecosystems alive to changeable local situations at the same time they do not compromise continuity and navigability.

We all learn through what we already know, so it is essential that we begin from where we are at. Our first lessons will then be drawn from existing sustainability impact data, using the UN SDG 17 as a guide. These data were not designed from the principles of scientifically rigorous measurement, but instead assume that separately aggregated counts of events, percentages, and physical measures of volume, mass, or time will suffice as measures of sustainability. Things that are easy to count are not, however, likely to work as satisfactory measures. We need to learn from the available data to think again about what data are necessary and sufficient to the task.

The lessons we will learn from the data available today will lead to more meaningful and rigorous measures of sustainability. Connecting these instruments together by making them metrologically traceable to standard units, while also illuminating local unique data patterns, in widely accessible multilevel information infrastructures is the way in which we will together work the ground, plant the seeds, and cultivate new diverse natural settings for innovating sustainable relationships.



Measurement and markets

June 3, 2018

My response to a request for discussion topic suggestions from Alain Leplege and David Andrich to be taken up at the Rasch Expert Group meeting in Paris on 26 June:

The role of measurement in reducing economic transaction costs and in establishing legal property rights is well established. The value and importance of measurement is stressed everywhere, in all fields. But where measuring physical, chemical, and biological variables contributes to lower transaction costs and defensible property rights, measuring psychological, social, and environmental variables increases administrative and technical burdens with no impact at all on property rights. Why is this?

Furthermore, when physical, chemical, and biological variables are objectively measurable, no one develops their own instruments, units, internal measurement systems, or the things measured by those systems. Instead, they purchase those tools and products in open markets.

But when psychological, social, and environmental variables are objectively measured, as they have been for many decades, everyone still assumes they must develop their own instruments, units, internal measurement systems, and the things measured by those systems, instead of purchasing those tools and products in open markets. Why is this?

I propose that the answers to both these questions follow from two widely assumed misconceptions about markets and measurement.

The first misconception concerns how markets are formed. As explained by Miller and O’Leary (2007, p. 721):

“Markets are not spontaneously generated by the exchange activity of buyers and sellers. Rather, skilled actors produce institutional arrangements, the rules, roles and relationships that make market exchange possible. The institutions define the market, rather than the reverse.“

North (1981, pp. 18-19, 36), one of the founders of the new institutional economics, elaborates further:

“…without some form of measurement, property rights cannot be established nor exchange take place.”

“One must be able to measure the quantity of a good in order for it to be exclusive property and to have value in exchange. Where measurement costs are very high, the good will be a common property resource. The technology of measurement and the history of weights and measures is a crucial part of economic history since as measurement costs were reduced the cost of transacting was reduced.“

Benham and Benham (2000, p. 370) concur:

“Economic theory suggests that changes in transaction costs have a first-order impact on the production frontier. Lower transaction costs mean more trade, greater specialization, changes in production costs, and increased output.”

The second misconception, concerning measurement, stems from the assumption that the widely used incomplete and insufficient methods based in True Score Theory are the state of the art, and that their associated reductionist and immoral commodization of people is unavoidable. As is well known to the Rasch expert group attendees, the state of the art in measurement offers a wealth of advantages inaccessible to True Score Theory. One of these is the insufficiently elaborated opportunities available for nonreductionist and moral commoditization of the constructs measured, not people themselves.

It seems plain that many of today’s problems of human suffering, social discontent, and environmental degradation could possibly be more effectively addressed by means of systematic and deliberate efforts aimed at using improved measurement methods to lower transaction costs, establish property rights, and create efficient markets supporting advanced innovations for improving the quality and quantity of intangible assets. Efforts in this direction have connected Rasch psychometrics with metrology (Mari & Wilson, 2014; Pendrill & Fisher, 2015; Fisher & Stenner, 2016), with the historical interweaving of science and the economy (Fisher, 2002, 2007, 2009, 2010, 2012, etc), and are being applied to the development of a new class of social impact bonds (see

What feedback, questions, and comments might the expert group attendees have in response to these efforts?

Additional references available on request.

Benham, A., & Benham, L. (2000). Measuring the costs of exchange. In C. Ménard (Ed.), Institutions, contracts and organizations: Perspectives from new institutional economics (pp. 367-375). Cheltenham, UK: Edward Elgar.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-734.

North, D. C. (1981). Structure and change in economic history. New York: W. W. Norton & Co.

On the practical value of rigorous modeling

June 1, 2018

What is the practical value of modeling real things in the world in terms requiring separable parameters?

If the parameters are separable, the stage is set for the validation of a model that is relevant to and applies to any individual person or challenge belonging to the infinite populations of those classes of all possible people and challenges.

Parameter separation does not automatically translate into representations of that conjoint relationship, though. Meaningful and consistent variation in responses requires items written to provoke answers that cohere across respondents and across the questions asked.

In addition, enough questions have to be asked to drive down uncertainty relative to the variation. Response patterns can be reproduced meaningfully only when there is more variation than uncertainty. Precision and reliability are functions of that ratio.

But when reliable and meaningful parameter separation is obtained, the laboratory model represents something real in the world. Moreover, the model stands for the object of interest in a way that persists no matter which particular people or challenges are involved.

This is where the practical value kicks in. This is what makes it possible to export the laboratory model into the real world. The specific microcosm of relationships isolated and found reproducible in research is useful and meaningful to the extent that those relationships take at least roughly the same form in the world. Instead of managing the specific indicators that are counted up in the concrete observations, it becomes possible to manage the generic object of interest, abstractly conceived. Adaptively selecting the relevant indicators according to their practical relevance on the ground in real world applications has the practical consequence of unifying measurement, management, and accountability in a single sphere of action.

Plainly there’s a lot more that needs to be said about this.

My responses to post-IOMW survey questions

May 7, 2018

My definition of objective measurement:

Reproducible invariant intervals embodied in instruments calibrated to shared unit standards explained by substantively meaningful theory. The word ‘objective’ is both redundant, like saying ‘wet rain,’ and unnecessarily exclusive of the shared subjectivity embodied in measuring instruments along with objectivity.

Distinguishing features of IOMW:

Clear focus on technical issues of measurement specifically defined in terms of models in form of natural laws, interval units with known uncertainties, data quality assessments, explanatory theory, substantive interpretation, and metrological traceability of instruments distributed to end users throughout a sociocognitive ecosystem.

Future keynote suggestions:

Luca Mari on measurement philosophy

Leslie Pendrill on metrology

Robert Massof on LOVRNet

Stefan Cano on health metrology consensus standards

Jan Morrison on STEM Learning Ecosystems

Angelica Lips Da Cruz on impact investing

Alan Schwartz on how measurement is revolutionizing philanthropy

Future training session topic suggestions:

Traceability control systems

Electronic distributed ledger systems for tracking learning, health, etc over time and across ecosystem niches

How to create information infrastructures capable of coherently integrating discontinuous levels of complexity, CSCW

How to access and put the wealth of available strictly longitudinal repeated measures of student learning growth to work (see Williamson’s 2016 Berkeley IMEKO paper)

How to integrate universally uniform measures of learning, health, etc in economic models, accounting spreadsheets, TQM/CQI quality improvement methods, outcome product pricing models, and investment finance.

How to approach measurement in terms of complex adaptive self organizing stochastic systems

Other comments:

I want to see a clear justification for any references to IRT. The vast majority of references to IRT at the NY meeting were actually references to measurement theory. If IRT is what is said, IRT ought to be what is meant. None of the major measurement theorists include IRT and they specifically disavow it as offering unidentifiable models, model choice based in p-values instead of principles and meaning, difficult if not impossible estimation problems, no proofs of conjoint additivity or of scores as sufficient statistics, and inconsistent assertions of both crossing ICCs and unidimensionality. IRT is not Measurement Theory. Why is it so widely featured at a measurement conference?

On social impact bonds and critical reflections

May 5, 2018

A new article (Roy, McHugh, & Sinclair, 2018) out this week in the Stanford Social Innovation Review echoes Gleeson-White (2015) in pointing out a disconnect between financial bottom lines and the social missions of companies whose primary objectives concern broader social and environmental impacts. The article also notes the expense of measurement, increased administrative burdens, high transaction costs, technical issues in achieving fair measures, the trend toward the negative implications of managing what is measured instead of advancing the mission, and the potential impacts of external policy environments and political climates.

The authors contend that social impact bonds are popular and proliferating for ideological reasons, not because of any evidence concerning their effectiveness in making the realization of social objectives profitable. Some of the several comments posted online in response to the article take issue with that claim, and point toward evidence of effectiveness. But the general point still stands: more must be done to systematically align investors’ financial interests with the citizens’ interest in advancing their financial, social, and environmental quality of life, and not just with the social service providers’ interest in funding and advancing their mission.

Roy et al. are correct to say that to do otherwise is to turn the people served into commodities. This happens because governance of, accountability for, and reporting of social impacts are shifted away from elected officials to the needs of private funders, with far less in the way of satisfactory recourse for citizens when programs go awry. The problem lies in the failure to create any capacity for individuals themselves to represent, invest in, manage, and profit from their skills, health, trust, and environmental service outcomes. Putting all the relevant information into the hands of service providers and investors, and making that information as low quality as it is, can only ever result in one-sided effects on people themselves. With no idea of the technologies, models, decades of results, and ready examples to draw from in the published research, the authors conclude with a recommendation to leave well enough alone and to pursue more traditional avenues of policy formation, instead of allowing the “cultural supremacy of market principles” to continue advancing into every area of life.

But as is so commonly the case when it comes to technical issues of quantification, the authors’ conclusions and criticisms skip over the essential role that high quality measurement plays in reducing transaction costs and supporting property rights. In general, measurement standards inform easily communicated and transferable information about the quantity and quality of products in markets, thereby lowering transaction costs and enabling rights to the ownership of specific amounts of things. The question that goes unasked in this article, and in virtually every other article in the area of ESG, social impact investing, etc., is this: What kind of measurement technologies and systems would we need to be able to replicate existing market efficiencies in new markets for human, social, and natural capital?

That question and other related ones are, of course, the theme of this blog and of many of my publications. Further exploration here and in the references to other posts (such as Fisher, 2011, 2012a, 2012b) may prove fruitful to others seriously interested in finding a way out of the unexamined assumptions stifling creativity in this area.

In short, instead of turning people into commodities, why should we not turn skills, health, trust, and environmental services into commodities? Why should not every person have legal title to scientifically and uniformly measured numbers of shares of each essential form of human, social, and natural capital? Why should individuals not be able to profit in both monetary and personal terms from their investments in education, health care, community, and the environment? Why should we allow corporations to continue externalizing the costs of social and environmental investments, at the expense of individual citizens and communities? Why is there so much disparity and inequality in the opportunities for skill development and healthy lives available across social sectors?

Might not our inability to obtain good information about processes and outcomes in the domains of educational, health care, social service, and environmental management have a lot to do with it? Why don’t we have the information infrastructure we need, when the technology for creating it has been in development for over 90 years? Why are there so many academics, researchers, philanthropic organizations, and government agencies that are content with the status quo when these longstanding technologies are available, and people, communities, and the environment are suffering from the lack of the information they ought to have?

During the French revolution, one of the primary motivations for devising the metric system was to extend the concept of universal rights to individual commercial exchanges. The confusing proliferation of metrics in Europe at the time made it possible for merchants and the nobility to sell in one unit and buy with another. Universal rights plainly implied universal measures. Alder (2002, p. 2) explains that:

“To do their job, standards must operate as a set of shared assumptions, the unexamined background against which we strike agreements and make distinctions. So it is not surprising that we take measurement for granted and consider it banal. Yet the use a society makes of its measures expresses its sense of fair dealing. That is why the balance scale is a widespread symbol of justice. .. Our methods of measurement define who we are and what we value.”

Getting back to the article by Roy, McHugh, and Sinclair, yes, it is true that the measures in use in today’s social impact bonds are woefully inadequate. Far from living up to the kind of justice symbolized by the balance scale, today’s social impact measures define who we are in terms of units of measurement that differ and change in unknown ways across individuals, over time, and across instruments. This is the reason for many, if not all, of the problems Roy et al. find with social impact bonds: their measures are not up to the task.

But instead of taking that as an unchangeable given, should not we do more to ask what kinds of measures could do the job that needs to be done? Should not we look around and see if in fact there might be available technologies able to advance the cause?

Theory and evidence have, in fact, been brought to bear in formulating approaches to instrument calibration that reproduce the balance scale’s fair and just comparisons of weight from data like that from tests and surveys (Choi, 1998; Massof, 2011; Rasch, 1960, pp. 110-115). The same thing has been done in reproducing measures of length (Stephanou & Fisher, 2013), distance (Moulton, 1993), and density (Pelton & Bunderson, 2003).

These are not isolated and special results. The methods involved have been in use for decades and in dozens of fields (Wright, 1968, 1977, 1999; Wright & Masters, 1982; Wright & Stone, 1979, 1999; Andrich, 1978, 1988, 1989, 2010; Bond & Fox, 2015; Engelhard, 2012; Wilson, 2005; Wilson & Fisher, 2017). Metric system engineers and physicists are in accord with psychometricians as to the validity of these claims (Pendrill & Fisher, 2015) and are on the record with positive statements of support:

“Rasch models belong to the same class that metrologists consider paradigmatic of measurement” (Mari and Wilson, 2014, p. 326).

“The Rasch approach…is not simply a mathematical or statistical approach, but instead [is] a specifically metrological approach to human-based measurement” (Pendrill, 2014, p. 26).

These statements represent the attitude toward measurement possibilities being applied by at least one effort in the area of social impact investing ( Hopefully, there will be many more projects of this kind emerging in the near future.

The challenges are huge, of course. This is especially the case when considering the discontinuous levels of complexity that have to be negotiated in making information flow across locally situated individual niches, group-level organizations and communities, and global accountability applications (Fisher, 2017; Fisher, Oon, & Benson, 2018; Fisher & Stenner, 2018). But taking on these challenges makes far more sense than remaining complicitly settled in a comfortable rut, throwing up our hands at how unfair life is.

There’s a basic question that needs to be asked. If what is presented as measurement raises transaction costs and does not support ownership rights to what is measured, is it really measurement? How can the measurement of kilowatts, liters, and grams lower transaction costs and support property rights at the same time that other so-called measurements raise transaction costs and do not support property rights? Does not this inconsistency suggest something might be amiss in the way measurement is conceived in some areas?

For more info, check out these other posts here:


Alder, K. (2002). The measure of all things: The seven-year odyssey and hidden error that transformed the world. New York: The Free Press.

Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561-573.

Andrich, D. (1988). Sage University Paper Series on Quantitative Applications in the Social Sciences. Vol. series no. 07-068: Rasch models for measurement. Beverly Hills, California: Sage Publications.

Andrich, D. (1989). Constructing fundamental measurements in social psychology. In J. A. Keats, R. Taft, R. A. Heath & S. H. Lovibond (Eds.), Mathematical and theoretical systems. Proceedings of the 24th International Congress of Psychology of the International Union of Psychological Science, Vol. 4 (pp. pp. 17-26). Amsterdam, Netherlands: North-Holland.

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Bond, T., & Fox, C. (2015). Applying the Rasch model: Fundamental measurement in the human sciences, 3d edition. New York: Routledge.

Choi, E. (1998). Rasch invents “ounces.” Popular Measurement, 1(1), 29. Retrieved from

Engelhard, G., Jr. (2012). Invariant measurement: Using Rasch models in the social, behavioral, and health sciences. New York: Routledge Academic.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.

Fisher, W. P., Jr. (2012a). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012b, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [].

Fisher, W. P., Jr. (2017). A practical approach to modeling complex adaptive flows in psychology and social science. Procedia Computer Science, 114, 165-174. Retrieved from

Fisher, W. P., Jr., Oon, E. P.-T., & Benson, S. (2018). Applying Design Thinking to systemic problems in educational assessment information management. Journal of Physics Conference Series, pp. in press; [].

Fisher, W. P., Jr., & Stenner, A. J. (2018). Ecologizing vs modernizing in measurement and metrology. Journal of Physics Conference Series, pp. in press [].

Gleeson-White, J. (2015). Six capitals, or can accountants save the planet? Rethinking capitalism for the 21st century. New York: Norton.

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327.

Massof, R. W. (2011). Understanding Rasch and Item Response Theory models: Applications to the estimation and validation of interval latent trait measures from responses to rating scale questionnaires. Ophthalmic Epidemiology, 18(1), 1-19.

Moulton, M. (1993). Probabilistic mapping. Rasch Measurement Transactions, 7(1), 268 [].

Pelton, T., & Bunderson, V. (2003). The recovery of the density scale using a stochastic quasi-realization of additive conjoint measurement. Journal of Applied Measurement, 4(3), 269-281.

Pendrill, L. (2014, December). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Roy, M. J., McHugh, N., & Sinclair, S. (2018, 1 May). A critical reflection on social impact bonds. Stanford Social Innovarion Review. Retrieved 5 May 2018, from

Stephanou, A., & Fisher, W. P., Jr. (2013). From concrete to abstract in the measurement of length. Journal of Physics Conference Series, 459,

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M., & Fisher, W. (2017). Psychological and social measurement: The career and contributions of Benjamin D. Wright. New York: Springer.

Wright, B. D. (1968). Sample-free test calibration and person measurement. In Proceedings of the 1967 invitational conference on testing problems (pp. 85-101 []). Princeton, New Jersey: Educational Testing Service.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 []). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis: Rasch measurement. Chicago, Illinois: MESA Press.

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Chicago, Illinois: MESA Press.

Wright, B. D., & Stone, M. H. (1999). Measurement essentials. Wilmington, DE: Wide Range, Inc. [].

Measurement quality checklist

April 21, 2018

We all make and use dozens of measurements everyday, reading everything from clocks to speedometers to rulers to weight scales to thermometers. We also deal with numbers often called measures but which are not expressed in a meaningful unit of comparison, like test scores, ratings, survey percentages, counts of how many times something happened, etc. These numbers don’t stand for a constant amount that adds up in the way hours, distances, weight, and temperature do. These lower quality ordinal numbers from tests and ratings are in wide use and are commonly called measurements but are also generally understood as not obtaining the kind of rigor and precision associated with physical measurements.

The failure to produce meaningful measurements of abilities, attitudes, knowledge, and behaviors, however, does not mean the things we measure with tests, surveys, and rating scales cannot be scientifically quantified. On the contrary, meaningful, useful, reliable, and validated high quality measurement has been available and in use on mass scales for decades. Perhaps the public remains unaware of these developments because of the technical mathematics and laborious analytic details involved. There are also a great many unexamined cultural presuppositions that assume human attributes cannot be measured meaningfully, and should not be measured even if they can be.

The problem here, as is so often the case when uninformed opinions hold sway, is that the predominance of meaningless numbers masquerading as measures actually makes us much more worse off than we would be if we invested the time and resources needed to create the Intangible Assets Metric System I’ve referred to elsewhere in this blog.

Many of the other posts here contrast meaningless and meaningful approaches to measurement, so I won’t repeat any of that here. What I’ll do instead is provide something that’s been suggested by many friends and colleagues over the years: a simple checklist of basic features that ought to be readily available in any measurement system worthy of the name. To find out more about any of these features, search the terms as they are listed.

An interval unit

Individual measures in that unit

Individual item locations in that unit

Rating scale transition thresholds in that unit

An uncertainty or error term for each individual measure

Data quality, internal consistency, and model fit statistics for each individual measure

Experimental evidence supporting the claim to an interval unit

A mathematical model of the interval unit

References to mathematical proofs that (a) the observed data are necessary and sufficient to the estimation of the model parameters, and (b) the model satisfies the requirements of conjoint additivity

Cronbach’s alpha, a KR-20, a separation index, or a separation reliability coefficient expressing the ratio of explained variance to uncertainty/error variance, for both persons and items

A map of the construct measured illustrating how the items are supposed to work together

Interpretive guidelines showing what measures mean as functions of item scale locations

A Wright map illustrating the conjoint relation of the measure and item distributions

A kidmap or other map of individual ordered responses useful for informing instruction or treatment

A theory explaining variation in item scale locations

Evidence of traceability to a unit standard, if available

Evidence that items are not biased for or against any identifiable groups

Information on the calibration sample and results (responses per item, etc)

For a commercially successful scientific reading ability measurement framework, see and Fisher & Stenner, 2016. For articles co-authored by metrology engineers and psychometricians, see Mari & Wilson, 2014, and Pendrill & Fisher, 2015. To see thousands of articles and books on related measurement topics dating back 50 years, search Rasch Measurement in Google Scholar. For consulting and advice on measurement, fill in a comment below, see, or explore the wealth of resources at For overviews of the role of measurement quality in education and health care, see Cano and Hobart (2011), Hobart, et al. (2007), Massof (2008), Massof and Rubin (2001), Wilson (2013), and Wright (1984, 1999).


Cano, S. J., & Hobart, J. C. (2011). The problem with health measurement. Patient Preference and Adherence, 5, 279-290.

Fisher, William P. Jr., and Stenner, A. Jackson. 2016. Theory-based metrological traceability in education: A reading measurement network, Measurement,
92, 489-496.

Hobart, J. C., Cano, S. J., Zajicek, J. P., & Thompson, A. J. (2007, December). Rating scales as outcome measures for clinical trials in neurology: Problems, solutions, and recommendations. Lancet Neurology, 6, 1094-1105.

Mari, Luca, and Wilson, Mark. 2014. An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315–327.

Massof, R. W. (2008, July-August). Editorial: Moving toward scientific measurements of quality of life. Ophthalmic Epidemiology, 15, 209-211.

Massof, R. W., & Rubin, G. S. (2001, May-Jun). Visual function assessment questionnaires. Survey of Ophthalmology, 45(6), 531-48.

Pendrill, Leslie, and Fisher, William P. Jr. 2015.  Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Wilson, M. R. (2013, April). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wright, B. D. (1984). Despair and hope for educational measurement. Contemporary Education Review, 3(1), 281-288 [].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 []). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Reductionist vs Nonreductionist Conceptualizations of Psychological and Social Measurement

April 19, 2018


  • Root metaphor: Mechanical clockwork universe
  • Paradigmatic case: Newtonian physics
  • Complete, consistent, deterministic structures
  • Whole is sum of parts
  • Sufficient statistics: Mean & std deviation
  • Uncertainty is variation across repeated measures
  • Test/survey items in use define totality of universe of possibility; changes in items change that universe
  • Descriptive, reactionary
  • Microlevel facts are supposed to additively combine into general laws
  • General laws are discovered by measuring
  • Top down data analytics influence policy
  • Externally imposed assembly processes
  • Subject/object dualism institutionalized in data analytics process
  • Data are hallmark criterion of objectivity
  • Subjectivity discounted, removed if possible
  • Counts are quantities
  • Ordinal scores treated as interval measures with no justification
  • Score variation relates solely to person characteristics
  • Score meaning tied to particular questions asked
  • Quantitative methods don’t define unit quantities or test for them
  • Qualitative data and methods are separated from quantitative data/methods
  • No model of construct stated or tested
  • Group level multivariate focus
  • P-values are primary model fit criterion
  • Population sampling motivates probabilistic approach
  • Equating based on statistical assumptions concerning score distribution


  • Root metaphor: Living organic universe
  • Paradigmatic case: Multilevel ecosystems
  • Incomplete, not perfectly consistent, stochastic structures
  • Whole is greater than sum of parts
  • Sufficient statistics: scores
  • Uncertainty is resonance of stochastic invariance within individual measures
  • Test/survey items in use sample from infinite population; changes in items used do not change that universe
  • Prescriptive, anticipatory
  • Microlevel facts self-organize into meso abstractions & macro formalisms
  • Measuring presumes general laws
  • Bottom up alignments and coordinations of decisions and behaviors move society
  • Internal processes of self-organization
  • Mutually implied subject-object entangled together in playful flow institutionalized via distributed instrumentation
  • Objectivity requires data explained by theory embodied in instruments
  • Subjectivity included as valid source of concerns and insights scrutinized for value
  • Counts might lead to quantity definition
  • Interval measures theoretically and empirically substantiated
  • Empirical & theoretical measure variation maps construct via items and persons
  • Measure meaning is independent of particular questions asked
  • Quantitative methods define unit quantities and test for them
  • Qualitative methods are integrated with quantitative methods
  • Mathematical, observation, and construct models stated and tested
  • Individual level univariate focus
  • Meaningful construct definition primary model fit criterion
  • Individual response process motivates probabilistic approach
  • Equating requires alignment of items along common dimension


Comments on SASB Standards

December 24, 2017

William P. Fisher, Jr., Ph.D., Sausalito, CA  (

and UC Berkeley BEAR Center (

Submitted to SASB on 22 December 2017

These comments pertain to all SASB Industry Standards, all Disclosure Topics, all Accounting Metrics, and all Metric-Level proposed updates. The latter are of particular interest because unexamined assumptions about how the proposed metrics work compromise all of the Criteria for Accounting Metrics in the SASB Conceptual Framework, as listed, for instance, on page 7 of the Services-Basis for Conclusions.pdf document. The criteria of particular interest include fair representation, usefulness, applicability, comparability, and being complete. The goal here is to document these issues for future targeting as the metrics are improved.

In creating an array of sustainability accounting standards, SASB has worked to advance the practical value of the ancient interdependencies of measurement and commerce. The profound efforts that have been invested in creating standards for sustainability accounting demand continued focus in moving forward from intuitions about measurability to more rigorous, convenient, and scientific approaches to qualitatively meaningful quantification. The Proposed Changes to Provisional Standards that are currently open for public review and comment extend and amplify previous assumptions about measurement and approaches to it. These comments spell out some of those assumptions and offer alternatives more likely to function as transparent media embodying the values of sustainability.

Six issues in particular unnecessarily complicate and hobble the standards and their implementation, and should be addressed in future improvements. The scientific value and viability of these recommendations have been asserted in recent collaborations of metrologists (weights and measures standards engineers) and psychological measurement researchers. The practical value of these recommendations has been established in over 60 years of research and applications across a wide range of fields. A small sampling of the tens of thousands of peer-reviewed publications in this area are listed below, grouped by topic.

First, comparable metrics need not be based in common content. Content provides an initial clue as to common interests and potential comparability, but remaining fixated on content as the sole criterion for communicating variation creates more problems than it solves. Decades of research and practice in psychology and the social sciences show how different indicators can be calibrated to measure the same thing. This opens the door to flexible methods of adapting indicators to the needs of individual firms or industries without compromising comparability across firms within and across industries.

Second, the focus on individual sustainability indicators as the metrics of interest needlessly over-complicates the interpretation and application of the standards. The consensus choice of particular indicators as being of interest in evaluating sustainability in a given industry suggests an implicit theory of what all the indicators taken together likely measure. That theory needs to be articulated, a formal mathematical model of what is to be measured needs to be stated, the model needs to be experimentally tested, and the entire population of all relevant indicators needs to be calibrated in a common unit of comparison. Doing this will result in a capacity to summarize multiple indicators in a single number that can be interpreted meaningfully in terms of the established consistency of the pattern across all indicators measuring the same aspect of sustainability, whether or not they were administered.

Third, and in the same vein, numeric counts and percentages are not measures. Contrary to the terms used in the SASB standards, counts and percentages are not quantitative in the sense of each additional one more standing for the same amount. I may have five rocks, and you may have two, but there is no way of telling from those counts who has more rock. Counts and percentages are at best ordinal, not interval. Commercial measurement standards for weight, volume, etc. all employ interval units established via theory and experiment as maintaining their size across counts of concrete individual instances of real things. Sustainability metrics require the same attention to technical detail as metrics in any other area of commerce.

Fourth, to interpret measures of mass, energy, volume, etc. as measures of sustainability, they need to be shown via theory and experiment to actually support these kinds of inferences. Physical measures (such as metric tons, joules, kilowatt hours, etc.) are scientifically calibrated to measure in standard unit amounts, but that does not mean those measures of physical variables automatically translate into measures of sustainability, which is what is assumed across many of the SASB standards. To measure sustainability, as distinct from mass, energy, or volume, it must be conceptualized in theory, experimentally modeled and tested, and embodied in a network of calibrated instruments traceable to a unit standard.

Fifth, uncertainty needs to be explicitly estimated and presented as an expected range of variation.

Sixth, inconsistencies in a firm’s data across indicators need to be flagged for special attention. Longstanding report formats and methods can be put to good use here.

Given the importance of sustainability standards for the future of life on earth, given SASB’s efforts at creating sustainability metric standards, and given the huge multipliers obtained when distributed network effects are put in play, the implicit goal of SASB is the establishment of a new intangible assets metric system. These comments are intended to provoke further deliberately conceived and implemented developments in that direction.

The end result of those developments will be the creation of common languages of sustainability research and practice, common languages that provide the media for collectively coordinating decisions and behaviors across local and global spheres of activity. As such, given the history of economics, it can reasonably be expected that the efficiencies gained from enhanced communications and information infrastructures will bring sustainability capital to life on previously unimagined mass scales. See the references listed below for more information in this area.

Following through on these recommendations will make it possible to harness the energy of the billions of people globally who for decades have been vocally expressing their desire for change. Providing a medium for channeling and focusing that energy on sustainability is the most urgent demand of our times. SASB is leading a vitally important array of efforts in this direction. The challenges are huge, but having defined the problem, humanity is likely able to come through for itself as the steward of life on earth.

References on bringing intangible assets to life

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-1093 [].

Fisher, W. P., Jr. (2009, November 19). Draft legislation on development and adoption of an intangible assets metric system. Retrieved 6 January 2011, from Living Capital Metrics blog:

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2009). NIST Critical national need idea White Paper: Metrological infrastructure for human, social, and natural capital (Tech. Rep. No. Washington, DC:. National Institute for Standards and Technology.

Fisher, W. P., Jr. (2010). Measurement, reduced transaction costs, and the ethics of efficient markets for human, social, and natural capital, Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University (

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [].

Fisher, W. P., Jr., & Stenner, A. J. (2011, January). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). National Science Foundation:

Fisher, W. P., Jr., & Stenner, A. J. (2011, August 31 to September 2). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO) TC1-TC7-TC13 Joint Symposium,, Jena, Germany.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-734.

References on metrology and psychological measurement

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496.

Mari, L., Maul, A., Irribarra, D. T., & Wilson, M. (2016, March). Quantities, quantification, and the necessary and sufficient conditions for measurement. Measurement, 100, 115-121.

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327.

Mari, L., & Wilson, M. (2015, 11-14 May). A structural framework across strongly and weakly defined measurements. Instrumentation and Measurement Technology Conference (I2MTC), 2015 IEEE International, pp. 1522-1526.

Pendrill, L. (2014, December). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2013). Quantifying human response: Linking metrological and psychometric characterisations of man as a measurement instrument. Journal of Physics Conference Series, 459,

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Wilson, M., & Fisher, W. (2016). Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology Across the Sciences: Wishful Thinking? Journal of Physics Conference Series, 772(1), 011001. Retrieved from

Wilson, M., Mari, L., Maul, A., & Torres Irribara, D. (2015). A comparison of measurement concepts across physical science and social science domains: Instrument design, calibration, and measurement. Journal of Physics Conference Series, 588(012034),

References on the state of the art in psychological measurement

Barney, M., & Fisher, W. P., Jr. (2016, April). Adaptive measurement and assessment. Annual Review of Organizational Psychology and Organizational Behavior, 3, 469-490.

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013, August). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14 [doi: 10.3389/fpsyg.2013.00536].

Wilson, M. R. (2013, April). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

William P. Fisher, Jr., Ph.D.

Research Associate

BEAR Center

Graduate School of Education

University of California, Berkeley

Skype: wfisher2

We are what we measure.

It’s time we measured what we want to be.

Private Costs and Public Goods

December 4, 2017

Concerning the relation of private costs to public goods, I see two issues here that need to be spelled out in greater detail.

First, everyone is likely well aware, the concept of private property is unknown in many, most or all traditional cultures. Land in the sense of a bordered parcel carved out of the larger whole of interdependent relationships in an ecosystem is bizarrely dysfunctional and confused, from this point of view. The public goods of watershed, air purification, fishery, wildlife, etc. services cannot in any way be disentangled from the way private costs are thoroughly absorbed into the cyclical dynamics of symbiotic give and take, investment and return. Profit is defined entirely in terms of value for life. Overuse and misuse that imbalances relationships has tangible consequences that negatively impact quality of life.

Western culture broadened the scope of activity in ways that made it possible to expand the relationships in play beyond the constraints of local circumstances. Private costs incurred in cycles of investment and return could be sunk over longer time periods and across wider geographic ranges. Negative local consequences were balanced against positive returns accumulated elsewhere. Overall returns were negative as the resource base was destroyed, but the accounting methods in use and cultural values in play ignored this in favor of a narrower definition of personal profit that reinforced the extractive processes.

This system could continue only as long as new resources could be identified and converted to profit. As private costs increased and returns decreased new accounting methods and cultural values emerged and focused on rebalancing ecosystem interdependencies. Unfortunately, culturally ingrained habits of thought and institutionalized patterns of incentives and rewards often result in counterproductive conceptualizations of the situation. Too often, the linear destruction of the resource base is assumed to be the defining characteristic of a functional economy, and it is further assumed that this system must somehow be juxtaposed alongside or made externally congruent with ecosystems’ paradigmatically different circular interdependencies.

This, to me, is the background to your question about private costs and public goods. As long as we conceive and enact our relationships with social value and environmental services in terms of an either-or dichotomy, we are lost. But is it really true that our only alternatives are to subject externalities to extractive processes or to constrain those processes so as to allow ecosystem dynamics a wider scope of free reign at the expense of humanity? Should we think only in terms of (a) letting the current form of capitalism run rampant, (b) scaling back human activity to some kind of utopian low-tech, low-impact integration with a nature religion, or (c) the currently dominant assumption of a tense compromise between these two options of destroying or preserving the resource base?

Why are so few people talking about reconceiving economies as ecosystems of interdependent relationships that encompass not just human institutions but nonhuman forms of life and ecologies as well? Why should not we strive for an economy in which value for life is conceived as authentic wealth and market institutions make monetized profits contingent on nurturing genuine productivity? Why should not negative impacts on public goods be translated into private costs borne by individuals, communities, and businesses? Why should not everyone have the means to impact via their own behaviors, and to track in their own accounts, the quality and quantity of the personal stocks of shares in public goods they own? Why cannot we create legal and regulatory supports for entrepreneurs to be rewarded for wide commercial sales of their innovations reducing human suffering, social discontent, and environmental degradation? Why should not individuals be able to invest in privately owned shares of public goods in ways that efficiently move capital resources to where they are most effectively employed in the name of creating authentic wealth? How else could we ever amass the magnitude of focused effort it is going to take to rebalance the climate, get plastic microparticles out of the oceans, eliminate human suffering, and remove the sources of social discontent?

I am sure it seems counterintuitive but this problem is akin to a Chinese finger puzzle. The harder we pull, the more tightly trapped we become. We have to relax into the problem to be released from it. In jujitsu fashion, we have to use the energy of what we oppose to advance toward our goals. The profit motive is destructive because we have not integrated the genuine wealth we say we value into our accounting, financial, and economic systems. If we really do value human, social, and environmental riches over mere monetary riches as much as we say we do, why have we invested so little effort in finding qualitatively meaningful and mathematically rigorous ways of communicating, sharing, storing, and growing that value? Why haven’t we codified the legal structures and enforcement bodies that would monitor adherence to new institutional norms? Why don’t we have standards for the common languages and currencies we need if we are to be able to efficiently exchange meaningful expressions of real value?

If we would measure and manage investments in individual stocks of intangible assets the way we do for tangible assets, we could orchestrate a different balance of power. Efficient markets for investments in human, social, and natural capital would quickly channel flows away from businesses destroying genuine wealth toward those growing it. So efficient markets are not themselves the primary problem. The problem is that three of the four forms of capital comprising the economy have not been brought to life in the form of transferable representations serving as common currencies for the exchange of value. Just as land was once universally regarded a public good but came to be a privately owned source of costs and profits, so, too, will today’s public goods also be transformed.

There is a huge difference in the contexts of these transformations that must not be overlooked. Land became a private commodity in isolation with no system of checks and balances to constrain it. The earth appeared to be a bottomless resource well available for the taking. Issues of intangible market externalities were relevant only to the extent they negatively affected profits. Compassion for collateral damage was regarded either as foolish or as marketing opportunities for ostentatious public displays of prepackaged concern.

The transformation of the broader array of public goods into some form of privately owned and managed property is taking place in a qualitatively different context. Taking the trouble to articulate our values in ways that make us accountable for them will lead to a decisive “put up or shut up” moment for humanity. If people are as innately greedy and selfish as some think, the availability of legally binding and monetarily profitable accounting, financial, and economic systems for investing in and producing enhanced social value will mean nothing, and monetary profits will continue to be extracted, illegally, and perhaps at higher rates, only from privately owned tangible assets in such high demand that no one could get along without them.

On the other hand, a global movement spanning well more than half a century has been consistently seeking to devise ways of addressing these problems. Hundreds of millions of individuals, thousands of organizations, and hundreds of countries have conferred, invested, discussed, agreed, planned, and created across billions of ideas and possibilities. The will to do what needs to be done seems to me to exist in abundance. Those who think otherwise seem reconciled to the destructive scaling back of human activity, which only puts on display their own inability to think past today’s cultural assumptions to new expansions of the ecologically sound aspects of institutions of the past.

I think that we can indeed imagine ecosystems of interdependent relationships in which people will do what needs to be done not because it is the right or good thing to do but because they can see how it works to enrich the value obtained in their lives, for their families and communities. The ecosystem has to do the work of multiplying value indirectly into the larger community as an effect of each individual’s decisions and behaviors. This is not and cannot be a matter of individual concerns or intentions. The institutions have to support what Hayek called the true individual: not the falsely isolated and selfish individual but the authentic person whose identity and roles are shaped via relations of trust in thousands of unknown others, mediated by every exchange of information or value.

What we need to do is amplify and multiply the number and nature of these relations of trust. These amplifications and multiplications have been in process for decades in education, health care, social services, environmental resource management, and other areas. Technical advances in the quality of the information created and managed in these fields is quietly accumulating into new relationships between teachers and students, clinicians and patients, researchers and practitioners, consumers and producers, the social and natural sciences, markets and externalities, etc. The information transforming these relationships builds trust by revealing the day-to-day consistency and reliability of what people say and do in classrooms, clinics, and offices over time and across situations. New information systems show where students, teachers, clinicians, patients, managers, etc. stand relative to where they were in the past, relative to their goals, and relative to everyone else. The information is actionable in previously unavailable ways, as it shows what comes next in one’s self-defined learning trajectory or improvement goal sequence.

Technical issues concerning information complexity are being addressed to prevent the kinds of failures plaguing efforts in the past. Most importantly, in the manner of the advances in resilience and lean thinking informing manufacturing in recent decades, one of the consequences of increased information quality and the enhanced levels of trust obtained in the caring arts and sciences is that those on the front lines are empowered to act creatively in new ways. There is no reason to think that the innovations and advances that can be expected in these contexts will be less impressive or valuable than we have witnessed in technological areas in recent years. It seems highly likely that the inflationary spirals characteristic of these fields will be reversed into deflationary economies akin to that of microelectronics, where lower costs and higher quality drive increased profits.


To express all this from another point of view, as I’ve said many times before, we keep assuming that our modern Cartesian methods are the only ways of defining problems and solutions, but the actual truth of the matter is that these methods are themselves the problem. As long as we keep assuming that solutions to our problems depend on finding the political or financial will to take them on, we will continue to fail. If we instead harness the energy of the profit motive to focus efforts productively in the direction of integrated solutions, we will successfully achieve our goals on a scale far exceeding what anyone so far has projected as possible or likely.


Leveling the playing field and setting up equal opportunities for entry into a game played with the goal of keeping the game going.

Differences between today’s sustainability metrics and the ones needed for low cost social value transactions and efficient markets for intangible assets

November 16, 2017

Measurement is such a confusing topic! Everyone proclaims how important it is, but almost no one ever seeks out and implements the state of the art, despite the enormous advantages to be gained from doing so.

A key metric quality issue concerns the cumbersome and uninterpretable masses of data that well-intentioned people can hobble themselves with when they are interested in improving their business processes and outcomes. They focus on what they can easily count, and then they wrongly (at great but unrecognized cost) misinterpret the counts and percentages as measures.

For instance, today’s sustainability and social value indicators are each expressed in a different unit (dollars, hours, tons, joules, kilowatt hours, survey ratings, category percentages, etc.; see below for a sample list). Some of them may indeed be scientific measures of that individual aspect of the business. The problem is they are all being interpreted in an undefined and chaotic aggregate as a measure of something else (social value, sustainability, etc.). Technically speaking, if we want a scientific measure of that higher order construct, we need to model it, estimate it, calibrate it, and deploy it as a common language in a network of instruments all traceable to a common unit standard.

All of this is strictly parallel with what we do to make markets in bushels of corn, barrels of oil, and kilowatts of electricity. We don’t buy produce by count in the grocery store because unscrupulous merchants would charge the same amount for small fruits as for large. All of the scales in grocery store produce markets measure in the same unit, and all of the packages of food are similarly marked in standard units of weight and volume so we can compare prices and value.

There are a lot of advantages to taking the trouble to extend this system to social value. I suppose every one of these points could be a chapter in a book:

  • First, investing in scientific measurement reduces data volume to a tiny fraction of what we start with, not only with no loss of information but with the introduction of additional information telling us how confident we can be in the data and exactly what the data specifically mean (see below). That is, all the original information is recoverable from the calibrated measure, which is also qualified with an uncertainty range and a consistency statistic. Inconsistencies can be readily identified and acted on at individual levels.
  • Now the numbers represent something that adds up the way they do, instead of standing for the unknown, differing, and uncontrolled units used in the original counts and percentages.
  • We can take missing data into account, which means we can adapt the indicators used in different situations to specific circumstances without compromising comparability.
  • We know how to gauge the dependability of the data better, meaning that we will not be over-confident about unreliable data, and we won’t waste our time and resources obtaining data of greater precision than we actually need.
  • Furthermore, the indicators themselves are now scaled into a hierarchy that maps the continuum from low to high performance. This map points the way to improvement. The order of things on the scale shows what comes first and how more complex and difficult goals build on simpler and easier ones. The position of a measure on the scale shows what’s been accomplished, what remains to be done, and what to do next.
  • Finally, we have a single metric we can use to price value across the local particulars of individual providers. This is where it becomes possible to see who gives the most bang for the buck, to reward them, to scale up an expanded market for the product, and to monetize returns on investment.

The revolutionary network effects of efficient markets are produced by the common currencies for the exchange of value that emerge out of this context. Improvements rebalancing cost and quality foster deflationary economies that drive more profit from lower costs (think Moore’s law). We gain the efficiency of dramatic reductions in data volume, and the meaningfulness of numbers that stand for something substantively real in the world that we can act on. These combine to lower the cost of transactions, as it now becomes vastly less expensive to find out how much of the social good is available, and what quality it is. Instead of dozens or hundreds of indicators repeated for each company in an industry, and repeated for each division in each company, and all of these repeated for each year or quarter, we have access to all of that information properly contextualized in a succinct, meaningful, and interpretable format for different applications at individual, organizational, industry-wide, national, regional, or global levels of complexity.

That’s likely way too much to digest at once! But it seemed worth saying it all at once in one place, in case anyone might be motivated to get in touch or start efforts in this direction on their own.

Examples of the variety of units in a handy sustainability metrics spreadsheet can be found at the Hess web site ( freshwater use in millions or thousands of cubic meters, solid waste and carbon emissions in thousands of tons, natural gas consumption in thousands of gigajoules, electricity consumption in thousands of kilowatt hours; employee union members, layoffs, and turnover as percentages; employee lost time incident rates in hundreds of thousands of hours worked, percentages of female or minority board members, dollars for business performance.

These indicators are chosen with good reasons for use within each specific area of interest. They comprise an intuitive observation model that has face validity. But this is only the start of the work that needs to be done to create the metrics we need if we are to radically multiply the efficiency of social value markets. For an example of how to work from today’s diverse arrays of social value indicators (where each one is presented in its own spreadsheet) toward more meaningful, adaptable, and precise measures, see:

Fisher, W. P., Jr. (2011). Measuring genuine progress by scaling economic indicators to think global & act local: An example from the UN Millennium Development Goals project. Social Science Research Network: .