Archive for the ‘environmental management’ Category

A Second Simple Example of Measurement’s Role in Reducing Transaction Costs, Enhancing Market Efficiency, and Enables the Pricing of Intangible Assets

March 9, 2011

The prior post here showed why we should not confuse counts of things with measures of amounts, though counts are the natural starting place to begin constructing measures. That first simple example focused on an analogy between counting oranges and measuring the weight of oranges, versus counting correct answers on tests and measuring amounts of ability. This second example extends the first by, in effect, showing what happens when we want to aggregate value not just across different counts of some one thing but across different counts of different things. The point will be, in effect, to show how the relative values of apples, oranges, grapes, and bananas can be put into a common frame of reference and compared in a practical and convenient way.

For instance, you may go into a grocery store to buy raspberries and blackberries, and I go in to buy cantaloupe and watermelon. Your cost per individual fruit will be very low, and mine will be very high, but neither of us will find this annoying, confusing, or inconvenient because your fruits are very small, and mine, very large. Conversely, your cost per kilogram will be much higher than mine, but this won’t cause either of us any distress because we both recognize the differences in the labor, handling, nutritional, and culinary value of our purchases.

But what happens when we try to purchase something as complex as a unit of socioeconomic development? The eight UN Millennium Development Goals (MDGs) represent a start at a systematic effort to bring human, social, and natural capital together into the same economic and accountability framework as liquid and manufactured capital, and property. But that effort is stymied by the inefficiency and cost of making and using measures of the goals achieved. The existing MDG databases (http://data.un.org/Browse.aspx?d=MDG), and summary reports present overwhelming numbers of numbers. Individual indicators are presented for each year, each country, each region, and each program, goal by goal, target by target, indicator by indicator, and series by series, in an indigestible volume of data.

Though there are no doubt complex mathematical methods by which a philanthropic, governmental, or NGO investor might determine how much development is gained per million dollars invested, the cost of obtaining impact measures is so high that most funding decisions are made with little information concerning expected returns (Goldberg, 2009). Further, the percentages of various needs met by leading social enterprises typically range from 0.07% to 3.30%, and needs are growing, not diminishing. Progress at current rates means that it would take thousands of years to solve today’s problems of human suffering, social disparity, and environmental quality. The inefficiency of human, social, and natural capital markets is so overwhelming that there is little hope for significant improvements without the introduction of fundamental infrastructural supports, such as an Intangible Assets Metric System.

A basic question that needs to be asked of the MDG system is, how can anyone make any sense out of so much data? Most of the indicators are evaluated in terms of counts of the number of times something happens, the number of people affected, or the number of things observed to be present. These counts are usually then divided by the maximum possible (the count of the total population) and are expressed as percentages or rates.

As previously explained in various posts in this blog, counts and percentages are not measures in any meaningful sense. They are notoriously difficult to interpret, since the quantitative meaning of any given unit difference varies depending on the size of what is counted, or where the percentage falls in the 0-100 continuum. And because counts and percentages are interpreted one at a time, it is very difficult to know if and when any number included in the sheer mass of data is reasonable, all else considered, or if it is inconsistent with other available facts.

A study of the MDG data must focus on these three potential areas of data quality improvement: consistency evaluation, volume reduction, and interpretability. Each builds on the others. With consistent data lending themselves to summarization in sufficient statistics, data volume can be drastically reduced with no loss of information (Andersen, 1977, 1999; Wright, 1977, 1997), data quality can be readily assessed in terms of sufficiency violations (Smith, 2000; Smith & Plackner, 2009), and quantitative measures can be made interpretable in terms of a calibrated ruler’s repeatedly reproducible hierarchy of indicators (Bond & Fox, 2007; Masters, Lokan, & Doig, 1994).

The primary data quality criteria are qualitative relevance and meaningfulness, on the one hand, and mathematical rigor, on the other. The point here is one of following through on the maxim that we manage what we measure, with the goal of measuring in such a way that management is better focused on the program mission and not distracted by accounting irrelevancies.

Method

As written and deployed, each of the MDG indicators has the face and content validity of providing information on each respective substantive area of interest. But, as has been the focus of repeated emphases in this blog, counting something is not the same thing as measuring it.

Counts or rates of literacy or unemployment are not, in and of themselves, measures of development. Their capacity to serve as contributing indications of developmental progress is an empirical question that must be evaluated experimentally against the observable evidence. The measurement of progress toward an overarching developmental goal requires inferences made from a conceptual order of magnitude above and beyond that provided in the individual indicators. The calibration of an instrument for assessing progress toward the realization of the Millennium Development Goals requires, first, a reorganization of the existing data, and then an analysis that tests explicitly the relevant hypotheses as to the potential for quantification, before inferences supporting the comparison of measures can be scientifically supported.

A subset of the MDG data was selected from the MDG database available at http://data.un.org/Browse.aspx?d=MDG, recoded, and analyzed using Winsteps (Linacre, 2011). At least one indicator was selected from each of the eight goals, with 22 in total. All available data from these 22 indicators were recorded for each of 64 countries.

The reorganization of the data is nothing but a way of making the interpretation of the percentages explicit. The meaning of any one country’s percentage or rate of youth unemployment, cell phone users, or literacy has to be kept in context relative to expectations formed from other countries’ experiences. It would be nonsense to interpret any single indicator as good or bad in isolation. Sometimes 30% represents an excellent state of affairs, other times, a terrible one.

Therefore, the distributions of each indicator’s percentages across the 64 countries were divided into ranges and converted to ratings. A lower rating uniformly indicates a status further away from the goal than a higher rating. The ratings were devised by dividing the frequency distribution of each indicator roughly into thirds.

For instance, the youth unemployment rate was found to vary such that the countries furthest from the desired goal had rates of 25% and more(rated 1), and those closest to or exceeding the goal had rates of 0-10% (rated 3), leaving the middle range (10-25%) rated 2. In contrast, percentages of the population that are undernourished were rated 1 for 35% or more, 2 for 15-35%, and 3 for less than 15%.

Thirds of the distributions were decided upon only on the basis of the investigator’s prior experience with data of this kind. A more thorough approach to the data would begin from a finer-grained rating system, like that structuring the MDG table at http://mdgs.un.org/unsd/mdg/Resources/Static/Products/Progress2008/MDG_Report_2008_Progress_Chart_En.pdf. This greater detail would be sought in order to determine empirically just how many distinctions each indicator can support and contribute to the overall measurement system.

Sixty-four of the available 336 data points were selected for their representativeness, with no duplications of values and with a proportionate distribution along the entire continuum of observed values.

Data from the same 64 countries and the same years were then sought for the subsequent indicators. It turned out that the years in which data were available varied across data sets. Data within one or two years of the target year were sometimes substituted for missing data.

The data were analyzed twice, first with each indicator allowed its own rating scale, parameterizing each of the category difficulties separately for each item, and then with the full rating scale model, as the results of the first analysis showed all indicators shared strong consistency in the rating structure.

Results

Data were 65.2% complete. Countries were assessed on an average of 14.3 of the 22 indicators, and each indicator was applied on average to 41.7 of the 64 country cases. Measurement reliability was .89-.90, depending on how measurement error is estimated. Cronbach’s alpha for the by-country scores was .94. Calibration reliability was .93-.95. The rating scale worked well (see Linacre, 2002, for criteria). The data fit the measurement model reasonably well, with satisfactory data consistency, meaning that the hypothesis of a measurable developmental construct was not falsified.

The main result for our purposes here concerns how satisfactory data consistency makes it possible to dramatically reduce data volume and improve data interpretability. The figure below illustrates how. What does it mean for data volume to be drastically reduced with no loss of information? Let’s see exactly how much the data volume is reduced for the ten item data subset shown in the figure below.

The horizontal continuum from -100 to 1300 in the figure is the metric, the ruler or yardstick. The number of countries at various locations along that ruler is shown across the bottom of the figure. The mean (M), first standard deviation (S), and second standard deviation (T) are shown beneath the numbers of countries. There are ten countries with a measure of just below 400, just to the left of the mean (M).

The MDG indicators are listed on the right of the figure, with the indicator most often found being achieved relative to the goals at the bottom, and the indicator least often being achieved at the top. The ratings in the middle of the figure increase from 1 to 3 left to right as the probability of goal achievement increases as the measures go from low to high. The position of the ratings in the middle of the figure shifts from left to right as one reads up the list of indicators because the difficulty of achieving the goals is increasing.

Because the ratings of the 64 countries relative to these ten goals are internally consistent, nothing but the developmental level of the country and the developmental challenge of the indicator affects the probability that a given rating will be attained. It is this relation that defines fit to a measurement model, the sufficiency of the summed ratings, and the interpretability of the scores. Given sufficient fit and consistency, any country’s measure implies a given rating on each of the ten indicators.

For instance, imagine a vertical line drawn through the figure at a measure of 500, just above the mean (M). This measure is interpreted relative to the places at which the vertical line crosses the ratings in each row associated with each of the ten items. A measure of 500 is read as implying, within a given range of error, uncertainty, or confidence, a rating of

  • 3 on debt service and female-to-male parity in literacy,
  • 2 or 3 on how much of the population is undernourished and how many children under five years of age are moderately or severely underweight,
  • 2 on infant mortality, the percent of the population aged 15 to 49 with HIV, and the youth unemployment rate,
  • 1 or 2 the poor’s share of the national income, and
  • 1 on CO2 emissions and the rate of personal computers per 100 inhabitants.

For any one country with a measure of 500 on this scale, ten percentages or rates that appear completely incommensurable and incomparable are found to contribute consistently to a single valued function, developmental goal achievement. Instead of managing each separate indicator as a universe unto itself, this scale makes it possible to manage development itself at its own level of complexity. This ten-to-one ratio of reduced data volume is more than doubled when the total of 22 items included in the scale is taken into account.

This reduction is conceptually and practically important because it focuses attention on the actual object of management, development. When the individual indicators are the focus of attention, the forest is lost for the trees. Those who disparage the validity of the maxim, you manage what you measure, are often discouraged by the the feeling of being pulled in too many directions at once. But a measure of the HIV infection rate is not in itself a measure of anything but the HIV infection rate. Interpreting it in terms of broader developmental goals requires evidence that it in fact takes a place in that larger context.

And once a connection with that larger context is established, the consistency of individual data points remains a matter of interest. As the world turns, the order of things may change, but, more likely, data entry errors, temporary data blips, and other factors will alter data quality. Such changes cannot be detected outside of the context defined by an explicit interpretive framework that requires consistent observations.

-100  100     300     500     700     900    1100    1300
|-------+-------+-------+-------+-------+-------+-------|  NUM   INDCTR
1                                 1  :    2    :  3     3    9  PcsPer100
1                         1   :   2    :   3            3    8  CO2Emissions
1                    1  :    2    :   3                 3   10  PoorShareNatInc
1                 1  :    2    :  3                     3   19  YouthUnempRatMF
1              1   :    2   :   3                       3    1  %HIV15-49
1            1   :   2    :   3                         3    7  InfantMortality
1          1  :    2    :  3                            3    4  ChildrenUnder5ModSevUndWgt
1         1   :    2    :  3                            3   12  PopUndernourished
1    1   :    2   :   3                                 3    6  F2MParityLit
1   :    2    :  3                                      3    5  DebtServExpInc
|-------+-------+-------+-------+-------+-------+-------|  NUM   INDCTR
-100  100     300     500     700     900    1100    1300
                   1
       1   1 13445403312323 41 221    2   1   1            COUNTRIES
       T      S       M      S       T

Discussion

A key element in the results obtained here concerns the fact that the data were about 35% missing. Whether or not any given indicator was actually rated for any given country, the measure can still be interpreted as implying the expected rating. This capacity to take missing data into account can be taken advantage of systematically by calibrating a large bank of indicators. With this in hand, it becomes possible to gather only the amount of data needed to make a specific determination, or to adaptively administer the indicators so as to obtain the lowest-error (most reliable) measure at the lowest cost (with the fewest indicators administered). Perhaps most importantly, different collections of indicators can then be equated to measure in the same unit, so that impacts may be compared more efficiently.

Instead of an international developmental aid market that is so inefficient as to preclude any expectation of measured returns on investment, setting up a calibrated bank of indicators to which all measures are traceable opens up numerous desirable possibilities. The cost of assessing and interpreting the data informing aid transactions could be reduced to negligible amounts, and the management of the processes and outcomes in which that aid is invested would be made much more efficient by reduced data volume and enhanced information content. Because capital would flow more efficiently to where supply is meeting demand, nonproducers would be cut out of the market, and the effectiveness of the aid provided would be multiplied many times over.

The capacity to harmonize counts of different but related events into a single measurement system presents the possibility that there may be a bright future for outcomes-based budgeting in education, health care, human resource management, environmental management, housing, corrections, social services, philanthropy, and international development. It may seem wildly unrealistic to imagine such a thing, but the return on the investment would be so monumental that not checking it out would be even crazier.

A full report on the MDG data, with the other references cited, is available on my SSRN page at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1739386.

Goldberg, S. H. (2009). Billions of drops in millions of buckets: Why philanthropy doesn’t advance social progress. New York: Wiley.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Advertisements

Measurement, Metrology, and the Birth of Self-Organizing, Complex Adaptive Systems

February 28, 2011

On page 145 of his book, The Mathematics of Measurement: A Critical History, John Roche quotes Charles de La Condamine (1701-1774), who, in 1747, wrote:

‘It is quite evident that the diversity of weights and measures of different countries, and frequently in the same province, are a source of embarrassment in commerce, in the study of physics, in history, and even in politics itself; the unknown names of foreign measures, the laziness or difficulty in relating them to our own give rise to confusion in our ideas and leave us in ignorance of facts which could be useful to us.’

Roche (1998, p. 145) then explains what de La Condamine is driving at, saying:

“For reasons of international communication and of civic justice, for reasons of stability over time and for accuracy and reliability, the creation of exact, reproducible and well maintained international standards, especially of length and mass, became an increasing concern of the natural philosophers of the seventeenth and eighteenth centuries. This movement, cooperating with a corresponding impulse in governing circles for the reform of weights and measures for the benefit of society and trade, culminated in late eighteenth century France in the metric system. It established not only an exact, rational and international system of measuring length, area, volume and mass, but introduced a similar standard for temperature within the scientific community. It stimulated a wider concern within science to establish all scientific units with equal rigour, basing them wherever possible on the newly established metric units (and on the older exact units of time and angular measurement), because of their accuracy, stability and international availability. This process gradually brought about a profound change in the notation and interpretation of the mathematical formalism of physics: it brought about, for the first time in the history of the mathematical sciences, a true union of mathematics and measurement.”

As it was in the seventeenth and eighteenth centuries for physics, so it has also been in the twentieth and twenty-first for the psychosocial sciences. The creation of exact, reproducible and well maintained international standards is a matter of increasing concern today for the roles they will play in education, health care, the work place, business intelligence, and the economy at large.

As the economic crises persist and perhaps worsen, demand for common product definitions and for interpretable, meaningful measures of impacts and outcomes in education, health care, social services, environmental management, etc. will reach a crescendo. We need an exact, rational and international system of measuring literacy, numeracy, health, motivations, quality of life, community cohesion, and environmental quality, and we needed it fifty years ago. We need to reinvigorate and revive a wider concern across the sciences to establish all scientific units with equal rigor, and to have all measures used in research and practice based wherever possible on consensus standard metrics valued for their accuracy, stability and availability. We need to replicate in the psychosocial sciences the profound change in the notation and interpretation of the mathematical formalism of physics that occurred in the eighteenth and nineteenth centuries. We need to extend the true union of mathematics and measurement from physics to the psychosocial sciences.

Previous posts in this blog speak to the persistent invariance and objectivity exhibited by many of the constructs measured using ability tests, attitude surveys, performance assessments, etc. A question previously raised in this blog concerning the reproductive logic of living meaning deserves more attention, and can be productively explored in terms of complex adaptive functionality.

In a hierarchy of reasons why mathematically rigorous measurement is valuable, few are closer to the top of the list than facilitating the spontaneous self-organization of networks of agents and actors (Latour, 1987). The conception, gestation, birthing, and nurturing of complex adaptive systems constitute a reproductive logic for sociocultural traditions. Scientific traditions, in particular, form mature self-identities via a mutually implied subject-object relation absorbed into the flow of a dialectical give and take, just as economic systems do.

Complex adaptive systems establish the reproductive viability of their offspring and the coherence of an ecological web of meaningful relationships by means of this dialectic. Taylor (2003, pp. 166-8) describes the five moments in the formation and operation of complex adaptive systems, which must be able

  • to identify regularities and patterns in the flow of matter, energy, and information (MEI) in the environment (business, social, economic, natural, etc.);
  • to produce condensed schematic representations of these regularities so they can be identified as the same if they are repeated;
  • to form reproductively interchangeable variants of these representations;
  • to succeed reproductively by means of the accuracy and reliability of the representations’ predictions of regularities in the MEI data flow; and
  • adaptively modify and reorganize representations by means of informational feedback from the environment.

All living systems, from bacteria and viruses to plants and animals to languages and cultures, are complex adaptive systems characterized by these five features.

In the history of science, technologically-embodied measurement facilitates complex adaptive systems of various kinds. That history can be used as a basis for a meta-theoretical perspective on what measurement must look like in the social and human sciences. Each of Taylor’s five moments in the formation and operation of complex adaptive systems describes a capacity of measurement systems, in that:

  • data flow regularities are captured in initial, provisional instrument calibrations;
  • condensed local schematic representations are formed when an instrument’s calibrations are anchored at repeatedly observed, invariant values;
  • interchangeable nonlocal versions of these invariances are created by means of instrument equating, item banking, metrological networks, and selective, tailored, adaptive instrument administration;
  • measures read off inaccurate and unreliable instruments will not support successful reproduction of the data flow regularity, but accurate and reliable instruments calibrated in a shared common unit provide a reference standard metric that enhances communication and reproduces the common voice and shared identity of the research community; and
  • consistently inconsistent anomalous observations provide feedback suggesting new possibilities for as yet unrecognized data flow regularities that might be captured in new calibrations.

Measurement in the social sciences is in the process of extending this functionality into practical applications in business, education, health care, government, and elsewhere. Over the course of the last 50 years, measurement research and practice has already iterated many times through these five moments. In the coming years, a new critical mass will be reached in this process, systematically bringing about scale-of-magnitude improvements in the efficiency of intangible assets markets.

How? What does a “data flow regularity” look like? How is it condensed into a a schematic and used to calibrate an instrument? How are local schematics combined together in a pattern used to recognize new instances of themselves? More specifically, how might enterprise resource planning (ERP) software (such as SAP, Oracle, or PeopleSoft) simultaneously provide both the structure needed to support meaningful comparisons and the flexibility needed for good fit with the dynamic complexity of adaptive and generative self-organizing systems?

Prior work in this area proposes a dual-core, loosely coupled organization using ERP software to build social and intellectual capital, instead of using it as an IT solution addressing organizational inefficiencies (Lengnick-Hall, Lengnick-Hall, & Abdinnour-Helm, 2004). The adaptive and generative functionality (Stenner & Stone, 2003) provided by probabilistic measurement models (Rasch, 1960; Andrich, 2002, 2004; Bond & Fox, 2007; Wilson, 2005; Wright, 1977, 1999) makes it possible to model intra- and inter-organizational interoperability (Weichhart, Feiner, & Stary, 2010) at the same time that social and intellectual capital resources are augmented.

Actor/agent network theory has emerged from social and historical studies of the shared and competing moral, economic, political, and mathematical values disseminated by scientists and technicians in a variety of different successful and failed areas of research (Latour, 2005). The resulting sociohistorical descriptions ought be translated into a practical program for reproducing successful research programs. A metasystem for complex adaptive systems of research is implied in what Roche (1998) calls a “true union of mathematics and measurement.”

Complex adaptive systems are effectively constituted of such a union, even if, in nature, the mathematical character of the data flows and calibrations remains virtual. Probabilistic conjoint models for fundamental measurement are poised to extend this functionality into the human sciences. Though few, if any, have framed the situation in these terms, these and other questions are being explored, explicitly and implicitly, by hundreds of researchers in dozens of fields as they employ unidimensional models for measurement in their investigations.

If so, might then we be on the verge of a yet another new reading and writing of Galileo’s “book of nature,” this time restoring the “loss of meaning for life” suffered in Galileo’s “fateful omission” of the means by which nature came to be understood mathematically (Husserl, 1970)? The elements of a comprehensive, mathematical, and experimental design science of living systems appear on the verge of providing a saturated solution—or better, a nonequilbrium thermodynamic solution—to some of the infamous shortcomings of modern, Enlightenment science. The unity of science may yet be a reality, though not via the reductionist program envisioned by the positivists.

Some 50 years ago, Marshall McLuhan popularized the expression, “The medium is the message.” The special value quantitative measurement in the history of science does not stem from the mere use of number. Instruments are media on which nature, human or other, inscribes legible messages. A renewal of the true union of mathematics and measurement in the context of intangible assets will lead to a new cultural, scientific, and economic renaissance. As Thomas Kuhn (1977, p. 221) wrote,

“The full and intimate quantification of any science is a consummation devoutly to be wished. Nevertheless, it is not a consummation that can effectively be sought by measuring. As in individual development, so in the scientific group, maturity comes most surely to those who know how to wait.”

Given that we have strong indications of how full and intimate quantification consummates a true union of mathematics and measurement, the time for waiting is now past, and the time to act has come. See prior blog posts here for suggestions on an Intangible Assets Metric System, for resources on methods and research, for other philosophical ruminations, and more. This post is based on work presented at Rasch meetings several years ago (Fisher, 2006a, 2006b).

References

Andrich, D. (2002). Understanding resistance to the data-model relationship in Rasch’s paradigm: A reflection for the next generation. Journal of Applied Measurement, 3(3), 325-59.

Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Fisher, W. P., Jr. (2006a, Friday, April 28). Complex adaptive functionality via measurement. Presented at the Midwest Objective Measurement Seminar, M. Lunz (Organizer), University of Illinois at Chicago.

Fisher, W. P., Jr. (2006b, June 27-9). Measurement and complex adaptive functionality. Presented at the Pacific Rim Objective Measurement Symposium, T. Bond & M. Wu (Organizers), The Hong Kong Institute of Education, Hong Kong.

Husserl, E. (1970). The crisis of European sciences and transcendental phenomenology: An introduction to phenomenological philosophy (D. Carr, Trans.). Evanston, Illinois: Northwestern University Press (Original work published 1954).

Kuhn, T. S. (1977). The function of measurement in modern physical science. In T. S. Kuhn, The essential tension: Selected studies in scientific tradition and change (pp. 178-224). Chicago: University of Chicago Press. [(Reprinted from Kuhn, T. S. (1961). Isis, 52(168), 161-193.]

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Lengnick-Hall, C. A., Lengnick-Hall, M. L., & Abdinnour-Helm, S. (2004). The role of social and intellectual capital in achieving competitive advantage through enterprise resource planning (ERP) systems. Journal of Engineering Technology Management, 21, 307-330.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Roche, J. (1998). The mathematics of measurement: A critical history. London: The Athlone Press.

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Taylor, M. C. (2003). The moment of complexity: Emerging network culture. Chicago: University of Chicago Press.

Weichhart, G., Feiner, T., & Stary, C. (2010). Implementing organisational interoperability–The SUddEN approach. Computers in Industry, 61, 152-160.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Build it and they will come

February 8, 2011

“It” in the popular Kevin Costner movie, “Field of Dreams,” was a baseball diamond. He put it in a corn field. Not only did a ghost team conjure itself from the corn, so did a line of headlights on the road. There would seem to have been a stunning lack of preparation for crowds of fans, as parking, food, and toilet facilities were nowhere in sight.

Those things would be taken care of in due course, but that’s another story. The point has nothing to do with being realistic and everything to do with making dreams come true. Believing in yourself and your dreams is hard. Dreams are inherently unrealistic. As George Bernard Shaw said, reasonable people adapt to life and the world. It’s unreasonable people who think the world should adapt to them. And, accordingly, change comes about only because unreasonable and unrealistic people act to make things different.

I dream of a playing field, too. I can’t just go clear a few acres in a field to build it, though. The kind of clearing I’m dreaming of is more abstract. But the same idea applies. I, too, am certain that, if we build it, they will come.

What is it? Who are they? “It” is a better way for each of us to represent who we are to the world, and to see where we stand in it. It is a new language for speaking the truth of what we are each capable of. It is a way of tuning the instruments of a new science that will enable us to harmonize relationships of all kinds: personal, occupational, social, and economic.

Which brings us to who “they” are. They are us. Humanity. We are the players on this field that we will clear. We are the ones who care and who desire meaning. We are the ones who have been robbed of the trust, loyalty, and commitment we’ve invested in governments, corporations, and decades of failed institutions. We are the ones who know what has been lost, and what yet could still be gained. We are the ones who possess our individual skills, motivations, and health, but yet have no easy, transparent way to represent how much of any one of them we have, what quality it is, or how much it can be traded for. We are the ones who all share in the bounty of the earth’s fecund capacity for self-renewal, but who among us can show exactly how much the work we do every day adds or subtracts from the quality of the environment?

So why do I say, build it and they will come? Because this sort of thing is not something that can be created piecemeal. What if Costner’s character in the movie had not just built the field but had instead tried to find venture capital, recruit his dream team, set up a ticket sales vendor, hire management and staff, order uniforms and equipment, etc.? It never would have happened. It doesn’t work that way.

And so, finally, just what do we need to build? Just this: a new metric system. The task is to construct a system of measures for managing what’s most important in life: our relationships, our health, our capacity for productive and creative employment. We need a system that enables us to track our investments in intangible assets like education, health care, community, and quality of life. We need instruments tuned to the same scales, ones that take advantage of recently developed technical capacities for qualitatively meaningful quantification; for information synthesis across indicators/items/questions; for networked, collective thinking; for adaptive innovation support; and for creating fungible currencies in which human, social, and natural capital can be traded in efficient markets.

But this is not a system that can be built piecemeal. Infrastructure on this scale is too complex and too costly for any single individual, firm, or industry to create by itself. And building one part of it at a time will not work. We need to create the environment in which these new forms of life, these new species, these new markets for living capital, can take root and grow, organically. If we create that environment, with incentives and rewards capable of functioning like fertile soil, warm sun, and replenishing rain, it will be impossible to stop the growth.

You see, there are thousands of people around the world using new measurement methods to calibrate tests, surveys and assessments as valid and reliable instruments. But they are operating in an environment in which the fully viable seeds they have to plant are wasted. There’s no place for them to take root. There’s no sun, no water.

Why is the environment for the meaningful, uniform measurement of intangible assets so inhospitable? The primary answer to this question is cultural. We have ingrained and highly counterproductive attitudes toward what are often supposed to be the inherent properties of numbers. One very important attitude of this kind is that it is common to think that all numbers are quantitative. But lots of scoring systems and percentage reporting schemes involve numbers that do not stand for something that adds up. There is nothing automatic or simple about the way any given unit of calibrated measurement remains the same all up and down a scale. Arriving at a way to construct and maintain such a unit requires as much intensive research and imaginative investigation in the social sciences as it does in the natural sciences. But where the natural sciences and engineering have grown up around a focus on meaningful measurement, the social sciences have not.

One result of mistaken preconceptions about number is that even when tests, surveys, and assessments measure the same thing, they are disconnected from one another, tuned to different scales. There is no natural environment, no shared ecology, in which the growth of learning can take place in field-wide terms. There’s no common language in which to share what’s been learned. Even when research results are exactly the same, they look different.

But if there was a system of consensus-based reference standard metrics, one for each major construct–reading, writing, and math abilities; health status; physical and psychosocial functioning; quality of life; social and natural capital–there would be the expectation that instruments measuring the same thing should measure in the same unit. Researchers could be contributing to building larger systems when they calibrate new instruments and recalibrate old ones. They would more obviously be adding to the stock of human knowledge, understanding, and wisdom. Divergent results would demand explanations, and convergent ones would give us more confidence as we move forward.

Most importantly, quality improvement and consumer purchasing decisions and behaviors would be fluidly coordinated with no need for communicating and negotiating the details of each individual comparison. Education and health care lack common product definitions because their outcomes are measured in fragmented, incommensurable metrics. But if we had consensus-based reference standard metrics for every major form of capital employed in the economy, we could develop reasonable expectations expressed in a common language for how much change should typically be obtained in fifth-grade mathematics or from a hip replacement.

As is well-known in the business world, innovation is highly dependent on standards. We cannot empower the front line with the authority to make changes when decisions have to be based on information that is unavailable or impossible to interpret. Most of the previous entries in this blog take up various aspects of this situation.

All of this demands a very different way of thinking about what’s possible in the realm of measurement. The issues are complex. They are usually presented in difficult mathematical terms within specialized research reports. But the biggest problem has to do with thinking laterally, with moving ideas out of the vertical hierarchies of the silos where they are trapped and into a new field we can dream in. And the first seeds to be planted in such a field are the ones that say the dream is worth dreaming. When we hear that message, we are already on the way not just to building this dream, but to creating a world in which everyone can dream and envision more specific possibilities for their lives, their families, their creativity.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

You see, there are thousands of people around the world using these
new measurement methods to calibrate tests, surveys and assessments as
valid and reliable instruments. But they are operating in an
environment in which the fully viable seeds they have to plant are
wasted. There’s no place for them to take root. There’s no sun, no
water. 

This is because the instruments being calibrated are all disconnected.
Even instruments of the same kind measuring the same thing are
isolated from one another, tuned to different scales. There is no
natural environment, no shared ecology, in which the growth of
learning can take place. There’s no common language in which to share
what’s been learned. Even when results are exactly the same, they look
different.

 

You see, there are thousands of people around the world using these new measurement methods to calibrate tests, surveys and assessments as valid and reliable instruments. But they are operating in an environment in which the fully viable seeds they have to plant are wasted. There’s no place for them to take root. There’s no sun, no water. This is because the instruments being calibrated are all disconnected. Even instruments of the same kind measuring the same thing are isolated from one another, tuned to different scales. There is no natural environment, no shared ecology, in which the growth of learning can take place. There’s no common language in which to share what’s been learned. Even when results are exactly the same, they look different.

Al Gore: Marshalling the Collective Will is NOT the Problem–The Problem is the Problem!

November 22, 2009

In his new book, former vice-president Al Gore says we have in hand all the tools we need to solve the climate change crises, except the collective will to do anything about them. I respectfully beg to differ. Finding the will is not the problem. We already have it and we have it volumes sufficient to the task. Gore is also wrong in claiming we have the tools we need. There are entire classes of scientific and economic tools that we are missing. It is because we lack the right tools that we are unable to focus and channel our will for solutions.

The short version of my argument is that we don’t have scientific, universally uniform, and ubiquitously used metrics for measuring overall environmental quality. Because we don’t have the measures, we can’t and don’t effectively and efficiently manage our natural capital and environmental assets. Without metrics akin to barrels of oil or bushels of grain, we don’t have markets for matching environmental quality supply with demand for it.

Without tools as essential as metrics and markets, we can’t harness our existing will to improve our relationship with the earth. What will do we have, you might ask? Our collective will is expressed in the profit motive. What we need to do is set up metrics and markets to harness the energy of the profit motive. We need to create systems for trading natural capital (and human and social capital) so that we generate real wealth and drive happiness indexes north by realizing human potential, building thriving communities, and nurturing sustainable environments. The profit motive is not our enemy. It is the source of energy we need to deal with the multiple crises we face: human, social, and environmental.

Now for the long version of my argument. The problem is the problem. We restrict our options for solving problems by the way we frame the issue. Einstein supposedly pointed out that big problems, ones framed at a level where they define the entire paradigmatic orientation to a class of smaller, solvable problems, cannot be solved from within the paradigm they emerge from. We tend to define problems from the modern point of view, in a Cartesian fashion, from the point of view of a subject that is separate from, and in no way involved in the construction of, the objects it encounters. What I want to point out is that it is this Cartesian orientation to problem definition that is itself the problem!

Set aside your opinions on the basic issues concerning climate change, and think about what’s going on. It is undeniable that human activities are implicated in changes to the environment, and that we have to learn to manage our effects on the planet, or they will feed back on us in potentially harmful ways. This is the nature of life in the flux and flow of ecological relationships. It is one of many ways in which observers are inherently implicated in constructing what is observed, which is recognized as holding true as much in physics as in anthropology. These are uncontroversial facts, quite apart from any concern with climate change.

And what these feedback loops imply, as has indeed already been pointed out by generations of scholars and thinkers, is that there is no such thing as a pure Cartesian subject separate from its objects. We shape the things in our world, and those things, in turn, shape us. Subjects and objects are mutually implicated. All observers are participant observers. It is inevitable that what we do and think will change the world, and the new world will require us to think and act differently.

The plethora of environmental crises we face are therefore situated in a new non-Cartesian paradigm. It is a fundamental error of the first order to approach a non-Cartesian problem as though it were merely another variation on the usual kind of thing that can be addressed fairly well from the Cartesian dualist perspective. When we think, as Al Gore does, that we should be socialistically organizing resources for a centrally-organized 5-year plan of attack on environmental problems, we are missing the point.

This approach can be put to work only in terms of an authoritarian form of control directed by a dictatorial panel of experts, a military junta, or a self-appointed czar. Framed from a Cartesian point of view, no democratic process will ever compel voters to do what needs to be done. As was illustrated so dramatically by the fall of Communism, the socialistic manipulation of the concrete particulars of human, social, and environmental problems is unsustainable and socially irresponsible.

The fact is that non-Cartesian problems are only made worse when we try to solve them with Cartesian solutions. This is why non-Cartesian problems are often described by philosophers as “hermeneutic,” a word that derives from the name of the Greek god Hermes, known by the ancient Romans as Mercury. Like liquid mercury, non-Cartesian problems merely split and multiply when we grasp at them clumsily ignoring our own involvement in the creation of the problem.

So we can go on trying to herd cats or nail jello to the wall, but to be part of the solution and not just another way of being part of the problem, we need to set up systems of thought and behavior that are not internally inconsistent and self-contradictory. No matter what we do, if we keep on marshalling resources to attack problems in deliberate and systematic ignorance of this cross-paradigmatic dissonance, we can only make matters worse.

What else can be done? Just what does it mean to go with the flow of the mutual implication of subject and object? How can we explicitly model the problem to include the participant observer?

“The medium is the message,” to quote Marshall McLuhan. As was pointed out so humorously by Woody Allen in his film, “Annie Hall,” this expression is often repeated and often misunderstood. Though all can see that the news and entertainment media are ubiquitous, the meaning of our captivation with the media of creative expression has not yet been clarified sufficiently well for generalized understanding.

Significant advances have occurred in recent years, however. The media we are captivated by define and limit not only how and what we communicate, but who and what we have been, are, and could be. Depending on the quality of their transparency and of the biases that color them, media convey moral, human, and economic values of various kinds. The media through which we express values include every conceivable technology, from alphabets and phonemes to buildings, clothing, and food preparation, to musical instruments, and the creations of art and science.

Media are at the crux of the lesson we have to learn if we are to frame the problems of environmental management so that we are living solutions, not exacerbating problems. Media of all kinds, from pen and paper to television to the Internet, are fundamentally technical. In fact, media are the original technologies. The words “text,” “textile,” and “technique” all derive from the Greek “techne,” to make, and have even deeper roots in the Sanskrit “TEK.” Technology is our primary medium of shared meaning. Technology embodies the meanings we create and distributes their values across society and around the world.

What we need to do to effect non-Cartesian solutions then is to dwell deeply with our shared meanings and values, and find new ways of living them out, ways that embody the unity of subject and object, problem and solution. Nice rhetoric, you might say, but what does it mean? What is its practical consequence?

Put in academic terms, the pragmatic issue concerns the nature of technology and how it provides measures of reality serving as the media through which we experience the world in terms of shared universals. Primary sources here include the works of writers like Latour, Wise, Jasanoff, Knorr-Cetina, Schaffer, Ihde, Heidegger, and others cited in previous posts in this blog, and in my published work.

To do more to cut to the chase, we can start to think of language and technology as embodying problem-solution unities. Words and tools are situated within ecologies of relationships that define their meanings and functions. We need to be more sensitive to the way meanings and values become embodied in language and technologies, and then are distributed across far-flung networks to coordinate collectively harmonized thought and action.

To get right down to where this all is leading, though it is probably far from obvious, the appropriate non-Cartesian orientation to the problems of environmental management raised in Al Gore’s new book ultimately culminates in creation of the technical networks through which we distribute measures of what we want to manage. These networks comprise the ecologies of meaning and values that we inhabit. Not coincidentally, they also create the markets in which human, social, and natural capital can be efficiently and effectively traded.

When these networks and markets are created, finding the collective will to deal with the environmental challenges we face will be the least of our problems. The profit motive is an exceptionally strong force. What we ought to be doing is figuring out how to harness it as the engine of social change. This contrasts diametrically with Al Gore’s perspective, which treats the profit motive as part of the problem.

Technical networks of instruments traceable to reference standards, and markets for the exchange of the values measured by those instruments, are what we ought to be focusing on. The previous post in this blog proposes an Intangible Assets Metric System, and is related to earlier posts on the role of common currencies for the exchange of meaningful quantitative values in creating functional markets for human, social, and natural capital. What we need are these infrastructural supports for creating the efficient markets in which demand for environmental solutions can be matched the supply of those solutions. The failure of socialism is testimony to the futility of trying to man-handle our way forward by brute force.

Of course, I will continue living out my life’s mission and passion by continuing to elaborate variations, explanations, and demonstrations of how this could be so….

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.