Archive for the ‘Psychometrics’ Category

The Counterproductive Consequences of Common Study Designs and Statistical Methods

May 21, 2015

Because of the ways studies are designed and the ways data are analyzed, research results in psychology and the social sciences often appear to be nonlinear, sample- and instrument-dependent, and incommensurable, even when they need not be. In contrast with what are common assumptions about the nature of the constructs involved, invariant relations may be more obscured than clarified by typically employed research designs and statistical methods.

To take a particularly salient example, the number of small factors with Eigenvalues greater than 1.0 identified via factor analysis increases as the number of modes in a multi-modal distribution also increases, and the interpretation of results is further complicated by the fact that the number of factors identified decreases as sample size increases (Smith, 1996).

Similarly, variation in employment test validity across settings was established as a basic assumption by the 1970s, after 50 years of studies observing the situational specificity of results. But then Schmidt and Hunter (1977) identified sampling error, measurement error, and range restriction as major sources of what was only the appearance of incommensurable variation in employment test validity. In other words, for most of the 20th century, the identification of constructs and comparisons of results across studies were pointlessly confused by mixed populations, uncontrolled variation in reliability, and unnoted floor and/or ceiling effects. Though they do nothing to establish information systems deploying common languages structured by standard units of measurement (Feinstein, 1995), meta-analysis techniques are a step forward in equating effect sizes (Hunter & Schmidt, 2004).

Wright and Stone’s (1979) Best Test Design, in contrast, takes up each of these problems in an explicit way. Sampling error is addressed in that both the sample’s and the items’ representations of the same populations of persons and expressions of a construct are evaluated. The evaluation of reliability is foregrounded and clarified by taking advantage of the availability of individualized measurement uncertainty (error) estimates (following Andrich, 1982, presented at AERA in 1977). And range restriction becomes manageable in terms of equating and linking instruments measuring in different ranges of the same construct. As was demonstrated by Duncan (1985; Allerup, Bech, Loldrup, et al., 1994; Andrich & Styles, 1998), for instance, the restricted ranges of various studies assessing relationships between measures of attitudes and behaviors led to the mistaken conclusion that these were separate constructs. When the entire range of variation was explicitly modeled and studied, a consistent relationship was found.

Statistical and correlational methods have long histories of preventing the discovery, assessment, and practical application of invariant relations because they fail to test for invariant units of measurement, do not define standard metrics, never calibrate all instruments measuring the same thing in common units, and have no concept of formal measurement systems of interconnected instruments. Wider appreciation of the distinction between statistics and measurement (Duncan & Stenbeck, 1988; Fisher, 2010; Wilson, 2013a), and of the potential for metrological traceability we have within our reach (Fisher, 2009, 2012; Fisher & Stenner, 2013; Mari & Wilson, 2013; Pendrill, 2014; Pendrill & Fisher, 2015; Wilson, 2013b; Wilson, Mari, Maul, & Torres Irribarra, 2015), are demonstrably fundamental to the advancement of a wide range of fields.

References

Allerup, P., Bech, P., Loldrup, D., Alvarez, P., Banegil, T., Styles, I., & Tenenbaum, G. (1994). Psychiatric, business, and psychological applications of fundamental measurement models. International Journal of Educational Research, 21(6), 611-622.

Andrich, D. (1982). An index of person separation in Latent Trait Theory, the traditional KR-20 index, and the Guttman scale response pattern. Education Research and Perspectives, 9(1), 95-104 [http://www.rasch.org/erp7.htm].

Andrich, D., & Styles, I. M. (1998). The structural relationship between attitude and behavior statements from the unfolding perspective. Psychological Methods, 3(4), 454-469.

Duncan, O. D. (1985). Probability, disposition and the inconsistency of attitudes and behaviour. Synthese, 42, 21-34.

Duncan, O. D., & Stenbeck, M. (1988). Panels and cohorts: Design and model in the study of voting turnout. In C. C. Clogg (Ed.), Sociological Methodology 1988 (pp. 1-35). Washington, DC: American Sociological Association.

Feinstein, A. R. (1995). Meta-analysis: Statistical alchemy for the 21st century. Journal of Clinical Epidemiology, 48(1), 71-79.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2010). Statistics and measurement: Clarifying the differences. Rasch Measurement Transactions, 23(4), 1229-1230.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5.

Fisher, W. P., Jr., & Stenner, A. J. (2013). Overcoming the invisibility of metrology: A reading measurement network for education and the social sciences. Journal of Physics: Conference Series, 459(012024), http://iopscience.iop.org/1742-6596/459/1/012024.

Hunter, J. E., & Schmidt, F. L. (Eds.). (2004). Methods of meta-analysis: Correcting error and bias in research findings. Thousand Oaks, CA: Sage.

Mari, L., & Wilson, M. (2013). A gentle introduction to Rasch measurement models for metrologists. Journal of Physics Conference Series, 459(1), http://iopscience.iop.org/1742-6596/459/1/012002/pdf/1742-6596_459_1_012002.pdf.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62(5), 529-540.

Smith, R. M. (1996). A comparison of methods for determining dimensionality in Rasch measurement. Structural Equation Modeling, 3(1), 25-40.

Wilson, M. R. (2013a). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wilson, M. R. (2013b). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wilson, M., Mari, L., Maul, A., & Torres Irribarra, D. (2015). A comparison of measurement concepts across physical science and social science domains: Instrument design, calibration, and measurement. Journal of Physics: Conference Series, 588(012034), http://iopscience.iop.org/1742-6596/588/1/012034.

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Chicago, Illinois: MESA Press.

Six Classes of Results Supporting the Measurability of Human Functioning and Capability

April 12, 2014

Another example of high-level analysis that suffers from a lack of input from state of the art measurement arises in Nussbaum (1997, p. 1205), where the author remarks that it is now a matter of course, in development economics, “to recognize distinct domains of human functioning and capability that are not commensurable along a single metric, and with regard to which choice and liberty of agency play a fundamental structuring role.” Though Nussbaum (2011, pp. 58-62) has lately given a more nuanced account of the challenges of measurement relative to human capabilities, appreciation of the power and flexibility of contemporary measurement models, methods, and instruments remains lacking. For a detailed example of the complexities and challenges that must be addressed in the context of global human development, which is Nussbaum’s area of interest, see Fisher (2011).

Though there are indeed domains of human functioning and capability that are not commensurable along a single metric, they are not the ones referred to by Nussbaum or the texts she cites. On the contrary, six different approaches to establishing the measurability of human functioning and capability have been explored and proven as providing, especially in their composite aggregate, a substantial basis for theory and practice (modified from Fisher, 2009, pp. 1279-1281). These six classes of results speak to the abstract, mathematical side of the paradox noted by Ricoeur (see previous post here) concerning the need to simultaneously accept roles for abstract ideal global universals and concrete local historical contexts in strategic planning and thinking. The six classes of results are:

  1. Mathematical proofs of the necessity and sufficiency of test and survey scores for invariant measurement in the context of Rasch’s probabilistic models (Andersen, 1977, 1999; Fischer, 1981; Newby, Conner, Grant, and Bunderson, 2009; van der Linden, 1992).
  2. Reproduction of physical units of measurement (centimeters, grams, etc.) from ordinal observations (Choi, 1997; Moulton, 1993; Pelton and Bunderson, 2003; Stephanou and Fisher, 2013).
  3. The common mathematical form of the laws of nature and Rasch models (Rasch, 1960, pp. 110-115; Fisher, 2010; Fisher and Stenner, 2013).
  4. Multiple independent studies of the same constructs on different (and common) samples using different (and the same) instruments intended to measure the same thing converge on common units, defining the same objects, substantiating theory, and supporting the viability of standardized metrics (Fisher, 1997a, 1997b, 1999, etc.).
  5. Thousands of peer-reviewed publications in hundreds of scientific journals provide a wide-ranging and diverse array of supporting evidence and theory.
  6. Analogous causal attributions and theoretical explanatory power can be created in both natural and social science contexts (Stenner, Fisher, Stone, and Burdick, 2013).

What we have here, in sum, is a combination of Greek axiomatic and Babylonian empirical algorithms, in accord with Toulmin’s (1961, pp. 28-33) sense of the contrasting principled bases for scientific advancement. Feynman (1965, p. 46) called for less of a focus on the Greek chain of reasoning approach, as it is only as strong as its weakest link, whereas the Babylonian algorithms are akin to a platform with enough supporting legs that one or more might fail without compromising its overall stability. The variations in theory and evidence under these six headings provide ample support for the conceptual and practical viability of metrological systems of measurement in education, health care, human resource management, sociology, natural resource management, social services, and many other fields. The philosophical critique of any type of economics will inevitably be wide of the mark if uninformed about these accomplishments in the theory and practice of measurement.

References

Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42(1), 69-81.

Andersen, E. B. (1999). Sufficient statistics in educational measurement. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment (pp. 122-125). New York: Pergamon.

Choi, S. E. (1997). Rasch invents “ounces.” Rasch Measurement Transactions, 11(2), 557 [http://www.rasch.org/rmt/rmt112.htm#Ounces].

Feynman, R. (1965). The character of physical law. Cambridge, Massachusetts: MIT Press.

Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika, 46(1), 59-77.

Fisher, W. P., Jr. (1997). Physical disability construct convergence across instruments: Towards a universal metric. Journal of Outcome Measurement, 1(2), 87-113.

Fisher, W. P., Jr. (1997). What scale-free measurement means to health outcomes research. Physical Medicine & Rehabilitation State of the Art Reviews, 11(2), 357-373.

Fisher, W. P., Jr. (1999). Foundations for health status metrology: The stability of MOS SF-36 PF-10 calibrations across samples. Journal of the Louisiana State Medical Society, 151(11), 566-578.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2010). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics: Conference Series, 238(1), http://iopscience.iop.org/1742-6596/238/1/012016/pdf/1742-6596_238_1_012016.pdf.

Fisher, W. P., Jr. (2011). Measuring genuine progress by scaling economic indicators to think global & act local: An example from the UN Millennium Development Goals project. LivingCapitalMetrics.com. Retrieved 18 January 2011, from Social Science Research Network: http://ssrn.com/abstract=1739386.

Fisher, W. P., Jr., & Stenner, A. J. (2013). On the potential for improved measurement in the human and social sciences. In Q. Zhang & H. Yang (Eds.), Pacific Rim Objective Measurement Symposium 2012 Conference Proceedings (pp. 1-11). Berlin, Germany: Springer-Verlag.

Moulton, M. (1993). Probabilistic mapping. Rasch Measurement Transactions, 7(1), 268 [http://www.rasch.org/rmt/rmt71b.htm].

Newby, V. A., Conner, G. R., Grant, C. P., & Bunderson, C. V. (2009). The Rasch model and additive conjoint measurement. Journal of Applied Measurement, 107(4), 348-354.

Nussbaum, M. (1997). Flawed foundations: The philosophical critique of (a particular type of) economics. University of Chicago Law Review, 64, 1197-1214.

Nussbaum, M. (2011). Creating capabilities: The human development approach. Cambridge, MA: The Belknap Press.

Pelton, T., & Bunderson, V. (2003). The recovery of the density scale using a stochastic quasi-realization of additive conjoint measurement. Journal of Applied Measurement, 4(3), 269-281.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Danish Yearbook of Philosophy, 14, 58-94.

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14.

Stephanou, A., & Fisher, W. P., Jr. (2013). From concrete to abstract in the measurement of length. Journal of Physics Conference Series, 459, http://iopscience.iop.org/1742-6596/459/1/012026.

Toulmin, S. E. (1961). Foresight and understanding: An enquiry into the aims of science. London, England: Hutchinson.

van der Linden, W. J. (1992). Sufficient and necessary statistics. Rasch Measurement Transactions, 6(3), 231 [http://www.rasch.org/rmt/rmt63d.htm].

 

Rasch Measurement as a Basis for a New Standards Framework

October 26, 2011

The 2011 U.S. celebration of World Standards Day took place on October 13 at the Fairmont Hotel in Washington, D.C., with the theme of “Advancing Safety and Sustainability Standards Worldwide.” The evening began with a reception in a hall of exhibits from the celebrations sponsors, which included the National Institute for Standards and Technology (NIST), the Society for Standards Professionals (SES), the American National Standards Institute (ANSI), Microsoft, IEEE, Underwriters Laboratories, the Consumer Electronics Association, ASME, ASTM International, Qualcomm, Techstreet, and many others. Several speakers took the podium after dinner to welcome the 400 or so attendees and to present the World Standards Day Paper Competition Awards and the Ronald H. Brown Standards Leadership Award.

Dr. Patrick Gallagher, Under Secretary of Commerce for Standards and Technology, and Director of NIST, was the first speaker after dinner. He directed his remarks at the value of a decentralized, voluntary, and demand-driven system of standards in promoting innovation and economic prosperity. Gallagher emphasized that “standards provide the common language that keeps domestic and international trade flowing,” concluding that “it is difficult to overestimate their critical value to both the U.S. and global economy.”

James Shannon, President of the National Fire Protection Association (NFPA), accepted the R. H. Brown Standards Leadership Award in recognition for his work initiating or improving the National Electrical Code, the Life Safety Code, and the Fire Safe Cigarette and Residential Sprinkler Campaigns.

Ellen Emard, President of SES, introduced the paper competition award winners. As of this writing the titles and authors of the first and second place awards are not yet available on the SES web site (http://www.ses-standards.org/displaycommon.cfm?an=1&subarticlenbr=56). I took third place for my paper, “What the World Needs Now: A Bold Plan for New Standards.” Where the other winning papers took up traditional engineering issues concerning the role of standards in advancing safety and sustainability issues, my paper spoke to the potential scientific and economic benefits that could be realized by standard metrics and common product definitions for outcomes in education, health care, social services, and environmental resource management. All three of the award-winning papers will appear in a forthcoming issue of Standards Engineering, the journal of SES.

I was coincidentally seated at the dinner alongside Gordon Gillerman, winner of third place in the 2004 paper competition (http://www.ses-standards.org/associations/3698/files/WSD%202004%20-%203%20-%20Gillerman.pdf) and currently Chief of the Standards Services Division at NIST. Gillerman has a broad range of experience in coordinating standards across multiple domains, including environmental protection, homeland security, safety, and health care. Having recently been involved in a workshop focused on measuring, evaluating, and improving the usability of electronic health records (http://www.nist.gov/healthcare/usability/upload/EHR-Usability-Workshop-2011-6-03-2011_final.pdf), Gillerman was quite interested in the potential Rasch measurement techniques hold for reducing data volume with no loss of information, and so for streamlining computer interfaces.

Robert Massof of Johns Hopkins University accompanied me to the dinner, and was seated at a nearby table. Also at Massof’s table were several representatives of the National Institute of Building Sciences, some of whom Massof had recently met at a workshop on adaptations for persons with low vision disabilities. Massof’s work equating the main instruments used for assessing visual function in low vision rehabilitation could lead to a standard metric useful in improving the safety and convenience of buildings.

As is stated in educational materials distributed at the World Standards Day celebration by ANSI, standards are a constant behind-the-scenes presence in nearly all areas of everyday life. Everything from air, water, and food to buildings, clothing, automobiles, roads, and electricity are produced in conformity with voluntary consensus standards of various kinds. In the U.S. alone, more than 100,000 standards specify product and system features and interconnections, making it possible for appliances to tap the electrical grid with the same results no matter where they are plugged in, and for products of all kinds to be purchased with confidence. Life is safer and more convenient, and science and industry are more innovative and profitable, because of standards.

The point of my third-place paper is that life could be even safer and more convenient, and science and industry could be yet more innovative and profitable, if standards and conformity assessment procedures for outcomes in education, health care, social services, and environmental resource management were developed and implemented. Rasch measurement demonstrates the consistent reproducibility of meaningful measures across samples and different collections of construct-relevant items. Within any specific area of interest, then, Rasch measures have the potential of serving as the kind of mediating instruments or objects recognized as essential to the process of linking science with the economy (Fisher & Stenner, 2011b; Hussenot & Missonier, 2010; Miller & O’Leary, 2007). Recent white papers published by NIST and NSF document the challenges and benefits likely to be encountered and produced by initiatives moving in this direction (Fisher, 2009; Fisher & Stenner, 2011a).

A diverse array of Rasch measurement presentations were made at the recent International Measurement Confederation (IMEKO) meeting of metrology engineers in Jena, Germany (see RMT 25 (1), p. 1318). With that start at a new dialogue between the natural and social sciences, the NIST and NSF white papers, and with the award in the World Standards Day paper competition, the U.S. and international standards development communities have shown their interest in exploring possibilities for a new array of standard units of measurement, standardized outcome product definitions, standard conformity assessment procedures, and outcome product quality standards. The increasing acceptance and recognition of the viability of such standards is a logical consequence of observations like these:

  • “Where this law [relating reading ability and text difficulty to comprehension rate] can be applied it provides a principle of measurement on a ratio scale of both stimulus parameters and object parameters, the conceptual status of which is comparable to that of measuring mass and force. Thus…the reading accuracy of a child…can be measured with the same kind of objectivity as we may tell its weight” (Rasch, 1960, p. 115).
  • “Today there is no methodological reason why social science cannot become as stable, as reproducible, and hence as useful as physics” (Wright, 1997, p. 44).
  • “…when the key features of a statistical model relevant to the analysis of social science data are the same as those of the laws of physics, then those features are difficult to ignore” (Andrich, 1988, p. 22).

Rasch’s work has been wrongly assimilated in social science research practice as just another example of the “standard model” of statistical analysis. Rasch measurement rightly ought instead to be treated as a general articulation of the three-variable structure of natural law useful in framing the context of scientific practice. That is, Rasch’s models ought to be employed primarily in calibrating instruments quantitatively interpretable at the point of use in a mathematical language shared by a community of research and practice. To be shared in this way as a universally uniform coin of the realm, that language must be embodied in a consensus standard defining universally uniform units of comparison.

Rasch measurement offers the potential of shifting the focus of quantitative psychosocial research away from data analysis to integrated qualitative and quantitative methods enabling the definition of standard units and the calibration of instruments measuring in that unit. An intangible assets metric system will, in turn, support the emergence of new product- and performance-based standards, management system standards, and personnel certification standards. Reiterating once again Rasch’s (1960, p. xx) insight, we can acknowledge with him that “this is a huge challenge, but once the problem has been formulated it does seem possible to meet it.”

 References

Andrich, D. (1988). Rasch models for measurement. (Vols. series no. 07-068). Sage University Paper Series on Quantitative Applications in the Social Sciences. Beverly Hills, California: Sage Publications.

Fisher, W. P.. Jr. (2009). Metrological infrastructure for human, social, and natural capital (NIST Critical National Need Idea White Paper Series, Retrieved 25 October 2011 from http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC: National Institute for Standards and Technology.

Fisher, W. P., Jr., & Stenner, A. J. (2011a, January). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). Retrieved 25 October 2011 from http://www.nsf.gov/sbe/sbe_2020/submission_detail.cfm?upld_id=36. Washington, DC: National Science Foundation.

Fisher, W. P., Jr., & Stenner, A. J. (2011b). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO), Jena, Germany, August 31 to September 2.

Hussenot, A., & Missonier, S. (2010). A deeper understanding of evolution of the role of the object in organizational process. The concept of ‘mediation object.’ Journal of Organizational Change Management, 23(3), 269-286.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part III: Reflections on Greider’s “Bold Ideas” in The Nation

September 10, 2011

And so, The Nation’s “Bold Ideas for a New Economy” is disappointing for not doing more to start from the beginning identified by its own writer, William Greider. The soul of capitalism needs to be celebrated and nourished, if we are to make our economy “less destructive and domineering,” and “more focused on what people really need for fulfilling lives.” The only real alternative to celebrating and nourishing the soul of capitalism is to kill it, in the manner of the Soviet Union’s failed experiments in socialism and communism.

The article speaks the truth, though, when it says there is no point in trying to persuade the powers that be to make the needed changes. Republicans see the market as it exists as a one-size-fits-all economic panacea, when all it can accomplish in its current incomplete state is the continuing externalization of anything and everything important about human, social, and environmental decency. For their part, Democrats do indeed “insist that regulation will somehow fix whatever is broken,” in an ever-expanding socialistic micromanagement of every possible exception to the rules that emerges.

To date, the president’s efforts at a nonpartisan third way amount only to vacillations between these opposing poles. The leadership that is needed, however, is something else altogether. Yes, as The Nation article says, capitalism needs to be made to serve the interests of society, and this will require deep structural change, not just new policies. But none of the contributors of the “bold ideas” presented propose deep structural changes of a kind that actually gets at the soul of capitalism. All of the suggestions are ultimately just new policies tweaking superficial aspects of the economy in mechanical, static, and very limited ways.

The article calls for “Democratizing reforms that will compel business and finance to share decision-making and distribute rewards more fairly.” It says the vision has different names but “the essence is a fundamental redistribution of power and money.” But corporate distortions of liability law, the introduction of boardroom watchdogs, and a tax on financial speculation do not by any stretch of the imagination address the root causes of social and environmental irresponsibility in business. They “sound like obscure technical fixes” because that’s what they are. The same thing goes for low-cost lending from public banks, the double or triple bottom lines of Benefit Corporations, new anti-trust laws, calls for “open information” policies, added personal stakes for big-time CEOs, employee ownership plans, the elimination of tax subsidies for, new standards for sound investing, new measures of GDP, and government guarantees of full employment.

All of these proposals sound like what ought to be the effects and outcomes of efforts addressing the root causes of capitalisms’ shortcomings. Instead, they are band aids applied to scratched fingers and arms when multiple by-pass surgery is called for. That is, what we need is to understand how to bring the spirit of capitalism to life in the new domains of human, social, and environmental interests, but what we’re getting are nothing but more of the same piecemeal ways of moving around the deck chairs on the Titanic.

There is some truth in the assertion that what really needs reinventing is our moral and spiritual imagination. As someone (Einstein or Edison?) is supposed to have put it, originality is simply a matter of having a source for an analogy no one else has considered. Ironically, the best model is often the one most taken for granted and nearest to hand. Such is the case with the two-sided scientific and economic effects of standardized units of measurement. The fundamental moral aspect here is nothing other than the Golden Rule, independently derived and offered in cultures throughout history, globally. Individualized social measurement is nothing if not a matter of determining whether others are being treated in the way you yourself would want to be treated.

And so, yes, to stress the major point of agreement with The Nation, “the new politics does not start in Washington.” Historically, at their best, governments work to keep pace with the social and technical innovations introduced by their peoples. Margaret Mead said it well a long time ago when she asserted that small groups of committed citizens are the only sources of real social change.

Not to be just one of many “advocates with bold imaginations” who wind up marginalized by the constraints of status quo politics, I claim my personal role in imagining a new economic future by tapping as deeply as I can into the positive, pre-existing structures needed for a transition into a new democratic capitalism. We learn through what we already know. Standards are well established as essential to commerce and innovation, but 90% of the capital under management in our economy—the human, social, and natural capital—lacks the standards needed for optimal market efficiency and effectiveness. An intangible assets metric system will be a vitally important way in which we extend what is right and good in the world today into new domains.

To conclude, what sets this proposal apart from those offered by The Nation and its readers hinges on our common agreement that “the most threatening challenge to capitalism is arguably the finite carrying capacity of the natural world.” The bold ideas proposed by The Nation’s readers respond to this challenge in ways that share an important feature in common: people have to understand the message and act on it. That fact dooms all of these ideas from the start. If we have to articulate and communicate a message that people then have to act on, we remain a part of the problem and not part of the solution.

As I argue in my “The Problem is the Problem” blog post of some months ago, this way of defining problems is itself the problem. That is, we can no longer think of ourselves as separate from the challenges we face. If we think we are not all implicated through and through as participants in the construction and maintenance of the problem, then we have not understood it. The bold ideas offered to date are all responses to the state of a broken system that seek to reform one or another element in the system when what we need is a whole new system.

What we need is a system that so fully embodies nature’s own ecological wisdom that the medium becomes the message. When the ground rules for economic success are put in place such that it is impossible to earn a profit without increasing stocks of human, social, and natural capital, there will be no need to spell out the details of a microregulatory structure of controlling new anti-trust laws, “open information” policies, personal stakes for big-time CEOs, employee ownership plans, the elimination of tax subsidies, etc. What we need is precisely what Greider reported from Innovest in his book: reliable, high quality information that makes human, social, and environmental issues matter financially. Situated in a context like that described by Bernstein in his 2004 The Birth of Plenty, with the relevant property rights, rule of law, scientific rationality, capital markets, and communications networks in place, it will be impossible to stop a new economic expansion of historic proportions.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Reimagining Capitalism Again, Part I: Reflections on Greider’s Soul of Capitalism

September 10, 2011

In his 2003 book, The Soul of Capitalism, William Greider wrote, “If capitalism were someday found to have a soul, it would probably be located in the mystic qualities of capital itself” (p. 94). The recurring theme in the book is that the resolution of capitalism’s deep conflicts must grow out as organic changes from the roots of capitalism itself.

In the book, Greider quotes Innovest’s Michael Kiernan as suggesting that the goal has to be re-engineering the DNA of Wall Street (p. 119). He says the key to doing this is good reliable information that has heretofore been unavailable but which will make social and environmental issues matter financially. The underlying problems of exactly what solid, high quality information looks like, where it comes from, and how it is created are not stated or examined, but the point, as Kiernan says, is that “the markets are pretty good at punishing and rewarding.” The objective is to use “the financial markets as an engine of reform and positive change rather than destruction.”

This objective is, of course, the focus of multiple postings in this blog (see especially this one and this one). From my point of view, capitalism indeed does have a soul and it is actually located in the qualities of capital itself. Think about it: if a soul is a spirit of something that exists independent of its physical manifestation, then the soul of capitalism is the fungibility of capital. Now, this fungibility is complex and ambiguous. It takes its strength and practical value from the way market exchange are represented in terms of currencies, monetary units that, within some limits, provide an objective basis of comparison useful for rewarding those capable of matching supply with demand.

But the fungibility of capital can also be dangerously misconceived when the rich complexity and diversity of human capital is unjustifiably reduced to labor, when the irreplaceable value of natural capital is unjustifiably reduced to land, and when the trust, loyalty, and commitment of social capital is completely ignored in financial accounting and economic models. As I’ve previously said in this blog, the concept of human capital is inherently immoral so far as it reduces real human beings to interchangeable parts in an economic machine.

So how could it ever be possible to justify any reduction of human, social, and natural value to a mere number? Isn’t this the ultimate in the despicable inhumanity of economic logic, corporate decision making, and, ultimately, the justification of greed? Many among us who profess liberal and progressive perspectives seem to have an automatic and reactionary prejudice of this kind. This makes these well-intentioned souls as much a part of the problem as those among us with sometimes just as well-intentioned perspectives that accept such reductionism as the price of entry into the game.

There is another way. Human, social, and natural value can be measured and made manageable in ways that do not necessitate totalizing reduction to a mere number. The problem is not reduction itself, but unjustified, totalizing reduction. Referring to all people as “man” or “men” is an unjustified reduction dangerous in the way it focuses attention only on males. The tendency to think and act in ways privileging males over females that is fostered by this sense of “man” shortchanges us all, and has happily been largely eliminated from discourse.

Making language more inclusive does not, however, mean that words lose the singular specificity they need to be able to refer to things in the world. Any given word represents an infinite population of possible members of a class of things, actions, and forms of life. Any simple sentence combining words into a coherent utterance then multiplies infinities upon infinities. Discourse inherently reduces multiplicities into texts of limited lengths.

Like any tool, reduction has its uses. Also like any tool, problems arise when the tool is allowed to occupy some hidden and unexamined blind spot from which it can dominate and control the way we think about everything. Critical thinking is most difficult in those instances in which the tools of thinking themselves need to be critically evaluated. To reject reduction uncritically as inherently unjustified is to throw the baby out with the bathwater. Indeed, it is impossible to formulate a statement of the rejection without simultaneously enacting exactly what is supposed to be rejected.

We have numerous ready-to-hand examples of how all reduction has been unjustifiably reduced to one homogenized evil. But one of the results of experiments in communal living in the 1960s and 1970s, as well as of the fall of the Soviet Union, was the realization that the centralized command and control of collectively owned community property cannot compete with the creativity engendered when individuals hold legal title to the fruits of their labors. If individuals cannot own the results of the investments they make, no one makes any investments.

In other words, if everything is owned collectively and is never reduced to individually possessed shares that can be creatively invested for profitable returns, then the system is structured so as to punish innovation and reward doing as little as possible. But there’s another way of thinking about the relation of the collective to the individual. The living soul of capitalism shows itself in the way high quality information makes it possible for markets to efficiently coordinate and align individual producers’ and consumers’ collective behaviors and decisions. What would happen if we could do that for human, social, and natural capital markets? What if “social capitalism” is more than an empty metaphor? What if capital institutions can be configured so that individual profit really does become the driver of socially responsible, sustainable economics?

And here we arrive at the crux of the problem. How do we create the high quality, solid information markets need to punish and reward relative to ethical and sustainable human, social, and environmental values? Well, what can we learn from the way we created that kind of information for property and manufactured capital? These are the questions taken up and explored in the postings in this blog, and in my scientific research publications and meeting presentations. In the near future, I’ll push my reflection on these questions further, and will explore some other possible answers to the questions offered by Greider and his readers in a recent issue of The Nation.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Debt, Revenue, and Changing the Way Washington Works: The Greatest Entrepreneurial Opportunity of Our Time

July 30, 2011

“Holding the line” on spending and taxes does not make for a fundamental transformation of the way Washington works. Simply doing less of one thing is just a small quantitative change that does nothing to build positive results or set a new direction. What we need is a qualitative metamorphosis akin to a caterpillar becoming a butterfly. In contrast with this beautiful image of natural processes, the arguments and so-called principles being invoked in the sham debate that’s going on are nothing more than fights over where to put deck chairs on the Titanic.

What sort of transformation is possible? What kind of a metamorphosis will start from who and where we are, but redefine us sustainably and responsibly? As I have repeatedly explained in this blog, my conference presentations, and my publications, with numerous citations of authoritative references, we already possess all of the elements of the transformation. We have only to organize and deploy them. Of course, discerning what the resources are and how to put them together is not obvious. And though I believe we will do what needs to be done when we are ready, it never hurts to prepare for that moment. So here’s another take on the situation.

Infrastructure that supports lean thinking is the name of the game. Lean thinking focuses on identifying and removing waste. Anything that consumes resources but does not contribute to the quality of the end product is waste. We have enormous amounts of wasteful inefficiency in many areas of our economy. These inefficiencies are concentrated in areas in which management is hobbled by low quality information, where we lack the infrastructure we need.

Providing and capitalizing on this infrastructure is The Greatest Entrepreneurial Opportunity of Our Time. Changing the way Washington (ha! I just typed “Wastington”!) works is the same thing as mitigating the sources of risk that caused the current economic situation. Making government behave more like a business requires making the human, social, and natural capital markets more efficient. Making those markets more efficient requires reducing the costs of transactions. Those costs are determined in large part by information quality, which is a function of measurement.

It is often said that the best way to reduce the size of government is to move the functions of government into the marketplace. But this proposal has never been associated with any sense of the infrastructural components needed to really make the idea work. Simply reducing government without an alternative way of performing its functions is irresponsible and destructive. And many of those who rail on and on about how bad or inefficient government is fail to recognize that the government is us. We get the government we deserve. The government we get follows directly from the kind of people we are. Government embodies our image of ourselves as a people. In the US, this is what having a representative form of government means. “We the people” participate in our society’s self-governance not just by voting, writing letters to congress, or demonstrating, but in the way we spend our money, where we choose to live, work, and go to school, and in every decision we make. No one can take a breath of air, a drink of water, or a bite of food without trusting everyone else to not carelessly or maliciously poison them. No one can buy anything or drive down the street without expecting others to behave in predictable ways that ensure order and safety.

But we don’t just trust blindly. We have systems in place to guard against those who would ruthlessly seek to gain at everyone else’s expense. And systems are the point. No individual person or firm, no matter how rich, could afford to set up and maintain the systems needed for checking and enforcing air, water, food, and workplace safety measures. Society as a whole invests in the infrastructure of measures created, maintained, and regulated by the government’s Department of Commerce and the National Institute for Standards and Technology (NIST). The moral importance and the economic value of measurement standards has been stressed historically over many millennia, from the Bible and the Quran to the Magna Carta and the French Revolution to the US Constitution. Uniform weights and measures are universally recognized and accepted as essential to fair trade.

So how is it that we nonetheless apparently expect individuals and local organizations like schools, businesses, and hospitals to measure and monitor students’ abilities; employees’ skills and engagement; patients’ health status, functioning, and quality of care; etc.? Why do we not demand common currencies for the exchange of value in human, social, and natural capital markets? Why don’t we as a society compel our representatives in government to institute the will of the people and create new standards for fair trade in education, health care, social services, and environmental management?

Measuring better is not just a local issue! It is a systemic issue! When measurement is objective and when we all think together in the common language of a shared metric (like hours, volts, inches or centimeters, ounces or grams, degrees Fahrenheit or Celsius, etc.), then and only then do we have the means we need to implement lean strategies and create new efficiencies systematically. We need an Intangible Assets Metric System.

The current recession in large part was caused by failures in measuring and managing trust, responsibility, loyalty, and commitment. Similar problems in measuring and managing human, social, and natural capital have led to endlessly spiraling costs in education, health care, social services, and environmental management. The problems we’re experiencing in these areas are intimately tied up with the way we formulate and implement group level decision making processes and policies based in statistics when what we need is to empower individuals with the tools and information they need to make their own decisions and policies. We will not and cannot metamorphose from caterpillar to butterfly until we create the infrastructure through which we each can take full ownership and control of our individual shares of the human, social, and natural capital stock that is rightfully ours.

We well know that we manage what we measure. What counts gets counted. Attention tends to be focused on what we’re accountable for. But–and this is vitally important–many of the numbers called measures do not provide the information we need for management. And not only are lots of numbers giving us low quality information, there are far too many of them! We could have better and more information from far fewer numbers.

Previous postings in this blog document the fact that we have the intellectual, political, scientific, and economic resources we need to measure and manage human, social, and natural capital for authentic wealth. And the issue is not a matter of marshaling the will. It is hard to imagine how there could be more demand for better management of intangible assets than there is right now. The problem in meeting that demand is a matter of imagining how to start the ball rolling. What configuration of investments and resources will start the process of bursting open the chrysalis? How will the demand for meaningful mediating instruments be met in a way that leads to the spreading of the butterfly’s wings? It is an exciting time to be alive.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Number lines, counting, and measuring in arithmetic education

July 29, 2011

Over the course of two days spent at a meeting on mathematics education, a question started to form in my mind, one I don’t know how to answer, and to which there may be no answer. I’d like to try to formulate what’s on my mind in writing, and see if it’s just nonsense, a curiosity, some old debate that’s been long since resolved, issues too complex to try to use in elementary education, or something we might actually want to try to do something about.

The question stems from my long experience in measurement. It is one of the basic principles of the field that counting and measuring are different things (see the list of publications on this, below). Counts don’t behave like measures unless the things being counted are units of measurement established as equal ratios or intervals that remain invariant independent of the local particulars of the sample and instrument.

Plainly, if you count two groups of small and large rocks or oranges, the two groups can have the same number of things and the group with the larger things will have more rock or orange than the group with the smaller things. But the association of counting numbers and arithmetic operations with number lines insinuates and reinforces to the point of automatic intuition the false idea that numbers always represent quantity. I know that number lines are supposed to represent an abstract continuum but I think it must be nearly impossible for children to not assume that the number line is basically a kind of ruler, a real physical thing that behaves much like a row of same size wooden blocks laid end to end.

This could be completely irrelevant if the distinction between “How many?” and “How much?” is intensively taught and drilled into kids. Somehow I think it isn’t, though. And here’s where I get to the first part of my real question. Might not the universal, early, and continuous reinforcement of this simplistic equating of number and quantity have a lot to do with the equally simplistic assumption that all numeric data and statistical analysis is somehow quantitative? We count rocks or fish or sticks and call the resulting numbers quantities, and so we do the same thing when we count correct answers or ratings of “Strongly Agree.”

Though that counting is a natural and obvious point from which to begin studying whether something is quantitatively measurable, there are no defined units of measurement in the ordinal data gathered up from tests and surveys. The difference between any two adjacent scores varies depending on which two adjacent scores are compared. This has profound implications for the inferences we make and for our ability to think together as a field about our objects of investigation.

Over the last 30 years and more, we have become increasingly sensitized to the way our words prefigure our expectations and color our perceptions. This struggle to say what we mean and to not prejudicially exclude others from recognition as full human beings is admirable and good. But if that is so, why is it then that we nonetheless go on unjustifiably reducing the real characteristics of people’s abilities, health, performances, etc. to numbers that do not and cannot stand for quantitative amounts? Why do we keep on referring to counts as quantities? Why do we insist on referring to inconstant and locally dependent scores as measures? And why do we refuse to use the readily available methods we have at our disposal to create universally uniform measures that consistently represent the same unit amount always and everywhere?

It seems to me that the image of the number line as a kind of ruler is so indelibly impressed on us as a habit of thought that it is very difficult to relinquish it in favor of a more abstract model of number. Might it be important for us to begin to plant the seeds for more sophisticated understandings of number early in mathematics education? I’m going to wonder out loud about this to some of my math education colleagues…

Cooper, G., & Humphry, S. M. (2010). The ontological distinction between units and entities. Synthese, pp. DOI 10.1007/s11229-010-9832-1.

Wright, B. D. (1989). Rasch model from counting right answers: Raw scores as sufficient statistics. Rasch Measurement Transactions, 3(2), 62 [http://www.rasch.org/rmt/rmt32e.htm].

Wright, B. D. (1993). Thinking with raw scores. Rasch Measurement Transactions, 7(2), 299-300 [http://www.rasch.org/rmt/rmt72r.htm].

Wright, B. D. (1994, Autumn). Measuring and counting. Rasch Measurement Transactions, 8(3), 371 [http://www.rasch.org/rmt/rmt83c.htm].

Wright, B. D., & Linacre, J. M. (1989). Observations are always ordinal; measurements, however, must be interval. Archives of Physical Medicine and Rehabilitation, 70(12), 857-867 [http://www.rasch.org/memo44.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Translating Gingrich’s Astute Observations on Health Care

June 30, 2011

“At the very heart of transforming health and healthcare is one simple fact: it will require a commitment by the federal government to invest in science and discovery. The period between investment and profit for basic research is too long for most companies to ever consider making the investment. Furthermore, truly basic research often produces new knowledge that everyone can use, so there is no advantage to a particular company to make the investment. The result is that truly fundamental research is almost always a function of government and foundations because the marketplace discourages focusing research in that direction” (p. 169 in Gingrich, 2003).

Gingrich says this while recognizing (p. 185) that:

“Money needs to be available for highly innovative ‘out of the box’ science. Peer review is ultimately a culturally conservative and risk-averse model. Each institution’s director should have a small amount of discretionary money, possibly 3% to 5% of their budget, to spend on outliers.”

He continues (p. 170), with some important elaborations on the theme:

“America’s economic future is a direct function of our ability to take new scientific research and translate it into entrepreneurial development.”

“The [Hart/Rudman] Commission’s second conclusion was that the failure to invest in scientific research and the failure to reform math and science education was the second largest threat to American security [behind terrorism].”

“Our goal [in the Hart/Rudman Commission] was to communicate the centrality of the scientific endeavor to American life and the depth of crisis we believe threatens the math and science education system. The United States’ ability to lead today is a function of past investments in scientific research and math and science education. There is no reason today to believe we will automatically maintain that lead especially given our current investments in scientific research and the staggering levels of our failures in math and science education.”

“Our ability to lead in 2025 will be a function of current decisions. Increasing our investment in science and discovery is a sound and responsible national security policy. No other federal expenditure will do more to create jobs, grow wealth, strengthen our world leadership, protect our environment, promote better education, or ensure better health for the country. We must make this increase now.”

On p. 171, this essential point is made:

“In health and healthcare, it is particularly important to increase our investment in research.”

This is all good. I agree completely. What NG says is probably more true than he realizes, in four ways.

First, the scientific capital created via metrology, controlled via theory, and embodied in technological instruments is the fundamental driver of any economy. The returns on investments in metrological improvements range from 40% to over 400% (NIST, 1996). We usually think of technology and technical standards in terms of computers, telecommunications, and electronics, but there actually is not anything at all in our lives untouched by metrology, since the air, water, food, clothing, roads, buildings, cars, appliances, etc. are all monitored, maintained, and/or manufactured relative to various kinds of universally uniform standards. NG is, as most people are, completely unaware that such standards are feasible and already under development for health, functionality, quality of life, quality of care, math and science education, etc. Given the huge ROIs associated with metrological improvements, there ought to be proportionately huge investments being made in metrology for human, social, and natural capital.

Second, NG’s point concerning national security is right on the mark, though for reasons that go beyond the ones he gives. There are very good reasons for thinking investments in, and meaningful returns from, the basic science for human, social, and natural capital metrology could be expected to undercut the motivations for terrorism and the retreats into fundamentalisms of various kinds that emerge in the face of the failures of liberal democracy (Marty, 2001). Making all forms of capital measured, managed, and accountable within a common framework accessible to everyone everywhere could be an important contributing factor, emulating the property titling rationale of DeSoto (1989, 2000) and the support for distributed cognition at the social level provided by metrological networks (Latour, 1987, 2005; Magnus, 2007), The costs of measurement can be so high as to stifle whole economies (Barzel, 1982), which is, broadly speaking, the primary problem with the economies of education, health care, social services, philanthropy, and environmental management (see, for instance, regarding philanthropy, Goldberg, 2009). Building the legal and financial infrastructure for low-friction titling and property exchange has become a basic feature of World Bank and IMF projects. My point, ever since I read De Soto, has been that we ought to be doing the same thing for human, social, and natural capital, facilitating explicit ownership of the skills, motivations, health, trust, and environmental resources that are rightfully the property of each of us, and that similar effects on national security ought to follow.

Third, NG makes an excellent point when he stresses the need for health and healthcare to be individual-centered, saying that, in contrast with the 20th-century healthcare system, “In the 21st Century System of Health and Healthcare, you will own your medical record, control your healthcare dollars, and be able to make informed choices about healthcare providers.” This is basically equivalent to saying that health capital needs to be fungible, and it can’t be fungible, of course, without a metrological infrastructure that makes every measure of outcomes, quality of life, etc. traceable to a reference standard. Individual-centeredness is also, of course, what distinguishes proper measurement from statistics. Measurement supports inductive inference, from the individual to the population, where statistics are deductive, going from the population to the individual (Fisher & Burton, 2010; Fisher, 2010). Individual-centered healthcare will never go anywhere without properly calibrated instrumentation and the traceability to reference standards that makes measures meaningful.

Fourth, NG repeatedly indicates how appalled he is at the slow pace of change in healthcare, citing research showing that it can take up to 17 years for doctors to adopt new procedures. I contend that this is an effect of our micromanagement of dead, concrete forms of capital. In a fluid living capital market, not only will consumers be able to reward quality in their purchasing decisions by having the information they need when they need it and in a form they can understand, but the quality improvements will be driven from the provider side in much the same way. As Brent James has shown, readily available, meaningful, and comparable information on natural variation in outcomes makes it much easier for providers to improve results and reduce the variation in them. Despite its central importance and the many years that have passed, however, the state of measurement in health care remains in dire need of dramatic improvement. Fryback (1993, p. 271; also see Kindig, 1999) succinctly put the point, observing that the U.S.

“health care industry is a $900 + billion [over $2.5 trillion in 2009 (CMS, 2011] endeavor that does not know how to measure its main product: health. Without a good measure of output we cannot truly optimize efficiency across the many different demands on resources.”

Quantification in health care is almost universally approached using methods inadequate to the task, resulting in ordinal and scale-dependent scores that cannot take advantage of the objective comparisons provided by invariant, individual-level measures (Andrich, 2004). Though data-based statistical studies informing policy have their place, virtually no effort or resources have been invested in developing individual-level instruments traceable to universally uniform metrics that define the outcome products of health care. These metrics are key to efficiently harmonizing quality improvement, diagnostic, and purchasing decisions and behaviors in the manner described by Berwick, James, and Coye (2003) without having to cumbersomely communicate the concrete particulars of locally-dependent scores (Heinemann, Fisher, & Gershon, 2006). Metrologically-based common product definitions will finally make it possible for quality improvement experts to implement analogues of the Toyota Production System in healthcare, long presented as a model but never approached in practice (Coye, 2001).

So, what does all of this add up to? A new division for human, social, and natural capital in NIST is in order, with extensive involvement from NIH, CMS, AHRQ, and other relevant agencies. Innovative measurement methods and standards are the “out of the box” science NG refers to. Providing these tools is the definitive embodiment of an appropriate role for government. These are the kinds of things that we could have a productive conversation with NG about, it seems to me….

References

 Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Barzel, Y. (1982). Measurement costs and the organization of markets. Journal of Law and Economics, 25, 27-48.

Berwick, D. M., James, B., & Coye, M. J. (2003, January). Connections between quality measurement and improvement. Medical Care, 41(1 (Suppl)), I30-38.

Centers for Medicare and Medicaid Services. (2011). National health expenditure data: NHE fact sheet. Retrieved 30 June 2011, from https://www.cms.gov/NationalHealthExpendData/25_NHE_Fact_Sheet.asp.

Coye, M. J. (2001, November/December). No Toyotas in health care: Why medical care has not evolved to meet patients’ needs. Health Affairs, 20(6), 44-56.

De Soto, H. (1989). The other path: The economic answer to terrorism. New York: Basic Books.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Fisher, W. P., Jr. (2010). Statistics and measurement: Clarifying the differences. Rasch Measurement Transactions, 23(4), 1229-1230 [http://www.rasch.org/rmt/rmt234.pdf].

Fisher, W. P., Jr., & Burton, E. (2010). Embedding measurement within existing computerized data systems: Scaling clinical laboratory and medical records heart failure data to predict ICU admission. Journal of Applied Measurement, 11(2), 271-287.

Fryback, D. (1993). QALYs, HYEs, and the loss of innocence. Medical Decision Making, 13(4), 271-2.

Gingrich, N. (2008). Real change: From the world that fails to the world that works. Washington, DC: Regnery Publishing.

Goldberg, S. H. (2009). Billions of drops in millions of buckets: Why philanthropy doesn’t advance social progress. New York: Wiley.

Heinemann, A. W., Fisher, W. P., Jr., & Gershon, R. (2006). Improving health care quality with outcomes management. Journal of Prosthetics and Orthotics, 18(1), 46-50 [http://www.oandp.org/jpo/library/2006_01S_046.asp].

Kindig, D. A. (1997). Purchasing population health. Ann Arbor, Michigan: University of Michigan Press.

Kindig, D. A. (1999). Purchasing population health: Aligning financial incentives to improve health outcomes. Nursing Outlook, 47, 15-22.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to Actor-Network-Theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Magnus, P. D. (2007). Distributed cognition and the task of science. Social Studies of Science, 37(2), 297-310.

Marty, M. (2001). Why the talk of spirituality today? Some partial answers. Second Opinion, 6, 53-64.

Marty, M., & Appleby, R. S. (Eds.). (1993). Fundamentalisms and society: Reclaiming the sciences, the family, and education. The fundamentalisms project, vol. 2. Chicago: University of Chicago Press.

National Institute for Standards and Technology. (1996). Appendix C: Assessment examples. Economic impacts of research in metrology. In Committee on Fundamental Science, Subcommittee on Research (Ed.), Assessing fundamental science: A report from the Subcommittee on Research, Committee on Fundamental Science. Washington, DC: National Standards and Technology Council

[http://www.nsf.gov/statistics/ostp/assess/nstcafsk.htm#Topic%207; last accessed 30 June 2011].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Newton, Metaphysics, and Measurement

January 20, 2011

Though Newton claimed to deduce quantitative propositions from phenomena, the record shows that he brought a whole cartload of presuppositions to bear on his observations (White, 1997), such as his belief that Pythagoras was the discoverer of the inverse square law, his knowledge of Galileo’s freefall experiments, and his theological and astrological beliefs in occult actions at a distance. Without his immersion in this intellectual environment, he likely would not have been able to then contrive the appearance of deducing quantity from phenomena.

The second edition of the Principia, in which appears the phrase “hypotheses non fingo,” was brought out in part to respond to the charge that Newton had not offered any explanation of what gravity is. De Morgan, in particular, felt that Newton seemed to know more than he could prove (Keynes, 1946). But in his response to the critics, and in asserting that he feigns no hypotheses, Newton was making an important distinction between explaining the causes or composition of gravity and describing how it works. Newton was saying he did not rely on or make or test any hypotheses as to what gravity is; his only concern was with how it behaves. In due course, gravity came to be accepted as a fundamental feature of the universe in no need of explanation.

Heidegger (1977, p. 121) contends that Newton was, as is implied in the translation “I do not feign hypotheses,” saying in effect that the ground plan he was offering as a basis for experiment and practical application was not something he just made up. Despite Newton’s rejection of metaphysical explanations, the charge of not explaining gravity for what it is was being answered with a metaphysics of how, first, to derive the foundation for a science of precise predictive control from nature, and then resituate that foundation back within nature as an experimental method incorporating a mathematical plan or model. This was, of course, quite astute of Newton, as far as he went, but he stopped far short of articulating the background assumptions informing his methods.

Newton’s desire for a logic of experimental science led him to reject anything “metaphysical or physical, or based on occult qualities, or mechanical” as a foundation for proceeding. Following in Descartes’ wake, Newton then was satisfied to solidify the subject-object duality and to move forward on the basis of objective results that seemed to make metaphysics a thing of the past. Unfortunately, as Burtt (1954/1932, pp. 225-230) observes in this context, the only thing that can possibly happen when you presume discourse to be devoid of metaphysical assumptions is that your metaphysics is more subtly insinuated and communicated to others because it is not overtly presented and defended. Thus we have the history of logical positivism as the dominant philosophy of science.

It is relevant to recall here that Newton was known for strong and accurate intuitions, and strong and unorthodox religious views (he held the Lucasian Chair at Cambridge only by royal dispensation, as he was not Anglican). It must be kept in mind that Newton’s combination of personal characteristics was situated in the social context of the emerging scientific culture’s increasing tendency to prioritize results that could be objectively detached from the particular people, equipment, samples, etc. involved in their production (Shapin, 1989). Newton then had insights that, while remarkably accurate, could not be entirely derived from the evidence he offered and that, moreover, could not acceptably be explained informally, psychologically, or theologically.

What is absolutely fascinating about this constellation of factors is that it became a model for the conduct of science. Of course, Newton’s laws of motion were adopted as the hallmark of successful scientific modeling in the form of the Standard Model applied throughout physics in the nineteenth century (Heilbron, 1993). But so was the metaphysical positivist logic of a pure objectivism detached from everything personal, intuitive, metaphorical, social, economic, or religious (Burtt, 1954/1932).

Kuhn (1970) made a major contribution to dismantling this logic when he contrasted textbook presentations of the methodical production of scientific effects with the actual processes of cobbled-together fits and starts that are lived out in the work of practicing scientists. But much earlier, James Clerk Maxwell (1879, pp. 162-163) had made exactly the same observation in a contrast of the work of Ampere with that of Faraday:

“The experimental investigation by which Ampere established the laws of the mechanical action between electric currents is one of the most brilliant achievements in science. The whole, theory and experiment, seems as if it had leaped, full grown and full armed, from the brain of the ‘Newton of electricity.’ It is perfect in form, and unassailable in accuracy, and it is summed up in a formula from which all the phenomena may be deduced, and which must always remain the cardinal formula of electro-dynamics.

“The method of Ampere, however, though cast into an inductive form, does not allow us to trace the formation of the ideas which guided it. We can scarcely believe that Ampere really discovered the law of action by means of the experiments which he describes. We are led to suspect, what, indeed, he tells us himself* [Ampere’s Theorie…, p. 9], that he discovered the law by some process which he has not shewn us, and that when he had afterwards built up a perfect demonstration he removed all traces of the scaffolding by which he had raised it.

“Faraday, on the other hand, shews us his unsuccessful as well as his successful experiments, and his crude ideas as well as his developed ones, and the reader, however inferior to him in inductive power, feels sympathy even more than admiration, and is tempted to believe that, if he had the opportunity, he too would be a discoverer. Every student therefore should read Ampere’s research as a splendid example of scientific style in the statement of a discovery, but he should also study Faraday for the cultivation of a scientific spirit, by means of the action and reaction which will take place between newly discovered facts and nascent ideas in his own mind.”

Where does this leave us? In sum, Rasch emulated Ampere in two ways. He did so first in wanting to become the “Newton of reading,” or even the “Newton of psychosocial constructs,” when he sought to show that data from reading test items and readers are structured with an invariance analogous to that of data from instruments applying a force to an object with mass (Rasch, 1960, pp. 110-115). Rasch emulated Ampere again when, like Ampere, after building up a perfect demonstration of a reading law structured in the form of Newton’s second law, he did not report the means by which he had constructed test items capable of producing the data fitting the model, effectively removing all traces of the scaffolding.

The scaffolding has been reconstructed for reading (Stenner, et al., 2006) and has also been left in plain view by others doing analogous work involving other constructs (cognitive and moral development, mathematics ability, short-term memory, etc.). Dawson (2002), for instance, compares developmental scoring systems of varying sophistication and predictive control. And it may turn out that the plethora of uncritically applied Rasch analyses may turn out to be a capital resource for researchers interested in focusing on possible universal laws, predictive theories, and uniform metrics.

That is, published reports of calibration, error, and fit estimates open up opportunities for “pseudo-equating” (Beltyukova, Stone, & Fox, 2004; Fisher 1997, 1999) in their documentation of the invariance, or lack thereof, of constructs over samples and instruments. The evidence will point to a need for theoretical and metric unification directly analogous to what happened in the study and use of electricity in the nineteenth century:

“…’the existence of quantitative correlations between the various forms of energy, imposes upon men of science the duty of bringing all kinds of physical quantity to one common scale of comparison.’” [Schaffer, 1992, p. 26; quoting Everett 1881; see Smith & Wise 1989, pp. 684-4]

Qualitative and quantitative correlations in scaling results converged on a common construct in the domain of reading measurement through the 1960s and 1970s, culminating in the Anchor Test Study and the calibration of the National Reference Scale for Reading (Jaeger, 1973; Rentz & Bashaw, 1977). The lack of a predictive theory and the entirely empirical nature of the scale estimates prevented the scale from wide application, as the items in the tests that were equated were soon replaced with new items.

But the broad scale of the invariance observed across tests and readers suggests that some mechanism must be at work (Stenner, Stone, & Burdick, 2009), or that some form of life must be at play (Fisher, 2003a, 2003b, 2004, 2010a), structuring the data. Eventually, some explanation accounting for the structure ought to become apparent, as it did for reading (Stenner, Smith, & Burdick, 1983; Stenner, et al., 2006). This emergence of self-organizing structures repeatedly asserting themselves as independently existing real things is the medium of the message we need to hear. That message is that instruments play a very large and widely unrecognized role in science. By facilitating the routine production of mutually consistent, regularly observable, and comparable results they set the stage for theorizing, the emergence of consensus on what’s what, and uniform metrics (Daston & Galison, 2007; Hankins & Silverman, 1999; Latour, 1987, 2005; Wise, 1988, 1995). The form of Rasch’s models as extensions of Maxwell’s method of analogy (Fisher, 2010b) makes them particularly productive as a means of providing self-organizing invariances with a medium for their self-inscription. But that’s a story for another day.

References

Beltyukova, S. A., Stone, G. E., & Fox, C. M. (2004). Equating student satisfaction measures. Journal of Applied Measurement, 5(1), 62-9.

Burtt, E. A. (1954/1932). The metaphysical foundations of modern physical science (Rev. ed.) [First edition published in 1924]. Garden City, New York: Doubleday Anchor.

Daston, L., & Galison, P. (2007). Objectivity. Cambridge, MA: MIT Press.

Dawson, T. L. (2002, Summer). A comparison of three developmental stage scoring systems. Journal of Applied Measurement, 3(2), 146-89.

Fisher, W. P., Jr. (1997). Physical disability construct convergence across instruments: Towards a universal metric. Journal of Outcome Measurement, 1(2), 87-113.

Fisher, W. P., Jr. (1999). Foundations for health status metrology: The stability of MOS SF-36 PF-10 calibrations across samples. Journal of the Louisiana State Medical Society, 151(11), 566-578.

Fisher, W. P., Jr. (2003a, December). Mathematics, measurement, metaphor, metaphysics: Part I. Implications for method in postmodern science. Theory & Psychology, 13(6), 753-90.

Fisher, W. P., Jr. (2003b, December). Mathematics, measurement, metaphor, metaphysics: Part II. Accounting for Galileo’s “fateful omission.” Theory & Psychology, 13(6), 791-828.

Fisher, W. P., Jr. (2004, October). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-54.

Fisher, W. P., Jr. (2010a). Reducible or irreducible? Mathematical reasoning and the ontological method. Journal of Applied Measurement, 11(1), 38-59.

Fisher, W. P., Jr. (2010b). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics: Conference Series, 238(1), http://iopscience.iop.org/1742-6596/238/1/012016/pdf/1742-6596_238_1_012016.pdf.

Hankins, T. L., & Silverman, R. J. (1999). Instruments and the imagination. Princeton, New Jersey: Princeton University Press.

Jaeger, R. M. (1973). The national test equating study in reading (The Anchor Test Study). Measurement in Education, 4, 1-8.

Keynes, J. M. (1946, July). Newton, the man. (Speech given at the Celebration of the Tercentenary of Newton’s birth in 1642.) MacMillan St. Martin’s Press (London, England), The Collected Writings of John Maynard Keynes Volume X, 363-364.

Kuhn, T. S. (1970). The structure of scientific revolutions. Chicago, Illinois: University of Chicago Press.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to Actor-Network-Theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Maxwell, J. C. (1879). Treatise on electricity and magnetism, Volumes I and II. London, England: Macmillan.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Rentz, R. R., & Bashaw, W. L. (1977, Summer). The National Reference Scale for Reading: An application of the Rasch model. Journal of Educational Measurement, 14(2), 161-179.

Schaffer, S. (1992). Late Victorian metrology and its instrumentation: A manufactory of Ohms. In R. Bud & S. E. Cozzens (Eds.), Invisible connections: Instruments, institutions, and science (pp. 23-56). Bellingham, WA: SPIE Optical Engineering Press.

Shapin, S. (1989, November-December). The invisible technician. American Scientist, 77, 554-563.

Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text measures? Journal of Applied Measurement, 7(3), 307-22.

Stenner, A. J., Smith, M., III, & Burdick, D. S. (1983, Winter). Toward a theory of construct definition. Journal of Educational Measurement, 20(4), 305-316.

Stenner, A. J., Stone, M., & Burdick, D. (2009, Autumn). The concept of a measurement mechanism. Rasch Measurement Transactions, 23(2), 1204-1206.

White, M. (1997). Isaac Newton: The last sorcerer. New York: Basic Books.

Wise, M. N. (1988). Mediating machines. Science in Context, 2(1), 77-113.

Wise, M. N. (Ed.). (1995). The values of precision. Princeton, New Jersey: Princeton University Press.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Open Letter to the Impact Investment Community

May 4, 2010

It is very encouraging to discover your web sites (GIIN, IRIS, and GIIRS) and to see the work you’re doing in advancing the concept of impact investing. The defining issue of our time is figuring out how to harness the profit motive for socially responsible and environmentally sustainable prosperity. The economic, social, and environmental disasters of today might all have been prevented or significantly mitigated had social and environmental impacts been taken into account in all investing.

My contribution is to point out that, though the profit motive must be harnessed as the engine driving responsible and sustainable business practices, the force of that power is dissipated and negated by the lack of efficient human, social, and natural capital markets. If we cannot make these markets function more like financial markets, so that money naturally flows to those places where it produces the greatest returns, we will never succeed in the fundamental reorientation of the economy toward responsible sustainability. The goal has to be one of tying financial profits to growth in realized human potential, community, and environmental quality, but to do that we need measures of these intangible forms of capital that are as scientifically rigorous as they are eminently practical and convenient.

Better measurement is key to reducing the market frictions that inflate the cost of human, social, and natural capital transactions. A truly revolutionary paradigm shift has occurred in measurement theory and practice over the last fifty years and more. New methods make it possible

* to reduce data volume dramatically with no loss of information,
* to custom tailor measures by selectively adapting indicators to the entity rated, without compromising comparability,
* to remove rater leniency or severity effects from the measures,
* to design optimally efficient measurement systems that provide the level of precision needed to support decision making,
* to establish reference standard metrics that remain universally uniform across variations in local impact assessment indicator configurations, and
* to calibrate instruments that measure in metrics intuitively meaningful to stakeholders and end users.

Unfortunately, almost all the admirable energy and resources being poured into business intelligence measures skip over these “new” developments, defaulting to mistaken assumptions about numbers and the nature of measurement. Typical ratings, checklists, and scores provide units of measurement that

* change size depending on which question is asked, which rating category is assigned, and who or what is rated,
* increase data volume with every new question asked,
* push measures up and down in uncontrolled ways depending on who is judging the performance,
* are of unknown precision, and
* cannot be compared across different composite aggregations of ratings.

I have over 25 years experience in the use of advanced measurement and instrument calibration methods, backed up with MA and PhD degrees from the University of Chicago. The methods in which I am trained have been standard practice in educational testing for decades, and in the last 20 years have become the methods of choice in health care outcomes assessment.

I am passionately committed to putting these methods to work in the domain of impact investing, business intelligence, and ecological economics. As is shown in my attached CV, I have dozens of peer-reviewed publications presenting technical and philosophical research in measurement theory and practice.

In the last few years, I have taken my work in the direction of documenting the ways in which measurement can and should reduce information overload and transaction costs; enhance human, social, and natural capital market efficiencies; provide the instruments embodying common currencies for the exchange of value; and inform a new kind of Genuine Progress Indicator or Happiness Index.

For more information, please see the attached 2009 article I published in Measurement on these topics, and the attached White Paper I produced last July in response to call from NIST for critical national need ideas. Various entries in my blog (https://livingcapitalmetrics.wordpress.com) elaborate on measurement technicalities, history, and philosophy, as do my web site at http://www.livingcapitalmetrics.com and my profile at http://www.linkedin.com/in/livingcapitalmetrics.

For instance, the blog post at https://livingcapitalmetrics.wordpress.com/2009/11/22/al-gore-will-is-not-the-problem/ explores the idea with which I introduced myself to you here, that the profit motive embodies our collective will for responsible and sustainable business practices, but we hobble ourselves with self-defeating inattention to the ways in which capital is brought to life in efficient markets. We have the solutions to our problems at hand, though there are no panaceas, and the challenges are huge.

Please feel free to contact me at your convenience. Whether we are ultimately able to work together or not, I enthusiastically wish you all possible success in your endeavors.

Sincerely,

William P. Fisher, Jr., Ph.D.
LivingCapitalMetrics.com
919-599-7245

We are what we measure.
It’s time we measured what we want to be.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.