Archive for May, 2015

Feminist Diffractions, Stochastic Resonance, and Education, Revisited

May 25, 2015

Lehrer (2015) offers an insightful commentary on Saxe et al’s (2015) recent article in Human Development that prompts some observations.

Two areas for questions and comments come to mind. The first has to do with construing the development and revision of new ways of understanding as contested, which implicitly aligns with Latour’s (1987, pp. 89, 93) sense of the way new constructs are subjected to tests of strength. Haraway (1996) makes an important point in her critique of what she sees as the overly masculinist metaphors of heroic competition and (perhaps not so) sublimated violence in these contests. Her sense of “feminist diffractions” stops short of what I have in mind, but opens the door to an alternative approach to what Lehrer calls the “close coupling of definitions with the development and revision of new concepts and ways of understanding.”

Galison (1997, pp. 843-844), for instance, seeks a metaphor capable of expressing what happens in the conceptual, practical, and argumentative contests between different communities of scientists (instrumentalist technicians, theoreticians, and experimentalists). He wants a metaphor that does justice to the disunified chaos and disorder one finds in the relationships between these different groups, which paradoxically results in such productive and coherent innovations. He recalls Peirce’s and Wittgenstein’s metaphors of cables and threads that take their strength from being intertwined from smaller wires and bits of fiber but finds these images too mechanical for his purposes. He wants something more akin to amorphous semiconductors or laminated materials that can fail microscopically but hold macroscopically better than more structurally homogenous materials.

Berg and Timmermans (2000, pp. 55-56) make a similar observation in their study of the constitution of universalities in medical fields:

“In order for a statistical logistics to enhance precise decision making, it has to incorporate imprecision; in order to be universal, it has to carefully select its locales. … Paradoxically, then, the increased stability and reach of this network was not due to more (precise) instructions: the protocol’s logistics could thrive only by parasitically drawing upon its own disorder.”

The general problem is taken up by Ricoeur (1992, p. 289), who raises the notion of “universals in context or of potential or inchoate universals” that embody the paradox in which

“on the one hand, one must maintain the universal claim attached to a few values where the universal and the historical intersect, and on the other hand, one must submit this claim to discussion, not on a formal level, but on the level of the convictions incorporated in concrete forms of life.”

To repeat another theme that comes up again and again in this blog, this kind of noise-induced order sounds like the phenomenon of stochastic resonance (Fisher, 1992, 2011). The importance of stochastic resonance is that it opens up a way to connect the phenomena of emergent understanding with measurement, both at the local individual and general systemic levels.

This is the crux of some very important issues in the philosophy of science and in philosophy generally. Haraway (1996, pp. 439-440), for instance, points out that “embedded relationality is the prophylaxis for both relativism and transcendence.” And Golinski (2012, p. 35) similarly says, “Practices of translation, replication, and metrology have taken the place of the universality that used to be assumed as an attribute of singular science.”

A start in the direction of embedded relationality, translation, replication, and metrology in education is apparent, for instance, in work that enables teachers to usefully relate individual student performances to general learning progressions, connecting instructional applications with accountability (Fisher & Wilson, 2015; Lehrer, 2013; Lehrer & Jones, 2014; Wilson, 2004). As Lehrer (2015, p. 49) says about the Saxe et al. work, “Recurrent forms of mathematical practice enabled the authors to create compelling trajectories of collective activity and learning over time while preserving the contributions of individual development.”

The second of the two topics I’d like to address comes up here in the closing paragraph of his short commentary, where Lehrer says a “hoped-for future innovation would make it possible to visualize individual and collective trajectories simultaneously.” Though future improvements can certainlty be expected, visualizations of individual and collective trajectories for growth in reading are already being recognized in both educational and metrological contexts (Stenner, Swartz, Hanlon, & Emerson, 2012; Stenner & Fisher, 2013, p. 4) for their potential to serve as the media of an embedded relationality capable of undercutting both the relativism of uncontrolled local variation and the universalist pretensions often built into accountability programs.

With emerging recognition of the potential Rasch’s stochastic approaches to construct mapping (Bond & Fox, 2007; Wilson, 2005) offer in the way of metrological translation networks (Mari & Wilson, 2013; Pendrill, 2014; Pendrill & Fisher, 2015; Fisher & Wilson, 2015; Stenner & Fisher, 2013; Wilson, 2013; Wilson, Mari, Maul, & Torres Irribarra 2015), there are good reasons to expect significant new kinds of progress in fields that rely on assessments and surveys for outcome measurement and management.

References

Berg, M.,& Timmermans, S. (2000). Order and their others: On the constitution of universalities in medical work. Configurations, 8(1), 31-61.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Fisher, W. P., Jr. (1992). Stochastic resonance and Rasch measurement. Rasch Measurement Transactions, 5(4), 186-187 [http://www.rasch.org/rmt/rmt54k.htm].

Fisher, W. P., Jr. (2011). Stochastic and historical resonances of the unit in physics and psychometrics. Measurement: Interdisciplinary Research & Perspectives, 9, 46-50.

Fisher, W. P., Jr., & Stenner, A. J. (2015). The role of metrology in mobilizing and mediating the language and culture of scientific facts. Journal of Physics Conference Series, 588(012043).

Fisher, W. P., Jr., & Wilson, M. (2015). Building a productive trading zone in educational assessment research and practice. Pensamiento Educativo, in review.

Galison, P. (1997). Image and logic: A material culture of microphysics. Chicago: University of Chicago Press.

Golinski, J. (2012). Is it time to forget science? Reflections on singular science and its history. Osiris, 27(1), 19-36.

Haraway, D. J. (1996). Modest witness: Feminist diffractions in science studies. In P. Galison & D. J. Stump (Eds.), The disunity of science: Boundaries, contexts, and power (pp. 428-441). Stanford, California: Stanford University Press.

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Harvard University Press.

Lehrer, R. (2013, April 29). (Chair). In A learning progression emerges in a trading zone of professional community and identity. American Educational Research Association, Division C on Learning and Instruction, Section 2b on Learning and Motivation in Social and Cultural Contexts, San Francisco, CA.

Lehrer, R., & Jones, S. (2014, 2 April). Construct maps as boundary objects in the trading zone. In W. P. Fisher Jr. (Chair), Session 3-A: Rating Scales and Partial Credit, Theory and Applied. International Objective Measurement Workshop, Philadelphia, PA.

Lehrer, R. (2015). Designing for development: Commentary on Saxe, de Kirby, Kang, Le and Schneider. Human Development, 58(1), 45-49.

Mari, L., & Wilson, M. (2013). A gentle introduction to Rasch measurement models for metrologists. Journal of Physics Conference Series, 459(1), http://iopscience.iop.org/1742-6596/459/1/012002/pdf/1742-6596_459_1_012002.pdf.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Ricoeur, P. (1992). Oneself as another. Chicago, Illinois: University of Chicago Press.

Saxe, G. B., de Kirby, K., Kang, B., Le, M., & Schneider, A. (2015). Studying cognition through time in a classroom community: The interplay between “everyday” and “scientific” concepts. Human Development, 58(1), 5-44.

Stenner, A. J., & Fisher, W. P., Jr. (2013). Metrological traceability in the social sciences: A model from reading measurement. Journal of Physics: Conference Series, 459(012025), http://iopscience.iop.org/1742-6596/459/1/012025.

Stenner, A. J., Swartz, C., Hanlon, S., & Emerson, C. (2012, February). Personalized learning platforms. Presented at the Pearson Global Research Conference, Fremantle, Western Australia.

Wilson, M. (Ed.). (2004). National Society for the Study of Education Yearbooks. Vol. 103, Part II: Towards coherence between classroom assessment and accountability. Chicago, Illinois: University of Chicago Press.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

 

Advertisements

The Counterproductive Consequences of Common Study Designs and Statistical Methods

May 21, 2015

Because of the ways studies are designed and the ways data are analyzed, research results in psychology and the social sciences often appear to be nonlinear, sample- and instrument-dependent, and incommensurable, even when they need not be. In contrast with what are common assumptions about the nature of the constructs involved, invariant relations may be more obscured than clarified by typically employed research designs and statistical methods.

To take a particularly salient example, the number of small factors with Eigenvalues greater than 1.0 identified via factor analysis increases as the number of modes in a multi-modal distribution also increases, and the interpretation of results is further complicated by the fact that the number of factors identified decreases as sample size increases (Smith, 1996).

Similarly, variation in employment test validity across settings was established as a basic assumption by the 1970s, after 50 years of studies observing the situational specificity of results. But then Schmidt and Hunter (1977) identified sampling error, measurement error, and range restriction as major sources of what was only the appearance of incommensurable variation in employment test validity. In other words, for most of the 20th century, the identification of constructs and comparisons of results across studies were pointlessly confused by mixed populations, uncontrolled variation in reliability, and unnoted floor and/or ceiling effects. Though they do nothing to establish information systems deploying common languages structured by standard units of measurement (Feinstein, 1995), meta-analysis techniques are a step forward in equating effect sizes (Hunter & Schmidt, 2004).

Wright and Stone’s (1979) Best Test Design, in contrast, takes up each of these problems in an explicit way. Sampling error is addressed in that both the sample’s and the items’ representations of the same populations of persons and expressions of a construct are evaluated. The evaluation of reliability is foregrounded and clarified by taking advantage of the availability of individualized measurement uncertainty (error) estimates (following Andrich, 1982, presented at AERA in 1977). And range restriction becomes manageable in terms of equating and linking instruments measuring in different ranges of the same construct. As was demonstrated by Duncan (1985; Allerup, Bech, Loldrup, et al., 1994; Andrich & Styles, 1998), for instance, the restricted ranges of various studies assessing relationships between measures of attitudes and behaviors led to the mistaken conclusion that these were separate constructs. When the entire range of variation was explicitly modeled and studied, a consistent relationship was found.

Statistical and correlational methods have long histories of preventing the discovery, assessment, and practical application of invariant relations because they fail to test for invariant units of measurement, do not define standard metrics, never calibrate all instruments measuring the same thing in common units, and have no concept of formal measurement systems of interconnected instruments. Wider appreciation of the distinction between statistics and measurement (Duncan & Stenbeck, 1988; Fisher, 2010; Wilson, 2013a), and of the potential for metrological traceability we have within our reach (Fisher, 2009, 2012; Fisher & Stenner, 2013; Mari & Wilson, 2013; Pendrill, 2014; Pendrill & Fisher, 2015; Wilson, 2013b; Wilson, Mari, Maul, & Torres Irribarra, 2015), are demonstrably fundamental to the advancement of a wide range of fields.

References

Allerup, P., Bech, P., Loldrup, D., Alvarez, P., Banegil, T., Styles, I., & Tenenbaum, G. (1994). Psychiatric, business, and psychological applications of fundamental measurement models. International Journal of Educational Research, 21(6), 611-622.

Andrich, D. (1982). An index of person separation in Latent Trait Theory, the traditional KR-20 index, and the Guttman scale response pattern. Education Research and Perspectives, 9(1), 95-104 [http://www.rasch.org/erp7.htm].

Andrich, D., & Styles, I. M. (1998). The structural relationship between attitude and behavior statements from the unfolding perspective. Psychological Methods, 3(4), 454-469.

Duncan, O. D. (1985). Probability, disposition and the inconsistency of attitudes and behaviour. Synthese, 42, 21-34.

Duncan, O. D., & Stenbeck, M. (1988). Panels and cohorts: Design and model in the study of voting turnout. In C. C. Clogg (Ed.), Sociological Methodology 1988 (pp. 1-35). Washington, DC: American Sociological Association.

Feinstein, A. R. (1995). Meta-analysis: Statistical alchemy for the 21st century. Journal of Clinical Epidemiology, 48(1), 71-79.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2010). Statistics and measurement: Clarifying the differences. Rasch Measurement Transactions, 23(4), 1229-1230.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5.

Fisher, W. P., Jr., & Stenner, A. J. (2013). Overcoming the invisibility of metrology: A reading measurement network for education and the social sciences. Journal of Physics: Conference Series, 459(012024), http://iopscience.iop.org/1742-6596/459/1/012024.

Hunter, J. E., & Schmidt, F. L. (Eds.). (2004). Methods of meta-analysis: Correcting error and bias in research findings. Thousand Oaks, CA: Sage.

Mari, L., & Wilson, M. (2013). A gentle introduction to Rasch measurement models for metrologists. Journal of Physics Conference Series, 459(1), http://iopscience.iop.org/1742-6596/459/1/012002/pdf/1742-6596_459_1_012002.pdf.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62(5), 529-540.

Smith, R. M. (1996). A comparison of methods for determining dimensionality in Rasch measurement. Structural Equation Modeling, 3(1), 25-40.

Wilson, M. R. (2013a). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wilson, M. R. (2013b). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wilson, M., Mari, L., Maul, A., & Torres Irribarra, D. (2015). A comparison of measurement concepts across physical science and social science domains: Instrument design, calibration, and measurement. Journal of Physics: Conference Series, 588(012034), http://iopscience.iop.org/1742-6596/588/1/012034.

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Chicago, Illinois: MESA Press.

Moore’s Law at 50

May 13, 2015

Thomas Friedman interviewed Gordon Moore on the occasion of the 50th anniversary of Moore’s 1965 article predicting that computing power would exponentially increase at little additional cost. Moore’s ten-year prediction for the doubling rate of the numbers of transistors on microchips held up, and has now, with small adjustments, guided investments and expectations in electronics for five decades.

Friedman makes an especially important point, saying:

But let’s remember that it [Moore’s Law] was enabled by a group of remarkable scientists and engineers, in an America that did not just brag about being exceptional, but invested in the infrastructure and basic scientific research, and set the audacious goals, to make it so. If we want to create more Moore’s Law-like technologies, we need to invest in the building blocks that produced that America.”

These kinds of calls for investments in infrastructure and basic research, for new audacious goals, and for more Moore’s Law-like technologies are, of course, some of the primary and recurring themes of this blog (here, here, here, and here) and presentations and publications of the last several years. For instance, Miller and O’Leary’s (2007) close study of how Moore’s Law has aligned and coordinated investments in the electronics industry has been extrapolated into the education context (Fisher, 2012; Fisher & Stenner, 2011).

Education already has had over 60 years experience with a close parallel to Moore’s Law in reading measurement. Stenner’s Law retrospectively predicts exactly the same doubling period for the increasing numbers from 1960 to 2010 of children’s reading abilities measured in a common (or equatable) unit with known uncertainty and personalized consistency indicators. Knowledge of this kind has enabled manufacturers, suppliers, marketers, customers, and other stakeholders in the electronics industry to plan five and ten years into the future, preparing products and markets to take advantage of increased power and speed at the same or lower cost. Similarly, that same kind of knowledge could be used in education, health care, social services, and natural resource management to define the rules, roles, and responsibilities of actors and institutions involved in literacy, health, community, and natural capital markets.

Reading instruction, for example, requires text complexities to be matched to reader abilities at a comprehension rate that challenges but does not discourage the reader. Uniform grade-level textbooks are often too easy for a third of a given classroom, and too hard for another third. Individualized instruction by teachers in classrooms of 25 and more students is too cumbersome to implement. Connecting classroom reading assessments with known text complexity measures informed by judicious teacher input sets the stage for the realization of new potentials in educational outcomes. Electronic resources tapping existing text complexity measures for millions of articles and books connect individual students’ high stakes and classroom assessments in a common instructional framework (for instance, see here for an offering from Pearson). As the numbers of student reading measures made in a common unit continues to grow exponentially, capacities for connecting readers to texts, and for communicating about what works and what doesn’t in education, will grow as well.

This model is exactly the kind of infrastructure, basic scientific research, and audacious goal setting that’s needed if we are to succeed in creating more Moore’s Law-like technologies. If we as a society made the decision to invest deliberately, intentionally, and massively in infrastructure of this kind across education, health care, social services, and natural resource management, who knows what kinds of powerful results might be attained?

References

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr., & Stenner, A. J. (2011, August 31 to September 2). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO) TC1-TC7-TC13 Joint Symposium, http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24493/ilm1-2011imeko-018.pdf, Jena, Germany.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-734.

Living Capital Metrics for Financial and Sustainability Accounting Standards

May 1, 2015

I was very happy a few days ago to come across Jane Gleeson-White’s new book, Six Capitals, or Can Accountants Save the Planet? Rethinking Capitalism for the 21st Century. The special value for me in this book comes in the form of an accessible update on what’s been going on in the world of financial accounting standards. Happily, there’s been a lot of activity (check out, for instance, Amato & White, 2013; Rogers & White, 2015). Less fortunately, the activity seems to be continuing to occur in the same measurement vacuum it always has, despite my efforts in this blog to broaden the conversation to include rigorous measurement theory and practice.

But to back up a bit, recent events around sustainability metric standards don’t seem to be connected to previous controversies around financial standards and economic modeling, which were more academically oriented to problems of defining and expressing value. Gleeson-White doesn’t cite any of the extensive literature in those areas (for instance, Anielski, 2007; Baxter, 1979; Economist, 2010; Ekins, 1992, 1999; Ekins, Dresner, & Dahlstrom, 2008; Ekins, Hillman, & Hutchins, 1992; Ekins & Voituriez, 2009; Fisher, 2009b, 2009c, 2011; Young & Williams, 2010). Valuation is still a problem, of course, as is the analogy between accounting standards and scientific standards (Baxter, 1979). But much of the sensitivity of the older academic debate over accounting standards seems to have been lost in the mad, though well-intentioned, rush to devise metrics for the traditionally externalized nontraditional forms of capital.

Before addressing the thousands of metrics in circulation and the science that needs to be brought to bear on them (the ongoing theme of posts in this blog), some attention to terminology is important. Gleeson-White refers to six capitals (manufactured, liquid, intellectual, human, social, and natural), in contrast with Ekins (1992; Ekins, et al., 2008), who describes four (manufactured, human, social, and natural). Gleeson-White’s liquid capital is cash money, which can be invested in capital (a means of producing value via ongoing services) and which can be extracted as a return on capital, but is not itself capital, as is shown by the repeated historical experience in many countries of printing money without stimulating economic growth and producing value. Of her remaining five forms of capital, intellectual capital is a form of social capital that can satisfactorily be categorized alongside the other forms of organization-level properties and systems involving credibility and trust.

On pages 209-227, Gleeson-White takes up questions relevant to the measurement and information quality topics of this blog. The context here is informed by the International Integrated Reporting Council’s (IIRC) December 2013 framework for accounting reports integrating all forms of capital (Amato & White, 2013), and by related efforts of the Sustainability Accounting Standards Board (SASB) (Rogers & White, 2015). Following the IIRC, Gleeson-White asserts that

“Not all the new capitals can be quantified, yet or perhaps ever–for example, intellectual, human and social capital, much of natural capital–and so integrated reports are not expected to provide quantitative measures of each of the capitals.”

Of course, this opinion flies in the face of established evidence and theory accepted by both metrologists (weights and measures standards engineers and physicists) and psychometricians as to the viability of rigorous measurement standards for the outcomes of education, health care, social services, natural resource management, etc. (Fisher, 2009b, 2011, 2012a, 2012b; Fisher & Stenner, 2011a, 2013, 2015; Fisher & Wilson, 2015; Mari & Wilson, 2013; Pendrill, 2014; Pendrill & Fisher, 2013, 2015; Wilson, 2013; Wilson, Mari, Maul, & Torres Irribarra, 2015). Pendrill (2014, p. 26), an engineer, physicist, and past president of the European Association of National Metrology Institutes, for instance, states that “The Rasch approach…is not simply a mathematical or statistical approach, but instead [is] a specifically metrological approach to human-based measurement.” As is repeatedly shown in this blog, access to scientific measures sets the stage for a dramatic transformation of the potential for succeeding in the goal of rethinking capitalism.

Next, Gleeson-White’s references to several of the six capitals as the “living” capitals (p. 193) is a literal reference to the fact that human, social, and natural capital are all carried by people, organizations/communities, and ecosystems. The distinction between dead and living capital elaborated by De Soto (2000) and Fisher (2002, 2007, 2010b, 2011), which involves making any form of capital fungible by representing it in abstract forms negotiable in banks and courts of law, is not taken into account, though this would seem to be a basic requirement that must be fulfilled before the rethinking of capitalism could said to have been accomplished.

Gleeson-White raises the pointed question as to exactly how integrated reporting is supposed to provoke positive growth in the nontraditional forms of capital. The concept of an economic framework integrating all forms of capital relative to the profit motive, as described in Ekins’ work, for instance, and as is elaborated elsewhere in this blog, seems just over the horizon, though repeated mention is made of natural capitalism (Hawken, Lovins, & Lovins, 1999). The posing of the questions provided by Gleeson-White (pp. 216-217) is priceless, however:

“…given integrated reporting’s purported promise to contribute to sustainable development by encouraging more efficient resource allocation, how might it actually achieve this for natural and social capitals on their own terms? It seems integrated reporting does nothing to address a larger question of resource allocation….”

“To me the fact that integrated reporting cannot address such questions suggests that as with the example of human capital, its promise to foster efficient resource allocation pertains only to financial capital and not to the other capitals. If we accept that the only way to save our societies and planet is to reconceive them in terms of capital, surely the efficient valuing and allocation of all six capitals must lie at the heart of any economics and accounting for the planet’s scarce resources in the twenty-first century.
“There is a logical inconsistency here: integrated reporting might be the beginning of a new accounting paradigm, but for the moment it is being practiced by an old-paradigm corporation: essentially, one obliged to make a return on financial capital at the cost of the other capitals.”

The goal requires all forms of capital to be integrated into the financial bottom line. Where accounting for manufactured capital alone burns living capital resources for profit, a comprehensive capital accounting framework defines profit in terms of reduced waste. This is a powerful basis for economics, as waste is the common root cause of human suffering, social discontent and environmental degradation (Hawken, Lovins, & Lovins, 1999).

Multiple bottom lines are counter-productive, as they allow managers the option of choosing which stakeholder group to satisfy, often at the expense of the financial viability of the firm (Jensen, 2001; Fisher, 2010a). Economic sustainability requires that profits be legally, morally, and scientifically contingent on a balance of powers distributed across all forms of capital. Though the devil will no doubt lurk in the details, there is increasing evidence that such a balance of powers can be negotiated.

A key point here not brought up by Gleeson-White concerns the fact that markets are not created by exchange activity, but rather by institutionalized rules, roles, and responsibilities (Miller & O’Leary, 2007) codified in laws, mores, technologies, and expectations. Translating historical market-making activities as they have played out relative to manufactured capital in the new domains of human, social, and natural capital faces a number of significant challenges, adapting to a new way of thinking about tests, assessments, and surveys foremost among them (Fisher & Stenner, 2011b).

One of the most important contributions advanced measurement theory and practice (Rasch, 1960; Wright, 1977; Andrich, 1988, 2004; Fisher & Wright, 1994; Wright & Stone, 1999; Bond & Fox, 2007; Wilson, 2005; Engelhard, 2012; Stenner, Fisher, Stone, & Burdick, 2013) can make to the process of rethinking capitalism involves the sorting out of the myriad metrics that have erupted in the last several years. Gleeson-White (p. 223) reports, for instance, that the Bloomberg financial information network now has over 750 ESG (Environmental, Social, Governance) data fields, which were extracted from reports provided by over 5,000 companies in 52 countries.  Similarly, Rogers and White (2015) say that

“…today there are more than 100 organizations offering more than 400 corporate sustainability ratings products that assess some 50,000 companies on more than 8,000 metrics of environmental, social and governance (ESG) performance.”

As is also the case with the UN Millennium Development Goals (Fisher, 2011b), the typical use of these metrics as single-item “quantities” is based in counts of relevant events. This procedure misses the basic point that counts of concrete things in the world are not measures. Is it not obvious that I can have ten rocks to your two, and you can still have more rock than I do? The same thing applies to any kind of performance ratings, survey responses, or test scores. We assign the same numeric increase to every addition of one more count, but hardly anyone experimentally tests the hypothesis that the counts all work together to measure the same thing. Those who think there’s no need for precision science in this context are ignoring the decades of successful and widespread technical work in this area, at their own risk.

The repetition of history here is fascinating. As Ashworth (2004, p. 1,314) put it, historically, “The requirements of increased trade and the fiscal demands of the state fuelled the march toward a regular form of metrology.” For instance, in 1875 it was noted that “the existence of quantitative correlations between the various forms of energy, imposes upon men of science the duty of bringing all kinds of physical quantity to one common scale of comparison” (Everett, 1875, p. 9). The moral and economic  value of common scales was recognized during the French revolution, when, Alder (2002, p. 32) documents, it was asked:

“Ought not a single nation have a uniform set of measures, just as a soldier fought for a single patrie? Had not the Revolution promised equality and fraternity, not just for France, but for all the people of the world? By the same token, should not all of the world’s people use a single set of weights and measures to encourage peaceable commerce, mutual understanding, and the exchange of knowledge? That was the purpose of measuring the world.”

The value of rigorously measuring human, social and natural capital includes meaningfully integrating qualitative substance with quantitative convenience, reduced data volume, augmenting measures with uncertainty and consistency indexes, and the capacity to take missing data into account (making possible instrument equating, item banking, etc.)  In contrast with the usual methods, rigorous science demands that experiments determine which indicators cohere to measure the same thing by repeatedly giving the same values across samples, over time and space, and across subsets of indicators. Beyond such data-based results, advanced theory makes it possible to arrive at explanatory, predictive methods that add a whole new layer of efficiency to the generation of indicators (de Boeck & Wilson, 2004; Stenner, et al., 2013).

Finally, Gleeson-White (pp. 220-221) reports that “In July 2011, the SASB [Sustainability Accounting Standards Board] was launched in the United States to create standardized measures for the new capitals.” “Founded by environmental engineer and sustainability expert Jean Rogers in San Francisco, SASB is creating a full set of industry-specific standards for sustainability accounting, with the aim of making this information more consistent and comparable.” As of May 2014, the SASB vice chair is Mary Schapiro, former SEC chair, and the chairman of SASB is Michael Bloomfield, former mayor of NYC and founder of the financial information empire. The “SASB is developing nonfinancial standards for eighty-nine industries grouped in ten different sectors and aims to have completed this grueling task by February 2015. It is releasing each set of metrics as they are completed.”

Like the SASB and other groups, Gleeson-White (p. 222) reports, Bloomberg

“aims to use its metrics to start ‘standardizing the discourse around sustainability, so we’re all talking about the same things in the same way,’ as Bloomberg’s senior sustainability strategist Andrew Park put it. What companies ‘desperately want,’ he says, is ‘a legitimate voice’ to tell them: ‘This is what you need to do. You exist in this particular sector. Here are the metrics that you need to be reporting out on. So SASB will provide that. And we think that’s important, because that will help clean up the metrics that ultimately the finance community will start using.’
“Bloomberg wants to price environmental, social and governance externalities to legitimize them in the eyes of financial capital.”

Gleeson-White (p. 225) continues, saying

“Bloomberg wants to do more generally what Trucost did for Puma’s natural capital inputs: create standardized measures for the new capitals–such as ecosystem services and social impacts–so that this information can be aggregated and used by investors. Park and Ravenel call the failure to value clean air, water, stable coastlines and other environmental goods ‘as much a failure to measure as it is a market failure per se–one that could be addressed in part by providing these ‘unpriced’ resources with quantitative parameters that would enable their incorporation into market mechanisms. Such mechanisms could then appropriately ‘regulate’ the consumption of those resources.'”

Integrating well-measured living capitals into the context of appropriately configured institutional rules, roles, and responsibilities for efficient markets (Fisher, 2010b) should indeed involve a capacity to price these resources quantitatively, though this capacity alone would likely prove insufficient to the task of creating the markets (Miller & O’Leary, 2007; Williamson, 1981, 1991, 2005). Rasch’s (1960, pp. 110-115) deliberate patterning of his measurement models on the form of Maxwell’s equations for Newton’s Second Law provides a mathematical basis for connecting psychometrics with both geometry and natural laws, as well as with the law of supply and demand (Fisher, 2010c, 2015; Fisher & Stenner, 2013a).

This perspective on measurement is informed by an unmodern or amodern, post-positivist philosophy (Dewey, 2012; Latour, 1990, 1993), as opposed to a modern and positivist, or postmodern and anti-positivist, philosophy (Galison, 1997). The essential difference is that neither a universalist nor a relativist perspective is necessary to the adoption of practices of traceability to metrological standards. Rather, focusing on local, situated, human relationships, as described by Wilson (2004) in education, for instance, offers a way of resolving the false dilemma of that dichotomous contrast. As Golinski (2012, p. 35) puts it, “Practices of translation, replication, and metrology have taken the place of the universality that used to be assumed as an attribute of singular science.” Haraway (1996, pp. 439-440) harmonizes, saying “…embedded relationality is the prophylaxis for both relativism and transcendance.” Latour (2005, pp. 228-229) elaborates, saying:

“Standards and metrology solve practically the question of relativity that seems to intimidate so many people: Can we obtain some sort of universal agreement? Of course we can! Provided you find a way to hook up your local instrument to one of the many metrological chains whose material network can be fully described, and whose cost can be fully determined. Provided there is also no interruption, no break, no gap, and no uncertainty along any point of the transmission. Indeed, traceability is precisely what the whole of metrology is about! No discontinuity allowed, which is just what ANT [Actor Network Theory] needs for tracing social topography. Ours is the social theory that has taken metrology as the paramount example of what it is to expand locally everywhere, all while bypassing the local as well as the universal. The practical conditions for the expansion of universality have been opened to empirical inquiries. It’s not by accident that so much work has been done by historians of science into the situated and material extension of universals. Given how much modernizers have invested into universality, this is no small feat.
“As soon as you take the example of scientific metrology and standardization as your benchmark to follow the circulation of universals, you can do the same operation for other less traceable, less materialized circulations: most coordination among agents is achieved through the dissemination of quasi-standards.”

As Rasch (1980: xx) understood, “this is a huge challenge, but once the problem has been formulated it does seem possible to meet it.” Though some metrologically informed traceability networks have begun to emerge in education and health care (for instance, Fisher & Stenner, 2013, 2015; Stenner & Fisher, 2013), virtually everything remains to be done to make the coordination across stakeholders as fully elaborated as the standards in the natural sciences.

References

Alder, K. (2002). The measure of all things: The seven-year odyssey and hidden error that transformed the world. New York: The Free Press.

Amato, N., & White, S. (2013, December 7). IIRC releases International Integrated Reporting Framework. Journal of Accountancy. Retrieved from http://www.journalofaccountancy.com/news/2013/dec/20139207.html

Andrich, D. (1988). Sage University Paper Series on Quantitative Applications in the Social Sciences. Vol. series no. 07-068: Rasch models for measurement. Beverly Hills, California: Sage Publications.

Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Anielski, M. (2007). The economics of happiness: Building genuine wealth. Gabriola, British Columbia: New Society Publishers.

Ashworth, W. J. (2004, 19 November). Metrology and the state: Science, revenue, and commerce. Science, 306(5700), 1314-1317.

Baxter, W. T. (1979). Accounting standards: Boon or curse? In The Emmanuel Saxe distinguished lectures in accounting. http://newman.baruch.cuny.edu/digital/saxe/saxe_1978/baxter_79.htm.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

De Boeck, P., & Wilson, M. (Eds.). (2004). Explanatory item response models: A generalized linear and nonlinear approach. Statistics for Social and Behavioral Sciences). New York: Springer-Verlag.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Dewey, J. (2012). Unmodern philosophy and modern philosophy (P. Deen, Ed.). Carbondale, Illinois: Southern Illinois University Press.

Editorial. (2010, 10 June). Accounting standards: To FASB or not to FASB? The Economist, http://www.economist.com/node/16319655.

Ekins, P. (1992). A four-capital model of wealth creation. In P. Ekins & M. Max-Neef (Eds.), Real-life economics: Understanding wealth creation (pp. 147-155). London: Routledge.

Ekins, P. (1999). Economic growth and environmental sustainability: The prospects for green growth. New York: Routledge.

Ekins, P., Dresner, S., & Dahlstrom, K. (2008, March/April). The four-capital method of sustainable development evaluation. European Environment, 18(2), 63-80.

Ekins, P., Hillman, M., & Hutchison, R. (1992). The Gaia atlas of green economics (Foreword by Robert Heilbroner). New York: Anchor Books.

Ekins, P., & Voituriez, T. (2009). Trade, globalization and sustainability impact assessment: A critical look at methods and outcomes. London, England: Earthscan Publications Ltd.

Engelhard, G., Jr. (2012). Invariant measurement: Using Rasch models in the social, behavioral, and health sciences. New York: Routledge Academic.

Everett, J. D. (1875). Illustrations of the C. G. S. system of units. London, England: Taylor & Francis.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-1093 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a, November 19). Draft legislation on development and adoption of an intangible assets metric system. Retrieved 6 January 2011, from https://livingcapitalmetrics.wordpress.com/2009/11/19/draft-legislation/

Fisher, W. P., Jr. (2009b, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2009c). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (Tech. Rep. No. http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC:. National Institute for Standards and Technology.

Fisher, W. P., Jr. (2010a, 22 November). Meaningfulness, measurement, value seeking, and the corporate objective function: An introduction to new possibilities., LivingCapitalMetrics.com, Sausalito, California. Retrieved from http://ssrn.com/abstract=1713467

Fisher, W. P., Jr. (2010b). Measurement, reduced transaction costs, and the ethics of efficient markets for human, social, and natural capital, Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University (http://ssrn.com/abstract=2340674).

Fisher, W. P., Jr. (2010c). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics: Conference Series, 238(1), http://iopscience.iop.org/1742-6596/238/1/012016/pdf/1742-6596_238_1_012016.pdf.

Fisher, W. P., Jr. (2011a). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In N. Brown, B. Duckor, K. Draney & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 2 (pp. 1-27). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2011b). Measuring genuine progress by scaling economic indicators to think global & act local: An example from the UN Millennium Development Goals project. LivingCapitalMetrics.com. Retrieved 18 January 2011, from Social Science Research Network: http://ssrn.com/abstract=1739386.

Fisher, W. P., Jr. (2012a). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012b, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2015). A Rasch perspective on the law of supply and demand. Rasch Measurement Transactions, in press.

Fisher, W. P., Jr., Harvey, R. F., & Kilgore, K. M. (1995). New developments in functional assessment: Probabilistic models for gold standards. NeuroRehabilitation, 5(1), 3-25.

Fisher, W. P., Jr., Harvey, R. F., Taylor, P., Kilgore, K. M., & Kelly, C. K. (1995, February). Rehabits: A common language of functional assessment. Archives of Physical Medicine and Rehabilitation, 76(2), 113-122.

Fisher, W. P., Jr., & Stenner, A. J. (2011a, January). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). Retrieved 12 January 2014, from National Science Foundation: http://www.nsf.gov/sbe/sbe_2020/submission_detail.cfm?upld_id=36.

Fisher, W. P., Jr., & Stenner, A. J. (2011b, August 31 to September 2). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO) TC1-TC7-TC13 Joint Symposium, http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24493/ilm1-2011imeko-018.pdf, Jena, Germany.

Fisher, W. P., Jr., & Stenner, A. J. (2013a). On the potential for improved measurement in the human and social sciences. In Q. Zhang & H. Yang (Eds.), Pacific Rim Objective Measurement Symposium 2012 Conference Proceedings (pp. 1-11). Berlin, Germany: Springer-Verlag.

Fisher, W. P., Jr., & Stenner, A. J. (2013b). Overcoming the invisibility of metrology: A reading measurement network for education and the social sciences. Journal of Physics: Conference Series, 459(012024), http://iopscience.iop.org/1742-6596/459/1/012024.

Fisher, W. P., Jr., & Stenner, A. J. (2015). The role of metrology in mobilizing and mediating the language and culture of scientific facts. Journal of Physics Conference Series, 588(012043).

Fisher, W. P., Jr., & Stenner, A. J. (2015). Theory-based metrological traceability in education: A reading measurement network. Measurement, in review.

Fisher, W. P., Jr., & Wilson, M. (2015). Building a productive trading zone in educational assessment research and practice. Pensamiento Educativo, in review.

Fisher, W. P., Jr., & Wright, B. D. (1994). Introduction to probabilistic conjoint measurement theory and applications (W. P. Fisher, Jr., & B. D. Wright, Eds.) [Special issue]. International Journal of Educational Research, 21(6), 559-568.

Galison, P. (1997). Image and logic: A material culture of microphysics. Chicago: University of Chicago Press.

Gleeson-White, J. (2015). Six capitals, or can accountants save the planet? Rethinking capitalism for the 21st century. New York: Norton.

Golinski, J. (2012). Is it time to forget science? Reflections on singular science and its history. Osiris, 27(1), 19-36.

Haraway, D. J. (1996). Modest witness: Feminist diffractions in science studies. In P. Galison & D. J. Stump (Eds.), The disunity of science: Boundaries, contexts, and power (pp. 428-441). Stanford, California: Stanford University Press.

Hawken, P., Lovins, A., & Lovins, H. L. (1999). Natural capitalism: Creating the next industrial revolution. New York: Little, Brown, and Co.

Jensen, M. C. (2001, Fall). Value maximization, stakeholder theory, and the corporate objective function. Journal of Applied Corporate Finance, 14(3), 8-21.

Latour, B. (1990). Postmodern? No, simply amodern: Steps towards an anthropology of science. Studies in History and Philosophy of Science, 21(1), 145-71.

Latour, B. (1993). We have never been modern. Cambridge, Massachusetts: Harvard University Press.

Latour, B. (2005). Reassembling the social: An introduction to Actor-Network-Theory. Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Mari, L., & Wilson, M. (2013). A gentle introduction to Rasch measurement models for metrologists. Journal of Physics Conference Series, 459(1), http://iopscience.iop.org/1742-6596/459/1/012002/pdf/1742-6596_459_1_012002.pdf.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-734.

Pendrill, L. (2014, December). Man as a measurement instrument [Special Feature]. NCSLI Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2013). Quantifying human response: Linking metrological and psychometric characterisations of man as a measurement instrument. Journal of Physics: Conference Series, 459, http://iopscience.iop.org/1742-6596/459/1/012057.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, p. in press. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Rogers, J., & White, A. (2015, April 28). Focusing corporate sustainability ratings on what matters. Huffington Post. Retrieved from http://www.huffingtonpost.com/jean-rogers/focusing-corporate-sustai_b_7156148.html.

Stenner, A. J., & Fisher, W. P., Jr. (2013). Metrological traceability in the social sciences: A model from reading measurement. Journal of Physics: Conference Series, 459(012025), http://iopscience.iop.org/1742-6596/459/1/012025.

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013, August). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14 [doi: 10.3389/fpsyg.2013.00536].

Williamson, O. E. (1981, November). The economics of organization: The transaction cost approach. The American Journal of Sociology, 87(3), 548-577.

Williamson, O. E. (1991). Economic institutions: Spontaneous and intentional governance [Special issue]. Journal of Law, Economics, & Organization: Papers from the Conference on the New Science of Organization, 7, 159-187.

Williamson, O. E. (2005). The economics of governance. American Economic Review, 95(2), 1-18.

Wilson, M. (Ed.). (2004). National Society for the Study of Education Yearbooks. Vol. 103, Part II: Towards coherence between classroom assessment and accountability. Chicago, Illinois: University of Chicago Press.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wilson, M., Mari, L., Maul, A., & Torres Irribarra, D. (2015). A comparison of measurement concepts across physical science and social science domains: Instrument design, calibration, and measurement. Journal of Physics: Conference Series, 588(012034), http://iopscience.iop.org/1742-6596/588/1/012034.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D., & Stone, M. H. (1999). Measurement essentials. Wilmington, DE: Wide Range, Inc. [http://www.rasch.org/measess/me-all.pdf].

Young, J. J., & Williams, P. F. (2010, August). Sorting and comparing: Standard-setting and “ethical” categories. Critical Perspectives on Accounting, 21(6), 509-521.