Archive for February, 2011

Measurement, Metrology, and the Birth of Self-Organizing, Complex Adaptive Systems

February 28, 2011

On page 145 of his book, The Mathematics of Measurement: A Critical History, John Roche quotes Charles de La Condamine (1701-1774), who, in 1747, wrote:

‘It is quite evident that the diversity of weights and measures of different countries, and frequently in the same province, are a source of embarrassment in commerce, in the study of physics, in history, and even in politics itself; the unknown names of foreign measures, the laziness or difficulty in relating them to our own give rise to confusion in our ideas and leave us in ignorance of facts which could be useful to us.’

Roche (1998, p. 145) then explains what de La Condamine is driving at, saying:

“For reasons of international communication and of civic justice, for reasons of stability over time and for accuracy and reliability, the creation of exact, reproducible and well maintained international standards, especially of length and mass, became an increasing concern of the natural philosophers of the seventeenth and eighteenth centuries. This movement, cooperating with a corresponding impulse in governing circles for the reform of weights and measures for the benefit of society and trade, culminated in late eighteenth century France in the metric system. It established not only an exact, rational and international system of measuring length, area, volume and mass, but introduced a similar standard for temperature within the scientific community. It stimulated a wider concern within science to establish all scientific units with equal rigour, basing them wherever possible on the newly established metric units (and on the older exact units of time and angular measurement), because of their accuracy, stability and international availability. This process gradually brought about a profound change in the notation and interpretation of the mathematical formalism of physics: it brought about, for the first time in the history of the mathematical sciences, a true union of mathematics and measurement.”

As it was in the seventeenth and eighteenth centuries for physics, so it has also been in the twentieth and twenty-first for the psychosocial sciences. The creation of exact, reproducible and well maintained international standards is a matter of increasing concern today for the roles they will play in education, health care, the work place, business intelligence, and the economy at large.

As the economic crises persist and perhaps worsen, demand for common product definitions and for interpretable, meaningful measures of impacts and outcomes in education, health care, social services, environmental management, etc. will reach a crescendo. We need an exact, rational and international system of measuring literacy, numeracy, health, motivations, quality of life, community cohesion, and environmental quality, and we needed it fifty years ago. We need to reinvigorate and revive a wider concern across the sciences to establish all scientific units with equal rigor, and to have all measures used in research and practice based wherever possible on consensus standard metrics valued for their accuracy, stability and availability. We need to replicate in the psychosocial sciences the profound change in the notation and interpretation of the mathematical formalism of physics that occurred in the eighteenth and nineteenth centuries. We need to extend the true union of mathematics and measurement from physics to the psychosocial sciences.

Previous posts in this blog speak to the persistent invariance and objectivity exhibited by many of the constructs measured using ability tests, attitude surveys, performance assessments, etc. A question previously raised in this blog concerning the reproductive logic of living meaning deserves more attention, and can be productively explored in terms of complex adaptive functionality.

In a hierarchy of reasons why mathematically rigorous measurement is valuable, few are closer to the top of the list than facilitating the spontaneous self-organization of networks of agents and actors (Latour, 1987). The conception, gestation, birthing, and nurturing of complex adaptive systems constitute a reproductive logic for sociocultural traditions. Scientific traditions, in particular, form mature self-identities via a mutually implied subject-object relation absorbed into the flow of a dialectical give and take, just as economic systems do.

Complex adaptive systems establish the reproductive viability of their offspring and the coherence of an ecological web of meaningful relationships by means of this dialectic. Taylor (2003, pp. 166-8) describes the five moments in the formation and operation of complex adaptive systems, which must be able

  • to identify regularities and patterns in the flow of matter, energy, and information (MEI) in the environment (business, social, economic, natural, etc.);
  • to produce condensed schematic representations of these regularities so they can be identified as the same if they are repeated;
  • to form reproductively interchangeable variants of these representations;
  • to succeed reproductively by means of the accuracy and reliability of the representations’ predictions of regularities in the MEI data flow; and
  • adaptively modify and reorganize representations by means of informational feedback from the environment.

All living systems, from bacteria and viruses to plants and animals to languages and cultures, are complex adaptive systems characterized by these five features.

In the history of science, technologically-embodied measurement facilitates complex adaptive systems of various kinds. That history can be used as a basis for a meta-theoretical perspective on what measurement must look like in the social and human sciences. Each of Taylor’s five moments in the formation and operation of complex adaptive systems describes a capacity of measurement systems, in that:

  • data flow regularities are captured in initial, provisional instrument calibrations;
  • condensed local schematic representations are formed when an instrument’s calibrations are anchored at repeatedly observed, invariant values;
  • interchangeable nonlocal versions of these invariances are created by means of instrument equating, item banking, metrological networks, and selective, tailored, adaptive instrument administration;
  • measures read off inaccurate and unreliable instruments will not support successful reproduction of the data flow regularity, but accurate and reliable instruments calibrated in a shared common unit provide a reference standard metric that enhances communication and reproduces the common voice and shared identity of the research community; and
  • consistently inconsistent anomalous observations provide feedback suggesting new possibilities for as yet unrecognized data flow regularities that might be captured in new calibrations.

Measurement in the social sciences is in the process of extending this functionality into practical applications in business, education, health care, government, and elsewhere. Over the course of the last 50 years, measurement research and practice has already iterated many times through these five moments. In the coming years, a new critical mass will be reached in this process, systematically bringing about scale-of-magnitude improvements in the efficiency of intangible assets markets.

How? What does a “data flow regularity” look like? How is it condensed into a a schematic and used to calibrate an instrument? How are local schematics combined together in a pattern used to recognize new instances of themselves? More specifically, how might enterprise resource planning (ERP) software (such as SAP, Oracle, or PeopleSoft) simultaneously provide both the structure needed to support meaningful comparisons and the flexibility needed for good fit with the dynamic complexity of adaptive and generative self-organizing systems?

Prior work in this area proposes a dual-core, loosely coupled organization using ERP software to build social and intellectual capital, instead of using it as an IT solution addressing organizational inefficiencies (Lengnick-Hall, Lengnick-Hall, & Abdinnour-Helm, 2004). The adaptive and generative functionality (Stenner & Stone, 2003) provided by probabilistic measurement models (Rasch, 1960; Andrich, 2002, 2004; Bond & Fox, 2007; Wilson, 2005; Wright, 1977, 1999) makes it possible to model intra- and inter-organizational interoperability (Weichhart, Feiner, & Stary, 2010) at the same time that social and intellectual capital resources are augmented.

Actor/agent network theory has emerged from social and historical studies of the shared and competing moral, economic, political, and mathematical values disseminated by scientists and technicians in a variety of different successful and failed areas of research (Latour, 2005). The resulting sociohistorical descriptions ought be translated into a practical program for reproducing successful research programs. A metasystem for complex adaptive systems of research is implied in what Roche (1998) calls a “true union of mathematics and measurement.”

Complex adaptive systems are effectively constituted of such a union, even if, in nature, the mathematical character of the data flows and calibrations remains virtual. Probabilistic conjoint models for fundamental measurement are poised to extend this functionality into the human sciences. Though few, if any, have framed the situation in these terms, these and other questions are being explored, explicitly and implicitly, by hundreds of researchers in dozens of fields as they employ unidimensional models for measurement in their investigations.

If so, might then we be on the verge of a yet another new reading and writing of Galileo’s “book of nature,” this time restoring the “loss of meaning for life” suffered in Galileo’s “fateful omission” of the means by which nature came to be understood mathematically (Husserl, 1970)? The elements of a comprehensive, mathematical, and experimental design science of living systems appear on the verge of providing a saturated solution—or better, a nonequilbrium thermodynamic solution—to some of the infamous shortcomings of modern, Enlightenment science. The unity of science may yet be a reality, though not via the reductionist program envisioned by the positivists.

Some 50 years ago, Marshall McLuhan popularized the expression, “The medium is the message.” The special value quantitative measurement in the history of science does not stem from the mere use of number. Instruments are media on which nature, human or other, inscribes legible messages. A renewal of the true union of mathematics and measurement in the context of intangible assets will lead to a new cultural, scientific, and economic renaissance. As Thomas Kuhn (1977, p. 221) wrote,

“The full and intimate quantification of any science is a consummation devoutly to be wished. Nevertheless, it is not a consummation that can effectively be sought by measuring. As in individual development, so in the scientific group, maturity comes most surely to those who know how to wait.”

Given that we have strong indications of how full and intimate quantification consummates a true union of mathematics and measurement, the time for waiting is now past, and the time to act has come. See prior blog posts here for suggestions on an Intangible Assets Metric System, for resources on methods and research, for other philosophical ruminations, and more. This post is based on work presented at Rasch meetings several years ago (Fisher, 2006a, 2006b).

References

Andrich, D. (2002). Understanding resistance to the data-model relationship in Rasch’s paradigm: A reflection for the next generation. Journal of Applied Measurement, 3(3), 325-59.

Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Fisher, W. P., Jr. (2006a, Friday, April 28). Complex adaptive functionality via measurement. Presented at the Midwest Objective Measurement Seminar, M. Lunz (Organizer), University of Illinois at Chicago.

Fisher, W. P., Jr. (2006b, June 27-9). Measurement and complex adaptive functionality. Presented at the Pacific Rim Objective Measurement Symposium, T. Bond & M. Wu (Organizers), The Hong Kong Institute of Education, Hong Kong.

Husserl, E. (1970). The crisis of European sciences and transcendental phenomenology: An introduction to phenomenological philosophy (D. Carr, Trans.). Evanston, Illinois: Northwestern University Press (Original work published 1954).

Kuhn, T. S. (1977). The function of measurement in modern physical science. In T. S. Kuhn, The essential tension: Selected studies in scientific tradition and change (pp. 178-224). Chicago: University of Chicago Press. [(Reprinted from Kuhn, T. S. (1961). Isis, 52(168), 161-193.]

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Cambridge University Press.

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. (Clarendon Lectures in Management Studies). Oxford, England: Oxford University Press.

Lengnick-Hall, C. A., Lengnick-Hall, M. L., & Abdinnour-Helm, S. (2004). The role of social and intellectual capital in achieving competitive advantage through enterprise resource planning (ERP) systems. Journal of Engineering Technology Management, 21, 307-330.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Roche, J. (1998). The mathematics of measurement: A critical history. London: The Athlone Press.

Stenner, A. J., & Stone, M. (2003). Item specification vs. item banking. Rasch Measurement Transactions, 17(3), 929-30 [http://www.rasch.org/rmt/rmt173a.htm].

Taylor, M. C. (2003). The moment of complexity: Emerging network culture. Chicago: University of Chicago Press.

Weichhart, G., Feiner, T., & Stary, C. (2010). Implementing organisational interoperability–The SUddEN approach. Computers in Industry, 61, 152-160.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

A Technology Road Map for Efficient Intangible Assets Markets

February 24, 2011

Scientific technologies, instruments and conceptual images have been found to play vitally important roles in economic success because of the way they enable accurate predictions of future industry and market states (Miller & O’Leary, 2007). The technology road map for the microprocessor industry, based in Moore’s Law, has successfully guided market expectations and coordinated research investment decisions for over 40 years. When the earlier electromechanical, relay, vacuum tube, and transistor computing technology paradigms are included, the same trajectory has dominated the computer industry for over 100 years (Kurzweil, 2005, pp. 66-67).

We need a similar technology road map to guide the creation and development of intangible asset markets for human, social, and natural (HSN) capital. This will involve intensive research on what the primary constructs are, determining what is measurable and what is not, creating consensus standards for uniform metrics and the metrology networks through which those standards will function. Alignments with these developments will require comprehensively integrated economic models, accounting frameworks, and investment platforms, in addition to specific applications deploying the capital formations.

What I’m proposing is, in a sense, just an extension in a new direction of the metrology challenges and issues summarized in Table ITWG15 on page 48 in the 2010 update to the International Technology Roadmap for Semiconductors (http://www.itrs.net/about.html). Distributed electronic communication facilitated by computers and the Internet is well on the way to creating a globally uniform instantaneous information network. But much of what needs to be communicated through this network remains expressed in locally defined languages that lack common points of reference. Meaningful connectivity demands a shared language.

To those who say we already have the technology necessary and sufficient to the measurement and management of human, social, and natural capital, I say think again. The difference between what we have and what we need is the same as the difference between (a) an economy whose capital resources are not represented in transferable representations like titles and deeds, and that are denominated in a flood of money circulating in different currencies, and, (b) an economy whose capital resources are represented in transferable documents and are traded using a single currency with a restricted money supply. The measurement of intangible assets is today akin to the former economy, with little actual living capital and hundreds of incommensurable instruments and scoring systems, when what we need is the latter. (See previous entries in this blog for more on the difference between dead and living capital.)

Given the model of a road map detailing the significant features of the living capital terrain, industry-specific variations will inform the development of explicit market expectations, the alignment of HSN capital budgeting decisions, and the coordination of research investments. The concept of a technology road map for HSN capital is based in and expands on an integration of hierarchical complexity (Commons & Richards, 2002; Dawson, 2004), complex adaptive functionality (Taylor, 2003), Peirce’s semiotic developmental map of creative thought (Wright, 1999), and historical stages in the development of measuring systems (Stenner & Horabin, 1992; Stenner, Burdick, Sanford, & Burdick, 2006).

Technology road maps replace organizational amnesia with organizational learning by providing the structure of a memory that not only stores information, knowledge, understanding, and wisdom, but makes it available for use in new situations. Othman and Hashim (2004) describe organizational amnesia (OA) relative to organizational learning (OL) in a way that opens the door to a rich application of Miller and O’Leary’s (2007) detailed account of how technology road maps contribute to the creation of new markets and industries. Technology road maps function as the higher organizational principles needed for transforming individual and social expertise into economically useful products and services. Organizational learning and adaptability further need to be framed at the inter-organizational level where their various dimensions or facets are aligned not only within individual organizations but between them within the industry as a whole.

The mediation of the individual and organizational levels, and of the organizational and inter-organizational levels, is facilitated by measurement. In the microprocessor industry, Moore’s Law enabled the creation of technology road maps charting the structure, processes, and outcomes that had to be aligned at the individual, organizational, and inter-organizational levels to coordinate the entire microprocessor industry’s economic success. Such road maps need to be created for each major form of human, social, and natural capital, with the associated alignments and coordinations put in play at all levels of every firm, industry, and government.

It is a basic fact of contemporary life that the technologies we employ every day are so complex that hardly anyone understands how they do what they do. Technological miracles are commonplace events, from transportation to entertainment, from health care to manufacturing. And we usually suffer little in the way of adverse consequences from not knowing how an automatic transmission, a thermometer, or digital video reproduction works. It is enough to know how to use the tool.

This passive acceptance of technical details beyond our ken extends into areas in which standards, methods, and products are much less well defined. Managers, executives, researchers, teachers, clinicians, and others who need measurement but who are unaware of its technicalities are then put in the position of being passive consumers accepting the lowest common denominator in the quality of the services and products obtained.

And that’s not all. Just as the mass market of measurement consumers is typically passive and uninformed, in complementary fashion the supply side is fragmented and contentious. There is little agreement among measurement experts as to which quantitative methods set the standard as the state of the art. Virtually any method can be justified in terms of some body of research and practice, so the confused consumer accepts whatever is easily available or is most likely to support a preconceived agenda.

It may be possible, however, to separate the measurement wheat from the chaff. For instance, measurement consumers may value a way of distinguishing among methods that is based in a simple criterion of meaningful utility. What if all measurement consumers’ own interests in, and reasons for, measuring something in particular, such as literacy or community, were emphasized and embodied in a common framework? What if a path of small steps from currently popular methods of less value to more scientific ones of more value could be mapped? Such a continuum of methods could range from those doing the least to advance the users’ business interests to those doing the most to advance those interests.

The aesthetics, simplicity, meaningfulness, rigor, and practical consequences of strong theoretical requirements for instrument calibration provide such criteria for choices as to models and methods (Andrich, 2002, 2004; Busemeyer and Wang, 2000; Myung, 2000; Pitt, Kim, Myung, 2003; Wright, 1997, 1999). These criteria could be used to develop and guide explicit considerations of data quality, construct theory, instrument calibration, quantitative comparisons, measurement standard metrics, etc. along a continuum from the most passive and least objective to the most actively involved and most objective.

The passive approach to measurement typically starts from and prioritizes content validity. The questions asked on tests, surveys, and assessments are considered relevant primarily on the basis of the words they use and the concepts they appear to address. Evidence that the questions actually cohere together and measure the same thing is not needed. If there is any awareness of the existence of axiomatically prescribed measurement requirements, these are not considered to be essential. That is, if failures of invariance are observed, they usually provoke a turn to less stringent data treatments instead of a push to remove or prevent them. Little or no measurement or construct theory is implemented, meaning that all results remain dependent on local samples of items and people. Passively approaching measurement in this way is then encumbered by the need for repeated data gathering and analysis, and by the local dependency of the results. Researchers working in this mode are akin to the woodcutters who say they are too busy cutting trees to sharpen their saws.

An alternative, active approach to measurement starts from and prioritizes construct validity and the satisfaction of the axiomatic measurement requirements. Failures of invariance provoke further questioning, and there is significant practical use of measurement and construct theory. Results are then independent of local samples, sometimes to the point that researchers and practical applications are not encumbered with usual test- or survey-based data gathering and analysis.

As is often the case, this black and white portrayal tells far from the whole story. There are multiple shades of grey in the contrast between passive and active approaches to measurement. The actual range of implementations is much more diverse that the simple binary contrast would suggest (see the previous post in this blog for a description of a hierarchy of increasingly complex stages in measurement). Spelling out the variation that exists could be helpful for making deliberate, conscious choices and decisions in measurement practice.

It is inevitable that we would start from the materials we have at hand, and that we would then move through a hierarchy of increasing efficiency and predictive control as understanding of any given variable grows. Previous considerations of the problem have offered different categorizations for the transformations characterizing development on this continuum. Stenner and Horabin (1992) distinguish between 1) impressionistic and qualitative, nominal gradations found in the earliest conceptualizations of temperature, 2) local, data-based quantitative measures of temperature, and 3) generalized, universally uniform, theory-based quantitative measures of temperature.

The latter is prized for the way that thermodynamic theory enables the calibration of individual thermometers with no need for testing each one in empirical studies of its performance. Theory makes it possible to know in advance what the results of such tests would be with enough precision to greatly reduce the burden and expenses of instrument calibration.

Reflecting on the history of psychosocial measurement in this context, it then becomes apparent that these three stages can then be further broken down. The previous post in this blog lists the distinguishing features for each of six stages in the evolution of measurement systems, building on the five stages described by Stenner, Burdick, Sanford, and Burdick (2006).

And so what analogue of Moore’s Law might be projected? What kind of timetable can be projected for the unfolding of what might be called Stenner’s Law? Guidance for reasonable expectations is found in Kurzweil’s (2005) charting of historical and projected future exponential increases in the volume of information and computer processing speed. The accelerating growth in knowledge taking place in the world today speaks directly to a systematic integration of criteria for what shall count as meaningful new learning. Maps of the roads we’re traveling will provide some needed guidance and make the trip more enjoyable, efficient, and productive. Perhaps somewhere not far down the road we’ll be able to project doubling rates for growth in the volume of fungible literacy capital globally, or the halving rates in the cost of health capital stocks. We manage what we measure, so when we begin measuring well what we want to manage well, we’ll all be better off.

References

Andrich, D. (2002). Understanding resistance to the data-model relationship in Rasch’s paradigm: A reflection for the next generation. Journal of Applied Measurement, 3(3), 325-59.

Andrich, D. (2004, January). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), I-7–I-16.

Busemeyer, J. R., & Wang, Y.-M. (2000, March). Model comparisons and model selections based on generalization criterion methodology. Journal of Mathematical Psychology, 44(1), 171-189 [http://quantrm2.psy.ohio-state.edu/injae/jmpsp.htm].

Commons, M. L., & Richards, F. A. (2002, Jul). Organizing components into combinations: How stage transition works. Journal of Adult Development, 9(3), 159-177.

Dawson, T. L. (2004, April). Assessing intellectual development: Three approaches, one sequence. Journal of Adult Development, 11(2), 71-85.

Kurzweil, R. (2005). The singularity is near: When humans transcend biology. New York: Viking Penguin.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Myung, I. J. (2000). Importance of complexity in model selection. Journal of Mathematical Psychology, 44(1), 190-204.

Othman, R., & Hashim, N. A. (2004). Typologizing organizational amnesia. The Learning Organization, 11(3), 273-84.

Pitt, M. A., Kim, W., & Myung, I. J. (2003). Flexibility versus generalizability in model selection. Psychonomic Bulletin & Review, 10, 29-44.

Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text measures? Journal of Applied Measurement, 7(3), 307-22.

Stenner, A. J., & Horabin, I. (1992). Three stages of construct definition. Rasch Measurement Transactions, 6(3), 229 [http://www.rasch.org/rmt/rmt63b.htm].

Taylor, M. C. (2003). The moment of complexity: Emerging network culture. Chicago: University of Chicago Press.

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Stages in the Development of Meaningful, Efficient, and Useful Measures

February 21, 2011

In all learning, we use what we already know as a means of identifying what we do not yet know. When someone can read a written language, knows an alphabet and has a vocabulary, understands grammar and syntax, then that knowledge can be used to learn about the world. Then, knowing what birds are, for instance, one might learn about different kinds of birds or the typical behaviors of one bird species.

And so with measurement, we start from where we find ourselves, as with anything else. There is no need or possibility for everyone to master all the technical details of every different area of life that’s important. But it is essential that we know what is technically possible, so that we can seek out and find the tools that help us achieve our goals. We can’t get what we can’t or don’t ask for. In the domain of measurement, it seems that hardly anyone is looking for what’s actually readily available.

So it seems pertinent to offer a description of a continuum of increasingly meaningful, efficient and useful ways of measuring. Previous considerations of the problem have offered different categorizations for the transformations characterizing development on this continuum. Stenner and Horabin (1992) distinguish between 1) impressionistic and qualitative, nominal gradations found in the earliest conceptualizations of temperature, 2) local, data-based quantitative measures of temperature, and 3) generalized, universally uniform, theory-based quantitative measures of temperature.

Theory-based temperature measurement is prized for the way that thermodynamic theory enables the calibration of individual thermometers with no need for testing each one in empirical studies of its performance. As Lewin (1951, p. 169) put it, “There is nothing so practical as a good theory.” Thus we have electromagnetic theory making it possible to know the conduction and resistance characteristics of electrical cable from the properties of the metal alloys and insulators used, with no need to test more than a small fraction of that cable as a quality check.

Theory makes it possible to know in advance what the results of such tests would be with enough precision to greatly reduce the burden and expenses of instrument calibration. There likely would be no electrical industry at all if the properties of every centimeter of cable and every appliance had to be experimentally tested. This principle has been employed in measuring human, social, and natural capital for some time, but, for a variety of reasons, it has not yet been adopted on a wide scale.

Reflecting on the history of psychosocial measurement in this context, it then becomes apparent that Stenner and Horabin’s (1992) three stages can then be further broken down. Listed below are the distinguishing features for each of six stages in the evolution of measurement systems, building on the five stages described by Stenner, Burdick, Sanford, and Burdick (2006). This progression of increasing complexity, meaning, efficiency, and utility can be used as a basis for a technology roadmap that will enable the coordination and alignment of various services and products in the domain of intangible assets, as I will take up in a forthcoming post.

Stage 1. Least meaning, utility, efficiency, and value

Purely passive, receptive

Statistics describe data: What you see is what you get

Content defines measure

Additivity, invariance, etc. not tested, so numbers do not stand for something that adds up like they do

Measurement defined statistically in terms of group-level intervariable relations

Meaning of numbers changes with questions asked and persons answering

No theory

Data must be gathered and analyzed to have results

Commercial applications are instrument-dependent

Standards based in ensuring fair methods and processes

Stage 2

Slightly less passive, receptive but still descriptively oriented

Additivity, invariance, etc. tested, so numbers might stand for something that adds up like they do

Measurement still defined statistically in terms of group-level intervariable relations

Falsification of additive hypothesis effectively derails measurement effort

Descriptive models with interaction effects accepted as viable alternatives

Typically little or no attention to theory of item hierarchy and construct definition

Empirical (data-based) calibrations only

Data must be gathered and analyzed to have results

Initial awareness of measurement theory

Commercial applications are instrument-dependent

Standards based in ensuring fair methods and processes

Stage 3

Even less purely passive & receptive, more active

Instrument still designed relative to content specifications

Additivity, invariance, etc. tested, so numbers might stand for something that adds up like they do

Falsification of additive hypothesis provokes questions as to why

Descriptive models with interaction effects not accepted as viable alternatives

Measurement defined prescriptively in terms of individual-level intravariable invariance

Significant attention to theory of item hierarchy and construct definition

Empirical calibrations only

Data has to be gathered and analyzed to have results

More significant use of measurement theory in prescribing acceptable data quality

Limited construct theory (no predictive power)

Commercial applications are instrument-dependent

Standards based in ensuring fair methods and processes

Stage 4

First stage that is more active than passive

Initial efforts to (re-)design instrument relative to construct specifications and theory

Additivity, invariance, etc. tested in thoroughly prescriptive focus on calibrating instrument

Numbers not accepted unless they stand for something that adds up like they do

Falsification of additive hypothesis provokes questions as to why and corrective action

Models with interaction effects not accepted as viable alternatives

Measurement defined prescriptively in terms of individual-level intravariable invariance

Significant attention to theory of item hierarchy and construct definition relative to instrument design

Empirical calibrations only but model prescribes data quality

Data usually has to be gathered and analyzed to have results

Point of use self-scoring forms might provide immediate measurement results to end user

Some construct theory (limited predictive power)

Some commercial applications are not instrument-dependent (as in CAT item bank implementations)

Standards based in ensuring fair methods and processes

Stage 5

Significantly active approach to measurement

Item hierarchy translated into construct theory

Construct specification equation predicts item difficulties

Theory-predicted (not empirical) calibrations used in applications

Item banks superseded by single-use items created on the fly

Calibrations checked against empirical results but data gathering and analysis not necessary

Point of use self-scoring forms or computer apps provide immediate measurement results to end user

Used routinely in commercial applications

Awareness that standards might be based in metrological traceability to consensus standard uniform metric

Stage 6. Most meaning, utility, efficiency, and value

Most purely active approach to measurement

Item hierarchy translated into construct theory

Construct specification equation predicts item ensemble difficulties

Theory-predicted calibrations enable single-use items created from context

Checked against empirical results for quality assessment but data gathering and analysis not necessary

Point of use self-scoring forms or computer apps provide immediate measurement results to end user

Used routinely in commercial applications

Standards based in metrological traceability to consensus standard uniform metric

 

References

Lewin, K. (1951). Field theory in social science: Selected theoretical papers (D. Cartwright, Ed.). New York: Harper & Row.

Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text measures? Journal of Applied Measurement, 7(3), 307-22.

Stenner, A. J., & Horabin, I. (1992). Three stages of construct definition. Rasch Measurement Transactions, 6(3), 229 [http://www.rasch.org/rmt/rmt63b.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Simple ideas, complex possibilities, elegant and beautiful results

February 11, 2011

Possibilities of great subtlety, elegance, and power can follow from the simplest ideas. Leonardo da Vinci is often credited with offering a variation on this theme, but the basic idea is much older. Philosophy, for instance, began with Plato’s distinction between name and concept. This realization that words are not the things they stand for has informed and structured each of several scientific revolutions.

How so? It all begins from the reasons why Plato required his students to have studied geometry. He knew that those familiar with the Pythagorean theorem would understand the difference between any given triangle and the mathematical relationships it represents. No right triangle ever definitively embodies a perfect realization of the assertion that the square of the hypotenuse equals the sum of the squares of the other two sides. The mathematical definition or concept of a triangle is not the same thing as any actual triangle.

The subtlety and power of this distinction became apparent in its repeated application throughout the history of science. In a sense, astronomy is a geometry of the heavens, Newton’s laws are a geometry of gravity, Ohm’s law is a geometry of electromagnetism, and relativity is a geometry of the invariance of mass and energy in relation to the speed of light. Rasch models present a means to geometries of literacy, numeracy, health, trust, and environmental quality.

We are still witnessing the truth, however partial, of Whitehead’s assertion that the entire history of Western culture is a footnote to Plato. As Husserl put it, we’re still struggling with the possibility of creating a geometry of experience, a phenomenology that is not a mere description of data but that achieves a science of living meaning. The work presented in other posts here attests to a basis for optimism that this quest will be fruitful.

Build it and they will come

February 8, 2011

“It” in the popular Kevin Costner movie, “Field of Dreams,” was a baseball diamond. He put it in a corn field. Not only did a ghost team conjure itself from the corn, so did a line of headlights on the road. There would seem to have been a stunning lack of preparation for crowds of fans, as parking, food, and toilet facilities were nowhere in sight.

Those things would be taken care of in due course, but that’s another story. The point has nothing to do with being realistic and everything to do with making dreams come true. Believing in yourself and your dreams is hard. Dreams are inherently unrealistic. As George Bernard Shaw said, reasonable people adapt to life and the world. It’s unreasonable people who think the world should adapt to them. And, accordingly, change comes about only because unreasonable and unrealistic people act to make things different.

I dream of a playing field, too. I can’t just go clear a few acres in a field to build it, though. The kind of clearing I’m dreaming of is more abstract. But the same idea applies. I, too, am certain that, if we build it, they will come.

What is it? Who are they? “It” is a better way for each of us to represent who we are to the world, and to see where we stand in it. It is a new language for speaking the truth of what we are each capable of. It is a way of tuning the instruments of a new science that will enable us to harmonize relationships of all kinds: personal, occupational, social, and economic.

Which brings us to who “they” are. They are us. Humanity. We are the players on this field that we will clear. We are the ones who care and who desire meaning. We are the ones who have been robbed of the trust, loyalty, and commitment we’ve invested in governments, corporations, and decades of failed institutions. We are the ones who know what has been lost, and what yet could still be gained. We are the ones who possess our individual skills, motivations, and health, but yet have no easy, transparent way to represent how much of any one of them we have, what quality it is, or how much it can be traded for. We are the ones who all share in the bounty of the earth’s fecund capacity for self-renewal, but who among us can show exactly how much the work we do every day adds or subtracts from the quality of the environment?

So why do I say, build it and they will come? Because this sort of thing is not something that can be created piecemeal. What if Costner’s character in the movie had not just built the field but had instead tried to find venture capital, recruit his dream team, set up a ticket sales vendor, hire management and staff, order uniforms and equipment, etc.? It never would have happened. It doesn’t work that way.

And so, finally, just what do we need to build? Just this: a new metric system. The task is to construct a system of measures for managing what’s most important in life: our relationships, our health, our capacity for productive and creative employment. We need a system that enables us to track our investments in intangible assets like education, health care, community, and quality of life. We need instruments tuned to the same scales, ones that take advantage of recently developed technical capacities for qualitatively meaningful quantification; for information synthesis across indicators/items/questions; for networked, collective thinking; for adaptive innovation support; and for creating fungible currencies in which human, social, and natural capital can be traded in efficient markets.

But this is not a system that can be built piecemeal. Infrastructure on this scale is too complex and too costly for any single individual, firm, or industry to create by itself. And building one part of it at a time will not work. We need to create the environment in which these new forms of life, these new species, these new markets for living capital, can take root and grow, organically. If we create that environment, with incentives and rewards capable of functioning like fertile soil, warm sun, and replenishing rain, it will be impossible to stop the growth.

You see, there are thousands of people around the world using new measurement methods to calibrate tests, surveys and assessments as valid and reliable instruments. But they are operating in an environment in which the fully viable seeds they have to plant are wasted. There’s no place for them to take root. There’s no sun, no water.

Why is the environment for the meaningful, uniform measurement of intangible assets so inhospitable? The primary answer to this question is cultural. We have ingrained and highly counterproductive attitudes toward what are often supposed to be the inherent properties of numbers. One very important attitude of this kind is that it is common to think that all numbers are quantitative. But lots of scoring systems and percentage reporting schemes involve numbers that do not stand for something that adds up. There is nothing automatic or simple about the way any given unit of calibrated measurement remains the same all up and down a scale. Arriving at a way to construct and maintain such a unit requires as much intensive research and imaginative investigation in the social sciences as it does in the natural sciences. But where the natural sciences and engineering have grown up around a focus on meaningful measurement, the social sciences have not.

One result of mistaken preconceptions about number is that even when tests, surveys, and assessments measure the same thing, they are disconnected from one another, tuned to different scales. There is no natural environment, no shared ecology, in which the growth of learning can take place in field-wide terms. There’s no common language in which to share what’s been learned. Even when research results are exactly the same, they look different.

But if there was a system of consensus-based reference standard metrics, one for each major construct–reading, writing, and math abilities; health status; physical and psychosocial functioning; quality of life; social and natural capital–there would be the expectation that instruments measuring the same thing should measure in the same unit. Researchers could be contributing to building larger systems when they calibrate new instruments and recalibrate old ones. They would more obviously be adding to the stock of human knowledge, understanding, and wisdom. Divergent results would demand explanations, and convergent ones would give us more confidence as we move forward.

Most importantly, quality improvement and consumer purchasing decisions and behaviors would be fluidly coordinated with no need for communicating and negotiating the details of each individual comparison. Education and health care lack common product definitions because their outcomes are measured in fragmented, incommensurable metrics. But if we had consensus-based reference standard metrics for every major form of capital employed in the economy, we could develop reasonable expectations expressed in a common language for how much change should typically be obtained in fifth-grade mathematics or from a hip replacement.

As is well-known in the business world, innovation is highly dependent on standards. We cannot empower the front line with the authority to make changes when decisions have to be based on information that is unavailable or impossible to interpret. Most of the previous entries in this blog take up various aspects of this situation.

All of this demands a very different way of thinking about what’s possible in the realm of measurement. The issues are complex. They are usually presented in difficult mathematical terms within specialized research reports. But the biggest problem has to do with thinking laterally, with moving ideas out of the vertical hierarchies of the silos where they are trapped and into a new field we can dream in. And the first seeds to be planted in such a field are the ones that say the dream is worth dreaming. When we hear that message, we are already on the way not just to building this dream, but to creating a world in which everyone can dream and envision more specific possibilities for their lives, their families, their creativity.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

You see, there are thousands of people around the world using these
new measurement methods to calibrate tests, surveys and assessments as
valid and reliable instruments. But they are operating in an
environment in which the fully viable seeds they have to plant are
wasted. There’s no place for them to take root. There’s no sun, no
water. 

This is because the instruments being calibrated are all disconnected.
Even instruments of the same kind measuring the same thing are
isolated from one another, tuned to different scales. There is no
natural environment, no shared ecology, in which the growth of
learning can take place. There’s no common language in which to share
what’s been learned. Even when results are exactly the same, they look
different.

 

You see, there are thousands of people around the world using these new measurement methods to calibrate tests, surveys and assessments as valid and reliable instruments. But they are operating in an environment in which the fully viable seeds they have to plant are wasted. There’s no place for them to take root. There’s no sun, no water. This is because the instruments being calibrated are all disconnected. Even instruments of the same kind measuring the same thing are isolated from one another, tuned to different scales. There is no natural environment, no shared ecology, in which the growth of learning can take place. There’s no common language in which to share what’s been learned. Even when results are exactly the same, they look different.