Archive for the ‘metrology’ Category

Project Drawdown, Climate Change, and Measurement: Towards More Intentionally Designed Approaches to Sustainable Change

December 23, 2022

It is not just safe to say—it must be shouted from the rooftops—that without closer attention to measurement, nothing will come of the admirable and essential efforts being made by Project Drawdown as it strives to accelerate the deployment of climate solutions, the development of new leadership, and shifts in the overall conversation from doom and gloom to opportunity and possibility.

It is intensely painful to see well-intentioned, smart, and caring people acting out transparently hollow and ineffective rhetorical moves operationalized in ways that are absolutely guaranteed to fail. Project Drawdown, like virtually every other effort aimed at addressing climate change and sustainable solutions, from the United Nations Agenda 2030 and Sustainable Development Goals (Fisher, et al., 2019; Lips da Cruz, et al., 2019; Fisher & Wilson, 2019) to the Carbon Disclosure Project (Fisher, Melin, & Moller, 2021, 2022), seeks to create sustainable change without ecologizing knowledge infrastructures. Failing to make use of longstanding and highly advantageous principles and methods of measurement and metrology can only lead to disappointing results.

My previous publications on this theme (Fisher, 2009, 2011, 2012a/b, 2020a/b, 2021a/b; 2023; Fisher, et al., 2019, 2021; Lips da Cruz, et al., 2019; etc.) are now joined by a more pointed contrast (Fisher, 2022) of how confusing numbers for quantities must necessarily always result in failed sustainable change efforts.

That is, interpreted in the context of Project Drawdown, the new article asks, in effect,

  • How dramatically accelerated progress akin to that seen over the course of technological developments made in the last 20, 50, or 200 years could be possible if efficient markets for human, social, and natural capital were created (Fisher, 2009, 2011, 2012a/b, 2020a, 2021b)?
  • How “science-based priorities for climate action—across sectors, timescales, and geographies” can be set so as “to make more rapid and efficient progress” if no attention is paid to creating meaningful metrics read from instruments deployed in distributed networks and traceable to consensus standard units?
  • How any reasonable basis for expecting so-called “science-based priorities” to actually make any kind of difference that matters can be substantiated if instruments are not carefully designed to measure higher order sustainability constructs—as opposed to merely tracking physical volumes of carbon and other greenhouse gases?
  • How any kind of credible plans for “more rapid and efficient progress” can be formulated if the constructs measured are not demonstrated in theory and practice as exhibiting the same properties of structural invariance across sectors, timescales, and geographies?
  • How systems recognizing that “everyone has a vital part to play in achieving” Project Drawdown’s goals can be created if everyone everywhere is not being connected in global metrology systems that design, calibrate, and distribute tools for custom-tailored, personalized, legally owned, and financially accountable sustainable change measurement and management (Fisher, 2012b)?
  • How “changemakers—business leaders, investors, philanthropists, development officials, and more” can be informed and supplied “with science-derived strategies to ensure climate solutions scale as quickly and equitably as possible” if systematic approaches to creating metrologically sophisticated participatory social ecologies (Fisher, 2021a; Fisher & Stenner, 2018; Morrison & Fisher, 2018-2023) are not underway?
  • How universal involvement in making the needed investments and reaping the desired rewards can be facilitated without mapping measured constructs telling individuals, groups, and communities where they stand now in relation to where they were, where they want to be, and what to do next, with clear indications of the exceptions to the rule in need of close attention in every unique local circumstance (Black, et al., 2011; Fisher, 2013; Fisher & Stenner, 2023)?

Moving faster to address the urgent challenges of our time is not primarily a matter of finding and applying the will power and resources needed to do the job. The desire, will, and resources already exist in abundance. As I explain in several previous posts here, what we lack are institutions and systems envisioned, planned, skilled, resourced, and incentivized to harness the power we possess. My new Acta IMEKO article (Fisher, 2022) contrasts the differences between today’s way of imagining and approaching sustainable change, and a new way that shifts the focus to a broader vision informed by an ecologizing approach to devising sociocognitive infrastructures (Fisher, 2021a; Fisher & Stenner, 2018).

If it were easy to communicate how to shift a paradigm, one might have to wonder how truly paradigmatic the proposed change really is. Though it often feels like nothing but screaming into a hurricane, there is really nothing else to do but persist in spelling out these issues the best I can…

References

Black, P., Wilson, M., & Yao, S. (2011). Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research and Perspectives, 9, 1-52.

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.

Fisher, W. P., Jr. (2012a). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). Palgrave Macmillan.

Fisher, W. P., Jr. (2012b, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2013). Imagining education tailored to assessment as, for, and of learning: Theory, standards, and quality improvement. Assessment and Learning, 2, 6-22.

Fisher, W. P., Jr. (2020a). Contextualizing sustainable development metric standards: Imagining new entrepreneurial possibilities. Sustainability, 12(9661), 1-22. https://doi.org/10.3390/su12229661

Fisher, W. P., Jr. (2020b). Measuring genuine progress: An example from the UN Millennium Development Goals project. Journal of Applied Measurement, 21(1), 110-133

Fisher, W. P., Jr. (2021a). Bateson and Wright on number and quantity: How to not separate thinking from its relational context. Symmetry, 13(1415). https://doi.org/10.3390/sym13081415

Fisher, W. P., Jr. (2021b). Separation theorems in econometrics and psychometrics: Rasch, Frisch, two Fishers, and implications for measurement. Journal of Interdisciplinary Economics, 35(1), 29-60. https://journals.sagepub.com/doi/10.1177/02601079211033475

Fisher, W. P., Jr. (2022). Contrasting roles of measurement knowledge systems in confounding or creating sustainable change. Acta IMEKO, 11(4), 1-7. https://acta.imeko.org/index.php/acta-imeko/article/view/1330

Fisher, W. P., Jr. (2023). Measurement systems, brilliant results, and brilliant processes in healthcare: Untapped potentials of person-centered outcome metrology for cultivating trust. In W. P. Fisher, Jr. & S. Cano (Eds.), Person-centered outcome metrology: Principles and applications for high stakes decision making (pp. 357-396). Springer.

Fisher, W. P., Jr., Melin, J., & Möller, C. (2021). Metrology for climate-neutral cities (RISE Research Institutes of Sweden AB No. RISE Report 2021:84). Gothenburg, Sweden:. RISE. http://ri.diva-portal.org/smash/record.jsf?pid=diva2%3A1616048&dswid=-7140 (79 pp.)

Fisher, W. P., Jr., Melin, J., & Möller, C. (2022). A preliminary report on metrology for climate-neutral cities. Acta IMEKO, in press.

Fisher, W. P., Jr., Pendrill, L., Lips da Cruz, A., & Felin, A. (2019). Why metrology? Fair dealing and efficient markets for the United Nations’ Sustainable Development Goals. Journal of Physics: Conference Series, 1379(012023 [http://iopscience.iop.org/article/10.1088/1742-6596/1379/1/012023]). doi:10.1088/1742-6596/1379/1/012023

Fisher, W. P., Jr., & Stenner, A. J. (2018). Ecologizing vs modernizing in measurement and metrology. Journal of Physics Conference Series, 1044(012025), [http://iopscience.iop.org/article/10.1088/1742-6596/1044/1/012025].

Fisher, W. P., Jr., & Stenner, A. J. (2023). A technology roadmap for intangible assets metrology. In W. P. Fisher, Jr., and P. J. Massengill, Explanatory models, unit standards, and personalized learning in educational measurement: Selected papers by A. Jackson Stenner (pp. 179-198). Springer. https://link.springer.com/book/10.1007/978-981-19-3747-7 

Fisher, W. P., Jr., & Wilson, M. (2019). The BEAR Assessment System Software as a platform for developing and applying UN SDG metrics. Journal of Physics Conference Series, 1379(012041). https://doi.org/10.1088/1742-6596/1379/1/012041

Lips da Cruz, A., Fisher, W. P. J., Felin, A., & Pendrill, L. (2019). Accelerating the realization of the United Nations Sustainable Development Goals through metrological multi-stakeholder interoperability. Journal of Physics: Conference Series, 1379(012046 [http://iopscience.iop.org/article/10.1088/1742-6596/1379/1/012046]).

Morrison, J., & Fisher, W. P., Jr. (2018). Connecting learning opportunities in STEM education: Ecosystem collaborations across schools, museums, libraries, employers, and communities. Journal of Physics: Conference Series, 1065(022009). doi:10.1088/1742-6596/1065/2/022009

Morrison, J., & Fisher, W. P., Jr. (2019). Measuring for management in Science, Technology, Engineering, and Mathematics learning ecosystems. Journal of Physics: Conference Series, 1379(012042). doi:10.1088/1742-6596/1379/1/012042

Morrison, J., & Fisher, W. P., Jr. (2020, September 1). The Measure STEM Caliper Development Initiative [Online]. In  http://bearcenter.berkeley.edu/seminar/measure-stem-caliper-development-initiative-online, BEAR Seminar Series. University of California, Berkeley.

Morrison, J., & Fisher, W. P., Jr. (2021a). Caliper: Measuring success in STEM learning ecosystems. Measurement: Sensors, 18, 100327. https://doi.org/10.1016/j.measen.2021.100327

Morrison, J., & Fisher, W. P., Jr. (2021b, June 1). Multilevel measurement for business and industrial workforce development. Presented at the Mathematical and Statistical Methods for Metrology. Joint Workshop of ENBIS and MATHMET, Politecnico di Torino, Torino, Italy.

Morrison, J., & Fisher, W. P., Jr. (2022). Caliper: Steps to an ecologized knowledge infrastructure for STEM learning ecosystems in Israel. Acta IMEKO, in press.

Rasch’s Models for Measurement and Item Response Theory, Yet Again

October 30, 2022

A colleague in the midst of writing a peer review for an educational research journal just wrote to ask how it could happen that the automatic association of Rasch’s models for measurement with Item Response Theory (IRT) could still be so prevalent. In this particular case, it was necessary to inform the article’s authors that the Rasch rating scale and partial credit models are not IRT models. Everyone involved in developing those models refers to measurement theory; if IRT comes up, it is in a critical context.

The point applies to all of the multifaceted, multilevel, multidimensional, and polytomous models developed in relation to Rasch’s original dichotomous model. Rasch (1960, pp. 110-115) derives his model for reading comprehension via an analogy to Newton’s Second Law of Motion (Fisher, 2010b, 2021). Despite a wealth of explanations on the distinctions between statistical models like those advanced in IRT and scientific models like Rasch’s, the quick and easy connection continues to be accepted by many researchers, reviewers, and editors. So it seems that an update to the basic argument provided repeatedly in the past (Andrich, 1989a/b/c; Fisher, 2010a; Wright, 1977, 1984, 1997, 1999; many others) is in order.

In the article under review, the works cited (by Andrich, Bond, Masters, Wright, etc.) make no positive mention or constructive use of IRT. This is because the addition of the second and third item parameters renders the models unidentified (they change into incomparable forms across data sets and so cannot support generalized inferences); see Fisher (2021), San Martin & Rolin (2013), San Martin, et al. (2009, 2015). As Embretson (1996, p. 211) put it, “if item discrimination parameters are required to obtain fit, total score is not even monotonically related to the IRT theta parameters.” The illogic of the situation in IRT applications is rarely acknowledged, since basically no one ever bothers explaining why people answering the same questions and having the same scores have different measurements, why the item order jumps around depending on who is responding, and how to keep track of the item hierarchy relevant to each person. Even when multiple IRT item parameters are estimated, end use applications either simply remain silent about the interpretation issues or employ the unidimensional Rasch scale.

Accordingly, Lumsden (1978, p. 22) recommended that “The two- and three-parameter logistic and normal ogive scaling models should be abandoned since, if the unidimensionality requirement is met, the Rasch (1960) one-parameter model will be realized.” Wood (1978, p. 31) similarly said “two- and three-parameter models are not the answer – test scaling models are self-contradictory if they assert both unidimensionality and different slopes for the item characteristic curves.” Wright (1977, p. 220; also see Wright, 1984, 1997, 1999) explained that:

“When scientists measure they intend their measurements to be objective in the sense of being generalizable beyond the moment of measurement. This means that, whatever parameters are thought to characterize the measuring instruments, they must remain relatively stable through the range of intended application and must not interact substantially with the objects being measured. It also means that the parameters intended to describe the process of measurement can be estimated successfully.”

IRT model parameters do not remain stable and in fact focus on the substantive interactions, even though no one deliberately writes items or chooses samples with the intention of fulfilling theoretical expectations that these interactions will occur. That is, no one writes items intending them to change their difficulty order depending on who responds to them. For the second and third item parameters to make sense, though, that variation is exactly what would have to be intended. Such situations do exist, such that models based in sufficient statistics have been formulated to rescale logits and to account for systematic differences in the relationships of the unit and discrimination and guessing (Andrich, et al., 2016; Humphry, 2011). But as Andrich (1989a/b/c) emphasizes, IRT model formulations generally assume that the point is to describe data, no matter how uninterpretable they may be, instead of intentionally designing instruments to produce data satisfying basic principles of inference.

Fisher (2021) gives the history documenting Rasch’s and Thurstone’s involvements in the development of the concept of identified models. Moreover, this position has been substantiated in recent years in a number of articles and books connecting Rasch models with measurement science and metrology (Mari, and Wilson, 2014; Mari, at al., 2021; Pendrill, 2014, 2019; Pendrill and Fisher, 2015; Fisher and Cano, 2022; etc.). Luca Mari, an electrical engineer involved in the Bureau International des Poids et Mesures (BIPM) SI Unit metrological standards deliberations, stated, in an article co-authored with Mark Wilson, that “Rasch models belong to the same class that metrologists consider paradigmatic of measurement.” A past chair of the European Association of National Metrology Institutes, Leslie Pendrill (2014, p. 26), similarly says: “The Rasch approach…is not simply a mathematical or statistical approach, but instead [is] a specifically metrological approach to human-based measurement.”

Of course, as long as the systems of incentives and rewards goes on supporting illogical reasoning and emotional, political, and economic attachments to counterproductive methods, arguments like those presented here are not likely to have much impact. It is important to go on the record, however, with reasoned positions as it does sometime happen that small numbers of readers are persuaded to test and perhaps change their perspectives.

Education and persuasion have a limited place, though, in the overall strategy being pursued in these efforts to advance the science of measurement. The common languages supported by metrologically traceable and quality-assured measurement systems historically have proven themselves vastly more powerful and efficacious than the chaotic confusion of incomparable metrics. And far from reducing rich complexity to manageable uniformity, metrologically sound measurement science is quite akin to tuning the instruments of the human and social sciences, with all the implications that follow from that metaphor concerning support and opportunities for capitalizing on unique local creative improvisations.

The primary obstacle to creating such systems in education, health care, human resource management, social services, environmental sustainability, etc. is how to organically culture new relationships of trust. Work in this vein that has been underway for decades continues to gain momentum (Fisher, 2023a/b).

References

Andrich, D. (1989a). Constructing fundamental measurements in social psychology. In J. A. Keats, R. Taft, R. A. Heath & S. H. Lovibond (Eds.), Mathematical and theoretical systems. Proceedings of the 24th International Congress of Psychology of the International Union of Psychological Science, Vol. 4 (pp. pp. 17-26). North-Holland.

Andrich, D. (1989b). Distinctions between assumptions and requirements in measurement in the social sciences. In J. A. Keats, R. Taft, R. A. Heath & S. H. Lovibond (Eds.), Mathematical and Theoretical Systems: Proceedings of the 24th International Congress of Psychology of the International Union of Psychological Science, Vol. 4 (pp. 7-16). Elsevier Science Publishers.

Andrich, D. (1989c). Statistical reasoning in psychometric models and educational measurement. Journal of Educational Measurement, 26(1), 81-90.

Andrich, D., Marais, I., & Humphry, S. M. (2016). Controlling guessing bias in the dichotomous Rasch model applied to a large scale, vertically scaled testing program. Educational and Psychological Measurement, 76(3), 412-435.

Embretson, S. E. (1996, September). Item Response Theory models and spurious interaction effects in factorial ANOVA designs. Applied Psychological Measurement, 20(3), 201-212.

Fisher, W. P., Jr. (2010a). IRT and confusion about Rasch measurement. Rasch Measurement Transactions, 24(2), 1288 [http://www.rasch.org/rmt/rmt242.pdf].

Fisher, W. P., Jr. (2010b). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics Conference Series, 238(1), http://iopscience.iop.org/1742-6596/238/1/012016/pdf/1742-6596_238_1_012016.pdf.

Fisher, W. P., Jr. (2021). Separation theorems in econometrics and psychometrics: Rasch, Frisch, two Fishers, and implications for measurement. Journal of Interdisciplinary Economics, OnlineFirst, 1-32. https://journals.sagepub.com/doi/10.1177/02601079211033475

Fisher, W. P., Jr. (2023a). Foreword: Koans, semiotics, and metrology in Stenner’s approach to measurement-informed science and commerce. In W. P. Fisher, Jr. & P. J. Massengill (Eds.), Explanatory models, unit standards, and personalized learning in educational measurement: Selected papers by A. Jackson Stenner (pp. ix-lxx). Springer.

Fisher, W. P., Jr. (2023b). Measurement systems, brilliant results, and brilliant processes in healthcare: Untapped potentials of person-centered outcome metrology for cultivating trust. In W. P. Fisher, Jr. & S. Cano (Eds.), Person-centered outcome metrology: Principles and applications for high stakes decision making. Springer.

Fisher, W. P., Jr., & Cano, S. (Eds.). (2023). Person-centred outcome metrology: Principles and applications for high stakes decision making. Springer Series in Measurement Science & Technology. Springer. https://link.springer.com/book/9783031074646

Humphry, S. M. (2011). The role of the unit in physics and psychometrics. Measurement: Interdisciplinary Research and Perspectives, 9(1), 1-24.

Lumsden, J. (1978). Tests are perfectly reliable. British Journal of Mathematical and Statistical Psychology, 31, 19-26.

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327. http://www.sciencedirect.com/science/article/pii/S0263224114000645

Mari, L., Wilson, M., & Maul, A. (2021). Measurement across the sciences: Developing a shared concept system for measurement. Springer Series in Measurement Science and Technology. Springer.

Pendrill, L. R. (2014, December). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33. http://www.tandfonline.com/doi/abs/10.1080/19315775.2014.11721702

Pendrill, L. R. (2019). Quality assured measurement: Unification across social and physical sciences. Springer.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Danmarks Paedogogiske Institut.

San Martin, E., Gonzalez, J., & Tuerlinckx, F. (2009). Identified parameters, parameters of interest, and their relationships. Measurement: Interdisciplinary Research and Perspectives, 7(2), 97-105.

San Martin, E., Gonzalez, J., & Tuerlinckx, F. (2015). On the unidentifiability of the fixed-effects 3 PL model. Psychometrika, 80(2), 450-467.

San Martin, E., & Rolin, J. M. (2013). Identification of parametric Rasch-type models. Journal of Statistical Planning and Inference, 143(1), 116-130.

Wright, B. D. (1977). Misunderstanding the Rasch model. Journal of Educational Measurement, 14(3), 219-225.

Wright, B. D. (1984). Despair and hope for educational measurement. Contemporary Education Review, 3(1), 281-288 [http://www.rasch.org/memo41.htm].

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm]. https://doi.org/10.1111/j.1745-3992.1997.tb00606.x

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Lawrence Erlbaum Associates.

More on finding the will to address climate change: How to reframe the problem

September 21, 2022

From an article by Lauren Foster published 19 September 2022 in Barron’s:
“This is not pie in the sky. We know we have got technology to do it, we know the money is out there to do it. We just need to muster the will and the decisiveness to act,” said Kristalina Georgieva, managing Director of the IMF, at an event for the opening of Climate Week in New York City.

In 2009, I first commented in this blog on the error committed when we focus on finding the will power to act decisively in addressing climate change. The main barrier to climate action is thinking that will and decisiveness are the main problem. It’s a Chinese finger puzzle kind of problem: the more we insist on shouldering the burden of responsibilities, the more onerous those responsibilities become and the more motivation we have to avoid them. To get results, we have to change the way we frame the problem.

We need to set up systems in which everyone will do the right things because the incentives and rewards make it easy and profitable to do so. We need to find the will and decisiveness to change the way we frame the problem. Ironically, we will continue to fail to find the will to act decisively for as long as we frame the problem as primarily defined by willpower and decisiveness.

We should instead ask ourselves how could we set up systems that everyone would want to be part of and to contribute to because these systems help people everywhere meet their own immediate economic needs at the same time they contribute to the greater good.

Documenting the urgency of the problems and pushing people to take action solely on the basis of their willpower while doing nothing to change the systems of incentives and rewards will do nothing but exacerbate the problems. To build on the lessons learned from the last 200 years of prosperity (Bernstein, 2004; Fisher, 2012), successfully investing in the future and the greater social good requires new sciences, new accounting standards, new property rights, and new communications networks. Transforming our systemically inequitable and disempowering institutions into socially just and empowering ones is a hugely complex task (Fisher, 2022). It is plain to see, however, that continuing to address these problems using the same ideas and methods can only dig us deeper into the hole we are already in. Thinking different is incredibly difficult, but resources for doing so have long been available. See at bottom for a list of previous posts on this topic, and peer-reviewed articles.

References

Bernstein, W. J. (2004). The birth of plenty: How the prosperity of the modern world was created. New York: McGraw-Hill.

Fisher, W. P., Jr. (2012, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2022). Measurement systems, brilliant results, and brilliant processes in healthcare: Untapped potentials of person-centered outcome metrology for cultivating trust. In W. P. Fisher, Jr. & S. Cano (Eds.), Person-centered outcome metrology: Principles and applications for high stakes decision making. Cham: Springer.


See previous posts in this blog on this issue:

https://livingcapitalmetrics.wordpress.com/2022/08/18/an-ecologizing-approach-to-addressing-viral-epidemics-virally-conflicting-interests-and-public-weariness-with-mandated-precautions/

https://livingcapitalmetrics.wordpress.com/2022/07/31/comments-on-verra-sustainable-development-verified-impact-standards/

https://livingcapitalmetrics.wordpress.com/2021/01/05/day-one-memo-to-biden-harris-administration/

https://livingcapitalmetrics.wordpress.com/2020/05/23/distinguishing-old-and-new-ways-of-thinking-to-solve-problems-of-sustainable-development/

https://livingcapitalmetrics.wordpress.com/2019/04/13/cartesian-problems-and-solutions/

https://livingcapitalmetrics.wordpress.com/2018/08/12/self-sustaining-sustainability-once-again/

https://livingcapitalmetrics.wordpress.com/2010/01/13/reinventing-capitalism/

https://livingcapitalmetrics.wordpress.com/2009/11/22/al-gore-will-is-not-the-problem/

See my peer-reviewed articles on this topic:

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2010). Measurement, reduced transaction costs, and the ethics of efficient markets for human, social, and natural capital, Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2340674).

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2020). Contextualizing sustainable development metric standards: Imagining new entrepreneurial possibilities. Sustainability, 12(9661), 1-22. Retrieved from https://doi.org/10.3390/su12229661

Fisher, W. P., Jr. (2021). Bateson and Wright on number and quantity: How to not separate thinking from its relational context. Symmetry, 13(1415). Retrieved from https://doi.org/10.3390/sym13081415

An Ecologizing Approach to Addressing Viral Epidemics Virally: Conflicting Interests and Public Weariness with Mandated Precautions

August 18, 2022

A recent article (Mahr, 2022) published online by Politico quotes Rachel Walensky, the head of the CDC, regarding the fact that “the CDC alone would not be able to bring Covid-19 under control, and called for broader investment in public health at the state and local levels.”

“I actually really think many people have thought this is CDC’s responsibility, to fix public health [and] the pandemic,” Walensky said. “The CDC alone can’t fix this. Businesses have to help, the government has to help, school systems have to help. This is too big for the CDC alone.”

Walensky provides here a good point of entry into an alternative, ecosystem-based approach to addressing a transformation of the CDC and of public health efforts in general. The crux of the matter comes to a head in this statement:

“This year, the agency has struggled to strike a balance between the competing interests of a virus that continues to find ways to evade vaccines and natural immunity, and a public that is weary of taking the sort of precautions that federal and state governments have mandated.”

There are two themes of particular interest here: the competing interests of the virus and the public, and the public’s weariness with mandated precautions. These competing interests conflict in their fundamental orientation to relationships. The virus evolves via bottom-up emergent processes that adapt resiliently on the fly to changing circumstances by means of easily communicable and contagious contact methods. The public, in contrast, is able to change only via more belabored and mechanical processes imposed from the outside in, and from the top down. Where information flows quickly and efficiently in a standardized way throughout all the individuals inhabiting the virus’s multilevel networked ecology, much of the crucial information in the public domain is communicated only in incommensurable and incomparable terms that not only result in cumbersome miscommunications but even open the door to private interests’ self-serving efforts at spreading misinformation.

Balancing the virus’s and the public’s competing interests is as difficult and challenging as it is because the public interest is encumbered by institutions unable to deal with the virus on its own terms. But just as it is sometimes necessary to fight fire with fire, here it may be necessary to fight viruses virally. What does this mean?

Consider this: the virus is structured in terms of a formal genotype, a standardized phenotype, and an adaptive mutability harnessed by natural selection in a seemingly endless process of innovative evolution.

But public healthcare institutions, like institutions in general, are not structured as social ecologies giving rise to productively evolving communicable social contagions of innovative care. Instead, these institutions are counterproductively structured so as to make themselves particularly vulnerable to epidemics of misinformation. I would like to propose that this vulnerability emerges as a product of the ecological fallacy (Alker, 1969; Rousseau, 1985; Sedgwick, 2015), which involves a failure to create knowledge infrastructures sensitive to the differences between the forest and the trees, or between the map and the territory. Referred to by Whitehead (1925, pp. 52-58) as the fallacy of misplaced concreteness, Bateson’s (1972, 1978; Star & Ruhleder, 1996) concept of the “ecology of mind” stressed the epistemological error made when individuals’ cognitive processes are disconnected from the relationships in which they are embedded.

This error is virtually endemic as a primary feature of the dominant paradigm of statistical modeling in epidemiology and public health. This paradigm takes uncontrolled variations in the meaning of numeric differences for granted as an insuperable constraint. Numbers are assumed to automatically be quantitative, even though it is patently obvious that having ten rocks in no way guarantees one of possessing more rock mass than someone else with two rocks. This contradictory situation is routinely and systematically ignored in the vast majority of statistical comparisons of test scores and survey ratings. Though this is not always the case (Barney & Fisher, 2016; Fisher & Stenner, 2016; Stenner, et al., 2013, 2016), in general, quantitative methods in education, psychology, and the social sciences mistakenly assume that theoretically rigorous and reproducible interval-level measurement is impossible.

Scientific alternatives to accepting the constant confusion of incommensurable instrument-dependent ordinal units, however, have been available for decades (Rasch, 1960, 1961, 1977; Wright, 1977, 1997; Narens & Luce, 1986; Andrich & Marais, 2019; Fisher & Wright, 1994) but are rarely put to use in systematic applications. Interest in developing such applications has increased in recent years as the metrological potentials of advanced measurement modeling have become more widely known (Cano, et al., 2019; Fisher & Cano, 2022; Mari & Wilson, 2014; Mari, et al., 2021; Pendrill, 2014, 2019; Pendrill & Fisher, 2015).

Keeping everyone connected so they can think together in common languages sets up possibilities for virally communicable, evolving contagions of care. These kinds of social forms of life need to be created to be reproductively viable, with genotypic, phenotypic, and mutability characteristics analogous to those of biological forms of life (Pattee, 1985, 2012).

These characteristics correspond to developmental and semiotic levels of complexity, where formal theories and concepts, abstract instruments and words, and concrete data and things are integrated within systems, systems of systems (metasystems), and in paradigmatic supersystems (Commons & Bresette, 2006; Nöth, 2018). The hierarchy of these levels is not one in which lower levels are homogeneously reduced to monotonous sameness in higher levels, as though they are subjected to some kind of purification. Instead, higher levels integrate stochastic patterns repeating over time and space at lower levels in forms that remain identifiable but not identical across different groups of individuals (Fisher, 2020a, 2021a).

The genotype’s formal encapsulation of the conceptual instructions for organizing a standardized phenotype’s morphology comprises an explanatory model, a theory, predicting the structure and function of the physical form of a social or biological body. The predictive power of thermodynamics then experimentally validates the empirical performances of thermometers, just as the predictive power of syntactic and semantic complexity experimentally validates the empirical performances of reading comprehension tests.

The repeatability and reproducibility of the formal theory and abstract instrumentation comprising the genotypes and phenotypes of social forms of life provide scientifically defensible confidence that workable partnerships have been established between them and human interests. The explanatory and theoretical models involved are, however, probabilistic, meaning that the concrete data never conform exactly with expectations. No matter how improbable an observation might be, it is likely to occur at some expected frequency. When a form of life executes a reproductive strategy involving trillions or quadrillions of opportunities for possible mutations to occur, it actually becomes highly unlikely that improbable combinations will NOT happen. The problem of fighting a virus virally is one of creating the social ecologies in which highly improbable creative improvisations offering evolutionary innovations become likely to occur.

We tend to systematically ignore unique local variations in physical variables like time, temperature, mass, etc. because they are either negligible or deeply embedded in the environment, like the daily shifting in the shadows cast by the sun, or the differences in the time the sun sets when traveling across a time zone.

But when anomalies are encountered in research, the evolutionary potentials of mutable concrete observations come into their own. Just as perturbations in the orbit of Uranus led to the discovery of Neptune, so also did a misplaced lead plate reveal x-rays; a nonsticking glue yielded Post-It Notes; and a dead culture in a Petri dish, penicillin. To systematically create contexts in which we can make the unexpected as obvious as possible, we need to make our expectations as clear and widely distributed as possible.

The disclosure of anomalies has long been recognized as a primary function of measurement in science (Cook, 1914/1979, pp. 400, 427-439; Kuhn, 1961/1977, p. 219; Rasch, 1960, p. 124). Not enough has been done to systematically construct measurements calibrated in common metrics with expectations focused clearly enough to make unexpected results stand out and demand explanation. Even so, Latour (2004, p. 217) recognized that:

“Social sciences may become as scientific…as the natural sciences, on the condition that they run the same risks, which means rethinking their methods and reshaping their settings from top to bottom on the occasion of what those they articulate say. [The] …general principle becomes: devise your inquiries so that they maximize the recalcitrance of those you interrogate.”

In the domains of education and social science, consistent inconsistencies in test and survey data speak to the presence of multiple constructs and opportunities for clarifying communications by addressing them one at a time. Once each construct has been formally described by a predictive theory and explanatory model that accounts for variation in items’ empirically estimated scale locations at fit-for-purpose levels of uncertainty, precision, and reliability (De Boeck & Wilson, 2004; Embretson, 2010; Fischer, 1973; Fisher & Stenner, 2017; Stenner, et al., 2013, 2016), they may then be combined in multidimensional arrays or indexes (Wilson & Gochyyev, 2020) that are more interpretable and meaningful than nonlinear, ordinal, sample-dependent, and ecologically fallacious scores could ever be.

When inconsistencies do not accumulate to a level contradicting either the construct theory or the instrument calibrations, they may nonetheless point toward new and actionable information useful in applications or pointing in as-yet unexplored new directions. For instance, when an identifiable gender or ethnic subgroup expresses disagreement on Caliper’s (Morrison & Fisher, 2018-2022) generally agreeable partnership items while agreeing with the generally more disagreeable systems items, a previously inaccessible opportunity for revealing and negating institutionally systemic bias may be in hand.

Such opportunities are made available by not treating each item as a separate universe reported as a mean rating in a long list of numbers. Instead, a theory of how the measured construct changes provides a narrative account that informs interpretation, situating all the items together in a shared quantitative frame of reference. Now, the theoretically and empirically validated models of the measured construct constitute an ecological environment giving voice to any and all unique concrete expressions.

There is, then, a general failure in our institutions to conceive problems in terms of participatory social ecologies involving communications structured by semiotic levels of hierarchical complexity (i.e., as heterogenously distributed boundary objects; see Bowker, et al., 2015; Fisher, 2020a, 2022c; Fisher & Wilson, 2015; Star & Griesemer, 1989; also see Fisher & Stenner, 2018). Though this domain is inherently conceptually challenging, it is no more technically difficult than a good many other areas of human endeavor in which huge successes have been won.

Returning now to the second point concerning the public’s weariness with mandated precautions, what is the alternative to force-feeding solutions to a public largely unable to comprehend or appreciate the epidemiology of infection rates and the efficacy of vaccines? What differences will follow from culturing participatory social ecologies that do not commit the epistemological error of assuming that concrete numeric counts can be safely assumed to be abstract quantities? How might effective social contagions of communicable care invigorate a public worn and weary with tiresome efforts that pay back only intangible returns? How can institutionalized systems of incentives and rewards be restructured to captivate imaginations and inspire new entrepreneurial innovations? How might the complexities of public health measures and management be made accessible to everyday people in the same way that no understanding of thermodynamics is required to make good use of thermometers? How can instruments be designed, calibrated, and distributed so that the imaginative level of the entire population is lifted without altering the intelligence or vision of any individuals (following here Whitehead’s (1925, p. 107) description of the change in scientists’ thinking in the wake of the quantum revolution in physics; also see Hankins & Silverman, 1999; Latour, 1987, pp. 247-257).

A playful absorption into the flow of language games provides the energetic, labor-saving lift that fuels the economy of language (Banks, 2004; Fisher, 2004, 2020a, 2022b/c; Franck, 2002, 2019). Shared languages and common metrics prethink the world for us, sparing us the trouble of creating our own languages and translating between them. Linguistic standards, metrologically traceable unit quantities, and currency unions all leverage the same principle of virally communicable meaning (Fisher, 2012a/b, 2020, 2021a/b, 2022a/b/c). Efficient markets obtain their virally communicable shifts in capital flows in large part because high quality measurement standards are incorporated in legally defensible property rights and financial accountability systems (Allen & Sriram, 2000; Ashworth, 2004; Barber, 1987; Barzel, 1982; many others). There are no logical or moral reasons for not trying to create similar markets for human, social, and natural capital; in fact, making the effort would be far more logical and moral than not trying.

By mapping a trajectory of citizen involvement and empowerment in public health initiatives, as is measured and managed in the Caliper assessment of ecosystem success (Morrison & Fisher, 2018-2022) individuals could be systematically, socially, and financially motivated to see where they stand relative to personal and societal goals, to their past accomplishments, to what they should do next, and to any exceptional opportunities for leverage-able remediation or advantageous strengths. Community and organizational initiatives could ascertain the typical cost of achieving certain outcomes and could reward lean innovations that improve quality by removing wasteful resource investments.

Entrepreneurs could mount solutions that function across previously siloed sectors, expanding new markets for previously intangible assets. Because processes and outcomes are measured in an objective unit quantity, they need not be produced internally but can be purchased in market transactions. Because skills and performances are represented formally and abstractly independent of the individuals involved, fair prices can support profitable returns on investments. The alignment of human, social, and environmental values with financial values will make it impossible to extract monetary profits when genuine wealth is being destroyed (Fisher, 2012a/b, 2020a, 2021b).

References

Alker, H. R. (1969). A typology of ecological fallacies. In M. Dogan, S. Rokkan (Eds.) Quantitative ecological analysis in the social sciences. MIT Press, Cambridge, pp. 69-86.

Allen, R. H., & Sriram, R. D. (2000). The role of standards in innovation. Technological Forecasting and Social Change, 64, 171-181.

Andrich, D., & Marais, I. (2019). A course in Rasch measurement theory: Measuring in the educational, social, and health sciences. Cham, Switzerland: Springer.

Ashworth, W. J. (2004, November 19). Metrology and the state: Science, revenue, and commerce. Science, 306(5700), 1314-1317.

Banks, E. (2004). The philosophical roots of Ernst Mach’s economy of thought. Synthese, 139(1), 23-53.

Barber, J. M. (1987). Economic rationale for government funding of work on measurement standards. In R. Dobbie, J. Darrell, K. Poulter & R. Hobbs (Eds.), Review of DTI work on measurement standards (p. Annex 5). London: Department of Trade and Industry.

Barney, M., & Fisher, W. P., Jr. (2016). Adaptive measurement and assessment. Annual Review of Organizational Psychology and Organizational Behavior, 3, 469-490. Retrieved from https://www.annualreviews.org/doi/abs/10.1146/annurev-orgpsych-041015-062329

Barzel, Y. (1982). Measurement costs and the organization of markets. Journal of Law and Economics, 25, 27-48.

Bateson, G. (1972). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. Chicago: University of Chicago Press.

Bateson, G. (1978, Spring). Number is different from quantity. CoEvolution Quarterly, 17, 44-46 [Reprinted from pp. 53-58 in Bateson, G. (1979). Mind and Nature: A Necessary Unity. New York: E. P. Dutton.]. Retrieved from http://www.wholeearth.com/issue/2017/article/295/number.is.different.from.quantity

Bowker, G., Timmermans, S., Clarke, A. E., & Balka, E. (Eds). (2015). Boundary objects and beyond: Working with Leigh Star. Cambridge, MA: MIT Press.

Cano, S., Pendrill, L., Melin, J., & Fisher, W. P., Jr. (2019). Towards consensus measurement standards for patient-centered outcomes. Measurement, 141, 62-69. Retrieved from https://doi.org/10.1016/j.measurement.2019.03.056

Commons, M. L., & Bresette, L. M. (2006). Illuminating major creative scientific innovators with postformal stages. In C. Hoare (Ed.), Handbook of adult development and learning (pp. 255-280). Oxford, UK: Oxford University Press.

Cook, T. A. (1914/1979). The curves of life. New York: Dover.

De Boeck, P., & Wilson, M. (Eds.). (2004). Explanatory item response models: A generalized linear and nonlinear approach. (Statistics for Social and Behavioral Sciences). New York: Springer-Verlag.

Embretson, S. E. (2010). Measuring psychological constructs: Advances in model-based approaches. Washington, DC: American Psychological Association.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374.

Fisher, W. P., Jr. (2004, October). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-454.

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2020a). Contextualizing sustainable development metric standards: Imagining new entrepreneurial possibilities. Sustainability, 12(9661), 1-22. Retrieved from https://doi.org/10.3390/su12229661

Fisher, W. P., Jr. (2020b). Measuring genuine progress: An example from the UN Millennium Development Goals project. Journal of Applied Measurement, 21(1), 110-133.

Fisher, W. P., Jr. (2021a). Bateson and Wright on number and quantity: How to not separate thinking from its relational context. Symmetry, 13(1415). Retrieved from https://doi.org/10.3390/sym13081415

Fisher, W. P., Jr. (2021b). Separation theorems in econometrics and psychometrics: Rasch, Frisch, two Fishers, and implications for measurement. Journal of Interdisciplinary Economics, OnlineFirst, 1-32. Retrieved from https://journals.sagepub.com/doi/10.1177/02601079211033475

Fisher, W. P., Jr. (2022a). Aiming higher in conceptualizing manageable measures in production research. In N. Durakbasa & M. G. Gençyilmaz (Eds.), Digitizing production systems: Selected papers from ISPR2021, October 07-09, 2021 online, Turkey (pp. xix-xxxix). Berlin: Springer Verlag. Retrieved from https://link.springer.com/book/10.1007/978-3-030-90421-0/1.pdf

Fisher, W. P., Jr. (2022b). Foreword: Koans, semiotics, and metrology in Stenner’s approach to measurement-informed science and commerce. In W. P. Fisher, Jr. & P. J. Massengill (Eds.), Explanatory models, unit standards, and personalized learning in educational measurement: Selected papers by A. Jackson Stenner (pp. ix-lxx). Cham: Springer.

Fisher, W. P., Jr. (2022c). Measurement systems, brilliant results, and brilliant processes in healthcare: Untapped potentials of person-centered outcome metrology for cultivating trust. In W. P. Fisher, Jr. & S. Cano (Eds.), Person-centered outcome metrology: Principles and applications for high stakes decision making. Cham: Springer.

Fisher, W. P., Jr., & Cano, S. (Eds.). (2022). Person-centred outcome metrology: Principles and applications for high stakes decision making. Springer Series in Measurement Science & Technology. Cham: Springer. Retrieved from https://link.springer.com/book/9783031074646

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224116303281

Fisher, W. P., Jr., & Stenner, A. J. (2017, September 18). Towards an alignment of engineering and psychometric approaches to uncertainty in measurement: Consequences for the future. 18th International Congress of Metrology, 12004, 1-9. Retrieved from https://doi.org/10.1051/metrology/201712004

Fisher, W. P., Jr., & Stenner, A. J. (2018). Ecologizing vs modernizing in measurement and metrology. Journal of Physics Conference Series, 1044(012025), [http://iopscience.iop.org/article/10.1088/1742-6596/1044/1/012025].

Fisher, W. P., Jr., & Wilson, M. (2015). Building a productive trading zone in educational assessment research and practice. Pensamiento Educativo: Revista de Investigacion Educacional Latinoamericana, 52(2), 55-78. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2688260

Fisher, W. P., Jr., & Wilson, M. (2020). An online platform for sociocognitive metrology: The BEAR Assessment System Software. Measurement Science and Technology, 31(034006). Retrieved from https://iopscience.iop.org/article/10.1088/1361-6501/ab5397/meta

Fisher, W. P., Jr., & Wright, B. D. (Eds.). (1994). Applications of probabilistic conjoint measurement. International Journal of Educational Research, 21(6), 557-664.

Franck, G. (2002). The scientific economy of attention: A novel approach to the collective rationality of science. Scientometrics, 55(1), 3-26.

Franck, G. (2019). The economy of attention. Journal of Sociology, 55(1), 8-19.

Hankins, T. L., & Silverman, R. J. (1999). Instruments and the imagination. Princeton, New Jersey: Princeton University Press.

Kuhn, T. S. (1961). The function of measurement in modern physical science. Isis, 52(168), 161-193. (Rpt. in T. S. Kuhn, (Ed.). (1977). The essential tension: Selected studies in scientific tradition and change (pp. 178-224). Chicago: University of Chicago Press. Retrieved from https://www.journals.uchicago.edu/doi/abs/10.1086/349468)

Latour, B. (1987). Science in action: How to follow scientists and engineers through society. New York: Harvard University Press.

Latour, B. (2004). How to talk about the body? The normative dimension of science studies. Body & Society, 10(2-3), 205-229.

Mahr, K. (2022, August 17). CDC director orders agency overhaul, admitting flawed Covid-19 response. Politico. Retrieved from https://www.politico.com/news/2022/08/17/cdc-agency-overhaul-covid-19-response-00052384?cid=apn

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224114000645

Mari, L., Wilson, M., & Maul, A. (2021). Measurement across the sciences. Cham: Springer.

Morrison, J., & Fisher, W. P., Jr. (2018). Connecting learning opportunities in STEM education: Ecosystem collaborations across schools, museums, libraries, employers, and communities. Journal of Physics: Conference Series, 1065(022009). doi:10.1088/1742-6596/1065/2/022009

Morrison, J., & Fisher, W. P., Jr. (2019). Measuring for management in Science, Technology, Engineering, and Mathematics learning ecosystems. Journal of Physics: Conference Series, 1379(012042). doi:10.1088/1742-6596/1379/1/012042

Morrison, J., & Fisher, W. P., Jr. (2020, September 1). The Measure STEM Caliper Development Initiative [Online]. In http://bearcenter.berkeley.edu/seminar/measure-stem-caliper-development-initiative-online, BEAR Seminar Series. BEAR Center, Graduate School of Education: University of California, Berkeley.

Morrison, J., & Fisher, W. P., Jr. (2021). Caliper: Measuring success in STEM learning ecosystems. Measurement: Sensors, 18, 100327. Retrieved from https://doi.org/10.1016/j.measen.2021.100327

Morrison, J., & Fisher, W. P., Jr. (2022). Caliper: Steps to an ecologized knowledge infrastructure for STEM learning ecosystems in Israel. Acta IMEKO, in press.

Narens, L., & Luce, R. D. (1986). Measurement: The theory of numerical assignments. Psychological Bulletin, 99(2), 166-180.

Nöth, W. (2018). The semiotics of models. Sign Systems Studies, 46(1), 7-43.

Pattee, H. H. (1985). Universal principles of measurement and language functions in evolving systems. In J. L. Casti & A. Karlqvist (Eds.), Complexity, language, and life: Mathematical approaches (pp. 268-281). Berlin: Springer Verlag. Retrieved from https://core.ac.uk/download/pdf/33894060.pdf#page=282

Pattee, H. H. (2012). Causation, control, and the evolution of complexity. In H. H. Pattee & J. Raczaszek-Leonardi (Eds.), Laws, language and life: Howard Pattee’s classic papers on the physics of symbols with contemporary commentary (pp. 261-274). Dordrecht: Springer.

Pendrill, L. R. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33. Retrieved from http://www.tandfonline.com/doi/abs/10.1080/19315775.2014.11721702

Pendrill, L. R. (2019). Quality assured measurement: Unification across social and physical sciences. Cham: Springer.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In J. Neyman (Ed.), Proceedings of the fourth Berkeley symposium on mathematical statistics and probability: Volume IV: Contributions to biology and problems of medicine (pp. 321-333 [http://www.rasch.org/memo1960.pdf]). Berkeley, California: University of California Press.

Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Danish Yearbook of Philosophy, 14, 58-94. Retrieved from https://www.rasch.org/memo18.htm

Rousseau, D. M. (1985). Issues of level in organizational research: Multi-level and cross-level perspectives. Research in Organizational Behavior, 7(1), 1-37.

Sedgwick, P. (2015). Understanding the ecological fallacy. BMJ, 351.

Star, S. L., & Griesemer, J. R. (1989, August). Institutional ecology, ‘translations,’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907-39. Social Studies of Science, 19(3), 387-420.

Star, S. L., & Ruhleder, K. (1996, March). Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research, 7(1), 111-134.

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013, August). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14. doi: 10.3389/fpsyg.2013.00536

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2016). Causal Rasch models in language testing: An application rich primer. In Q. Zhang (Ed.), Pacific Rim Objective Measurement Symposium (PROMS) 2015 Conference Proceedings (pp. 1-14). Singapore: Springer.

Whitehead, A. N. (1925). Science and the modern world. New York: Macmillan.

Wilson, M., & Gochyyev, P. (2020, February). Having your cake and eating it too: Multiple dimensions and a composite. Measurement, 151(107247).

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm]. Retrieved from https://doi.org/10.1111/j.1745-3992.1997.tb00606.x

Comments on VERRA Sustainable Development Verified Impact Standards

July 31, 2022

The landing page at https://verra.org states that:
“Verra catalyzes tangible climate action and sustainable development outcomes. Verra’s standards drive large-scale investment towards high-impact activities that tackle some of the most pressing environmental and social issues of our day.”

Verra’s six listed standards and programs includes one entitled, “Sustainable Development Verified Impact Standards.” Two documents providing details on this kind of standard are available for download. One concerns “Methodology for Coastal Resilience Benefits from Restoration and Protection of Tidal Wetlands.” The methodology lays out a descriptive group-level statistical model of an ordinal unit, and not a prescriptive individual-level measurement model of an interval unit. Even though the stem ‘measur-‘ at the root of ‘measured,’ ‘measurement,’ etc. appears 119 times in the standard’s 51 pages, there is no definition of a composite Coastal Resilience Benefits interval unit quantity and associated uncertainty, nor is there any mention of experimental tests of the hypothesis that such a unit quantity can be identified and estimated.

The standard takes it for granted that physical measurements of distance, mass, volume, time, temperature, etc. are sufficient to the task of measuring coastal resilience benefits. But this is what is termed in logic as a category mistake, an ecological fallacy, or what Alfred North Whitehead (1925, pp. 52-58) called the “fallacy of misplaced concreteness.” Gregory Bateson (1972, pp. 73, 180-185, 491-495) similarly made much of the epistemological errors committed when the map is confused for the territory, the forest for the trees. In short, measuring coastal resilience benefits demands that this construct (also known in metrological terms as a measurand) itself be modeled, estimated, and calibrated in an identified and defined interval unit quantity.

Extensive and longstanding authoritative resources on measurement models supporting metrologically quality-assured instrument calibration traceability of this kind are available (Luce & Tukey, 1964; Rasch, 1960, 1961; Wright, 1977, 1997; Bond & Fox, 2015; Fisher & Wilson, 2015; Fisher & Wright, 1994; Mari & Wilson, 2014; Mari, et al., 2021; Pendrill, 2019; Pendrill & Fisher, 2015; Wilson, 2005, 2013a/b; Wilson & Fisher, 2016, 2019; etc.), with a similarly voluminous array of sustainable development applications (Cano, et al., 2019; Fisher, 2020a/b, 2021a/b; Fisher, et al., 2019, 2021; Fisher & Wilson, 2019; Madhala & Fisher, 2022; Moral, et al., 2006, 2014, 2016; Kaiser & Wilson, 2000, 2004; etc.). Writing in 1986, Narens and Luce (1986, pp. 167-169) pointed out that additive conjoint log-interval models developed in the 1960s (Luce & Tukey, 1964; Rasch, 1960, 1961; Wright, 1997) were “widely accepted” as providing access to fundamental measurement. Unfortunately, we have yet to even begin capitalizing on the opportunities for scientific, economic, social, and environmental progress offered by these models (Fisher, 2011, 2012a/b, 2020a).

Where statistical models are concerned with group-level processes occurring in the relations between variables, measurement models focus on substantive processes as they impact individuals. Actionable, meaningful management gets a grip on things in the world only in terms of measurements that give insight into what can be done in specific instances, and that can then be communicated in a common language across those instances. Statistical models have a number of debilitating shortcomings that make them highly unsatisfactory as a basis for quantification (Fisher, 2022). In addition to not positing and testing for interval quantities, these models do not:

  • articulate the individual-level response process;
  • map the development continuum;
  • provide individual level quantity, uncertainty, or consistency estimates;
  • meaningfully reduce data volume by an order of magnitude;
  • support the development of a metrologically quality assured instrument calibration network;
  • report out individual and group measurements in a way showing what has been accomplished relative to overall goals, what comes next, and special strengths and weaknesses;
  • enable the cost accounting, arbitrage, and pricing of unit outcomes;
  • nonreductively quantify living processes in ways that make them objectively reproducible over time and space;
  • represent individual- and group-level properties in comparable terms that support legal title to personal stocks of human, social, and natural capital, and profitable investments in and returns from those stocks.

For instance, Sections 9.1 and 9.2 in the coastal wetlands standard lists all the parameters to be monitored. Two comments pertain. First, these “parameters” are actually indicators that ought to be combined into an overarching composite model testing the statistical sufficiency of the observations–i.e., their capacity to serve as a basis for estimating interval unit quantities and uncertainties. In measurement theory and practice, the model parameters are the mathematical terms in the equation specifying the stimulus and response variables being quantified. Estimates signify the value obtained as a result of the combined inputs of all the indicators, no matter which particular subset of them is administered, and no matter which particular sample is measured.

The second point concerns the content of the indicators, which have been chosen because they are fairly easily measured in the physical values of distance, mass, volume, time, temperature, etc. A composite model and metrological unit system should also include a more actionable and meaningful definition of the measurand, one articulated as a developmental progression defined along a continuum ranging from most easily implemented to least easily implemented. This kind of integrated psychophysics is eminently suited to taking advantage of advanced measurement modeling (Camargo & Henson, 2013, 2015; Fisher, Melin, & Möller, 2021; Massof & McDonnell, 2012; Pendrill & Fisher, 2015; Powers & Fisher, 2018, 2022).

The practical application of the metrics and their comparability depends on obtaining usefully precise (a) theoretical predictions of indicator and project locations on the instrument (a construct map); (b) repeated demonstrations of reproducible and empirically stable data-based indicator and project location estimates (Wright maps); and (c) end user reports displaying indicator response values ordered along the mapped variable showing what has been accomplished, where the project stands in relation to its goals, what comes next in advancing its program, and where its actionable strengths and weaknesses lie.

Clear and significant progress in addressing the urgent needs for solutions to today’s pressing challenges cannot reasonably be expected until advanced measurement modeling in support of quality-assured metrological traceability is explicitly included in the design and implementation of Verra’s and others’ sustainable development standards. Hopefully the day will soon arrive when that will be the case.

References

Bateson, G. (1972). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evoltion, and epistemology. Chicago: University of Chicago Press.

Bond, T., & Fox, C. (2015). Applying the Rasch model: Fundamental measurement in the human sciences, 3d edition. New York: Routledge.

Camargo, F. R., & Henson, B. (2013). Aligning physical elements with persons’ attitude: An approach using Rasch measurement theory. Journal of Physics Conference Series, 459(1), http://iopscience.iop.org/1742-6596/459/1/012009/pdf/1742-6596_459_1_012009.pdf.

Camargo, F. R., & Henson, B. (2015). Conceptualising computerized adaptive testing for measurement of latent variables associated with physical objects. Journal of Physics Conference Series, 588(1), 012012. doi:10.1088/1742-6596/588/1/012012

Cano, S., Pendrill, L., Melin, J., & Fisher, W. P., Jr. (2019). Towards consensus measurement standards for patient-centered outcomes. Measurement, 141, 62-69. Retrieved from https://doi.org/10.1016/j.measurement.2019.03.056

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In N. Brown, B. Duckor, K. Draney & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 2 (pp. 1-27). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2012a). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012b, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2020a). Contextualizing sustainable development metric standards: Imagining new entrepreneurial possibilities. Sustainability, 12(9661), 1-22. Retrieved from https://doi.org/10.3390/su12229661

Fisher, W. P., Jr. (2020b). Measuring genuine progress: An example from the UN Millennium Development Goals project. Journal of Applied Measurement, 21(1), 110-133

Fisher, W. P., Jr. (2021a). Bateson and Wright on number and quantity: How to not separate thinking from its relational context. Symmetry, 13(1415). Retrieved from https://doi.org/10.3390/sym13081415

Fisher, W. P., Jr. (2021b). Separation theorems in econometrics and psychometrics: Rasch, Frisch, two Fishers, and implications for measurement. Journal of Interdisciplinary Economics, OnlineFirst, 1-32. Retrieved from https://journals.sagepub.com/doi/10.1177/02601079211033475

Fisher, W. P., Jr. (2022). Contrasting roles of measurement knowledge systems in confounding or creating sustainable change. Acta IMEKO, in press.

Fisher, W. P., Jr., Melin, J., & Möller, C. (2021). Metrology for climate-neutral cities (RISE Research Institutes of Sweden AB No. RISE Report 2021:84). Gothenburg, Sweden:. RISE. Retrieved from http://ri.diva-portal.org/smash/record.jsf?pid=diva2%3A1616048&dswid=-7140

Fisher, W. P., Jr., Pendrill, L., Lips da Cruz, A., & Felin, A. (2019). Why metrology? Fair dealing and efficient markets for the United Nations’ Sustainable Development Goals. Journal of Physics: Conference Series, 1379(012023 [http://iopscience.iop.org/article/10.1088/1742-6596/1379/1/012023]). doi:10.1088/1742-6596/1379/1/012023

Fisher, W. P., Jr., & Wilson, M. (2015). Building a productive trading zone in educational assessment research and practice. Pensamiento Educativo: Revista de Investigacion Educacional Latinoamericana, 52(2), 55-78. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2688260

Fisher, W. P., Jr., & Wilson, M. (2019). The BEAR Assessment System Software as a platform for developing and applying UN SDG metrics. Journal of Physics Conference Series, 1379(012041). Retrieved from https://doi.org/10.1088/1742-6596/1379/1/012041

Fisher, W. P., Jr., & Wright, B. D. (Eds.). (1994). Applications of probabilistic conjoint measurement. International Journal of Educational Research, 21(6), 557-664.

Kaiser, F. G., & Wilson, M. (2000). Assessing people’s general ecological behavior: A cross-cultural measure. Journal of Applied Social Psychology, 30(5), 952-978.

Kaiser, F. G., & Wilson, M. (2004). Goal-directed conservation behavior: The specific composition of a general performance. Personality and Individual Differences, 36(7), 1531-1544. Retrieved from https://doi.org/10.1016/j.paid.2003.06.003

Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new kind of fundamental measurement. Journal of Mathematical Psychology, 1(1), 1-27.

Madhala, T., & Fisher, W. P., Jr. (2022). Clothing, textile, and fashion industry sustainable impact measurement and management. Acta IMEKO, in press.

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224114000645

Mari, L., Wilson, M., & Maul, A. (2021). Measurement across the sciences (R. Morawski, G. Rossi, & others, Eds.). Springer Series in Measurement Science and Technology. Cham: Springer.

Massof, R. W., & McDonnell, P. J. (2012, April). Latent dry eye disease state variable. Investigative Ophthalmology & Visual Science, 53(4), 1905-1916. Retrieved from https://iovs.arvojournals.org/article.aspx?articleid=2188166

Moral, F. J., Álvarez, P., & Canito, J. L. (2006). Mapping and hazard assessment of atmospheric pollution in a medium sized urban area using the Rasch model and geostatistics techniques. Atmospheric Environment, 40(8), 1408-1418.

Moral, F. J., Rebollo, F. J., & Méndez, F. (2014). Using an objective model to estimate overall ozone levels at different urban locations. Stochastic Environmental Research and Risk Assessment, 28(3), 455-465.

Moral, F. J., Rebollo, F. J., Paniagua, L. L., García, A., & de Salazar, E. M. (2016). Application of climatic indices to analyse viticultural suitability in Extremadura, south-western Spain. Theoretical and Applied Climatology, 123(1-2), 277-289.

Narens, L., & Luce, R. D. (1986). Measurement: The theory of numerical assignments. Psychological Bulletin, 99(2), 166-180.

Pendrill, L. R. (2019). Quality assured measurement: Unification across social and physical sciences. Cham: Springer.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Powers, M., & Fisher, W. P., Jr. (2018). Toward a standard for measuring functional binocular vision: Modeling visual symptoms and visual skills. Journal of Physics Conference Series, 1065(132009). doi:10.1088/1742-6596/1065/13/132009

Powers, M., & Fisher, W. P., Jr. (2021). Physical and psychological measures quantifying functional binocular vision. Measurement: Sensors, 18, 100320. Retrieved from https://doi.org/10.1016/j.measen.2021.100320

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In J. Neyman (Ed.), Proceedings of the fourth Berkeley symposium on mathematical statistics and probability: Volume IV: Contributions to biology and problems of medicine (pp. 321-333 [http://www.rasch.org/memo1960.pdf]). Berkeley, California: University of California Press.

Whitehead, A. N. (1925). Science and the modern world. New York: Macmillan.

Wilson, M. R. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M. R. (2013a, April). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wilson, M. R. (2013b). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224113001061

Wilson, M., & Fisher, W. P., Jr. (2016). Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology across the Sciences: Wishful Thinking? Journal of Physics Conference Series, 772(1), 011001. Retrieved from http://iopscience.iop.org/article/10.1088/1742-6596/772/1/011001/pdf

Wilson, M., & Fisher, W. P., Jr. (2019). Preface of special issue, Psychometric Metrology. Measurement, 145, 190. Retrieved from https://www.sciencedirect.com/journal/measurement/special-issue/10C49L3R8GT

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm]. Retrieved from https://doi.org/10.1111/j.1745-3992.1997.tb00606.x

Comments on NeuroMET News

March 16, 2022

The NeuroMET project’s extension of Wright and Stone’s (1979) study of the Knox Cube Test is a remarkable testimony to the lasting value of their contributions in the history of measurement.

The persistent and eminently real invariance of the structures of short-term memory and attention span are an excellent place to begin building out metrological standards informing clinical care, as this NeuroMET project aims to do. 

This project provides a model to be followed in other areas as the now decades-long reproductions of constructs across samples and instruments presents undeniable evidence as to the metrological potentials of log-interval scales. Continuing to ignore the massive amounts of accumulated evidence and validated theory supporting our capacity to think together in common languages is increasingly akin to a willful ignorance and neurotic state of denial.

Though the log-interval fifth “level” of measurement proposed by Stevens (1957, p. 177; 1959, p. 24) is almost never mentioned, Narens and Luce (1986) note:

  • that the natural sciences are “full of log-interval scales” (pH acidity, decibels, Richter scale, information function, etc.) (p. 169),
  • that “that the scope of fundamental measurement is broader than Campbell had alleged” (p. 169),
  • that “it was only with the introduction of conjoint measurement–with its simple techniques and its possible applicability throughout the social sciences as well as physics–that this view [on the scope of fundamental measurement] became widely accepted” (p. 169), and
  • that additive conjoint models operationalizing log-interval scales in psychology and the social sciences (Rasch, 1960; Wright, 1968, 1977, 1999; Newby, et al., 2009; etc.) have “laid to rest the claim that the only possible basis for measurement is extensive structures” (p. 177).

At the close of his inaugural address to the AERA Rasch Measurement SIG, Ben Wright (1988) said:

“So we come to my last words. The Rasch model is not a data model at all. You may use it with data, but it’s not a data model. The Rasch model is a definition of measurement, a law of measurement. Indeed it’s the law of measurement.”

In short, measurement is not primarily a function of centrally planned and controlled data gathering and analysis. It is rather primarily a matter of reading instruments calibrated in quality assured metrics at the point of use in distributed metrological systems. The whole point of mathematical proofs that scores are minimally sufficient estimators of the parameters in identified measurement models is to support the economy of thought achieved in distributed systems of measurement standards.

The labor-saving symbolization of converging construct theories and experimental evidence brings what has been learned in measurement research into measurement practice. Failing to follow through from the proofs of sufficiency and the decades of evidence supporting their practical utility to the creation of metrological systems is a perverse travesty of reason, an inane refusal to accept the gifts of beauty and meaning being offered by these unasked-for but incredibly persistent and real structural invariances.

How can it be that the entire histories of education and health care in every human culture globally are based in developmental sequences and healing trajectories that have been ad nauseum repeatedly documented and observed for millennia without having been brought into common languages of measurement and management? In an age as numbed by repeated shocks as ours, it may seem impossible to be dumbfounded by much of anything, but, to me, this just takes the cake. On the one hand, the need to be able to work together to address catastrophically urgent issues has never been greater, and on the other hand, we refuse to take up and use the solutions we hold in our hands. What on earth is going on?

A growing body of publications substantiates the structures, processes, and outcomes of metrological research in psychology and the social sciences. In addition to Pendrill (2019); Mari, Wilson, and Maul (2021), Fisher (2009, 2012, etc.), and others, watch for the forthcoming Fisher and Cano (2022), which offers chapters by Andrich, Linacre, Massof, Melin, Pendrill, and others.

Kudos to the NeuroMET project team!

William Fisher

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2012, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr., & Cano, S. (Eds.). (2022). Person-centred outcome metrology. (R. Morawski, G. Rossi, et al., Series Eds.). (Springer Series in Measurement Science and Technology). Springer.

Mari, L., Wilson, M., & Maul, A. (2021). Measurement across the sciences (R. Morawski, G. Rossi, et al., Series Eds.). (Springer Series in Measurement Science and Technology). Springer.

Narens, L., & Luce, R. D. (1986). Measurement: The theory of numerical assignments. Psychological Bulletin, 99(2), 166-180.

Newby, V. A., Conner, G. R., Grant, C. P., & Bunderson, C. V. (2009). The Rasch model and additive conjoint measurement. Journal of Applied Measurement, 10(4), 348-354.

Pendrill, L. R. (2019). Quality assured measurement: Unification across social and physical sciences. (R. Morawski, G. Rossi, et al., Series Eds.). (Springer Series in Measurement Science and Technology). Springer.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests ( (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Danmarks Paedogogiske Institut.

Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In J. Neyman (Ed.), Proceedings of the fourth Berkeley symposium on mathematical statistics and probability: Volume IV: Contributions to biology and problems of medicine (pp. 321-333 [http://www.rasch.org/memo1960.pdf]). University of California Press.

Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64(3), 153-181.

Stevens, S. S. (1959). Measurement, psychophysics and utility. In C. W. Churchman & P. Ratoosh (Eds.), Measurement: Definitions and theories (pp. 18-63). Wiley.

Wright, B. D. (1968). Sample-free test calibration and person measurement. In Proceedings of the 1967 invitational conference on testing problems (pp. 85-101 [http://www.rasch.org/memo1.htm]). Educational Testing Service.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1988). Georg Rasch and measurement. Rasch Measurement Transactions, 2(3), 25-32 [http://www.rasch.org/rmt/rmt23a.htm].

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Lawrence Erlbaum Associates.

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. MESA Press

Day One Memo to the Biden-Harris Administration

January 5, 2021

William P. Fisher, Jr.

Living Capital Metrics LLC, BEAR Center, Graduate School of Education, UC Berkeley, and

the Research Institute of Sweden, Gothenburg

4 January 2021

I. Summary

As was observed by Reginald McGregor in the STEM learning ecosystems Zoom call today preparing for the Biden-Harris Town Hall meetings, past policies addressing equity, quality programming, funding, professional development, after school/school alignment, and other issues in education have not had the desired impacts on outcomes. McGregor then asked, what must we do differently to obtain the results we want and need? In short, what we must do differently is to focus systematically on how to create a viral contagion of trust–not just with each other but with our data and our institutions. Trust depends intrinsically on verifiable facts, personal ownership, and proven productive consequences–and we have a wealth of untapped resources for systematically building trust in mass scalable ways, for creating a social contagion of trust that disseminates the authentic wealth of learning and valued relationships. This proposal describes those resources, where they can be found, who the experts in these areas are, which agencies have historically been involved in developing them, what is being done to put them to work, and how we should proceed from here. Because it will set the tone for everything that follows, and because there is no better time for such a seismic shift in the ground than at the beginning, a clear and decisive statement of what needs to be done differently ought to be a Day One priority for the Biden-Harris administration. Though this memo was initiated in response to the STEM learning ecosystems town hall meetings, its theme is applicable across a wide range of policy domains, and should be read as such.

II. Challenge and Opportunity

What needs to be done differently hinges on the realization that a theme common to all of the issues identified by McGregor concerns the development of trusting relationships. Igniting viral contagions of trust systematically at mass scales requires accomplishing two apparently contradictory goals simultaneously: creating communications and information standards that are both universally transparent and individually personalized. It may appear that these two goals cannot be achieved at the same time, but in actual fact they are integrated in everyday language. The navigable continuity of communications and information standards need not be inconsistent with the unique strengths, weaknesses, and creative improvisations of custom tailored local conversations. Standards do not automatically entail pounding square pegs into round holes.

Transparent communications of meaningful high quality information cultivate trust by inspiring confidence in the repeated veracity and validity of what is said. Capacities for generalizing lessons learned across localities augment that trust and support the spread of innovations. Personalized information applicable to unique individual circumstances cultivates trust as students, teachers, parents, administrators, researchers, employers, and others are each able (a) to recognize their own special uniqueness reflected in information on their learning outcomes, (b) to see the patterns of their learning and growth reflected in that information over time, and (c) to see themselves in others’ information, and others in themselves. Systematic support and encouragement for policies and practices integrating these seemingly contradictory goals would constitute truly new approaches to old problems. Given that longstanding and widespread successes in combining these goals have already been achieved, new hope for resounding impacts becomes viable, feasible, and desirable.

III. Plan of Action

To stop the maddening contradiction of expecting different results from repetitions of the same behaviors, decisive steps must be taken toward making better use of existing models and methods, ones that coherently inform new behaviors leading to new outcomes. We are not speaking here of small incremental gains produced via intensive but microscopically focused efforts. We are raising the possibility that we may be capable of igniting viral contagions of trust. Just as the Arab Spring was in many ways fostered by the availability of new and unfettered technologically mediated social networks like Facebook and Twitter, so, also, will the creation of new outcomes communications platforms in education, healthcare, social services, and environmental resource management unleash powerful social forces. In the same way that smartphones are both incredibly useful for billions of people globally and are also highly technical devices involving complexities beyond the ken of the vast majority of those using them, so, too, do the complex models and methods at issue here have similar potentials for mass scaling.

To efficiently share transferable lessons as to what works, we need the common quantitative languages of outcome measurement standards, where (a) quantities are defined not in the ordinal terms of test scores but in the interval terms of metrologically traceable units with associated uncertainties, and (b) where those quantities are estimated not from just one set of assessment questions or items but from linked collections of diverse arrays of different kinds of self, observational, portfolio, peer, digital, and other assessments (or even from theory). To support individuals’ creative improvisations and unique circumstances, those standards, like the alphabets, grammars, and dictionaries setting the semiotic standards of everyday language, must enable new kinds of qualitative conversations negotiating the specific hurdles of local conditions. Custom tailored individual reports making use of interval unit estimates and uncertainties have been in use globally for decades.

Existing efforts in this area have been underway since the work of Thurstone in the 1920s, Rasch and Wright in the period from the 1950s through the 1990s, and of thousands of others since then. Over the course of the last several decades, the work of these innovators has been incorporated into hundreds of research studies funded by the Institute for Education Sciences, the National Science Foundation, and the National Institutes of Health. Most of these applications have, however, been hobbled by limited conceptualizations restricting expectations to the narrow terms of statistical hypothesis testing instead of opening onto the far more expansive possibilities offered by an integration of metrological standards and individualized reporting. This is a key way of expressing the crux of the shift proposed here. We are moving away from merely numeric statistical operations conducted via centrally planned and controlled analytic methods, and we are moving toward fully quantitative quality-assured measurement operations conducted via widely distributed and socially self-organized methods.

Because history shows existing institutions rarely successfully alter their founding principles, it is likely necessary for a government agency previously not involved in this work to now take the lead. That agency should be the National Institute of Standards and Technology (NIST). This recommendation is supported by the recent emergence of new alliances of psychometricians and metrologists clarifying the theory and methods needed for integrating the two seemingly opposed goals of comparable standards and custom tailored applications. The International Measurement Confederation (IMEKO) of national metrology institutes has provided a forum for reports in this area since 2008, as has, since 2017, the International Metrology Congress, held in Paris. An international meeting bringing together equal numbers of metrologists and psychometricians was held at UC Berkeley in 2016 (NIST’s Antonio Possolo gave a keynote), dozens of peer-reviewed journal articles in this new area have appeared since 2009, two authoritative books have appeared since 2019, and multiple ongoing collaborations internationally focused on the development of new unit standards and traceable instrumentation for education, health care, and other fields are underway.

Important leaders in this area capable of guiding the formation of the measurement-specific policies for research and practice include David Andrich (U Western Australia, Perth), Matt Barney (Leaderamp, Vacaville, CA), Betty Bergstrom (Pearson VUE, Chicago), Stefan Cano (Modus Outcomes, UK), Theo Dawson (Lectica, Northampton, MA), Peter Hagell (U Kristianstad, Sweden), Martin Ho (FDA), Mike Linacre (Winsteps.com), Larry Ludlow (Boston College), Luca Mari (U Cattaneo, Italy), Robert Massof (Johns Hopkins), Andrew Maul (UC Santa Barbara), Jeanette Melin (RISE, Sweden), Janice Morrison (TIES, Cleveland), Leslie Pendrill (RISE, Sweden), Maureen Powers (Gemstone Optometry, Berkeley), Andrea Pusic (Brigham & Women’s, Boston), Matthew Rabbitt (USDA), Thomas Salzberger (U Vienna, Austria), Karen Schmidt (U Virginia), Mark Wilson (UC Berkeley), and many others.

Partnerships across economic sectors are essential to the success of this initiative. Standards provide the media by which different groups of stakeholders can advance their unique interests more effectively in partnership than they can in isolation. Calls for proposals should stress the vital importance of establishing the multidisciplinary functionality of boundary objects residing at the borders between disciplines. Just as has been accomplished for the SI Unit metrological standards in the natural sciences, educators’ needs for comparable but customized information must be aligned with the analogous needs of stakeholders in other domains, such as management, clinical practice, law, accounting, finance, economics, etc. Of the actors in this domain listed above, at this time, the Research Institute of Sweden (RISE) is most energetically engaged in forming the needed cross-disciplinary collaborations.

Though the complexity and cost of such efforts appear almost insurmountable, beginning the process of envisioning how to address the challenges and capitalize on the opportunities is far more realistic and productive than continuing to flounder without direction, as we currently are and have been for decades. Estimates of the cost of creating, maintaining, and improving existing standards come to about 8% of GDP, with returns on investment estimated by NIST to be in the range of about 40% to over 400%, with a mean of about 140%. The levels of investment needed in the new metrological efforts, and the returns to be gained from those investments, will not likely differ significantly from these estimates.

IV. Conclusion

This proposal is important because it offers a truly original response to the question of what needs to be done differently in STEM education and elsewhere to avoid continuing to reproduce the same tired and ineffective results. The originality of the proposal is complemented by the depth at which it taps the historical successes of the natural sciences and the economics of standards: efficient markets for trading on trust in productive ways could lead to viral contagions of caring relationships. The proposal is also supported by the intuitive plausibility of taking natural language as a model for the creation of new common languages for the communication and improvement of learning, healthcare, employment, and other outcomes. As is the case for any authentic paradigm shift, opposition to the proposal is usually rooted in assumptions that existing expertise, methods, and tools are sufficient to the task, even when massive amounts of evidence point to the need for change. Simple, small, and inexpensive projects can be designed as tests of the concept and as means of attracting interest in the paradigm shift. Convening cross-sector groups of collaborators for the purposes of designing and conducting small demonstration projects may be an effective way of beginning. Finally, the potential for creating economically self-sustaining cycles of investments and returns could be an attractive way of incentivizing private sector participation, especially when this is expressed in terms of the alignment of financial wealth with the authentic wealth of trusting relationships.

V. About the author

William P. Fisher, Jr., Ph.D. received his doctorate from the University of Chicago, where he was mentored by Benjamin D. Wright and supported by a Spencer Foundation Dissertation Research Fellowship. He has been on the staff of the BEAR Center in the Graduate School of Education at UC Berkeley since 2011, and has consulted independently via Living Capital Metrics LLC since 2009. In 2020, Dr. Fisher joined the staff of the Research Institute of Sweden as a Senior Research Scientist. Dr. Fisher is recognized for contributions to measurement theory and practice that span the full range from the philosophical to the applied in fields as diverse as special education, mindfulness practice, nursing, rehabilitation, clinical chemistry, metrology, health outcomes, and survey research.

VI. Supporting literature

Andrich, David. “A Rating Formulation for Ordered Response Categories.” Psychometrika 43, no. 4, December 1978: 561-73.

Andrich, David. Rasch Models for Measurement. Sage University Paper Series on Quantitative Applications in the Social Sciences, vol. series no. 07-068. Beverly Hills, California: Sage, 1988.

Andrich, David, and Ida Marais. A Course in Rasch Measurement Theory: Measuring in the Educational, Social, and Health Sciences. Cham, Switzerland: Springer, 2019.

Barber, John M. “Economic Rationale for Government Funding of Work on Measurement Standards.” In Review of DTI Work on Measurement Standards, ed. R. Dobbie, J. Darrell, K. Poulter and R. Hobbs, Annex 5. London: Department of Trade and Industry, 1987.

Barney, Matt, and William P. Fisher, Jr. “Adaptive Measurement and Assessment.” Annual Review of Organizational Psychology and Organizational Behavior 3, April 2016: 469-90.

Cano, Stefan, Leslie Pendrill, Jeanette Melin, and William P. Fisher, Jr. “Towards Consensus Measurement Standards for Patient-Centered Outcomes.” Measurement 141, 2019: 62-69, https://doi.org/10.1016/j.measurement.2019.03.056.

Chien, Tsair-Wei, John Michael Linacre, and Wen-Chung Wang. “Examining Student Ability Using KIDMAP Fit Statistics of Rasch Analysis in Excel.” In Communications in Computer and Information Science, ed. Honghua Tan and Mark Zhou, 578-85. Berlin: Springer Verlag, 2011.

Chuah, Swee-Hoon, and Robert Hoffman. The Evolution of Measurement Standards. Tech. Rept. no. 5. Nottingham, England: Nottingham University Business School, 2004.

Fisher, William P., Jr. “The Mathematical Metaphysics of Measurement and Metrology: Towards Meaningful Quantification in the Human Sciences.” In Renascent Pragmatism: Studies in Law and Social Science, ed. Alfonso Morales, 118-53. Brookfield, VT: Ashgate Publishing Co., 2003.

Fisher, William P., Jr. “Meaning and Method in the Social Sciences.” Human Studies: A Journal for Philosophy and the Social Sciences 27, no. 4, October 2004: 429-54.

Fisher, William P., Jr. “Invariance and Traceability for Measures of Human, Social, and Natural Capital: Theory and Application.” Measurement 42, no. 9, November 2009: 1278-87.

Fisher, William P., Jr. NIST Critical National Need Idea White Paper: Metrological Infrastructure for Human, Social, and Natural Capital. Tech. Rept. no. http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf. Washington, DC: National Institute for Standards and Technology, 2009.

Fisher, William P., Jr. “Measurement, Reduced Transaction Costs, and the Ethics of Efficient Markets for Human, Social, and Natural Capital,” Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University, 2010, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2340674.

Fisher, William P., Jr. “What the World Needs Now: A Bold Plan for New Standards [Third Place, 2011 NIST/SES World Standards Day Paper Competition].” Standards Engineering 64, no. 3, 1 June 2012: 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, William P., Jr. “Imagining Education Tailored to Assessment as, for, and of Learning: Theory, Standards, and Quality Improvement.” Assessment and Learning 2, 2013: 6-22.

Fisher, William P., Jr. “Metrology, Psychometrics, and New Horizons for Innovation.” 18th International Congress of Metrology, Paris, September 2017: 09007, doi: 10.1051/metrology/201709007.

Fisher, William P., Jr. “A Practical Approach to Modeling Complex Adaptive Flows in Psychology and Social Science.” Procedia Computer Science 114, 2017: 165-74, https://doi.org/10.1016/j.procs.2017.09.027.

Fisher, William P., Jr. “Modern, Postmodern, Amodern.” Educational Philosophy and Theory 50, 2018: 1399-400. Reprinted in What Comes After Postmodernism in Educational Theory? ed. Michael Peters, Marek Tesar, Liz Jackson and Tina Besley, 104-105, New York: Routledge, DOI: 10.1080/00131857.2018.1458794.

Fisher, William P., Jr. “Contextualizing Sustainable Development Metric Standards: Imagining New Entrepreneurial Possibilities.” Sustainability 12, no. 9661, 2020: 1-22, https://doi.org/10.3390/su12229661.

Fisher, William P., Jr. “Measurements Toward a Future SI.” In Sensors and Measurement Science International (SMSI) 2020 Proceedings, ed. Gerald Gerlach and Klaus-Dieter Sommer, 38-39. Wunstorf, Germany: AMA Service GmbH, 2020, https://www.smsi-conference.com/assets/Uploads/e-Booklet-SMSI-2020-Proceedings.pdf.

Fisher, William P., Jr. “Wright, Benjamin D.” In SAGE Research Methods Foundations, ed. P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug and R.A. Williams. Thousand Oaks, CA: Sage Publications, 2020, https://methods.sagepub.com/foundations/wright-benjamin-d.

Fisher, William P., Jr., and A. Jackson Stenner. “Theory-Based Metrological Traceability in Education: A Reading Measurement Network.” Measurement 92, 2016: 489-96, http://www.sciencedirect.com/science/article/pii/S0263224116303281.

Fisher, William P., Jr., and Mark Wilson. “Building a Productive Trading Zone in Educational Assessment Research and Practice.” Pensamiento Educativo: Revista de Investigacion Educacional Latinoamericana 52, no. 2, 2015: 55-78, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2688260.

Gallaher, Michael P., Brent R. Rowe, Alex V. Rogozhin, Stephanie A. Houghton, J. Lynn Davis, Michael K. Lamvik, and John S. Geikler. Economic Impact of Measurement in the Semiconductor Industry. Tech. Rept. no. 07-2. Gaithersburg, MD: National Institute for Standards and Technology, 2007.

He, W., and G. G. Kingsbury. “A Large-Scale, Long-Term Study of Scale Drift: The Micro View and the Macro View.” Journal of Physics Conference Series 772, 2016: 012022, https://iopscience.iop.org/article/10.1088/1742-6596/772/1/012022/meta.

Holster, Trevor A., and J. W. Lake. “From Raw Scores to Rasch in the Classroom.” Shiken 19, no. 1, April 2015: 32-41.

Hunter, J Stuart. “The National System of Scientific Measurement.” Science 210, no. 21, 1980: 869-74.

Linacre, John Michael. “Individualized Testing in the Classroom.” In Advances in Measurement in Educational Research and Assessment, ed. Geofferey N Masters and John P. Keeves, 186-94. New York: Pergamon, 1999.

Mari, Luca, and Mark Wilson. “An Introduction to the Rasch Measurement Approach for Metrologists.” Measurement 51, May 2014: 315-27, http://www.sciencedirect.com/science/article/pii/S0263224114000645.

Mari, Luca, Mark Wilson, and Andrew Maul. Measurement Across the Sciences [in Press]. Springer Series in Measurement Science and Technology. Cham: Springer, 2021.

Massof, Robert W. “Editorial: Moving Toward Scientific Measurements of Quality of Life.” Ophthalmic Epidemiology 15, 1 August 2008: 209-11.

Masters, Geofferey N. “KIDMAP – a History.” Rasch Measurement Transactions 8, no. 2, 1994: 366 [http://www.rasch.org/rmt/rmt82k.htm].

Morrison, Jan, and William P. Fisher, Jr. “Connecting Learning Opportunities in STEM Education: Ecosystem Collaborations Across Schools, Museums, Libraries, Employers, and Communities.” Journal of Physics: Conference Series 1065, no. 022009, 2018, doi:10.1088/1742-6596/1065/2/022009.

Morrison, Jan, and William P. Fisher, Jr. “Measuring for Management in Science, Technology, Engineering, and Mathematics Learning Ecosystems.” Journal of Physics: Conference Series 1379, no. 012042, 2019, doi:10.1088/1742-6596/1379/1/012042.

National Institute for Standards and Technology. “Appendix C: Assessment Examples. Economic Impacts of Research in Metrology.” In Assessing Fundamental Science: A Report from the Subcommittee on Research, Committee on Fundamental Science, ed. Committee on Fundamental Science Subcommittee on Research. Washington, DC: National Standards and Technology Council, 1996, https://wayback.archive-it.org/5902/20150628164643/http://www.nsf.gov/statistics/ostp/assess/nstcafsk.htm#Topic%207.

National Institute for Standards and Technology. Outputs and Outcomes of NIST Laboratory Research. 18 December 2009. NIST. Last visited 18 April 2020 <https://www.nist.gov/director/outputs-and-outcomes-nist-laboratory-research&gt;.

North, Douglass C. Structure and Change in Economic History. New York: W. W. Norton & Co., 1981.

Pendrill, Leslie. Quality Assured Measurement: Unification Across Social and Physical Sciences. Cham: Springer, 2019.

Pendrill, Leslie, and William P. Fisher, Jr. “Counting and Quantification: Comparing Psychometric and Metrological Perspectives on Visual Perceptions of Number.” Measurement 71, 2015: 46-55, doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010.

Poposki, Nicola, Nineta Majcen, and Philip Taylor. “Assessing Publically Financed Metrology Expenditure Against Economic Parameters.” Accreditation and Quality Assurance: Journal for Quality, Comparability and Reliability in Chemical Measurement 14, no. 7, July 2009: 359-68.

Rasch, Georg. Probabilistic Models for Some Intelligence and Attainment Tests. Reprint, University of Chicago Press, 1980. Copenhagen, Denmark: Danmarks Paedogogiske Institut, 1960.

Rasch, Georg. “On General Laws and the Meaning of Measurement in Psychology.” In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability: Volume IV: Contributions to Biology and Problems of Medicine, ed. Jerzy Neyman, 321-33 [http://www.rasch.org/memo1960.pdf]. Berkeley: University of California Press, 1961.

Solloway, Sharon, and William P. Fisher, Jr. “Mindfulness in Measurement: Reconsidering the Measurable in Mindfulness.” International Journal of Transpersonal Studies 26, 2007: 58-81 [http://digitalcommons.ciis.edu/ijts-transpersonalstudies/vol26/iss1/8 ].

Stenner, A. Jackson, William P. Fisher, Jr., Mark H. Stone, and Don S. Burdick. “Causal Rasch Models.” Frontiers in Psychology: Quantitative Psychology and Measurement 4, no. 536, August 2013: 1-14 [doi: 10.3389/fpsyg.2013.00536].

Sumner, Jane, and William P. Fisher, Jr. “The Moral Construct of Caring in Nursing as Communicative Action: The Theory and Practice of a Caring Science.” Advances in Nursing Science 31, no. 4, 2008: E19-36.

Swann, G. M. P. The Economics of Metrology and Measurement. Report for the National Measurement Office and Department of Business, Innovation and Skills. London, England: Innovative Economics, Ltd, 2009.

Williamson, Gary L. “Exploring Reading and Mathematics Growth Through Psychometric Innovations Applied to Longitudinal Data.” Cogent Education 5, no. 1464424, 2018: 1-29.

Wilson, Mark, Ed. Towards Coherence Between Classroom Assessment and Accountability. National Society for the Study of Education, vol. 103, Part II. Chicago: University of Chicago Press, 2004.

Wilson, Mark R. Constructing Measures. Mahwah, NJ: Lawrence Erlbaum Associates, 2005.

Wilson, Mark R. “Seeking a Balance Between the Statistical and Scientific Elements in Psychometrics.” Psychometrika 78, no. 2, April 2013: 211-36.

Wilson, Mark. “Making Measurement Important for Education: The Crucial Role of Classroom Assessment.” Educational Measurement: Issues and Practice 37, no. 1, 2018: 5-20.

Wilson, Mark, and William P. Fisher, Jr. “Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology Across the Sciences: Wishful Thinking?” Journal of Physics Conference Series 772, no. 1, 2016: 011001, http://iopscience.iop.org/article/10.1088/1742-6596/772/1/011001/pdf.

Wilson, Mark, and William P. Fisher, Jr., Eds. Psychological and Social Measurement: The Career and Contributions of Benjamin D. Wright. Springer Series in Measurement Science and Technology, ed. M. G. Cain, G. B. Rossi, J. Tesai, M. van Veghel and K.-Y Jhang. Cham, Switzerland: Springer Nature, 2017, https://link.springer.com/book/10.1007/978-3-319-67304-2.

Wilson, Mark, and William P. Fisher, Jr. “Preface of Special Issue, Psychometric Metrology.” Measurement 145, 2019: 190, https://www.sciencedirect.com/journal/measurement/special-issue/10C49L3R8GT.

Wilson, Mark, and Kathleen Scalise. “Assessment of Learning in Digital Networks.” In Assessment and Teaching of 21st Century Skills: Methods and Approach, ed. Patrick Griffin and Esther Care, 57-81. Dordrecht: Springer Netherlands, 2015.

Wilson, Mark, and Y. Toyama. “Formative and Summative Assessments in Science and Literacy Integrated Curricula: A Suggested Alternative Approach.” In Language, Literacy, and Learning in the STEM Disciplines, ed. Alison L. Bailey, Carolyn A. Maher and Louise C. Wilkinson, 231-60. New York: Routledge, 2018.

Wright, Benjamin D. “Sample-Free Test Calibration and Person Measurement.” In Proceedings of the 1967 Invitational Conference on Testing Problems, 85-101 [http://www.rasch.org/memo1.htm]. Princeton, New Jersey: Educational Testing Service, 1968.

Wright, Benjamin D. “Solving Measurement Problems with the Rasch Model.” Journal of Educational Measurement 14, no. 2, 1977: 97-116 [http://www.rasch.org/memo42.htm].

Wright, Benjamin D. “Despair and Hope for Educational Measurement.” Contemporary Education Review 3, no. 1, 1984: 281-88 [http://www.rasch.org/memo41.htm].

Wright, Benjamin D. “Additivity in Psychological Measurement.” In Measurement and Personality Assessment, ed. Edward Roskam, 101-12. North Holland: Elsevier Science Ltd, 1985.

Wright, Benjamin D. “A History of Social Science Measurement.” Educational Measurement: Issues and Practice 16, no. 4, Winter 1997: 33-45, 52. https://doi.org/10.1111/j.1745-3992.1997.tb00606.x.

Wright, Benjamin D., and G N Masters. Rating Scale Analysis. Chicago: MESA Press, 1982. Full text: https://www.rasch.org/BTD_RSA/pdf%20%5Breduced%20size%5D/Rating%20Scale%20Analysis.pdf.

Wright, Benjamin D., R. J. Mead, and L. H. Ludlow. KIDMAP: Person-by-Item Interaction Mapping. Tech. Rept. no. MESA Memorandum #29. Chicago: MESA Press [http://www.rasch.org/memo29.pdf], 1980.

Wright, Benjamin D., and Mark H Stone. Best Test Design. Chicago: MESA Press, 1979, Full text: https://www.rasch.org/BTD_RSA/pdf%20%5Breduced%20size%5D/Best%20Test%20Design.pdf.

Creative Commons License

LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Measurement Choices in Sustainable Development

June 28, 2020

Dividing Us, or Unifying Us?

Showing the Way, or Leading Astray?

Sustainable development measurement choices have significant effects on our capacities to coordinate and manage our efforts. The usual approach to sustainability metrics requires that all parties comparing impacts use the same indicators. Communities or organizations using different metrics are not comparable. Applications of the metrics to judge progress or to evaluate the effects of different programs focus on comparing results from individual indicators. The indicators with the biggest differences are the areas in which accomplishments are rewarded, or failings provoke rethinking.

A number of scientific and logical problems can be identified in this procedure, and these will be taken up in due course. At the moment, however, let us only note that advanced scientific modeling approaches to measuring sustainable development do not require all parties to employ the same indicators, since different sets of indicators can be made comparable via instrument equating and item banking methods. And instead of focusing on differences across indicators, these alternative approaches use the indicators to map the developmental sequence. These maps enable end users to locate and orient themselves relative to where they have been, where they want to go, and where to go next on their sustainability journey.

Separating sustainable development efforts into incommensurable domains becomes a thing of the past when advanced scientific modeling approaches are used. At the same time, these modeling approaches also plot navigable maps of the sustainability terrain.

Scientific modeling of sustainability measures offer other advantages, as well.

  • First, scientific measures always contextualize reported quantities with a standard error term, whereas typical metrics are reported as though they are perfectly precise, with no uncertainty.
  • Second, scientific measures are calibrated as interval measures on the basis of predictive theory and experimental evidence, whereas sustainability metrics are typically ordinal counts of events (persons served, etc.), percentages, or ratings.
  • Third, scientific measures summarize multiple indicators in a single quantity and uncertainty term, with no loss of information, whereas sustainability metrics are often reported as large volumes of numbers.

The advantages of investing in a scientific measurement modeling approach follow from its combination of general comparability across data sets, the mapping of the thing measured, the reporting of uncertainty terms, the interval quantity, and the removal of the information overload.

For more information, see other entries in this blog and:

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-1093 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2012, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2013). Imagining education tailored to assessment as, for, and of learning: Theory, standards, and quality improvement. Assessment and Learning, 2, 6-22.

Fisher, W. P., Jr. (2020). Measurements toward a future SI: On the longstanding existence of metrology-ready precision quantities in psychology and the social sciences. In G. Gerlach & K.-D. Sommer (Eds.), SMSI 2020 Proceedings (pp. 38-39). Wunstorf, Germany: AMA Service GmbH. Retrieved from https://www.smsi-conference.com/assets/Uploads/e-Booklet-SMSI-2020-Proceedings.pdf

Fisher, W. P., Jr. (2020). Measuring genuine progress: An example from the UN Millennium Development Goals project. Journal of Applied Measurement, 21(1), 110-133

Fisher, W. P., Jr., Pendrill, L., Lips da Cruz, A., Felin, A., &. (2019). Why metrology? Fair dealing and efficient markets for the United Nations’ Sustainable Development Goals. Journal of Physics: Conference Series, 1379(012023). doi:10.1088/1742-6596/1379/1/012023

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224116303281

Fisher, W. P., Jr., & Stenner, A. J. (2017, September 18). Towards an alignment of engineering and psychometric approaches to uncertainty in measurement: Consequences for the future. 18th International Congress of Metrology, 12004, 1-9. Retrieved from https://doi.org/10.1051/metrology/201712004

Fisher, W. P., Jr., & Stenner, A. J. (2018). Ecologizing vs modernizing in measurement and metrology. Journal of Physics Conference Series, 1044(012025), [http://iopscience.iop.org/article/10.1088/1742-6596/1044/1/012025].

Fisher, W. P., Jr., & Wilson, M. (2015). Building a productive trading zone in educational assessment research and practice. Pensamiento Educativo: Revista de Investigacion Educacional Latinoamericana, 52(2), 55-78. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2688260

Fisher, W. P., Jr., & Wilson, M. (2020). An online platform for sociocognitive metrology: The BEAR Assessment System Software. Measurement Science and Technology, 31(034006). Retrieved from https://iopscience.iop.org/article/10.1088/1361-6501/ab5397/meta

Fisher, W. P., Jr., & Wright, B. D. (Eds.). (1994). Applications of probabilistic conjoint measurement. International Journal of Educational Research, 21(6), 557-664.

Lips da Cruz, A., Fisher, W. P. J., Felin, A., & Pendrill, L. (2019). Accelerating the realization of the United Nations Sustainability Development Goals through metrological multi-stakeholder interoperability. Journal of Physics: Conference Series, 1379(012046).

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224114000645

Mari, L., & Wilson, M. (2020). Measurement across the sciences [in press]. Cham: Springer.

Pendrill, L. (2019). Quality assured measurement: Unification across social and physical sciences. Cham: Springer

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Pendrill, L., & Petersson, N. (2016). Metrology of human-based and other qualitative measurements. Measurement Science and Technology, 27(9), 094003. Retrieved from https://doi.org/10.1088/0957-0233/27/9/094003

Wilson, M., & Fisher, W. (2016). Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology across the Sciences: Wishful Thinking? Journal of Physics Conference Series, 772(1), 011001. Retrieved from http://iopscience.iop.org/article/10.1088/1742-6596/772/1/011001/pdf

Wilson, M., & Fisher, W. (2019). Preface of special issue, Psychometric Metrology. Measurement, 145, 190. Retrieved from https://www.sciencedirect.com/journal/measurement/special-issue/10C49L3R8GT

Wilson, M., Mari, L., Maul, A., & Torres Irribara, D. (2015). A comparison of measurement concepts across physical science and social science domains: Instrument design, calibration, and measurement. Journal of Physics Conference Series, 588(012034), http://iopscience.iop.org/1742-6596/588/1/012034.

Remarks on a Survey Concerning Results Replications in Psychology

May 22, 2020

(The following reply was sent in response to an invitation from researchers at the University of Queensland in Brisbane, Australia to participate in a survey on the replication crisis in psychology.)

Thank you for alerting me to your important survey, and for providing an opportunity to address issues of results replications in psychology. Given the content of the survey, it seems appropriate to offer an alternative perspective on the nature of the situation.

Initially, I had a look at the first question in your survey on the replication crisis in psychology and closed the page. It does not seem to me the question can be properly answered given the information provided. Later I went back and responded as reasonably as I could, given the entire survey is biased toward the standard misconceptions of psychological measurement, namely, that ordinal scores gathered with the aim of applying descriptive statistics are definitive, and that quantitative methods have no need for hypotheses, models, proofs, or evidence of meaningful interval units of comparison.

To my mind, the replication crisis in psychology is in part a function of ignoring the distinction between statistical models and scientific models (Cohen, 1994; Duncan, 1992; Fisher, 2010b; Meehl, 1967; Michell, 1986; Wilson, 2013a; Zhu, 2012). The statistical motivation for making models probabilistic concerns sampling; the scientific motivation concerns the individual response process. As Duncan and Stenbeck (1988, pp. 24-25) put it,

“The main point to emphasize here is that the postulate of probabilistic response must be clearly distinguished in both concept and research design from the stochastic variation of data that arises from random sampling of a heterogeneous population. The distinction is completely blurred in our conventional statistical training and practice of data analysis, wherein the stochastic aspects of the statistical model are most easily justified by the idea of sampling from a population distribution. We seldom stop to wonder if sampling is the only reason for making the model stochastic. The perverse consequence of doing good statistics is, therefore, to suppress curiosity about the actual processes that generate the data.”

This distinction between scientific and statistical models is old and worn. It often seems that the mainstream will never pick up on it, despite the fact that, insofar as the individual-level response process’s sum of counts or ratings is treated inferentially as a sufficient statistic (i.e., as a score to which no outside information is added), then an identified scientific measurement model of a particular form is assumed, whether or not the researcher/analyst is aware of it or actually applies it (Andersen, 1977, 1999; Fischer, 1981; San Martin, Gonzalez, & Tuerlinckx, 2009).

Forty-three years ago, the situation was described by Wright (1977, p. 114):

“Unweighted scores are appropriate for person measurement if and only if what happens when a person responds to an item can be usefully approximated by a Rasch model…. Ironically, for anyone who claims skepticism about ‘the assumptions’ of the Rasch model, those who use unweighted scores are, however unwittingly, counting on the Rasch model to see them through. Whether this is useful in practice is a question not for more theorizing, but for empirical study.”

Insofar as measurement results are replicable, they converge on a common construct and unit definition, and support collective learning processes, the coherence of communities of research and practice, and the emergence of metrological standards (Barbic, et al., 2019; Cano, et al., 2019; Fisher, 1997a/b, 2004, 2009, 2010a, 2012, 2017a; Fisher & Stenner, 2016; Mari & Wilson, 2014, 2020; Pendrill, 2014, 2019; Pendrill & Fisher, 2015; Wilson, 2013b).

Researchers’ subjective guesses as to what measured constructs look like and how they behave tend to be borne out, more or less, in ways that allow us all to learn from each other, if and when we take the trouble to prepare, scale, and present our results in the form required to make that happen (for guidance in this regard, see Fisher & Wright, 1994; Smith, 2005; Stone, Wright, & Stenner, 1999; Wilson, 2005, 2009, 2018; Wilson, et al., 2012; Wright & Stone, 1979, 1999, 2003).

You would never know it from the kind of research assumed in your online survey, but the successful replication of results naturally should and does lead to the detailed mapping of variables (constructs), and the definition of unit standards that bring the thing measured into language as common metrics and shared objects of reference.

This is not a new idea, or an unproven one (Luce & Tukey, 1964; Narens & Luce, 1986; Rasch, 1960; Thurstone, 1928; Wright, 1997; among many others). Proofs of the form of the model following from the sufficiency of the scores are cited above, and experimental proofs of the utility of the models for designing and calibrating interval unit measures are provided in thousands of peer reviewed publications. Explanatory scientific models predicting item calibrations have been in development and practical in use for decades (Embretson, 2010; Fischer, 1973; Latimer, 1982; Prien, 1989; Stenner & Smith, 1982; Stenner, et al., 2013; Wright & Stone, 1979; among many others).

Preconceptions and unexamined assumptions about measurement blind many researchers and limit their vision of what’s possible to conventional repetitions of more of the same, even when the methods used do not work and have been shown ineffectual repeatedly for decades. In this regard, it is worth noting, contra widespread assumptions, that another difference between statistical and scientific models is the reductionist whole-is-the-sum-of-the-parts perspective of the former, and the emergent whole-is-greater-than-the-sum-of-the-parts perspective of the latter (Fisher, 2004, 2017b, 2019b; Fisher & Stenner, 2018). In contrast to the lack of vision and imagination resulting from the myopia of statistical methods, I think it is essential that we seek a capacity to extend everyday language so as to inform locally situated dialogues and negotiations via the mediations of meaningful common metrics integrating concrete things with formal concepts, as has been routinely the case in a wide range of applications for quite some time (Chien, et al., 2009; Masters, 1994; Masters, et al., 1994, 1999; Wilson, 2018; Wilson, et al, 2012; Wright, et al., 1980; among many others).

Sweden’s national metrology institute (the Research Institute of Sweden, RISE) is aggressively taking up research in this domain (Cano, et al., 2019; Fisher, 2019a; Pendrill, 2014, 2019; Pendrill & Fisher, 2015), as are a number of other national metrology institutes globally who have been involved over the last decade in the meetings of the International Measurement Confederation (IMEKO; Fisher, 2008, 2010c, 2012a). An IMEKO Joint Symposium hosted by myself and Mark Wilson at UC Berkeley in 2016 had nearly equal numbers of psychometricians and metrology engineers (Wilson & Fisher, 2016). This and later Joint Symposia have included enough full length papers for special issues of IMEKO’s Measurement journal (Wilson & Fisher, 2018, 2019).

Though psychology and the social sciences seem hopelessly stuck on continued use of statistical significance tests and ordinal scores as the paradigm of measurement, having garnered the attention of metrologists, a sound basis has emerged for hope that new directions will be explored on broader scales and to greater depths. The new partnerships being sought out and research initiatives being proposed at RISE, for instance, promise to enhance awareness across fields as to the challenges and opportunities at hand.

References

Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42(1), 69-81.

Andersen, E. B. (1999). Sufficient statistics in educational measurement. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment (pp. 122-125). New York: Pergamon.

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Barbic, S., Cano, S. J., Tee, K., & Mathias, S. (2019). Patient-centered outcome measurement in psychiatry: How metrology can optimize health services and outcomes. TMQ_Techniques, Methodologies and Quality, 10(Special Issue on Health Metrology), 10-19.

Cano, S., Pendrill, L., Melin, J., & Fisher, W. P., Jr. (2019). Towards consensus measurement standards for patient-centered outcomes. Measurement, 141, 62-69.

Chien, T.-W., Wang, W.-C., Wang, H.-Y., & Lin, H.-J. (2009). Online assessment of patients’ views on hospital performances using Rasch model’s KIDMAP diagram. BMC Health Services Research, 9, 135.

Cohen, J. (1994). The earth is round (p < 0.05). American Psychologist, 49, 997-1003.

Duncan, O. D. (1992, September). What if? Contemporary Sociology, 21(5), 667-668.

Duncan, O. D., & Stenbeck, M. (1988). Panels and cohorts: Design and model in the study of voting turnout. In C. C. Clogg (Ed.), Sociological Methodology 1988 (pp. 1-35). Washington, DC: American Sociological Association.

Embretson, S. E. (2010). Measuring psychological constructs: Advances in model-based approaches. Washington, DC: American Psychological Association.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374.

Fischer, G. H. (1981). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika, 46(1), 59-77.

Fisher, W. P., Jr. (1997a). Physical disability construct convergence across instruments: Towards a universal metric. Journal of Outcome Measurement, 1(2), 87-113 [http://jampress.org/JOM_V1N2.pdf]

Fisher, W. P., Jr. (1997b). What scale-free measurement means to health outcomes research. Physical Medicine & Rehabilitation State of the Art Reviews, 11(2), 357-373.

Fisher, W. P., Jr. (2004). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-454.

Fisher, W. P., Jr. (2008). Notes on IMEKO symposium. Rasch Measurement Transactions, 22(1), 1147 [http://www.rasch.org/rmt/rmt221.pdf].

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P., Jr. (2010a). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics Conference Series, 238(012016).

Fisher, W. P., Jr. (2010b). Statistics and measurement: Clarifying the differences. Rasch Measurement Transactions, 23(4), 1229-1230  [http://www.rasch.org/rmt/rmt234.pdf].

Fisher, W. P., Jr. (2010c). Unifying the language of measurement. Rasch Measurement Transactions, 24(2), 1278-1281  [http://www.rasch.org/rmt/rmt242.pdf].

Fisher, W. P., Jr. (2012a). 2011 IMEKO conference proceedings available online. Rasch Measurement Transactions, 25(4), 1349 [http://www.rasch.org/rmt/rmt254.pdf].

Fisher, W. P., Jr. (2012b, June 1). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2017a). Metrology, psychometrics, and new horizons for innovation. 18th International Congress of Metrology, Paris, 09007. doi: 10.1051/metrology/201709007

Fisher, W. P., Jr. (2017b). A practical approach to modeling complex adaptive flows in psychology and social science. Procedia Computer Science, 114, 165-174. Retrieved from https://doi.org/10.1016/j.procs.2017.09.027

Fisher, W. P., Jr. (2018). Update on Rasch in metrology. Rasch Measurement Transactions, 32(1), 1685-1687.

Fisher, W. P., Jr. (2019a). News from Sweden’s National Metrology Institute. Rasch Measurement Transactions, 32(3), 1719-1723.

Fisher, W. P., Jr. (2019b). A nondualist social ethic: Fusing subject and object horizons in measurement. TMQ–Techniques, Methodologies, and Quality [Special Issue on Health Metrology], 10, 21-40.

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496.

Fisher, W. P., Jr., & Stenner, A. J. (2018). On the complex geometry of individuality and growth: Cook’s 1914 ‘Curves of Life’ and reading measurement. Journal of Physics Conference Series, 1065, 072040.

Fisher, W. P., Jr., & Wright, B. D. (Eds.). (1994). Applications of probabilistic conjoint measurement. International Journal of Educational Research, 21(6), 557-664.

Green, K. E., & Smith, R. M. (1987). A comparison of two methods of decomposing item difficulties. Journal of Educational Statistics, 12(4), 369-381.

Latimer, S. L. (1982). Using the Linear Logistic Test Model to investigate a discourse-based model of reading comprehension. Education Research and Perspectives, 9(1), 73-94 [http://www.rasch.org/erp7.htm].

Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new kind of fundamental measurement. Journal of Mathematical Psychology, 1(1), 1-27.

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327.

Mari, L., & Wilson, M. (2020). Measurement across the sciences [in press]. Cham: Springer.

Masters, G. N. (1994). KIDMAP – a history. Rasch Measurement Transactions, 8(2), 366 [http://www.rasch.org/rmt/rmt82k.htm].

Masters, G. N., Adams, R. J., & Lokan, J. (1994). Mapping student achievement. International Journal of Educational Research, 21(6), 595-610.

Masters, G. N., Adams, R. J., & Wilson, M. (1999). Charting of student progress. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment (pp. 254-267). New York: Pergamon.

Meehl, P. E. (1967). Theory-testing in psychology and physics: A methodological paradox. Philosophy of Science, 34(2), 103-115.

Michell, J. (1986). Measurement scales and statistics: A clash of paradigms. Psychological Bulletin, 100, 398-407.

Narens, L., & Luce, R. D. (1986). Measurement: The theory of numerical assignments. Psychological Bulletin, 99(2), 166-180.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L. (2019). Quality assured measurement: Unification across social and physical sciences. Cham: Springer.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Prien, B. (1989). How to predetermine the difficulty of items of examinations and standardized tests. Studies in Educational Evaluation, 15, 309-317.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

San Martin, E., Gonzalez, J., & Tuerlinckx, F. (2009). Identified parameters, parameters of interest, and their relationships. Measurement: Interdisciplinary Research and Perspectives, 7(2), 97-105.

Smith, E. V., Jr. (2005). Representing treatment effects with variable maps. In N. Bezruczko (Ed.), Rasch measurement in health sciences (pp. 247-259). Maple Grove, MN: JAM Press.

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14.

Stenner, A. J., & Smith, M., III. (1982). Testing construct theories. Perceptual and Motor Skills, 55, 415-426.

Stone, M. H., Wright, B., & Stenner, A. J. (1999). Mapping variables. Journal of Outcome Measurement, 3(4), 308-322. [http://jampress.org/JOM_V3N4.pdf]

Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology, XXXIII, 529-544. (Rpt. in L. L. Thurstone, (1959). The measurement of values (pp. 215-233). Chicago, Illinois: University of Chicago Press, Midway Reprint Series.)

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M. R. (2013a). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78(2), 211-236.

Wilson, M. R. (2013b). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wilson, M. R. (2009). Measuring progressions: Assessment structures underlying a learning progression. Journal of Research in Science Teaching, 46, 716-730.

Wilson, M. (2018). Making measurement important for education: The crucial role of classroom assessment. Educational Measurement: Issues and Practice, 37(1), 5-20.

Wilson, M., Bejar, I., Scalise, K., Templin, J., Wiliam, D., & Torres Irribarra, D. (2012). Perspectives on methodological issues. In P. Griffin, B. McGaw & E. Care (Eds.), Assessment and teaching of 21st century skills (pp. 67-141). Dordrecht: Springer Netherlands.

Wilson, M., & Fisher, W. (2016). Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology across the Sciences: Wishful Thinking? Journal of Physics Conference Series, 772(1), 011001.

Wilson, M., & Fisher, W. (2018). Preface of special issue Metrology across the Sciences: Wishful Thinking? Measurement, 127, 577.

Wilson, M., & Fisher, W. (2019). Preface of special issue, Psychometric Metrology. Measurement, 145, 190.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Wright, B. D., Mead, R. J., & Ludlow, L. H. (1980). KIDMAP: person-by-item interaction mapping (Tech. Rep. No. MESA Memorandum #29). Chicago:. MESA Press  [http://www.rasch.org/memo29.pdf].

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Chicago, Illinois: MESA Press

Wright, B. D., & Stone, M. H. (1999). Measurement essentials. Wilmington, DE: Wide Range, Inc. [http://www.rasch.org/measess/me-all.pdf].

Wright, B. D., & Stone, M. H. (2003). Five steps to science: Observing, scoring, measuring, analyzing, and applying. Rasch Measurement Transactions, 17(1), 912-913 [http://www.rasch.org/rmt/rmt171j.htm].

Zhu, W. (2012). Sadly, the earth is still round (p< 0.05). Journal of Sport and Health Science, 1(1), 9-11.

Creative Commons License

LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

IMEKO Joint Symposium in St. Petersburg, Russia, 2-5 July 2019

June 26, 2019

The IMEKO Joint Symposium will be next week, 2-5 July, at the Original Sokos Hotel Olympia Garden, located at Batayskiy Pereulok, 3А, in St. Petersburg, Russia. Kudos to Kseniia Sapozhnikova, Giovanni Rossi, Eric Benoit, and the organizing committee for putting together such an impressive program, which is posted at: https://imeko19-spb.org/wp-content/uploads/2019/06/Program-of-the-Symposium.pdf

Presentations on measurement across the sciences from metrology engineers and psychometricians from around the world will include: Andrich, Cavanagh, Fitkov-Norris, Huang, Mari, Melin, Nguyen, Oon, Powers, Salzberger, Wilson, and multiple other co-authors, including Adams, Cano, Maul, Pendrill, and more.

For background on this rapidly developing new conversation on measurement across the sciences, see the references listed at bottom below. The late Ludwig Finkelstein, editor of IMEKO’s Measurement journal from 1982 to 2000, was a primary instigator of work in this area. At the 2010 Joint Symposium he co-hosted in London, Finkelstein said: “It is increasingly recognised that the wide range and diverse applications of measurement are based on common logical and philosophical principles and share common problems” (Finkelstein, 2010, p. 2). The IMEKO Joint Symposium continues to advance in the direction foreseen by Finkelstein.

Topics to be addressed include a round table discussion on the topic “Terminology issues related to expanding boundaries of measurements” chaired by Mari and Chunovkina.

Paper titles include:

Andrich on “Exemplifying natural science measurement in the social sciences with Rasch measurement theory”

Benoit, et al. on “Musical instruments for the measurement of autism sensory disorders”

Budylina and Danilov on “Methods to ensure the reliability of measurements in the age of Industry 4.0”

Cavanagh, Asano-Cavanagh, and Fisher on “Natural semantic metalanguage as an approach to measuring meaning”

Crenna and Rossi on “Squat biomechanics in weightlifting: Foot attitude effects”

Fisher, Pendrill, Lips da Cruz, and Felin on “Why metrology? Fair dealing and efficient markets for the UN SDGs”

Fisher and Wilson on “The BEAR Assessment System Software as a platform for developing and applying UN SDG metrics”

Fitkov-Norris and Yeghiazarian on “Is context the hidden spanner in the works of educational measurement: Exploring the impact of context on mode of learning preferences”

Gavrilenkov, et al. on “Multicriteria approach to design of strain gauge force transducers”

Grednovskaya, et al. on “Measuring non-physical quantities in the procedures of philosophical practice”

Huang, Oon, and Fisher on “Coherence in measuring student evaluation of teaching: A new paradigm”

Katkov on “The status of and prospects for development of voltage quantum standards”

Kneller and Fayans on “Solving interdisciplinary tasks: The challenge and the ways to surmount it”

Kostromina and Gnedykh on “Problems and prospects of complex psychological phenomena measurement”

Lips da Cruz, Fisher, Pendrill, and Felin on “Accelerating the realization of the UN SDGs through metrological multi-stakeholder interoperability”

Lyubimtsev, et al. on “Measuring systems designed for working with living organisms as biosensors: Features of their metrological maintenance”

Mari, Chunovkina, and Ehrlich on “The complex concept of quantity in the past and (possibly) the future of the International Vocabulary of Metrology”

Mari, Maul, and Wilson on “Can there be one meaning of ‘measurement’ across the sciences?”

Melin, Pendrill, Cano, and the EMPIR NeuroMET 15HLT04 Consortium on “Towards patient-centred cognition metrics”

Morrison and Fisher on “Measuring for management in Science, Technology, Engineering, and Mathematics learning ecosystems”

Nguyen on “The feasibility of using an international common reading progression to measure reading across languages: A case study of the Vietnamese language”

Nguyen, Nguyen, and Adams on “Assessment of the generic problem-solving construct across different contexts”

Oon, Hoi-Ka, and Fisher on “Metrologically coherent assessment for learning: What, why, and how”

Pandurevic, et al. on “Methods for quantitative evaluation of force and technique in competitive sport climbing”

Pavese on “Musing on extreme quantity values in physics and the problem of removing infinity”

Powers and Fisher on “Advances in modelling visual symptoms and visual skills”

Salzberger, Cano, et al. on “Addressing traceability in social measurement: Establishing a common metric for dependence”

Sapozhnikova, et al. on “Music and growl of a lion: Anything in common? Measurement model optimized with the help of AI will answer”

Soratto, Nunes, and Cassol on “Legal metrological verification in health area in Brazil”

Wilson and Dulhunty on “Interpreting the relationship between item difficulty and DIF: Examples from educational testing”

Wilson, Mari, and Maul on “The status of the concept of reference object in measurement in the human sciences compared to the physical sciences”

Background References

Finkelstein, L. (1975). Representation by symbol systems as an extension of the concept of measurement. Kybernetes, 4(4), 215-223.Finkelstein, L. (2003, July). Widely, strongly and weakly defined measurement. Measurement, 34(1), 39-48(10).

Finkelstein, L. (2005). Problems of measurement in soft systems. Measurement, 38(4), 267-274.

Finkelstein, L. (2009). Widely-defined measurement–An analysis of challenges. Measurement: Concerning Foundational Concepts of Measurement Special Issue Section (L. Finkelstein, Ed.), 42(9), 1270-1277.

Finkelstein, L. (2010). Measurement and instrumentation science and technology-the educational challenges. Journal of Physics Conference Series, 238, doi:10.1088/1742-6596/238/1/012001.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement: Concerning Foundational Concepts of Measurement Special Issue (L. Finkelstein, Ed.), 42(9), 1278-1287.

Mari, L. (2000). Beyond the representational viewpoint: A new formalization of measurement. Measurement, 27, 71-84.

Mari, L., Maul, A., Irribara, D. T., & Wilson, M. (2016, March). Quantities, quantification, and the necessary and sufficient conditions for measurement. Measurement, 100, 115-121. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224116307497

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224114000645

Pendrill, L. (2014, December). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33. Retrieved from http://www.tandfonline.com/doi/abs/10.1080/19315775.2014.11721702

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Pendrill, L., & Petersson, N. (2016). Metrology of human-based and other qualitative measurements. Measurement Science and Technology, 27(9), 094003. Retrieved from https://doi.org/10.1088/0957-0233/27/9/094003

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224113001061

Wilson, M., & Fisher, W. (2016). Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology across the Sciences: Wishful Thinking? Journal of Physics Conference Series, 772(1), 011001. Retrieved from http://iopscience.iop.org/article/10.1088/1742-6596/772/1/011001/pdf

Wilson, M., & Fisher, W. (2018). Preface of special issue, Metrology across the Sciences: Wishful Thinking? Measurement, 127, 577.

Wilson, M., & Fisher, W. (2019). Preface of special issue, Psychometric Metrology. Measurement, 145, 190.

 

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.