Archive for the ‘government’ Category

Day One Memo to the Biden-Harris Administration

January 5, 2021

William P. Fisher, Jr.

Living Capital Metrics LLC, BEAR Center, Graduate School of Education, UC Berkeley, and

the Research Institute of Sweden, Gothenburg

4 January 2021

I. Summary

As was observed by Reginald McGregor in the STEM learning ecosystems Zoom call today preparing for the Biden-Harris Town Hall meetings, past policies addressing equity, quality programming, funding, professional development, after school/school alignment, and other issues in education have not had the desired impacts on outcomes. McGregor then asked, what must we do differently to obtain the results we want and need? In short, what we must do differently is to focus systematically on how to create a viral contagion of trust–not just with each other but with our data and our institutions. Trust depends intrinsically on verifiable facts, personal ownership, and proven productive consequences–and we have a wealth of untapped resources for systematically building trust in mass scalable ways, for creating a social contagion of trust that disseminates the authentic wealth of learning and valued relationships. This proposal describes those resources, where they can be found, who the experts in these areas are, which agencies have historically been involved in developing them, what is being done to put them to work, and how we should proceed from here. Because it will set the tone for everything that follows, and because there is no better time for such a seismic shift in the ground than at the beginning, a clear and decisive statement of what needs to be done differently ought to be a Day One priority for the Biden-Harris administration. Though this memo was initiated in response to the STEM learning ecosystems town hall meetings, its theme is applicable across a wide range of policy domains, and should be read as such.

II. Challenge and Opportunity

What needs to be done differently hinges on the realization that a theme common to all of the issues identified by McGregor concerns the development of trusting relationships. Igniting viral contagions of trust systematically at mass scales requires accomplishing two apparently contradictory goals simultaneously: creating communications and information standards that are both universally transparent and individually personalized. It may appear that these two goals cannot be achieved at the same time, but in actual fact they are integrated in everyday language. The navigable continuity of communications and information standards need not be inconsistent with the unique strengths, weaknesses, and creative improvisations of custom tailored local conversations. Standards do not automatically entail pounding square pegs into round holes.

Transparent communications of meaningful high quality information cultivate trust by inspiring confidence in the repeated veracity and validity of what is said. Capacities for generalizing lessons learned across localities augment that trust and support the spread of innovations. Personalized information applicable to unique individual circumstances cultivates trust as students, teachers, parents, administrators, researchers, employers, and others are each able (a) to recognize their own special uniqueness reflected in information on their learning outcomes, (b) to see the patterns of their learning and growth reflected in that information over time, and (c) to see themselves in others’ information, and others in themselves. Systematic support and encouragement for policies and practices integrating these seemingly contradictory goals would constitute truly new approaches to old problems. Given that longstanding and widespread successes in combining these goals have already been achieved, new hope for resounding impacts becomes viable, feasible, and desirable.

III. Plan of Action

To stop the maddening contradiction of expecting different results from repetitions of the same behaviors, decisive steps must be taken toward making better use of existing models and methods, ones that coherently inform new behaviors leading to new outcomes. We are not speaking here of small incremental gains produced via intensive but microscopically focused efforts. We are raising the possibility that we may be capable of igniting viral contagions of trust. Just as the Arab Spring was in many ways fostered by the availability of new and unfettered technologically mediated social networks like Facebook and Twitter, so, also, will the creation of new outcomes communications platforms in education, healthcare, social services, and environmental resource management unleash powerful social forces. In the same way that smartphones are both incredibly useful for billions of people globally and are also highly technical devices involving complexities beyond the ken of the vast majority of those using them, so, too, do the complex models and methods at issue here have similar potentials for mass scaling.

To efficiently share transferable lessons as to what works, we need the common quantitative languages of outcome measurement standards, where (a) quantities are defined not in the ordinal terms of test scores but in the interval terms of metrologically traceable units with associated uncertainties, and (b) where those quantities are estimated not from just one set of assessment questions or items but from linked collections of diverse arrays of different kinds of self, observational, portfolio, peer, digital, and other assessments (or even from theory). To support individuals’ creative improvisations and unique circumstances, those standards, like the alphabets, grammars, and dictionaries setting the semiotic standards of everyday language, must enable new kinds of qualitative conversations negotiating the specific hurdles of local conditions. Custom tailored individual reports making use of interval unit estimates and uncertainties have been in use globally for decades.

Existing efforts in this area have been underway since the work of Thurstone in the 1920s, Rasch and Wright in the period from the 1950s through the 1990s, and of thousands of others since then. Over the course of the last several decades, the work of these innovators has been incorporated into hundreds of research studies funded by the Institute for Education Sciences, the National Science Foundation, and the National Institutes of Health. Most of these applications have, however, been hobbled by limited conceptualizations restricting expectations to the narrow terms of statistical hypothesis testing instead of opening onto the far more expansive possibilities offered by an integration of metrological standards and individualized reporting. This is a key way of expressing the crux of the shift proposed here. We are moving away from merely numeric statistical operations conducted via centrally planned and controlled analytic methods, and we are moving toward fully quantitative quality-assured measurement operations conducted via widely distributed and socially self-organized methods.

Because history shows existing institutions rarely successfully alter their founding principles, it is likely necessary for a government agency previously not involved in this work to now take the lead. That agency should be the National Institute of Standards and Technology (NIST). This recommendation is supported by the recent emergence of new alliances of psychometricians and metrologists clarifying the theory and methods needed for integrating the two seemingly opposed goals of comparable standards and custom tailored applications. The International Measurement Confederation (IMEKO) of national metrology institutes has provided a forum for reports in this area since 2008, as has, since 2017, the International Metrology Congress, held in Paris. An international meeting bringing together equal numbers of metrologists and psychometricians was held at UC Berkeley in 2016 (NIST’s Antonio Possolo gave a keynote), dozens of peer-reviewed journal articles in this new area have appeared since 2009, two authoritative books have appeared since 2019, and multiple ongoing collaborations internationally focused on the development of new unit standards and traceable instrumentation for education, health care, and other fields are underway.

Important leaders in this area capable of guiding the formation of the measurement-specific policies for research and practice include David Andrich (U Western Australia, Perth), Matt Barney (Leaderamp, Vacaville, CA), Betty Bergstrom (Pearson VUE, Chicago), Stefan Cano (Modus Outcomes, UK), Theo Dawson (Lectica, Northampton, MA), Peter Hagell (U Kristianstad, Sweden), Martin Ho (FDA), Mike Linacre (Winsteps.com), Larry Ludlow (Boston College), Luca Mari (U Cattaneo, Italy), Robert Massof (Johns Hopkins), Andrew Maul (UC Santa Barbara), Jeanette Melin (RISE, Sweden), Janice Morrison (TIES, Cleveland), Leslie Pendrill (RISE, Sweden), Maureen Powers (Gemstone Optometry, Berkeley), Andrea Pusic (Brigham & Women’s, Boston), Matthew Rabbitt (USDA), Thomas Salzberger (U Vienna, Austria), Karen Schmidt (U Virginia), Mark Wilson (UC Berkeley), and many others.

Partnerships across economic sectors are essential to the success of this initiative. Standards provide the media by which different groups of stakeholders can advance their unique interests more effectively in partnership than they can in isolation. Calls for proposals should stress the vital importance of establishing the multidisciplinary functionality of boundary objects residing at the borders between disciplines. Just as has been accomplished for the SI Unit metrological standards in the natural sciences, educators’ needs for comparable but customized information must be aligned with the analogous needs of stakeholders in other domains, such as management, clinical practice, law, accounting, finance, economics, etc. Of the actors in this domain listed above, at this time, the Research Institute of Sweden (RISE) is most energetically engaged in forming the needed cross-disciplinary collaborations.

Though the complexity and cost of such efforts appear almost insurmountable, beginning the process of envisioning how to address the challenges and capitalize on the opportunities is far more realistic and productive than continuing to flounder without direction, as we currently are and have been for decades. Estimates of the cost of creating, maintaining, and improving existing standards come to about 8% of GDP, with returns on investment estimated by NIST to be in the range of about 40% to over 400%, with a mean of about 140%. The levels of investment needed in the new metrological efforts, and the returns to be gained from those investments, will not likely differ significantly from these estimates.

IV. Conclusion

This proposal is important because it offers a truly original response to the question of what needs to be done differently in STEM education and elsewhere to avoid continuing to reproduce the same tired and ineffective results. The originality of the proposal is complemented by the depth at which it taps the historical successes of the natural sciences and the economics of standards: efficient markets for trading on trust in productive ways could lead to viral contagions of caring relationships. The proposal is also supported by the intuitive plausibility of taking natural language as a model for the creation of new common languages for the communication and improvement of learning, healthcare, employment, and other outcomes. As is the case for any authentic paradigm shift, opposition to the proposal is usually rooted in assumptions that existing expertise, methods, and tools are sufficient to the task, even when massive amounts of evidence point to the need for change. Simple, small, and inexpensive projects can be designed as tests of the concept and as means of attracting interest in the paradigm shift. Convening cross-sector groups of collaborators for the purposes of designing and conducting small demonstration projects may be an effective way of beginning. Finally, the potential for creating economically self-sustaining cycles of investments and returns could be an attractive way of incentivizing private sector participation, especially when this is expressed in terms of the alignment of financial wealth with the authentic wealth of trusting relationships.

V. About the author

William P. Fisher, Jr., Ph.D. received his doctorate from the University of Chicago, where he was mentored by Benjamin D. Wright and supported by a Spencer Foundation Dissertation Research Fellowship. He has been on the staff of the BEAR Center in the Graduate School of Education at UC Berkeley since 2011, and has consulted independently via Living Capital Metrics LLC since 2009. In 2020, Dr. Fisher joined the staff of the Research Institute of Sweden as a Senior Research Scientist. Dr. Fisher is recognized for contributions to measurement theory and practice that span the full range from the philosophical to the applied in fields as diverse as special education, mindfulness practice, nursing, rehabilitation, clinical chemistry, metrology, health outcomes, and survey research.

VI. Supporting literature

Andrich, David. “A Rating Formulation for Ordered Response Categories.” Psychometrika 43, no. 4, December 1978: 561-73.

Andrich, David. Rasch Models for Measurement. Sage University Paper Series on Quantitative Applications in the Social Sciences, vol. series no. 07-068. Beverly Hills, California: Sage, 1988.

Andrich, David, and Ida Marais. A Course in Rasch Measurement Theory: Measuring in the Educational, Social, and Health Sciences. Cham, Switzerland: Springer, 2019.

Barber, John M. “Economic Rationale for Government Funding of Work on Measurement Standards.” In Review of DTI Work on Measurement Standards, ed. R. Dobbie, J. Darrell, K. Poulter and R. Hobbs, Annex 5. London: Department of Trade and Industry, 1987.

Barney, Matt, and William P. Fisher, Jr. “Adaptive Measurement and Assessment.” Annual Review of Organizational Psychology and Organizational Behavior 3, April 2016: 469-90.

Cano, Stefan, Leslie Pendrill, Jeanette Melin, and William P. Fisher, Jr. “Towards Consensus Measurement Standards for Patient-Centered Outcomes.” Measurement 141, 2019: 62-69, https://doi.org/10.1016/j.measurement.2019.03.056.

Chien, Tsair-Wei, John Michael Linacre, and Wen-Chung Wang. “Examining Student Ability Using KIDMAP Fit Statistics of Rasch Analysis in Excel.” In Communications in Computer and Information Science, ed. Honghua Tan and Mark Zhou, 578-85. Berlin: Springer Verlag, 2011.

Chuah, Swee-Hoon, and Robert Hoffman. The Evolution of Measurement Standards. Tech. Rept. no. 5. Nottingham, England: Nottingham University Business School, 2004.

Fisher, William P., Jr. “The Mathematical Metaphysics of Measurement and Metrology: Towards Meaningful Quantification in the Human Sciences.” In Renascent Pragmatism: Studies in Law and Social Science, ed. Alfonso Morales, 118-53. Brookfield, VT: Ashgate Publishing Co., 2003.

Fisher, William P., Jr. “Meaning and Method in the Social Sciences.” Human Studies: A Journal for Philosophy and the Social Sciences 27, no. 4, October 2004: 429-54.

Fisher, William P., Jr. “Invariance and Traceability for Measures of Human, Social, and Natural Capital: Theory and Application.” Measurement 42, no. 9, November 2009: 1278-87.

Fisher, William P., Jr. NIST Critical National Need Idea White Paper: Metrological Infrastructure for Human, Social, and Natural Capital. Tech. Rept. no. http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf. Washington, DC: National Institute for Standards and Technology, 2009.

Fisher, William P., Jr. “Measurement, Reduced Transaction Costs, and the Ethics of Efficient Markets for Human, Social, and Natural Capital,” Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University, 2010, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2340674.

Fisher, William P., Jr. “What the World Needs Now: A Bold Plan for New Standards [Third Place, 2011 NIST/SES World Standards Day Paper Competition].” Standards Engineering 64, no. 3, 1 June 2012: 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, William P., Jr. “Imagining Education Tailored to Assessment as, for, and of Learning: Theory, Standards, and Quality Improvement.” Assessment and Learning 2, 2013: 6-22.

Fisher, William P., Jr. “Metrology, Psychometrics, and New Horizons for Innovation.” 18th International Congress of Metrology, Paris, September 2017: 09007, doi: 10.1051/metrology/201709007.

Fisher, William P., Jr. “A Practical Approach to Modeling Complex Adaptive Flows in Psychology and Social Science.” Procedia Computer Science 114, 2017: 165-74, https://doi.org/10.1016/j.procs.2017.09.027.

Fisher, William P., Jr. “Modern, Postmodern, Amodern.” Educational Philosophy and Theory 50, 2018: 1399-400. Reprinted in What Comes After Postmodernism in Educational Theory? ed. Michael Peters, Marek Tesar, Liz Jackson and Tina Besley, 104-105, New York: Routledge, DOI: 10.1080/00131857.2018.1458794.

Fisher, William P., Jr. “Contextualizing Sustainable Development Metric Standards: Imagining New Entrepreneurial Possibilities.” Sustainability 12, no. 9661, 2020: 1-22, https://doi.org/10.3390/su12229661.

Fisher, William P., Jr. “Measurements Toward a Future SI.” In Sensors and Measurement Science International (SMSI) 2020 Proceedings, ed. Gerald Gerlach and Klaus-Dieter Sommer, 38-39. Wunstorf, Germany: AMA Service GmbH, 2020, https://www.smsi-conference.com/assets/Uploads/e-Booklet-SMSI-2020-Proceedings.pdf.

Fisher, William P., Jr. “Wright, Benjamin D.” In SAGE Research Methods Foundations, ed. P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug and R.A. Williams. Thousand Oaks, CA: Sage Publications, 2020, https://methods.sagepub.com/foundations/wright-benjamin-d.

Fisher, William P., Jr., and A. Jackson Stenner. “Theory-Based Metrological Traceability in Education: A Reading Measurement Network.” Measurement 92, 2016: 489-96, http://www.sciencedirect.com/science/article/pii/S0263224116303281.

Fisher, William P., Jr., and Mark Wilson. “Building a Productive Trading Zone in Educational Assessment Research and Practice.” Pensamiento Educativo: Revista de Investigacion Educacional Latinoamericana 52, no. 2, 2015: 55-78, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2688260.

Gallaher, Michael P., Brent R. Rowe, Alex V. Rogozhin, Stephanie A. Houghton, J. Lynn Davis, Michael K. Lamvik, and John S. Geikler. Economic Impact of Measurement in the Semiconductor Industry. Tech. Rept. no. 07-2. Gaithersburg, MD: National Institute for Standards and Technology, 2007.

He, W., and G. G. Kingsbury. “A Large-Scale, Long-Term Study of Scale Drift: The Micro View and the Macro View.” Journal of Physics Conference Series 772, 2016: 012022, https://iopscience.iop.org/article/10.1088/1742-6596/772/1/012022/meta.

Holster, Trevor A., and J. W. Lake. “From Raw Scores to Rasch in the Classroom.” Shiken 19, no. 1, April 2015: 32-41.

Hunter, J Stuart. “The National System of Scientific Measurement.” Science 210, no. 21, 1980: 869-74.

Linacre, John Michael. “Individualized Testing in the Classroom.” In Advances in Measurement in Educational Research and Assessment, ed. Geofferey N Masters and John P. Keeves, 186-94. New York: Pergamon, 1999.

Mari, Luca, and Mark Wilson. “An Introduction to the Rasch Measurement Approach for Metrologists.” Measurement 51, May 2014: 315-27, http://www.sciencedirect.com/science/article/pii/S0263224114000645.

Mari, Luca, Mark Wilson, and Andrew Maul. Measurement Across the Sciences [in Press]. Springer Series in Measurement Science and Technology. Cham: Springer, 2021.

Massof, Robert W. “Editorial: Moving Toward Scientific Measurements of Quality of Life.” Ophthalmic Epidemiology 15, 1 August 2008: 209-11.

Masters, Geofferey N. “KIDMAP – a History.” Rasch Measurement Transactions 8, no. 2, 1994: 366 [http://www.rasch.org/rmt/rmt82k.htm].

Morrison, Jan, and William P. Fisher, Jr. “Connecting Learning Opportunities in STEM Education: Ecosystem Collaborations Across Schools, Museums, Libraries, Employers, and Communities.” Journal of Physics: Conference Series 1065, no. 022009, 2018, doi:10.1088/1742-6596/1065/2/022009.

Morrison, Jan, and William P. Fisher, Jr. “Measuring for Management in Science, Technology, Engineering, and Mathematics Learning Ecosystems.” Journal of Physics: Conference Series 1379, no. 012042, 2019, doi:10.1088/1742-6596/1379/1/012042.

National Institute for Standards and Technology. “Appendix C: Assessment Examples. Economic Impacts of Research in Metrology.” In Assessing Fundamental Science: A Report from the Subcommittee on Research, Committee on Fundamental Science, ed. Committee on Fundamental Science Subcommittee on Research. Washington, DC: National Standards and Technology Council, 1996, https://wayback.archive-it.org/5902/20150628164643/http://www.nsf.gov/statistics/ostp/assess/nstcafsk.htm#Topic%207.

National Institute for Standards and Technology. Outputs and Outcomes of NIST Laboratory Research. 18 December 2009. NIST. Last visited 18 April 2020 <https://www.nist.gov/director/outputs-and-outcomes-nist-laboratory-research&gt;.

North, Douglass C. Structure and Change in Economic History. New York: W. W. Norton & Co., 1981.

Pendrill, Leslie. Quality Assured Measurement: Unification Across Social and Physical Sciences. Cham: Springer, 2019.

Pendrill, Leslie, and William P. Fisher, Jr. “Counting and Quantification: Comparing Psychometric and Metrological Perspectives on Visual Perceptions of Number.” Measurement 71, 2015: 46-55, doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010.

Poposki, Nicola, Nineta Majcen, and Philip Taylor. “Assessing Publically Financed Metrology Expenditure Against Economic Parameters.” Accreditation and Quality Assurance: Journal for Quality, Comparability and Reliability in Chemical Measurement 14, no. 7, July 2009: 359-68.

Rasch, Georg. Probabilistic Models for Some Intelligence and Attainment Tests. Reprint, University of Chicago Press, 1980. Copenhagen, Denmark: Danmarks Paedogogiske Institut, 1960.

Rasch, Georg. “On General Laws and the Meaning of Measurement in Psychology.” In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability: Volume IV: Contributions to Biology and Problems of Medicine, ed. Jerzy Neyman, 321-33 [http://www.rasch.org/memo1960.pdf]. Berkeley: University of California Press, 1961.

Solloway, Sharon, and William P. Fisher, Jr. “Mindfulness in Measurement: Reconsidering the Measurable in Mindfulness.” International Journal of Transpersonal Studies 26, 2007: 58-81 [http://digitalcommons.ciis.edu/ijts-transpersonalstudies/vol26/iss1/8 ].

Stenner, A. Jackson, William P. Fisher, Jr., Mark H. Stone, and Don S. Burdick. “Causal Rasch Models.” Frontiers in Psychology: Quantitative Psychology and Measurement 4, no. 536, August 2013: 1-14 [doi: 10.3389/fpsyg.2013.00536].

Sumner, Jane, and William P. Fisher, Jr. “The Moral Construct of Caring in Nursing as Communicative Action: The Theory and Practice of a Caring Science.” Advances in Nursing Science 31, no. 4, 2008: E19-36.

Swann, G. M. P. The Economics of Metrology and Measurement. Report for the National Measurement Office and Department of Business, Innovation and Skills. London, England: Innovative Economics, Ltd, 2009.

Williamson, Gary L. “Exploring Reading and Mathematics Growth Through Psychometric Innovations Applied to Longitudinal Data.” Cogent Education 5, no. 1464424, 2018: 1-29.

Wilson, Mark, Ed. Towards Coherence Between Classroom Assessment and Accountability. National Society for the Study of Education, vol. 103, Part II. Chicago: University of Chicago Press, 2004.

Wilson, Mark R. Constructing Measures. Mahwah, NJ: Lawrence Erlbaum Associates, 2005.

Wilson, Mark R. “Seeking a Balance Between the Statistical and Scientific Elements in Psychometrics.” Psychometrika 78, no. 2, April 2013: 211-36.

Wilson, Mark. “Making Measurement Important for Education: The Crucial Role of Classroom Assessment.” Educational Measurement: Issues and Practice 37, no. 1, 2018: 5-20.

Wilson, Mark, and William P. Fisher, Jr. “Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology Across the Sciences: Wishful Thinking?” Journal of Physics Conference Series 772, no. 1, 2016: 011001, http://iopscience.iop.org/article/10.1088/1742-6596/772/1/011001/pdf.

Wilson, Mark, and William P. Fisher, Jr., Eds. Psychological and Social Measurement: The Career and Contributions of Benjamin D. Wright. Springer Series in Measurement Science and Technology, ed. M. G. Cain, G. B. Rossi, J. Tesai, M. van Veghel and K.-Y Jhang. Cham, Switzerland: Springer Nature, 2017, https://link.springer.com/book/10.1007/978-3-319-67304-2.

Wilson, Mark, and William P. Fisher, Jr. “Preface of Special Issue, Psychometric Metrology.” Measurement 145, 2019: 190, https://www.sciencedirect.com/journal/measurement/special-issue/10C49L3R8GT.

Wilson, Mark, and Kathleen Scalise. “Assessment of Learning in Digital Networks.” In Assessment and Teaching of 21st Century Skills: Methods and Approach, ed. Patrick Griffin and Esther Care, 57-81. Dordrecht: Springer Netherlands, 2015.

Wilson, Mark, and Y. Toyama. “Formative and Summative Assessments in Science and Literacy Integrated Curricula: A Suggested Alternative Approach.” In Language, Literacy, and Learning in the STEM Disciplines, ed. Alison L. Bailey, Carolyn A. Maher and Louise C. Wilkinson, 231-60. New York: Routledge, 2018.

Wright, Benjamin D. “Sample-Free Test Calibration and Person Measurement.” In Proceedings of the 1967 Invitational Conference on Testing Problems, 85-101 [http://www.rasch.org/memo1.htm]. Princeton, New Jersey: Educational Testing Service, 1968.

Wright, Benjamin D. “Solving Measurement Problems with the Rasch Model.” Journal of Educational Measurement 14, no. 2, 1977: 97-116 [http://www.rasch.org/memo42.htm].

Wright, Benjamin D. “Despair and Hope for Educational Measurement.” Contemporary Education Review 3, no. 1, 1984: 281-88 [http://www.rasch.org/memo41.htm].

Wright, Benjamin D. “Additivity in Psychological Measurement.” In Measurement and Personality Assessment, ed. Edward Roskam, 101-12. North Holland: Elsevier Science Ltd, 1985.

Wright, Benjamin D. “A History of Social Science Measurement.” Educational Measurement: Issues and Practice 16, no. 4, Winter 1997: 33-45, 52. https://doi.org/10.1111/j.1745-3992.1997.tb00606.x.

Wright, Benjamin D., and G N Masters. Rating Scale Analysis. Chicago: MESA Press, 1982. Full text: https://www.rasch.org/BTD_RSA/pdf%20%5Breduced%20size%5D/Rating%20Scale%20Analysis.pdf.

Wright, Benjamin D., R. J. Mead, and L. H. Ludlow. KIDMAP: Person-by-Item Interaction Mapping. Tech. Rept. no. MESA Memorandum #29. Chicago: MESA Press [http://www.rasch.org/memo29.pdf], 1980.

Wright, Benjamin D., and Mark H Stone. Best Test Design. Chicago: MESA Press, 1979, Full text: https://www.rasch.org/BTD_RSA/pdf%20%5Breduced%20size%5D/Best%20Test%20Design.pdf.

Creative Commons License

LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

On the recent Pew poll contrasting differences as to the “very big” problems we face today

October 20, 2018

An online news item appearing on 15 October 2018 proclaims that “Americans don’t just disagree on the issues. They disagree on what the issues are.” The article, by Dylan Scott on the Vox website, reports on a poll conducted by the Pew Research Center, involving registered voters in the U.S., between 24 September and 7 October. Polarizing disagreement is a recurring theme in the world, and keeping the tension up sells ads, so it is not surprising to see the emphasis in both the article and in the Pew report on major differences in people’s perceptions of what counts as a “very big” problem in the U.S. today. But a closer look at the data gives hope for finding ways to communicate across barriers that may look more significant than they actually are.

There’s no mention in the article of the sampling error, uncertainty, or confidence level, but the Pew site indicates that, overall, sampling error is 1.5%. But the Vox article mentions only the total sample size and fails to say that the registered voter portion of the respondents is smaller by a couple of thousand. Further, the sampling error jumps up to 2.6% for respondents indicating support for a Republican candidate, and to 2.3% for respondents supporting a Democrat. Again, the differences being played up are quite large, so there’s little risk of making too much out of a small difference. It’s good to know just how much of a difference makes a difference, though.

That said, neither Pew nor the Vox story mentions the very strong agreement between the different groups supporting opposing party candidates when the focus is on the relative magnitudes of agreement on aligned issues. Survey research typically focuses, of course, on percentages of responses to individual questions. Only measurement geeks like me wonder whether questions addressing a common theme could be related in a way that might convey more information. My curiosity was piqued, even though it is impossible to properly evaluate a model of this kind from the mere summary percentages. I knew if I found any correspondences they might just be accidents or coincidences, but I wanted to see what would happen.

So I typed up the text of the 18 issues concerning the seriousness of the problems being confronted in the US today, along with the percentages of registered voters saying each is a “very big” problem today. I put it all into SPSS and made a few technical checks to see if any major problems of interpretation would emerge from the nonlinear and ordinal percentages. The plots and correlations I did indicated that the same general results could be inferred from both the Pew percentages and their logit transformations.

While I was looking at a scatter plot of the Republican vs Democrat agreement percentages I noticed something interesting. I had been wondering if perhaps the striking differences in the groups’ willingness to say problems were serious might be a matter of relative emphases. Might the Republican supporters be less willing to find anything a big problem, but to nonetheless rank the issues in the same order as the Democrat supporters? This is, after all, exactly the kind of pattern commonly found in data from various surveys, assessments, and tests. No matter whether a respondent scores low overall, or scores high, the relative order of things stays the same.

Now, this is true in the kind of data I work with because considerable care is invested in composing questions that are intended to hang together like that. The idea is to deliberately vary the agreeability or difficulty of the questions so they all tap the same basic construct and demonstrably measure the same thing. When these kind of data are obtained, different questions measuring the same thing can be asked of different people without compromising the unit of measurement. That is, each different examinee or respondent can answer a unique set of questions and still have a measure comparable with anyone else’s. Like I said, this does not just happen by itself, but has to come about through a careful process of design and calibration. But the basic principles are well-established as being of longstanding and proven value across wide areas of research and practice.

So I was wondering if there might be one or more subsets of questions in the Pew data that would define the same problem magnitude dimension for supporters of both Republican and Democratic candidates. And as soon as I looked at the scatterplot of the percentages from the two groups, I saw that yes, indeed, there appeared to be four groups of issues that lined up along shared slopes. A color-coded version of that plot is in Figure 1.

The one statistical inference problem that emerged in examining these ordinal data concerns the yellow dot that is lowest and furthest to the left. At 8% agreement from the Republican supporters it was pulled away from the linear relation further than the other correspondences. When transformed into a log-odds unit, that single problematic difference lines up well with the other yellow dots further to the right.

The identity line in the figure shows where exact agreement between the two groups would be. That line marks out the connection between the same percentages of respondents agreeing an issue is a “very big” problem. We can see that the three green dots very nearly fall on that identity line. Just below them is a row of blue dots almost parallel with the identity line. Then there’s a third row of yellow dots further down, indicating more absolute disagreement between the two groups on these issues, but also showing a quite strong agreement as to their relative magnitudes within that group. Finally, there is another, red, line of dots in the lower right corner of the figure that marks out a more extreme range of absolute disagreement, but is also quite parallel to the identity line.

Fisher2018PewFig1

Figure 1 Initial plot of Republican vs Democrat Percentages agreement as to “Very Big” problems

Figures 2-5 below illustrate each of these groups of issues separately, giving further information on the problems and showing the regression lines and correlations for each contrast. The same colors have been retained to aid in seeing which groups of issues in Figure 1 are being shown.

The four areas of problems seem to me to correspond to issues of perceived major threats (Figure 2), accountability and access issues (Figure 3), equal opportunity issues (Figure 4), and systemic problems (Figure 5). Each of these content areas could be explored conceptually and qualitatively to assess whether some initial sense of a measured construct can be formed. If the by-person individual response data could be analyzed for fit to a proper measurement model, a much better job of determining the presence of invariant structure could be done.

But even without undertaking that work, these results already suggest a basis for productive conversations between the supposedly polarized groups. To start from the low-hanging fruit, the three problems the two groups agree on to within a couple of sampling errors (Figure 2) present topics of common agreement. Both Democrats and Republicans identify violent crime, the federal budget deficit, and drug addiction as matters of equally shared concern. The point is not that these are the highest rated problems for either group, but, rather, that they agree within the limits of statistical precision as to the extent that these are “very big” problems. It may be that setting shared priorities for addressing these problems could ground new relationships in that experience of having accomplished something productive together.

This new approach to building social capital might then proceed by taking up progressively more difficult areas of disagreement as to what “very big” problems are. Even though Republicans rate each area as less likely to be a “very big” problem, within each of the four groups of issues, they agree with Democrats as to their relative magnitudes. News like this might not sell a lot of ads, but it does offer hope for finding new ways of approaching relationships and crossing divides.

Fisher2018PewFig2

Figure 2.Republican vs Democrat areas of agreement as to “Very Big” problems

Fisher2018PewFig3

Figure 3 Republican vs Democrat areas of some disagreement as to “Very Big” problems

Fisher2018PewFig4

Figure 4 Republican vs Democrat areas of marked disagreement as to “Very Big” problems

Fisher2018PewFig5

Figure 5 Republican vs Democrat areas of fundamental disagreement as to “Very Big” problems

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Revisiting The Federalist Paper No. 31 by Alexander Hamilton: An Analogy from Geometry

July 10, 2018

[John Platt’s chapters on social chain reactions in his 1966 book, The Steps to Man, provoked my initial interest in looking into his work. That work appears to be an independent development of themes that appear in more well-known works by Tarde, Hayek, McLuhan, Latour, and others, which of course are of primary concern in thinking through metrological and ecosystem issues in psychological and social measurement. My interest also comes in the context of Platt’s supervision of Ben Wright in Robert Mulliken’s physics lab at the U of Chicago in 1948. However, other chapters in this book concern deeper issues of complexity and governance that cross yet more disciplinary boundaries. One of the chapters in the book, for instance, examines the Federalist Papers and remarks on a geometric analogy drawn by Alexander Hamilton concerning moral and political forms of knowledge. The parallel with my own thinking is such that I have restated Hamilton’s theme in my own words within the contemporary context. The following is my effort in this regard. No source citations are given, but a list of supporting references is included at bottom. Hamilton’s original text is available at: https://www.congress.gov/resources/display/content/The+Federalist+Papers#TheFederalistPapers-31.  ]

 

Communication requires that we rely on the shared understandings of a common language. Language puts in play combinations of words, concepts, and things that enable us to relate to one another at varying levels of complexity. Often, we need only to convey the facts of a situation in a simple denotative statement about something learned (“the cat is on the mat”). We also need to be able to think at a higher level of conceptual complexity referred to as metalinguistic, where we refer to words themselves and how we learn about what we’ve learned (“the word ‘cat’ has no fur”). At a third, metacommunicative, level of complexity, we make statements about statements, deriving theories of learning and judgments from repeated experiences of metalinguistic learning about learning (“I was joking when I said the cat was on the mat”).

Human reason moves freely between expressions of and representations of denotative facts, metalinguistic instruments like words, and metacommunicative theories. The combination of assurances obtained from the mutual supports each of these provides the others establishes the ground in which the seeds of social, political, and economic life take root and grow. Thought itself emerges from within the way the correspondence of things, words, and concepts precedes and informs the possibility of understanding and communication.

When understanding and communication fail, that failure may come about because of mistaken perceptions concerning the facts, a lack of vocabulary, or misconceptions colored by interests, passions, or prejudices, or some combination of these three.

The maxims of geometry exhibit exactly this same pattern combining concrete data on things in the world, instruments for abstract measurement, and formal theoretical concepts. Geometry is the primary and ancient example of how the beauty of aesthetic proportions teaches us to understand meaning. Contrary to common sense, which finds these kinds of discontinuities incomprehensible, philosophy since the time of Plato’s Symposium teaches how to make meaning in the face of seemingly irreconcilable differences between the local facts of a situation and the principles to which we may feel obliged to adhere. Geometry meaningfully and usefully, for instance, represents the undrawable infinite divisibility of line segments, as with the irrational length of the hypotenuse of a right isosceles triangle that has the other two sides with lengths of 1.

This apparently absurd and counter-intuitive skipping over of the facts in the construction of the triangular figure and the summary reference to the unstateable infinity of the square root of two is so widely accepted as to provide a basis for real estate property rights that are defensible in courts of law and financially fungible. And in this everyday commonplace we have a model for separating and balancing denotative facts, instrumental words, and judicial theories in moral and political domains.

Humanity has proven far less tractable than geometry over the course of its history regarding possible sciences of morals and politics. This is understandable given humanity’s involvement in its own ongoing development. As Freud put it, humanity’s Narcissistic feeling of being the center of the universe, the crown of creation, and the master of its own mind has suffered a series of blows as it has had to come to terms with the works of Copernicus, Darwin, and Freud himself. The struggle to establish a common human identity while also celebrating individual uniqueness is an epic adventure involving billions of tragic and comedic stories of hubris, sacrifice, and accomplishment. Humanity has arrived at a point now, however, at which a certain obstinate, perverse, and disingenuous resistance to self-understanding has gone too far.

Although the mathematical sciences excel in refining the precision of their tools, longstanding but largely untapped resources for improving the meaningfulness and value of moral and political knowledge have been available for decades. “The obscurity is much oftener in the passions and prejudices of the reasoner than in the subject.” Methods for putting passions on the table for sorting out take advantage of the lessons beauty teaches about meaning and thereby support each of the three levels of complexity in communication.

At this point we encounter the special relevance of those three levels of complexity to the separation and balance of powers in government. The concrete denotative factuality of data is the concern of the executive branch, as befits its orientation to matters of practical application. The abstract metalinguistic instrumentation of words is the concern of the legislative branch, in accord with its focus on the enactment of laws and measures. And formal metacommunicative explanatory theories are the concern of the judicial branch, as is appropriate to its focus on constitutional issues.

For each of us to give our own individual understandings fair play in ways that do not give free rein to unfettered prejudices entangled in words and subtle confusions, we need to be able to communicate in terms that, so far as possible, function equally well within and across each of these levels of complexity. It is only to state the obvious to say that we lack the language needed for communication of this kind. Our moral and political sciences have not yet systematically focused on creating such languages. Outside of a few scattered works, they have not even yet consciously hypothesized the possibility of creating these languages. It is nonetheless demonstrably the case that these languages are feasible, viable, and desirable.

Though good will towards all and a desire to refrain so far as possible from overt exclusionary prejudices for or against one or another group cannot always be assumed, these are the conditions necessary for a social contract and are taken as the established basis for what follows. The choice between discourse and violence includes careful attention to avoiding the violence of the premature conclusion. If we are ever to achieve improved communication and a fuller realization of both individual liberties and social progress, the care we invest in supports for life, liberty, and the pursuit of happiness must flow from this deep source.

Given the discontinuities between language’s levels of complexity, avoiding premature conclusions means needing individualized uncertainty estimates and an associated tolerance for departures from expectations set up by established fact-word-concept associations. For example, we cannot allow a three-legged horse to alter our definition of horses as four-legged animals. Neither should we allow a careless error or lucky guess to lead to immediate and unqualified judgments of learning in education. Setting up the context in which individual data points can be understood and explained is the challenge we face. Information infrastructures supporting this kind of contextualization have been in development for years.

To meet the need for new communicative capacities, features of these information infrastructures will have to include individualized behavioral feedback mechanisms, minimal encroachments on private affairs, managability, modifiability, and opportunities for simultaneously enhancing one’s own interests and the greater good.

It is in this latter area that our interests are now especially focused. Our audacious but not implausible goal is to find ways of enhancing communication and the quality of information infrastructures by extending beauty’s lessons for meaning into new areas. In the same way that geometry facilitates leaps from concrete figures to abstract constructions and from there to formal ideals, so, too, must we learn, learn about that learning, and develop theories of learning in other less well materialized areas, such as student-centered education, and patient-centered health care. Doing so will set the stage for new classes of human, social, and natural capital property rights that are just as defensible in courts of law and financially fungible as real estate.

When that language is created, when those rights are assigned, and when that legal defensibility and financial fungibility are obtained, a new construction of government will follow. In it, the separation and balance of executive, legislative, and judicial powers will be applied with equal regularity and precision down to the within-individual micro level, as well as at the between-individual meso level, and at the social macro level. This distribution of freedom and responsibility across levels and domains will feed into new educational, market, health, and governmental institutions of markedly different character than we have at present.

A wide range of research publications appearing over the last several decades documents unfolding developments in this regard, and so those themes will not be repeated here. Some of these publications are listed below for those interested. Far more remains to be done in this area than has yet been accomplished, to say the least.

 

 

Sources consulted or implied

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Bateson, G. (1972). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. Chicago: University of Chicago Press.

Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5-31.

Black, P., Wilson, M., & Yao, S. (2011). Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research & Perspectives, 9, 1-52.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2005, August 1-3). Data standards for living human, social, and natural capital. In Session G: Concluding Discussion, Future Plans, Policy, etc. Conference on Entrepreneurship and Human Rights [http://www.fordham.edu/economics/vinod/ehr05.htm], Pope Auditorium, Lowenstein Bldg, Fordham University.

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-1093 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009, November 19). Draft legislation on development and adoption of an intangible assets metric system. Retrieved 6 January 2011, from Living Capital Metrics blog: https://livingcapitalmetrics.wordpress.com/2009/11/19/draft-legislation/

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement: Concerning Foundational Concepts of Measurement Special Issue Section, 42(9), 1278-1287.

Fisher, W. P., Jr. (2009). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (Tech. Rep. No. http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC:. National Institute for Standards and Technology.

Fisher, W. P., Jr. (2010). Measurement, reduced transaction costs, and the ethics of efficient markets for human, social, and natural capital, Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2340674).

Fisher, W. P., Jr. (2010). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics Conference Series, 238(1), 012016.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.

Fisher, W. P., Jr. (2011). Stochastic and historical resonances of the unit in physics and psychometrics. Measurement: Interdisciplinary Research & Perspectives, 9, 46-50.

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2015). A probabilistic model of the law of supply and demand. Rasch Measurement Transactions, 29(1), 1508-1511  [http://www.rasch.org/rmt/rmt291.pdf].

Fisher, W. P., Jr. (2018). How beauty teaches us to understand meaning. Educational Philosophy and Theory, in review.

Fisher, W. P., Jr. (2018). A nondualist social ethic: Fusing subject and object horizons in measurement. TMQ–Techniques, Methodologies, and Quality, in review.

Fisher, W. P., Jr., Oon, E. P.-T., & Benson, S. (2018). Applying Design Thinking to systemic problems in educational assessment information management. Journal of Physics Conference Series, 1044, 012012.

Fisher, W. P., Jr., Oon, E. P.-T., & Benson, S. (2018). Rethinking the role of educational assessment in classroom communities: How can design thinking address the problems of coherence and complexity? Measurement, in review.

Fisher, W. P., Jr., & Stenner, A. J. (2013). On the potential for improved measurement in the human and social sciences. In Q. Zhang & H. Yang (Eds.), Pacific Rim Objective Measurement Symposium 2012 Conference Proceedings (pp. 1-11). Berlin, Germany: Springer-Verlag.

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496.

Fisher, W. P., Jr., & Stenner, A. J. (2018). Ecologizing vs modernizing in measurement and metrology. Journal of Physics Conference Series, 1044, 012025.

Gadamer, H.-G. (1980). Dialogue and dialectic: Eight hermeneutical studies on Plato (P. C. Smith, Trans.). New Haven: Yale University Press.

Gari, S. R., Newton, A., Icely, J. D., & Delgado-Serrano, M. D. M. (2017). An analysis of the global applicability of Ostrom’s design principles to diagnose the functionality of common-pool resource institutions. Sustainability, 9(7), 1287.

Gelven, M. (1984). Eros and projection: Plato and Heidegger. In R. W. Shahan & J. N. Mohanty (Eds.), Thinking about Being: Aspects of Heidegger’s thought (pp. 125-136). Norman, Oklahoma: Oklahoma University Press.

Hamilton, A. (. (1788, 1 January). Concerning the general power of taxation (continued). The New York Packet. (Rpt. in J. E. Cooke, (Ed.). (1961). The Federalist (Hamilton, Alexander; Madison, James; Jay, John). (pp. No. 31, 193-198). Middletown, Conn: Wesleyan University Press.

Lunz, M. E., Bergstrom, B. A., & Gershon, R. C. (1994). Computer adaptive testing. International Journal of Educational Research, 21(6), 623-634.

Ostrom, E. (2015). Governing the commons: The evolution of institutions for collective action. Cambridge, UK: Cambridge University Press (Original work published 1990).

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Penuel, W. R. (2015, 22 September). Infrastructuring as a practice for promoting transformation and equity in design-based implementation research. In Keynote. International Society for Design and Development in Education (ISDDE) 2015 Conference, Boulder, CO. Retrieved from http://learndbir.org/resources/ISDDE-Keynote-091815.pdf

Platt, J. R. (1966). The step to man. New York: John Wiley & Sons.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Ricoeur, P. (1966). The project of a social ethic. In D. Stewart & J. Bien, (Eds.). (1974). Political and social essays (pp. 160-175). Athens, Ohio: Ohio University Press.

Ricoeur, P. (1970). Freud and philosophy: An essay on interpretation. Evanston, IL: Northwestern University Press.

Ricoeur, P. (1974). Violence and language. In D. Stewart & J. Bien (Eds.), Political and social essays by Paul Ricoeur (pp. 88-101). Athens, Ohio: Ohio University Press.

Ricoeur, P. (1977). The rule of metaphor: Multi-disciplinary studies of the creation of meaning in language (R. Czerny, Trans.). Toronto: University of Toronto Press.

Star, S. L., & Ruhleder, K. (1996, March). Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research, 7(1), 111-134.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D. (1958, 7). On behalf of a personal approach to learning. The Elementary School Journal, 58, 365-375. (Rpt. in M. Wilson & W. P. Fisher, Jr., (Eds.). (2017). Psychological and social measurement: The career and contributions of Benjamin D. Wright (pp. 221-232). New York: Springer Nature.)

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.