Archive for the ‘Communication’ Category

Psychology and the social sciences: An atheoretical, scattered, and disconnected body of research

February 16, 2019

A new article in Nature Human Behaviour (NHB) points toward the need for better theory and more rigorous mathematical models in psychology and the social sciences (Muthukrishna & Henrich, 2019). The authors rightly say that the lack of an overarching cumulative theoretical framework makes it very difficult to see whether new results fit well with previous work, or if something surprising has come to light. Mathematical models are especially emphasized as being of value in specifying clear and precise expectations.

The point that the social sciences and psychology need better theories and models is painfully obvious. But there are in fact thousands of published studies and practical real world applications that not only provide, but indeed often surpass, the kinds of predictive theories and mathematical models called for in the NHB article. The article not only makes no mention of any of this work, its argument is framed entirely in a statistical context instead of the more appropriate context of measurement science.

The concept of reliability provides an excellent point of entry. Most behavioral scientists think of reliability statistically, as a coefficient with a positive numeric value usually between 0.00 and 1.00. The tangible sense of reliability as indicating exactly how predictable an outcome is does not usually figure in most researchers’ thinking. But that sense of the specific predictability of results has been the focus of attention in social and psychological measurement science for decades.

For instance, the measurement of time is reliable in the sense that the position of the sun relative to the earth can be precisely predicted from geographic location, the time of day, and the day of the year. The numbers and words assigned to noon time are closely associated with the Sun being at the high point in the sky (though there are political variations by season and location across time zones).

That kind of a reproducible association is rarely sought in psychology and the social sciences, but it is far from nonexistent. One can discern different degrees to which that kind of association is included in models of measured constructs. Though most behavioral research doesn’t mention the connection between linear amounts of a measured phenomenon and a reproducible numeric representation of it (level 0), quite a significant body of work focuses on that connection (level 1). The disappointing thing about that level 1 work is that the relentless obsession with statistical methods prevents most researchers from connecting a reproducible quantity with a single expression of it in a standard unit, and with an associated uncertainty term (level 2). That is, level 1 researchers conceive measurement in statistical terms, as a product of data analysis. Even when results across data sets are highly correlated and could be equated to a common metric, level 1 researchers do not leverage that source of potential value for simplified communication and accumulated comparability.

And then, for their part, level 2 researchers usually do not articulate theories about the measured constructs, by augmenting the mathematical data model with an explanatory model predicting variation (level 3). Level 2 researchers are empirically grounded in data, and can expand their network of measures only by gathering more data and analyzing it in ways that bring it into their standard unit’s frame of reference.

Level 3 researchers, however, have come to see what makes their measures tick. They understand the mechanisms that make their questions vary. They can write new questions to their theoretical specifications, test those questions by asking them of a relevant sample, and produce the predicted calibrations. For instance, reading comprehension is well established to be a function of the difference between a person’s reading ability and the complexity of the text they encounter (see articles by Stenner in the list below). We have built our entire educational system around this idea, as we deliberately introduce children first to the alphabet, then to the most common words, then to short sentences, and then to ever longer and more complicated text. But stating the construct model, testing it against data, calibrating a unit to which all tests and measures can be traced, and connecting together all the books, articles, tests, curricula, and students is a process that began (in English and Spanish) only in the 1980s. The process still is far from finished, and most reading research still does not use the common metric.

In this kind of theory-informed context, new items can be automatically generated on the fly at the point of measurement. Those items and inferences made from them are validated by the consistency of the responses and the associated expression of the expected probability of success, agreement, etc. The expense of constant data gathering and analysis can be cut to a very small fraction of what it is at levels 0-2.

Level 3 research methods are not widely known or used, but they are not new. They are gaining traction as their use by national metrology institutes globally grows. As high profile critiques of social and psychological research practices continue to emerge, perhaps more attention will be paid to this important body of work. A few key references are provided below, and virtually every post in this blog pertains to these issues.

References

Baghaei, P. (2008). The Rasch model as a construct validation tool. Rasch Measurement Transactions, 22(1), 1145-6 [http://www.rasch.org/rmt/rmt221a.htm].

Bergstrom, B. A., & Lunz, M. E. (1994). The equivalence of Rasch item calibrations and ability estimates across modes of administration. In M. Wilson (Ed.), Objective measurement: Theory into practice, Vol. 2 (pp. 122-128). Norwood, New Jersey: Ablex.

Cano, S., Pendrill, L., Barbic, S., & Fisher, W. P., Jr. (2018). Patient-centred outcome metrology for healthcare decision-making. Journal of Physics: Conference Series, 1044, 012057.

Dimitrov, D. M. (2010). Testing for factorial invariance in the context of construct validation. Measurement & Evaluation in Counseling & Development, 43(2), 121-149.

Embretson, S. E. (2010). Measuring psychological constructs: Advances in model-based approaches. Washington, DC: American Psychological Association.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374.

Fischer, G. H. (1983). Logistic latent trait models with linear constraints. Psychometrika, 48(1), 3-26.

Fisher, W. P., Jr. (1992). Reliability statistics. Rasch Measurement Transactions, 6(3), 238 [http://www.rasch.org/rmt/rmt63i.htm].

Fisher, W. P., Jr. (2008). The cash value of reliability. Rasch Measurement Transactions, 22(1), 1160-1163 [http://www.rasch.org/rmt/rmt221.pdf].

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496.

Green, S. B., Lissitz, R. W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37(4), 827-833.

Hattie, J. (1985). Methodology review: Assessing unidimensionality of tests and items. Applied Psychological Measurement, 9(2), 139-64.

Hobart, J. C., Cano, S. J., Zajicek, J. P., & Thompson, A. J. (2007). Rating scales as outcome measures for clinical trials in neurology: Problems, solutions, and recommendations. Lancet Neurology, 6, 1094-1105.

Irvine, S. H., Dunn, P. L., & Anderson, J. D. (1990). Towards a theory of algorithm-determined cognitive test construction. British Journal of Psychology, 81, 173-195.

Kline, T. L., Schmidt, K. M., & Bowles, R. P. (2006). Using LinLog and FACETS to model item components in the LLTM. Journal of Applied Measurement, 7(1), 74-91.

Lunz, M. E., & Linacre, J. M. (2010). Reliability of performance examinations: Revisited. In M. Garner, G. Engelhard, Jr., W. P. Fisher, Jr. & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 1 (pp. 328-341). Maple Grove, MN: JAM Press.

Mari, L., & Wilson, M. (2014). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327.

Markward, N. J., & Fisher, W. P., Jr. (2004). Calibrating the genome. Journal of Applied Measurement, 5(2), 129-141.

Maul, A., Mari, L., Torres Irribarra, D., & Wilson, M. (2018). The quality of measurement results in terms of the structural features of the measurement process. Measurement, 116, 611-620.

Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human Behaviour, 1-9.

Obiekwe, J. C. (1999, August 1). Application and validation of the linear logistic test model for item difficulty prediction in the context of mathematics problems. Dissertation Abstracts International: Section B: The Sciences & Engineering, 60(2-B), 0851.

Pendrill, L. (2014). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33.

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Pendrill, L., & Petersson, N. (2016). Metrology of human-based and other qualitative measurements. Measurement Science and Technology, 27(9), 094003.

Sijtsma, K. (2009). Correcting fallacies in validity, reliability, and classification. International Journal of Testing, 8(3), 167-194.

Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74(1), 107-120.

Stenner, A. J. (2001). The necessity of construct theory. Rasch Measurement Transactions, 15(1), 804-5 [http://www.rasch.org/rmt/rmt151q.htm].

Stenner, A. J., Fisher, W. P., Jr., Stone, M. H., & Burdick, D. S. (2013). Causal Rasch models. Frontiers in Psychology: Quantitative Psychology and Measurement, 4(536), 1-14.

Stenner, A. J., & Horabin, I. (1992). Three stages of construct definition. Rasch Measurement Transactions, 6(3), 229 [http://www.rasch.org/rmt/rmt63b.htm].

Stenner, A. J., Stone, M. H., & Fisher, W. P., Jr. (2018). The unreasonable effectiveness of theory based instrument calibration in the natural sciences: What can the social sciences learn? Journal of Physics Conference Series, 1044(012070).

Stone, M. H. (2003). Substantive scale construction. Journal of Applied Measurement, 4(3), 282-297.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774.

Wright, B. D., & Stone, M. H. (1979). Chapter 5: Constructing a variable. In Best test design: Rasch measurement (pp. 83-128). Chicago, Illinois: MESA Press.

Wright, B. D., & Stone, M. H. (1999). Measurement essentials. Wilmington, DE: Wide Range, Inc. [http://www.rasch.org/measess/me-all.pdf].

Wright, B. D., Stone, M., & Enos, M. (2000). The evolution of meaning in practice. Rasch Measurement Transactions, 14(1), 736 [http://www.rasch.org/rmt/rmt141g.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Advertisements

Making sustainability impacts universally identifiable, individually owned, efficiently exchanged, and profitable

February 2, 2019

Sustainability impacts plainly and obviously lack common product definitions, objective measures, efficient markets, and associated capacities for competing on improved quality. The absence of these landmarks in the domain of sustainability interests is a result of inattention and cultural biases far more than it is a result of the inherent characteristics or nature of sustainability itself. Given the economic importance of these kinds of capacities and the urgent need for new innovations supporting sustainable development, it is curious how even those most stridently advocating new ways of thinking seem to systematically ignore well-established opportunities for advancing their cause. The wealth of historical examples of rapidly emerging, transformative, disruptive, and highly profitable innovations would seem to motivate massive interest in how extend those successes in new directions.

Economists have long noted how common currencies reduce transaction costs, support property rights, and promote market efficiencies (for references and more information, see previous entries in this blog over the last ten years and more). Language itself is well known for functioning as an economical labor-saving device in the way that useful concepts representing things in the world as words need not be re-invented by everyone for themselves, but can simply be copied. In the same ways that common languages ease communication, and common currencies facilitate trade, so, too, do standards for common product definitions contribute to the creation of markets.

Metrologically traceable measurements make it possible for everyone everywhere to know how much of something in particular there is. This is important, first of all, because things have to be identifiable in shared ways if we are to be able to include them in our lives, socially. Anyone interested in obtaining or producing that kind of thing has to be able to know it and share information about it as something in particular. Common languages capable of communicating specifically what a thing is, and how much of it there is, support claims to ownership and to the fruits of investments in entrepreneurial innovations.

Technologies for precision measurement key to these communications are one of the primary products of science. Instruments measuring in SI units embody common currencies for the exchange of scientific capital. The calibration and distribution of such instruments in the domain of sustainability impact investing and innovation ought to be a top-level priority. How else will sustainable impacts be made universally identifiable, individually owned, efficiently exchanged, and profitable?

The electronics, computer, and telecommunications industries provide ample evidence of precision measurement’s role in reducing transaction costs, establishing common product definitions, and reaping huge profits. The music industry’s use of these technologies combines the science and economics of precision measurement with the artistic creativity of intensive improvisations constructed from instruments tuned to standardized scales that achieve wholly unique levels of individual innovation.

Much stands to be learned, and even more to be gained, in focusing sustainability development on ways in which we can harness the economic power of the profit motive by combining collective efforts with individual imaginations in the domains of human, social, and natural capital. Aligning financial, monetary wealth with the authentic wealth and genuine productivity of gains in human, community, and environmental value ought to be the defining mission of this generation. The time to act is now.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

So you say knowledge wants to be free?

January 26, 2019

If knowledge wants to be free, why do we work so hard keeping it trapped in scores and ratings whose meanings change depending on which questions were asked and who answered them?

Why don’t we liberate knowledge from its many prisons by embodying it in measurement systems that mean the same thing (within the range of uncertainty) no matter which questions on a topic are asked and no matter who answers them?

We routinely share knowledge quickly and easily when it’s about time, length, temperature, energy, mass, etc. Methods, theories, models, and tools developed over the last 90+ years show how we could be doing the same thing for literacy, health, functionality, environmental management, and every other major area of concern in the UN Sustainability Development Goals.

There’s a lot of talk among sustainability advocates about how urgent the need is for transformative efforts, investments, and technologies. It seems to me that sense of urgency will never be more than empty talk as long as we go on willfully ignoring the fact that we hold the keys to the chains that bind us.

 

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

On the recent Pew poll contrasting differences as to the “very big” problems we face today

October 20, 2018

An online news item appearing on 15 October 2018 proclaims that “Americans don’t just disagree on the issues. They disagree on what the issues are.” The article, by Dylan Scott on the Vox website, reports on a poll conducted by the Pew Research Center, involving registered voters in the U.S., between 24 September and 7 October. Polarizing disagreement is a recurring theme in the world, and keeping the tension up sells ads, so it is not surprising to see the emphasis in both the article and in the Pew report on major differences in people’s perceptions of what counts as a “very big” problem in the U.S. today. But a closer look at the data gives hope for finding ways to communicate across barriers that may look more significant than they actually are.

There’s no mention in the article of the sampling error, uncertainty, or confidence level, but the Pew site indicates that, overall, sampling error is 1.5%. But the Vox article mentions only the total sample size and fails to say that the registered voter portion of the respondents is smaller by a couple of thousand. Further, the sampling error jumps up to 2.6% for respondents indicating support for a Republican candidate, and to 2.3% for respondents supporting a Democrat. Again, the differences being played up are quite large, so there’s little risk of making too much out of a small difference. It’s good to know just how much of a difference makes a difference, though.

That said, neither Pew nor the Vox story mentions the very strong agreement between the different groups supporting opposing party candidates when the focus is on the relative magnitudes of agreement on aligned issues. Survey research typically focuses, of course, on percentages of responses to individual questions. Only measurement geeks like me wonder whether questions addressing a common theme could be related in a way that might convey more information. My curiosity was piqued, even though it is impossible to properly evaluate a model of this kind from the mere summary percentages. I knew if I found any correspondences they might just be accidents or coincidences, but I wanted to see what would happen.

So I typed up the text of the 18 issues concerning the seriousness of the problems being confronted in the US today, along with the percentages of registered voters saying each is a “very big” problem today. I put it all into SPSS and made a few technical checks to see if any major problems of interpretation would emerge from the nonlinear and ordinal percentages. The plots and correlations I did indicated that the same general results could be inferred from both the Pew percentages and their logit transformations.

While I was looking at a scatter plot of the Republican vs Democrat agreement percentages I noticed something interesting. I had been wondering if perhaps the striking differences in the groups’ willingness to say problems were serious might be a matter of relative emphases. Might the Republican supporters be less willing to find anything a big problem, but to nonetheless rank the issues in the same order as the Democrat supporters? This is, after all, exactly the kind of pattern commonly found in data from various surveys, assessments, and tests. No matter whether a respondent scores low overall, or scores high, the relative order of things stays the same.

Now, this is true in the kind of data I work with because considerable care is invested in composing questions that are intended to hang together like that. The idea is to deliberately vary the agreeability or difficulty of the questions so they all tap the same basic construct and demonstrably measure the same thing. When these kind of data are obtained, different questions measuring the same thing can be asked of different people without compromising the unit of measurement. That is, each different examinee or respondent can answer a unique set of questions and still have a measure comparable with anyone else’s. Like I said, this does not just happen by itself, but has to come about through a careful process of design and calibration. But the basic principles are well-established as being of longstanding and proven value across wide areas of research and practice.

So I was wondering if there might be one or more subsets of questions in the Pew data that would define the same problem magnitude dimension for supporters of both Republican and Democratic candidates. And as soon as I looked at the scatterplot of the percentages from the two groups, I saw that yes, indeed, there appeared to be four groups of issues that lined up along shared slopes. A color-coded version of that plot is in Figure 1.

The one statistical inference problem that emerged in examining these ordinal data concerns the yellow dot that is lowest and furthest to the left. At 8% agreement from the Republican supporters it was pulled away from the linear relation further than the other correspondences. When transformed into a log-odds unit, that single problematic difference lines up well with the other yellow dots further to the right.

The identity line in the figure shows where exact agreement between the two groups would be. That line marks out the connection between the same percentages of respondents agreeing an issue is a “very big” problem. We can see that the three green dots very nearly fall on that identity line. Just below them is a row of blue dots almost parallel with the identity line. Then there’s a third row of yellow dots further down, indicating more absolute disagreement between the two groups on these issues, but also showing a quite strong agreement as to their relative magnitudes within that group. Finally, there is another, red, line of dots in the lower right corner of the figure that marks out a more extreme range of absolute disagreement, but is also quite parallel to the identity line.

Fisher2018PewFig1

Figure 1 Initial plot of Republican vs Democrat Percentages agreement as to “Very Big” problems

Figures 2-5 below illustrate each of these groups of issues separately, giving further information on the problems and showing the regression lines and correlations for each contrast. The same colors have been retained to aid in seeing which groups of issues in Figure 1 are being shown.

The four areas of problems seem to me to correspond to issues of perceived major threats (Figure 2), accountability and access issues (Figure 3), equal opportunity issues (Figure 4), and systemic problems (Figure 5). Each of these content areas could be explored conceptually and qualitatively to assess whether some initial sense of a measured construct can be formed. If the by-person individual response data could be analyzed for fit to a proper measurement model, a much better job of determining the presence of invariant structure could be done.

But even without undertaking that work, these results already suggest a basis for productive conversations between the supposedly polarized groups. To start from the low-hanging fruit, the three problems the two groups agree on to within a couple of sampling errors (Figure 2) present topics of common agreement. Both Democrats and Republicans identify violent crime, the federal budget deficit, and drug addiction as matters of equally shared concern. The point is not that these are the highest rated problems for either group, but, rather, that they agree within the limits of statistical precision as to the extent that these are “very big” problems. It may be that setting shared priorities for addressing these problems could ground new relationships in that experience of having accomplished something productive together.

This new approach to building social capital might then proceed by taking up progressively more difficult areas of disagreement as to what “very big” problems are. Even though Republicans rate each area as less likely to be a “very big” problem, within each of the four groups of issues, they agree with Democrats as to their relative magnitudes. News like this might not sell a lot of ads, but it does offer hope for finding new ways of approaching relationships and crossing divides.

Fisher2018PewFig2

Figure 2.Republican vs Democrat areas of agreement as to “Very Big” problems

Fisher2018PewFig3

Figure 3 Republican vs Democrat areas of some disagreement as to “Very Big” problems

Fisher2018PewFig4

Figure 4 Republican vs Democrat areas of marked disagreement as to “Very Big” problems

Fisher2018PewFig5

Figure 5 Republican vs Democrat areas of fundamental disagreement as to “Very Big” problems

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

A Yet Simpler Take on Making Sustainability Self-Sustaining

September 1, 2018

The point of focusing on sustainability is to balance human interests with a long term view of life on earth. Depleting resources as though they will be always available plainly is no way to plan for a safe and pleasant future. But it seems to me something is missing in the way we approach sustainability. Every time I see any efforts aimed at rebalancing resource usage with a long term view of the Earth’s capacity to support us, what do I see? I see solutions that cost a lot, and people saying that the costs are the price we have to pay for the mistakes that have been made, and for a viable future. And so I also see a lot of procrastination, delays, and reluctance to commit to sustainable policies and practices.

Why? Because, first, there are a great many people who cannot afford to live in the world as it is, right now, simply bearing their existing day-to-day costs. Even in the richest countries, huge proportions of people live hand to mouth, or very nearly so. Second, it’s hard to detect and punish freeloaders. Many people, companies, and governments are willing to hold off committing to sustainability in the hope that some technological fix will come along and spare them avoidable costs.

So, my question is, and I do not say this at all in jest or with any sense of irony or sarcasm: how do we make sustainability fun and profitable? How can we make sustainability economically self-sustaining? How can we make sustainability into a growth industry?

My answer to those questions is, by improving the quality of information on sustainability impacts. What does that mean? Why should that have anything to do with making sustainability fun and profitable? What improving the quality of information on sustainability impacts means is measuring it well, using methods and models that have been used in research and practice for more than 90 years. What we need is a Human, Social, and Natural Capital Metric System. or an International System of Units for Human, Social, and Natural Capital.

As we all know from the existing SI (metric system) units, high quality information makes it much easier to communicate value. Easier communication means lower transaction costs, and lower transaction costs mean that it becomes very inexpensive to find out how much of a sustainability impact is available, and what quality it is. High quality information enables grassroots bottom up efforts coordinating the decisions and behaviors of everyone everywhere. Managers would be able to dramatically improve quality in domains of human, social, and environmental value the way they do now for manufactured value. And investors would be able to reward innovation in those areas in ways they currently cannot.

For instance, with high quality sustainability impact measures, you’d be able to buy shares of stock in a new global carbon reduction effort that realistically projects it is on track to reverse climate change back its 1980 status. If someone came out with a better carbon reduction product that would make it possible to get the job done faster or at lower cost, we would have the information we need to quickly shift the flow of resources to the better product.

Speaking to other components of the UN’s Sustainability Development Goals, maybe people need to wonder why they cannot go buy 250 units of additional literacy right now? Why can’t you get a good price on a specific amount of literacy gain for your ten-year-old child from a few minutes of competitive shopping? And while you’re at it, maybe you could catch a special sale on 470 units of improved physical functionality for your great aunt who just had a hip replacement. Oh, she doesn’t need it because she’s got herself listed in a health capital investment bond likely to pay a 6% return? Well, maybe you should sink some funds into one of those contracts!

To take up the SDG 16.1 issue, if efforts to reduce armed violence were measured with the same level of information quality as kilowatt hours, that form of social capital product would be available in market transactions just the same way manufactured capital products like electricity are now. Conversely, your personal efforts at reducing armed violence, or improving someone’s literacy, or helping your great aunt with gains in physical functionality—all of these are investments of your skills and abilities that will pay back cash value to you. And because having fun with the kids, and getting out for recreational activities, are healthful things to do, enjoyment also should pay dividends.

Maybe this focus on fun and profit in making sustainability economically self-sustaining might finally find some traction for efforts in this area. Sustainability commerce could be a way of talking about these issues that will speak to matters more directly and practically. We’ll see how that works out as I try it on people in the near future.

 

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

 

Revisiting The Federalist Paper No. 31 by Alexander Hamilton: An Analogy from Geometry

July 10, 2018

[John Platt’s chapters on social chain reactions in his 1966 book, The Steps to Man, provoked my initial interest in looking into his work. That work appears to be an independent development of themes that appear in more well-known works by Tarde, Hayek, McLuhan, Latour, and others, which of course are of primary concern in thinking through metrological and ecosystem issues in psychological and social measurement. My interest also comes in the context of Platt’s supervision of Ben Wright in Robert Mulliken’s physics lab at the U of Chicago in 1948. However, other chapters in this book concern deeper issues of complexity and governance that cross yet more disciplinary boundaries. One of the chapters in the book, for instance, examines the Federalist Papers and remarks on a geometric analogy drawn by Alexander Hamilton concerning moral and political forms of knowledge. The parallel with my own thinking is such that I have restated Hamilton’s theme in my own words within the contemporary context. The following is my effort in this regard. No source citations are given, but a list of supporting references is included at bottom. Hamilton’s original text is available at: https://www.congress.gov/resources/display/content/The+Federalist+Papers#TheFederalistPapers-31.  ]

 

Communication requires that we rely on the shared understandings of a common language. Language puts in play combinations of words, concepts, and things that enable us to relate to one another at varying levels of complexity. Often, we need only to convey the facts of a situation in a simple denotative statement about something learned (“the cat is on the mat”). We also need to be able to think at a higher level of conceptual complexity referred to as metalinguistic, where we refer to words themselves and how we learn about what we’ve learned (“the word ‘cat’ has no fur”). At a third, metacommunicative, level of complexity, we make statements about statements, deriving theories of learning and judgments from repeated experiences of metalinguistic learning about learning (“I was joking when I said the cat was on the mat”).

Human reason moves freely between expressions of and representations of denotative facts, metalinguistic instruments like words, and metacommunicative theories. The combination of assurances obtained from the mutual supports each of these provides the others establishes the ground in which the seeds of social, political, and economic life take root and grow. Thought itself emerges from within the way the correspondence of things, words, and concepts precedes and informs the possibility of understanding and communication.

When understanding and communication fail, that failure may come about because of mistaken perceptions concerning the facts, a lack of vocabulary, or misconceptions colored by interests, passions, or prejudices, or some combination of these three.

The maxims of geometry exhibit exactly this same pattern combining concrete data on things in the world, instruments for abstract measurement, and formal theoretical concepts. Geometry is the primary and ancient example of how the beauty of aesthetic proportions teaches us to understand meaning. Contrary to common sense, which finds these kinds of discontinuities incomprehensible, philosophy since the time of Plato’s Symposium teaches how to make meaning in the face of seemingly irreconcilable differences between the local facts of a situation and the principles to which we may feel obliged to adhere. Geometry meaningfully and usefully, for instance, represents the undrawable infinite divisibility of line segments, as with the irrational length of the hypotenuse of a right isosceles triangle that has the other two sides with lengths of 1.

This apparently absurd and counter-intuitive skipping over of the facts in the construction of the triangular figure and the summary reference to the unstateable infinity of the square root of two is so widely accepted as to provide a basis for real estate property rights that are defensible in courts of law and financially fungible. And in this everyday commonplace we have a model for separating and balancing denotative facts, instrumental words, and judicial theories in moral and political domains.

Humanity has proven far less tractable than geometry over the course of its history regarding possible sciences of morals and politics. This is understandable given humanity’s involvement in its own ongoing development. As Freud put it, humanity’s Narcissistic feeling of being the center of the universe, the crown of creation, and the master of its own mind has suffered a series of blows as it has had to come to terms with the works of Copernicus, Darwin, and Freud himself. The struggle to establish a common human identity while also celebrating individual uniqueness is an epic adventure involving billions of tragic and comedic stories of hubris, sacrifice, and accomplishment. Humanity has arrived at a point now, however, at which a certain obstinate, perverse, and disingenuous resistance to self-understanding has gone too far.

Although the mathematical sciences excel in refining the precision of their tools, longstanding but largely untapped resources for improving the meaningfulness and value of moral and political knowledge have been available for decades. “The obscurity is much oftener in the passions and prejudices of the reasoner than in the subject.” Methods for putting passions on the table for sorting out take advantage of the lessons beauty teaches about meaning and thereby support each of the three levels of complexity in communication.

At this point we encounter the special relevance of those three levels of complexity to the separation and balance of powers in government. The concrete denotative factuality of data is the concern of the executive branch, as befits its orientation to matters of practical application. The abstract metalinguistic instrumentation of words is the concern of the legislative branch, in accord with its focus on the enactment of laws and measures. And formal metacommunicative explanatory theories are the concern of the judicial branch, as is appropriate to its focus on constitutional issues.

For each of us to give our own individual understandings fair play in ways that do not give free rein to unfettered prejudices entangled in words and subtle confusions, we need to be able to communicate in terms that, so far as possible, function equally well within and across each of these levels of complexity. It is only to state the obvious to say that we lack the language needed for communication of this kind. Our moral and political sciences have not yet systematically focused on creating such languages. Outside of a few scattered works, they have not even yet consciously hypothesized the possibility of creating these languages. It is nonetheless demonstrably the case that these languages are feasible, viable, and desirable.

Though good will towards all and a desire to refrain so far as possible from overt exclusionary prejudices for or against one or another group cannot always be assumed, these are the conditions necessary for a social contract and are taken as the established basis for what follows. The choice between discourse and violence includes careful attention to avoiding the violence of the premature conclusion. If we are ever to achieve improved communication and a fuller realization of both individual liberties and social progress, the care we invest in supports for life, liberty, and the pursuit of happiness must flow from this deep source.

Given the discontinuities between language’s levels of complexity, avoiding premature conclusions means needing individualized uncertainty estimates and an associated tolerance for departures from expectations set up by established fact-word-concept associations. For example, we cannot allow a three-legged horse to alter our definition of horses as four-legged animals. Neither should we allow a careless error or lucky guess to lead to immediate and unqualified judgments of learning in education. Setting up the context in which individual data points can be understood and explained is the challenge we face. Information infrastructures supporting this kind of contextualization have been in development for years.

To meet the need for new communicative capacities, features of these information infrastructures will have to include individualized behavioral feedback mechanisms, minimal encroachments on private affairs, managability, modifiability, and opportunities for simultaneously enhancing one’s own interests and the greater good.

It is in this latter area that our interests are now especially focused. Our audacious but not implausible goal is to find ways of enhancing communication and the quality of information infrastructures by extending beauty’s lessons for meaning into new areas. In the same way that geometry facilitates leaps from concrete figures to abstract constructions and from there to formal ideals, so, too, must we learn, learn about that learning, and develop theories of learning in other less well materialized areas, such as student-centered education, and patient-centered health care. Doing so will set the stage for new classes of human, social, and natural capital property rights that are just as defensible in courts of law and financially fungible as real estate.

When that language is created, when those rights are assigned, and when that legal defensibility and financial fungibility are obtained, a new construction of government will follow. In it, the separation and balance of executive, legislative, and judicial powers will be applied with equal regularity and precision down to the within-individual micro level, as well as at the between-individual meso level, and at the social macro level. This distribution of freedom and responsibility across levels and domains will feed into new educational, market, health, and governmental institutions of markedly different character than we have at present.

A wide range of research publications appearing over the last several decades documents unfolding developments in this regard, and so those themes will not be repeated here. Some of these publications are listed below for those interested. Far more remains to be done in this area than has yet been accomplished, to say the least.

 

 

Sources consulted or implied

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Bateson, G. (1972). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. Chicago: University of Chicago Press.

Black, P., & Wiliam, D. (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation and Accountability, 21, 5-31.

Black, P., Wilson, M., & Yao, S. (2011). Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research & Perspectives, 9, 1-52.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2005, August 1-3). Data standards for living human, social, and natural capital. In Session G: Concluding Discussion, Future Plans, Policy, etc. Conference on Entrepreneurship and Human Rights [http://www.fordham.edu/economics/vinod/ehr05.htm], Pope Auditorium, Lowenstein Bldg, Fordham University.

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-1093 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009, November 19). Draft legislation on development and adoption of an intangible assets metric system. Retrieved 6 January 2011, from Living Capital Metrics blog: https://livingcapitalmetrics.wordpress.com/2009/11/19/draft-legislation/

Fisher, W. P., Jr. (2009, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement: Concerning Foundational Concepts of Measurement Special Issue Section, 42(9), 1278-1287.

Fisher, W. P., Jr. (2009). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (Tech. Rep. No. http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC:. National Institute for Standards and Technology.

Fisher, W. P., Jr. (2010). Measurement, reduced transaction costs, and the ethics of efficient markets for human, social, and natural capital, Bridge to Business Postdoctoral Certification, Freeman School of Business, Tulane University (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2340674).

Fisher, W. P., Jr. (2010). The standard model in the history of the natural sciences, econometrics, and the social sciences. Journal of Physics Conference Series, 238(1), 012016.

Fisher, W. P., Jr. (2011). Bringing human, social, and natural capital to life: Practical consequences and opportunities. Journal of Applied Measurement, 12(1), 49-66.

Fisher, W. P., Jr. (2011). Stochastic and historical resonances of the unit in physics and psychometrics. Measurement: Interdisciplinary Research & Perspectives, 9, 46-50.

Fisher, W. P., Jr. (2012). Measure and manage: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. 43-63). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012, May/June). What the world needs now: A bold plan for new standards [Third place, 2011 NIST/SES World Standards Day paper competition]. Standards Engineering, 64(3), 1 & 3-5 [http://ssrn.com/abstract=2083975].

Fisher, W. P., Jr. (2015). A probabilistic model of the law of supply and demand. Rasch Measurement Transactions, 29(1), 1508-1511  [http://www.rasch.org/rmt/rmt291.pdf].

Fisher, W. P., Jr. (2018). How beauty teaches us to understand meaning. Educational Philosophy and Theory, in review.

Fisher, W. P., Jr. (2018). A nondualist social ethic: Fusing subject and object horizons in measurement. TMQ–Techniques, Methodologies, and Quality, in review.

Fisher, W. P., Jr., Oon, E. P.-T., & Benson, S. (2018). Applying Design Thinking to systemic problems in educational assessment information management. Journal of Physics Conference Series, 1044, 012012.

Fisher, W. P., Jr., Oon, E. P.-T., & Benson, S. (2018). Rethinking the role of educational assessment in classroom communities: How can design thinking address the problems of coherence and complexity? Measurement, in review.

Fisher, W. P., Jr., & Stenner, A. J. (2013). On the potential for improved measurement in the human and social sciences. In Q. Zhang & H. Yang (Eds.), Pacific Rim Objective Measurement Symposium 2012 Conference Proceedings (pp. 1-11). Berlin, Germany: Springer-Verlag.

Fisher, W. P., Jr., & Stenner, A. J. (2016). Theory-based metrological traceability in education: A reading measurement network. Measurement, 92, 489-496.

Fisher, W. P., Jr., & Stenner, A. J. (2018). Ecologizing vs modernizing in measurement and metrology. Journal of Physics Conference Series, 1044, 012025.

Gadamer, H.-G. (1980). Dialogue and dialectic: Eight hermeneutical studies on Plato (P. C. Smith, Trans.). New Haven: Yale University Press.

Gari, S. R., Newton, A., Icely, J. D., & Delgado-Serrano, M. D. M. (2017). An analysis of the global applicability of Ostrom’s design principles to diagnose the functionality of common-pool resource institutions. Sustainability, 9(7), 1287.

Gelven, M. (1984). Eros and projection: Plato and Heidegger. In R. W. Shahan & J. N. Mohanty (Eds.), Thinking about Being: Aspects of Heidegger’s thought (pp. 125-136). Norman, Oklahoma: Oklahoma University Press.

Hamilton, A. (. (1788, 1 January). Concerning the general power of taxation (continued). The New York Packet. (Rpt. in J. E. Cooke, (Ed.). (1961). The Federalist (Hamilton, Alexander; Madison, James; Jay, John). (pp. No. 31, 193-198). Middletown, Conn: Wesleyan University Press.

Lunz, M. E., Bergstrom, B. A., & Gershon, R. C. (1994). Computer adaptive testing. International Journal of Educational Research, 21(6), 623-634.

Ostrom, E. (2015). Governing the commons: The evolution of institutions for collective action. Cambridge, UK: Cambridge University Press (Original work published 1990).

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55.

Penuel, W. R. (2015, 22 September). Infrastructuring as a practice for promoting transformation and equity in design-based implementation research. In Keynote. International Society for Design and Development in Education (ISDDE) 2015 Conference, Boulder, CO. Retrieved from http://learndbir.org/resources/ISDDE-Keynote-091815.pdf

Platt, J. R. (1966). The step to man. New York: John Wiley & Sons.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Ricoeur, P. (1966). The project of a social ethic. In D. Stewart & J. Bien, (Eds.). (1974). Political and social essays (pp. 160-175). Athens, Ohio: Ohio University Press.

Ricoeur, P. (1970). Freud and philosophy: An essay on interpretation. Evanston, IL: Northwestern University Press.

Ricoeur, P. (1974). Violence and language. In D. Stewart & J. Bien (Eds.), Political and social essays by Paul Ricoeur (pp. 88-101). Athens, Ohio: Ohio University Press.

Ricoeur, P. (1977). The rule of metaphor: Multi-disciplinary studies of the creation of meaning in language (R. Czerny, Trans.). Toronto: University of Toronto Press.

Star, S. L., & Ruhleder, K. (1996, March). Steps toward an ecology of infrastructure: Design and access for large information spaces. Information Systems Research, 7(1), 111-134.

Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D. (1958, 7). On behalf of a personal approach to learning. The Elementary School Journal, 58, 365-375. (Rpt. in M. Wilson & W. P. Fisher, Jr., (Eds.). (2017). Psychological and social measurement: The career and contributions of Benjamin D. Wright (pp. 221-232). New York: Springer Nature.)

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.