Archive for the ‘instruments’ Category

IMEKO Joint Symposium in St. Petersburg, Russia, 2-5 July 2019

June 26, 2019

The IMEKO Joint Symposium will be next week, 2-5 July, at the Original Sokos Hotel Olympia Garden, located at Batayskiy Pereulok, 3А, in St. Petersburg, Russia. Kudos to Kseniia Sapozhnikova, Giovanni Rossi, Eric Benoit, and the organizing committee for putting together such an impressive program, which is posted at: https://imeko19-spb.org/wp-content/uploads/2019/06/Program-of-the-Symposium.pdf

Presentations on measurement across the sciences from metrology engineers and psychometricians from around the world will include: Andrich, Cavanagh, Fitkov-Norris, Huang, Mari, Melin, Nguyen, Oon, Powers, Salzberger, Wilson, and multiple other co-authors, including Adams, Cano, Maul, Pendrill, and more.

For background on this rapidly developing new conversation on measurement across the sciences, see the references listed at bottom below. The late Ludwig Finkelstein, editor of IMEKO’s Measurement journal from 1982 to 2000, was a primary instigator of work in this area. At the 2010 Joint Symposium he co-hosted in London, Finkelstein said: “It is increasingly recognised that the wide range and diverse applications of measurement are based on common logical and philosophical principles and share common problems” (Finkelstein, 2010, p. 2). The IMEKO Joint Symposium continues to advance in the direction foreseen by Finkelstein.

Topics to be addressed include a round table discussion on the topic “Terminology issues related to expanding boundaries of measurements” chaired by Mari and Chunovkina.

Paper titles include:

Andrich on “Exemplifying natural science measurement in the social sciences with Rasch measurement theory”

Benoit, et al. on “Musical instruments for the measurement of autism sensory disorders”

Budylina and Danilov on “Methods to ensure the reliability of measurements in the age of Industry 4.0”

Cavanagh, Asano-Cavanagh, and Fisher on “Natural semantic metalanguage as an approach to measuring meaning”

Crenna and Rossi on “Squat biomechanics in weightlifting: Foot attitude effects”

Fisher, Pendrill, Lips da Cruz, and Felin on “Why metrology? Fair dealing and efficient markets for the UN SDGs”

Fisher and Wilson on “The BEAR Assessment System Software as a platform for developing and applying UN SDG metrics”

Fitkov-Norris and Yeghiazarian on “Is context the hidden spanner in the works of educational measurement: Exploring the impact of context on mode of learning preferences”

Gavrilenkov, et al. on “Multicriteria approach to design of strain gauge force transducers”

Grednovskaya, et al. on “Measuring non-physical quantities in the procedures of philosophical practice”

Huang, Oon, and Fisher on “Coherence in measuring student evaluation of teaching: A new paradigm”

Katkov on “The status of and prospects for development of voltage quantum standards”

Kneller and Fayans on “Solving interdisciplinary tasks: The challenge and the ways to surmount it”

Kostromina and Gnedykh on “Problems and prospects of complex psychological phenomena measurement”

Lips da Cruz, Fisher, Pendrill, and Felin on “Accelerating the realization of the UN SDGs through metrological multi-stakeholder interoperability”

Lyubimtsev, et al. on “Measuring systems designed for working with living organisms as biosensors: Features of their metrological maintenance”

Mari, Chunovkina, and Ehrlich on “The complex concept of quantity in the past and (possibly) the future of the International Vocabulary of Metrology”

Mari, Maul, and Wilson on “Can there be one meaning of ‘measurement’ across the sciences?”

Melin, Pendrill, Cano, and the EMPIR NeuroMET 15HLT04 Consortium on “Towards patient-centred cognition metrics”

Morrison and Fisher on “Measuring for management in Science, Technology, Engineering, and Mathematics learning ecosystems”

Nguyen on “The feasibility of using an international common reading progression to measure reading across languages: A case study of the Vietnamese language”

Nguyen, Nguyen, and Adams on “Assessment of the generic problem-solving construct across different contexts”

Oon, Hoi-Ka, and Fisher on “Metrologically coherent assessment for learning: What, why, and how”

Pandurevic, et al. on “Methods for quantitative evaluation of force and technique in competitive sport climbing”

Pavese on “Musing on extreme quantity values in physics and the problem of removing infinity”

Powers and Fisher on “Advances in modelling visual symptoms and visual skills”

Salzberger, Cano, et al. on “Addressing traceability in social measurement: Establishing a common metric for dependence”

Sapozhnikova, et al. on “Music and growl of a lion: Anything in common? Measurement model optimized with the help of AI will answer”

Soratto, Nunes, and Cassol on “Legal metrological verification in health area in Brazil”

Wilson and Dulhunty on “Interpreting the relationship between item difficulty and DIF: Examples from educational testing”

Wilson, Mari, and Maul on “The status of the concept of reference object in measurement in the human sciences compared to the physical sciences”

Background References

Finkelstein, L. (1975). Representation by symbol systems as an extension of the concept of measurement. Kybernetes, 4(4), 215-223.Finkelstein, L. (2003, July). Widely, strongly and weakly defined measurement. Measurement, 34(1), 39-48(10).

Finkelstein, L. (2005). Problems of measurement in soft systems. Measurement, 38(4), 267-274.

Finkelstein, L. (2009). Widely-defined measurement–An analysis of challenges. Measurement: Concerning Foundational Concepts of Measurement Special Issue Section (L. Finkelstein, Ed.), 42(9), 1270-1277.

Finkelstein, L. (2010). Measurement and instrumentation science and technology-the educational challenges. Journal of Physics Conference Series, 238, doi:10.1088/1742-6596/238/1/012001.

Fisher, W. P., Jr. (2009). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement: Concerning Foundational Concepts of Measurement Special Issue (L. Finkelstein, Ed.), 42(9), 1278-1287.

Mari, L. (2000). Beyond the representational viewpoint: A new formalization of measurement. Measurement, 27, 71-84.

Mari, L., Maul, A., Irribara, D. T., & Wilson, M. (2016, March). Quantities, quantification, and the necessary and sufficient conditions for measurement. Measurement, 100, 115-121. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224116307497

Mari, L., & Wilson, M. (2014, May). An introduction to the Rasch measurement approach for metrologists. Measurement, 51, 315-327. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224114000645

Pendrill, L. (2014, December). Man as a measurement instrument [Special Feature]. NCSLi Measure: The Journal of Measurement Science, 9(4), 22-33. Retrieved from http://www.tandfonline.com/doi/abs/10.1080/19315775.2014.11721702

Pendrill, L., & Fisher, W. P., Jr. (2015). Counting and quantification: Comparing psychometric and metrological perspectives on visual perceptions of number. Measurement, 71, 46-55. doi: http://dx.doi.org/10.1016/j.measurement.2015.04.010

Pendrill, L., & Petersson, N. (2016). Metrology of human-based and other qualitative measurements. Measurement Science and Technology, 27(9), 094003. Retrieved from https://doi.org/10.1088/0957-0233/27/9/094003

Wilson, M. R. (2013). Using the concept of a measurement system to characterize measurement models used in psychometrics. Measurement, 46, 3766-3774. Retrieved from http://www.sciencedirect.com/science/article/pii/S0263224113001061

Wilson, M., & Fisher, W. (2016). Preface: 2016 IMEKO TC1-TC7-TC13 Joint Symposium: Metrology across the Sciences: Wishful Thinking? Journal of Physics Conference Series, 772(1), 011001. Retrieved from http://iopscience.iop.org/article/10.1088/1742-6596/772/1/011001/pdf

Wilson, M., & Fisher, W. (2018). Preface of special issue, Metrology across the Sciences: Wishful Thinking? Measurement, 127, 577.

Wilson, M., & Fisher, W. (2019). Preface of special issue, Psychometric Metrology. Measurement, 145, 190.

 

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Advertisements

Taking the Scales of Justice Seriously as a Model for Sustainable Political Economies

February 28, 2019

We all take standards of measurement for granted as background assumptions that we never have to think about. But as technical, mundane, and boring as these standards are, they define our systems of fair dealing and just relations. The image of blind justice holding a balance scale is a universal ideal being compromised in multiple ways by chaotic forces in today’s complicated world arena.

Even so, astoundingly little effort is being invested in systematically exploring how the scales of justice might be more meaningfully and resiliently embedded within our social, economic, educational, health care, and political institutions. This well may be because the idea that people’s abilities, behaviors, and knowledge could be precisely weighed on a scale, like fruit in a grocery store, seems outrageously immoral, opening the door to treating people like commodities to be bought and sold. And even if the political will for such measures could be found, the regulatory enforcement of legally binding contracts and accounting standards appears so implausibly complicated as to make the whole matter not worth any serious consideration at all.

On the face of it, a literal application of the scales of justice to human affairs echoes ideas discredited so thoroughly and for so long that bringing them up in the here and now seems utterly ridiculous, at least, and perhaps truly dangerous, with no possible result except the crushing reduction of human beings to cogs in a soulless machine.

But what if there is some basic way in which measurement is misunderstood when it is taken to mean people will be treated like mass produced commodities for sale? What if we could measure, legally own, invest in, and profit from our literacy, health, and trustworthiness, in the same way we do with property and material things? What if precision measurement was not a tool for oppressive manipulation but a means of obtaining, sharing, and communicating valuable information? What if local contextual situations can be allowed a latitude of variation that does not negatively compromise navigable continuity?

Circumstances are conspiring to take humanity in new directions. Complex new necessities are nurturing the conception and birth of new innovations. A wealth of diverse possibilities for adaptive experimentation proposed in the past–sometimes the distant past–are finding new life in today’s technological context. And science has changed a lot in the last 100 years. In fact, the public is largely unaware that the old paradigm of mechanical reduction has been completely demolished and replaced with a new paradigm of organic emergence and complex adaptive systems. Even Newtonian mechanics and the basic number theory of arithmetic have had to be reworked. It is also true that very few experts have thought through what the demise of the mechanical root metaphor, and the birth of an organic ecosystem metaphor, means philosophically, socially, historically, and culturally.

Bottom-up manifestations of repeating patterns that can be scaled, measured, quantified, and explained open up a wide array of new opportunities for learning from shared experiences. And, just as humanity has long understood about music, we know now how to contextualize group and individual assessment and survey response patterns in ways that let everyone be what they are, uniquely improvising playful creative performances expressed using high tech instruments tuned to shared standards. A huge amount of conceptual and practical work needs to be done, but there are multiple historical precedents suggesting that betting against human ingenuity would be a losing wager.

Two new projects I’m involved in concerning sustainability impact investing and a metrology center for categorical measures begin a new exploration of the consequences of this paradigm shift for our image of the scales of justice as representing a moral imperative. These projects ask whether more complex combinations of mathematics, experiment, technology, and theory can be overtly conceived and implemented in terms of participatory and democratic social and cognitive ecosystems. If so, we may then find our way to new standards of measurement, new languages, and new forms of social organization sufficient to redefining what we take for granted as satisfying our shared sense of fair dealing and just relations.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Making sustainability impacts universally identifiable, individually owned, efficiently exchanged, and profitable

February 2, 2019

Sustainability impacts plainly and obviously lack common product definitions, objective measures, efficient markets, and associated capacities for competing on improved quality. The absence of these landmarks in the domain of sustainability interests is a result of inattention and cultural biases far more than it is a result of the inherent characteristics or nature of sustainability itself. Given the economic importance of these kinds of capacities and the urgent need for new innovations supporting sustainable development, it is curious how even those most stridently advocating new ways of thinking seem to systematically ignore well-established opportunities for advancing their cause. The wealth of historical examples of rapidly emerging, transformative, disruptive, and highly profitable innovations would seem to motivate massive interest in how extend those successes in new directions.

Economists have long noted how common currencies reduce transaction costs, support property rights, and promote market efficiencies (for references and more information, see previous entries in this blog over the last ten years and more). Language itself is well known for functioning as an economical labor-saving device in the way that useful concepts representing things in the world as words need not be re-invented by everyone for themselves, but can simply be copied. In the same ways that common languages ease communication, and common currencies facilitate trade, so, too, do standards for common product definitions contribute to the creation of markets.

Metrologically traceable measurements make it possible for everyone everywhere to know how much of something in particular there is. This is important, first of all, because things have to be identifiable in shared ways if we are to be able to include them in our lives, socially. Anyone interested in obtaining or producing that kind of thing has to be able to know it and share information about it as something in particular. Common languages capable of communicating specifically what a thing is, and how much of it there is, support claims to ownership and to the fruits of investments in entrepreneurial innovations.

Technologies for precision measurement key to these communications are one of the primary products of science. Instruments measuring in SI units embody common currencies for the exchange of scientific capital. The calibration and distribution of such instruments in the domain of sustainability impact investing and innovation ought to be a top-level priority. How else will sustainable impacts be made universally identifiable, individually owned, efficiently exchanged, and profitable?

The electronics, computer, and telecommunications industries provide ample evidence of precision measurement’s role in reducing transaction costs, establishing common product definitions, and reaping huge profits. The music industry’s use of these technologies combines the science and economics of precision measurement with the artistic creativity of intensive improvisations constructed from instruments tuned to standardized scales that achieve wholly unique levels of individual innovation.

Much stands to be learned, and even more to be gained, in focusing sustainability development on ways in which we can harness the economic power of the profit motive by combining collective efforts with individual imaginations in the domains of human, social, and natural capital. Aligning financial, monetary wealth with the authentic wealth and genuine productivity of gains in human, community, and environmental value ought to be the defining mission of this generation. The time to act is now.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Evaluating Questionnaires as Measuring Instruments

June 23, 2018

An email came in today asking whether three different short (4- and 5-item) questionnaires could be expected to provide reasonable quality measurement. Here’s my response.

—–

Thanks for raising this question. The questionnaire plainly was not designed to provide data suitable for measurement. Though much can be learned about making constructs measurable from data produced by this kind of questionnaire, “Rasch analysis” cannot magically create a silk purse from a sow’s ear (as the old expression goes). Use Linacre’s (1993) generalizability theory nomograph to see what reliabilities are expected for each subscale, given the numbers of items and rating categories, and applying a conservative estimate of the adjusted standard deviations (1.0 logit, for instance). Convert the reliability coefficients into strata (Fisher, 1992, 2008; Wright & Masters, 1982, pp. 92, 105-106) to make the practical meaning of the precision obtained obvious.

So if you have data, analyze it and compare the expected and observed reliabilities. If the uncertainties are quite different, is that because of targeting issues? But before you do that, ask experts in the area to rank order:

  • the courses by relevance to the job;
  • the evaluation criteria from easy to hard; and
  • the skills/competencies in order of importance to job performance.

Then study the correspondence between the rankings and the calibration results. Where do they converge and diverge? Why? What’s unexpected? What can be learned?

Analyze all of the items in each area (student, employer, instructor) together in Winsteps and study each of the three tables 23.x, setting PRCOMP=S. Remember that the total variance explained is not interpreted simply in terms of “more is better” and that the total variance explained is not as important as the ratio of that variance to the variance in the first contrast (see Linacre, 2006, 2008). If the ratio is greater than 3, the scale is essentially unidimensional (though significant problems may remain to be diagnosed and corrected).

Common practice holds that unexplained variance eigenvalues should be less than 1.5, but this overly simplistic rule of thumb (Chou & Wang, 2010; Raîche, 2005) has been contradicted in practice many times, since, even if one or more eigenvalues are over 1.5, theory may say the items belong to the same construct, and the disattenuated correlations of the measures implied by the separate groups of items (provided in tables 23.x) may still approach 1.00, indicating that the same measures are produced across subscales. See Green (1996) and Smith (1996), among others, for more on this.

If subscales within each of the three groups of items are markedly different in the measures they produce, then separate them in different analyses. If these further analyses reveal still more multidimensionalities, it’s time to go back to the drawing board, given how short these scales are. If you define a plausible scale, study the item difficulty orders closely with one or more experts in the area. If there is serious interest in precision measurement and its application to improved management, and not just a bureaucratic need for data to satisfy empty demands for a mere appearance of quality assessment, then trace the evolution of the construct as it changes from less to more across the items.

What, for instance, is the common theme addressed across the courses that makes them all relevant to job performance? The courses were each created with an intention and they were brought together into a curriculum for a purpose. These intentions and purposes are the raw material of a construct theory. Spell out the details of how the courses build competency in translation.

Furthermore, I imagine that this curriculum, by definition, was set up to be effective in training students no matter who is in the courses (within the constraints of the admission criteria), and no matter which particular challenges relevant to job performance are sampled from the universe of all possible challenges. You will recognize these unexamined and unarticulated assumptions as what need to be explicitly stated as hypotheses informing a model of the educational enterprise. This model transforms implicit assumptions into requirements that are never fully satisfied but can be very usefully approximated.

As I’ve been saying for a long time (Fisher, 1989), please do not accept the shorthand language of references to “the Rasch model”, “Rasch scaling”, “Rasch analysis”, etc. Rasch did not invent the form of these models, which are at least as old as Plato. And measurement is not a function of data analysis. Data provide experimental evidence testing model-based hypotheses concerning construct theories. When explanatory theory corroborates and validates data in calibrated instrumentation, the instrument can be applied at the point of use with no need for data analysis, to produce measures, uncertainty (error) estimates, and graphical fit assessments (Connolly, Nachtman, & Pritchett, 1971; Davis, et al., 2008; Fisher, 2006; Fisher, Kilgore, & Harvey, 1995; Linacre, 1997; many others).

So instead of using those common shorthand phrases, please speak directly to the problem of modeling the situation in order to produce a practical tool for managing it.

Further information is available in the references below.

 

Aryadoust, S. V. (2009). Mapping Rasch-based measurement onto the argument-based validity framework. Rasch Measurement Transactions, 23(1), 1192-3 [http://www.rasch.org/rmt/rmt231.pdf].

Chang, C.-H. (1996). Finding two dimensions in MMPI-2 depression. Structural Equation Modeling, 3(1), 41-49.

Chou, Y. T., & Wang, W. C. (2010). Checking dimensionality in item response models with principal component analysis on standardized residuals. Educational and Psychological Measurement, 70, 717-731.

Connolly, A. J., Nachtman, W., & Pritchett, E. M. (1971). Keymath: Diagnostic Arithmetic Test. Circle Pines, Minnesota: American Guidance Service. Retrieved 23 June 2018 from https://images.pearsonclinical.com/images/pa/products/keymath3_da/km3-da-pub-summary.pdf

Davis, A. M., Perruccio, A. V., Canizares, M., Tennant, A., Hawker, G. A., Conaghan, P. G. et al. (2008, May). The development of a short measure of physical function for hip OA HOOS-Physical Function Shortform (HOOS-PS): An OARSI/OMERACT initiative. Osteoarthritis Cartilage, 16(5), 551-559.

Fisher, W. P., Jr. (1989). What we have to offer. Rasch Measurement Transactions, 3(3), 72 [http://www.rasch.org/rmt/rmt33d.htm].

Fisher, W. P., Jr. (1992). Reliability statistics. Rasch Measurement Transactions, 6(3), 238  [http://www.rasch.org/rmt/rmt63i.htm].

Fisher, W. P., Jr. (2006). Survey design recommendations [expanded from Fisher, W. P. Jr. (2000) Popular Measurement, 3(1), pp. 58-59]. Rasch Measurement Transactions, 20(3), 1072-1074 [http://www.rasch.org/rmt/rmt203.pdf].

Fisher, W. P., Jr. (2008). The cash value of reliability. Rasch Measurement Transactions, 22(1), 1160-1163 [http://www.rasch.org/rmt/rmt221.pdf].

Fisher, W. P., Jr., Harvey, R. F., & Kilgore, K. M. (1995). New developments in functional assessment: Probabilistic models for gold standards. NeuroRehabilitation, 5(1), 3-25.

Green, K. E. (1996). Dimensional analyses of complex data. Structural Equation Modeling, 3(1), 50-61.

Linacre, J. M. (1993). Rasch-based generalizability theory. Rasch Measurement Transactions, 7(1), 283-284; [http://www.rasch.org/rmt/rmt71h.htm].

Linacre, J. M. (1997). Instantaneous measurement and diagnosis. Physical Medicine and Rehabilitation State of the Art Reviews, 11(2), 315-324 [http://www.rasch.org/memo60.htm].

Linacre, J. M. (1998). Detecting multidimensionality: Which residual data-type works best? Journal of Outcome Measurement, 2(3), 266-83.

Linacre, J. M. (1998). Structure in Rasch residuals: Why principal components analysis? Rasch Measurement Transactions, 12(2), 636 [http://www.rasch.org/rmt/rmt122m.htm].

Linacre, J. M. (2003). PCA: Data variance: Explained, modeled and empirical. Rasch Measurement Transactions, 17(3), 942-943 [http://www.rasch.org/rmt/rmt173g.htm].

Linacre, J. M. (2006). Data variance explained by Rasch measures. Rasch Measurement Transactions, 20(1), 1045 [http://www.rasch.org/rmt/rmt201a.htm].

Linacre, J. M. (2008). PCA: Variance in data explained by Rasch measures. Rasch Measurement Transactions, 22(1), 1164 [http://www.rasch.org/rmt/rmt221j.htm].

Raîche, G. (2005). Critical eigenvalue sizes in standardized residual Principal Components Analysis. Rasch Measurement Transactions, 19(1), 1012 [http://www.rasch.org/rmt/rmt191h.htm].

Schumacker, R. E., & Linacre, J. M. (1996). Factor analysis and Rasch. Rasch Measurement Transactions, 9(4), 470 [http://www.rasch.org/rmt/rmt94k.htm].

Smith, E. V., Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2), 205-31.

Smith, R. M. (1996). A comparison of methods for determining dimensionality in Rasch measurement. Structural Equation Modeling, 3(1), 25-40.

Wright, B. D. (1996). Comparing Rasch measurement and factor analysis. Structural Equation Modeling, 3(1), 3-24.

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis: Rasch measurement. Chicago, Illinois: MESA Press.

Excerpts and Notes from Goldberg’s “Billions of Drops…”

December 23, 2015

Goldberg, S. H. (2009). Billions of drops in millions of buckets: Why philanthropy doesn’t advance social progress. New York: Wiley.

p. 8:
Transaction costs: “…nonprofit financial markets are highly disorganized, with considerable duplication of effort, resource diversion, and processes that ‘take a fair amount of time to review grant applications and to make funding decisions’ [citing Harvard Business School Case No. 9-391-096, p. 7, Note on Starting a Nonprofit Venture, 11 Sept 1992]. It would be a major understatement to describe the resulting capital market as inefficient.”

A McKinsey study found that nonprofits spend 2.5 to 12 times more raising capital than for-profits do. When administrative costs are factored in, nonprofits spend 5.5 to 21.5 times more.

For-profit and nonprofit funding efforts contrasted on pages 8 and 9.

p. 10:
Balanced scorecard rating criteria

p. 11:
“Even at double-digit annual growth rates, it will take many years for social entrepreneurs and their funders to address even 10% of the populations in need.”

p. 12:
Exhibit 1.5 shows that the percentages of various needs served by leading social enterprises are barely drops in the respective buckets; they range from 0.07% to 3.30%.

pp. 14-16:
Nonprofit funding is not tied to performance. Even when a nonprofit makes the effort to show measured improvement in impact, it does little or nothing to change their funding picture. It appears that there is some kind of funding ceiling implicitly imposed by funders, since nonprofit growth and success seems to persuade capital sources that their work there is done. Mediocre and low performing nonprofits seem to be able to continue drawing funds indefinitely from sympathetic donors who don’t require evidence of effective use of their money.

p. 34:
“…meaningful reductions in poverty, illiteracy, violence, and hopelessness will require a fundamental restructuring of nonprofit capital markets. Such a restructuring would need to make it much easier for philanthropists of all stripes–large and small, public and private, institutional and individual–to fund nonprofit organizations that maximize social impact.”

p. 54:
Exhibit 2.3 is a chart showing that fewer people rose from poverty, and more remained in it or fell deeper into it, in the period of 1988-98 compared with 1969-1979.

pp. 70-71:
Kotter’s (1996) change cycle.

p. 75:
McKinsey’s seven elements of nonprofit capacity and capacity assessment grid.

pp. 94-95:
Exhibits 3.1 and 3.2 contrast the way financial markets reward for-profit performance with the way nonprofit markets reward fund raising efforts.

Financial markets
1. Market aggregates and disseminates standardized data
2. Analysts publish rigorous research reports
3. Investors proactively search for strong performers
4. Investors penalize weak performers
5. Market promotes performance
6. Strong performers grow

Nonprofit markets
1. Social performance is difficult to measure
2. NPOs don’t have resources or expertise to report results
3. Investors can’t get reliable or standardized results data
4. Strong and weak NPOs spend 40 to 60% of time fundraising
5. Market promotes fundraising
6. Investors can’t fund performance; NPOs can’t scale

p. 95:
“…nonprofits can’t possibly raise enough money to achieve transformative social impact within the constraints of the existing fundraising system. I submit that significant social progress cannot be achieved without what I’m going to call ‘third-stage funding,’ that is, funding that doesn’t suffer from disabling fragmentation. The existing nonprofit capital market is not capable of [p. 97] providing third-stage funding. Such funding can arise only when investors are sufficiently well informed to make big bets at understandable and manageable levels of risk. Existing nonprofit capital markets neither provide investors with the kinds of information needed–actionable information about nonprofit performance–nor provide the kinds of intermediation–active oversight by knowledgeable professionals–needed to mitigate risk. Absent third-stage funding, nonprofit capital will remain irreducibly fragmented, preventing the marshaling of resources that nonprofit organizations need to make meaningful and enduring progress against $100 million problems.”

pp. 99-114:
Text and diagrams on innovation, market adoption, transformative impact.

p. 140:
Exhibit 4.2: Capital distribution of nonprofits, highlighting mid-caps

pages 192-3 make the case for the difference between a regular market and the current state of philanthropic, social capital markets.

p. 192:
“So financial markets provide information investors can use to compare alternative investment opportunities based on their performance, and they provide a dynamic mechanism for moving money away from weak performers and toward strong performers. Just as water seeks its own level, markets continuously recalibrate prices until they achieve a roughly optimal equilibrium at which most companies receive the ‘right’ amount of investment. In this way, good companies thrive and bad ones improve or die.
“The social sector should work the same way. .. But philanthropic capital doesn’t flow toward effective nonprofits and away from ineffective nonprofits for a simple reason: contributors can’t tell the difference between the two. That is, philanthropists just don’t [p. 193] know what various nonprofits actually accomplish. Instead, they only know what nonprofits are trying to accomplish, and they only know that based on what the nonprofits themselves tell them.”

p. 193:
“The signs that the lack of social progress is linked to capital market dysfunctions are unmistakable: fundraising remains the number-one [p. 194] challenge of the sector despite the fact that nonprofit leaders divert some 40 to 60% of their time from productive work to chasing after money; donations raised are almost always too small, too short, and too restricted to enhance productive capacity; most mid-caps are ensnared in the ‘social entrepreneur’s trap’ of focusing on today and neglecting tomorrow; and so on. So any meaningful progress we could make in the direction of helping the nonprofit capital market allocate funds as effectively as the private capital market does could translate into tremendous advances in extending social and economic opportunity.
“Indeed, enhancing nonprofit capital allocation is likely to improve people’s lives much more than, say, further increasing the total amount of donations. Why? Because capital allocation has a multiplier effect.”

“If we want to materially improve the performance and increase the impact of the nonprofit sector, we need to understand what’s preventing [p. 195] it from doing a better job of allocating philanthropic capital. And figuring out why nonprofit capital markets don’t work very well requires us to understand why the financial markets do such a better job.”

p. 197:
“When all is said and done, securities prices are nothing more than convenient approximations that market participants accept as a way of simplifying their economic interactions, with a full understanding that market prices are useful even when they are way off the mark, as they so often are. In fact, that’s the whole point of markets: to aggregate the imperfect and incomplete knowledge held by vast numbers of traders about much various securities are worth and still make allocation choices that are better than we could without markets.
“Philanthropists face precisely the same problem: how to make better use of limited information to maximize output, in this case, social impact. Considering the dearth of useful tools available to donors today, the solution doesn’t have to be perfect or even all that good, at least at first. It just needs to improve the status quo and get better over time.
“Much of the solution, I believe, lies in finding useful adaptations of market mechanisms that will mitigate the effects of the same lack of reliable and comprehensive information about social sector performance. I would even go so far as to say that social enterprises can’t hope to realize their ‘one day, all children’ visions without a funding allociation system that acts more like a market.
“We can, and indeed do, make incremental improvements in nonprofit funding without market mechanisms. But without markets, I don’t see how we can fix the fragmentation problem or produce transformative social impact, such as ensuring that every child in America has a good education. The problems we face are too big and have too many moving parts to ignore the self-organizing dynamics of market economics. As Thomas Friedman said about the need to impose a carbon tax at a time of falling oil prices, ‘I’ve wracked my brain trying to think of ways to retool America around clean-power technologies without a price signal–i.e., a tax–and there are no effective ones.”

p. 199:
“Prices enable financial markets to work the way nonprofit capital markets should–by sending informative signals about the most effective organizations so that money will flow to them naturally..”

p. 200:
[Quotes Kurtzman citing De Soto on the mystery of capital. Also see p. 209, below.]
“‘Solve the mystery of capital and you solve many seemingly intractable problems along with it.'”
[That’s from page 69 in Kurtzman, 2002.]

p. 201:
[Goldberg says he’s quoting Daniel Yankelovich here, but the footnote does not appear to have anything to do with this quote:]
“‘The first step is to measure what can easily be measured. The second is to disregard what can’t be measured, or give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily isn’t very important. This is blindness. The fourth step is to say that what can’t be easily measured really doesn’t exist. This is suicide.'”

Goldberg gives example here of $10,000 invested witha a 10% increase in value, compared with $10,000 put into a nonprofit. “But if the nonprofit makes good use of the money and, let’s say, brings the reading scores of 10 elementary school students up from below grade level to grade level, we can’t say how much my initial investment is ‘worth’ now. I could make the argument that the value has increased because the students have received a demonstrated educational benefit that is valuable to them. Since that’s the reason I made the donation, the achievement of higher scores must have value to me, as well.”

p. 202:
Goldberg wonders whether donations to nonprofits would be better conceived as purchases than investments.

p. 207:
Goldberg quotes Jon Gertner from the March 9, 2008, issue of the New York Times Magazine devoted to philanthropy:

“‘Why shouldn’t the world’s smartest capitalists be able to figure out more effective ways to give out money now? And why shouldn’t they want to make sure their philanthropy has significant social impact? If they can measure impact, couldn’t they get past the resistance that [Warren] Buffet highlighted and finally separate what works from what doesn’t?'”

p. 208:
“Once we abandon the false notions that financial markets are precision instruments for measuring unambiguous phenomena, and that the business and nonproft sectors are based in mutually exclusive principles of value, we can deconstruct the true nature of the problems we need to address and adapt market-like mechanisms that are suited to the particulars of the social sector.
“All of this is a long way (okay, a very long way) of saying that even ordinal rankings of nonprofit investments can have tremendous value in choosing among competing donation opportunities, especially when the choices are so numerous and varied. If I’m a social investor, I’d really like to know which nonprofits are likely to produce ‘more’ impact and which ones are likely to produce ‘less.'”

“It isn’t necessary to replicate the complex working of the modern stock markets to fashion an intelligent and useful nonprofit capital allocation mechanism. All we’re looking for is some kind of functional indication that would (1) isolate promising nonprofit investments from among the confusing swarm of too many seemingly worthy social-purpose organizations and (2) roughly differentiate among them based on the likelihood of ‘more’ or ‘less’ impact. This is what I meant earlier by increasing [p. 209] signals and decreasing noise.”

p. 209:
Goldberg apparently didn’t read De Soto, as he says that the mystery of capital is posed by Kurtzman and says it is solved via the collective intelligence and wisdom of crowds. This completely misses the point of the crucial value that transparent representations of structural invariance hold in market functionality. Goldberg is apparently offering a loose kind of market for which there is an aggregate index of stocks for nonprofits that are built up from their various ordinal performance measures. I think I find a better way in my work, building more closely from De Soto (Fisher, 2002, 2003, 2005, 2007, 2009a, 2009b).

p. 231:
Goldberg quotes Harvard’s Allen Grossman (1999) on the cost-benefit boundaries of more effective nonprofit capital allocation:

“‘Is there a significant downside risk in restructuring some portion of the philanthropic capital markets to test the effectiveness of performance driven philanthropy? The short answer is, ‘No.’ The current reality is that most broad-based solutions to social problems have eluded the conventional and fragmented approaches to philanthropy. It is hard to imagine that experiments to change the system to a more performance driven and rational market would negatively impact the effectiveness of the current funding flows–and could have dramatic upside potential.'”

p. 232:
Quotes Douglas Hubbard’s How to Measure Anything book that Stenner endorsed, and Linacre and I didn’t.

p. 233:
Cites Stevens on the four levels of measurement and uses it to justify his position concerning ordinal rankings, recognizing that “we can’t add or subtract ordinals.”

pp. 233-5:
Justifies ordinal measures via example of Google’s PageRank algorithm. [I could connect from here using Mary Garner’s (2009) comparison of PageRank with Rasch.]

p. 236:
Goldberg tries to justify the use of ordinal measures by citing their widespread use in social science and health care. He conveniently ignores the fact that virtually all of the same problems and criticisms that apply to philanthropic capital markets also apply in these areas. In not grasping the fundamental value of De Soto’s concept of transferable and transparent representations, and in knowing nothing of Rasch measurement, he was unable to properly evaluate to potential of ordinal data’s role in the formation of philanthropic capital markets. Ordinal measures aren’t just not good enough, they represent a dangerous diversion of resources that will be put into systems that take on lives of their own, creating a new layer of dysfunctional relationships that will be hard to overcome.

p. 261 [Goldberg shows here his complete ignorance about measurement. He is apparently totally unaware of the work that is in fact most relevant to his cause, going back to Thurstone in 1920s, Rasch in the 1950s-1970s, and Wright in the 1960s to 2000. Both of the problems he identifies have long since been solved in theory and in practice in a wide range of domains in education, psychology, health care, etc.]:
“Having first studied performance evaluation some 30 years ago, I feel confident in saying that all the foundational work has been done. There won’t be a ‘eureka!’ breakthrough where someone finally figures out the one true way to guage nonprofit effectiveness.
“Indeed, I would venture to say that we know virtually everything there is to know about measuring the performance of nonprofit organizations with only two exceptions: (1) How can we compare nonprofits with different missions or approaches, and (2) how can we make actionable performance assessments common practice for growth-ready mid-caps and readily available to all prospective donors?”

p. 263:
“Why would a social entrepreneur divert limited resources to impact assessment if there were no prospects it would increase funding? How could an investor who wanted to maximize the impact of her giving possibly put more golden eggs in fewer impact-producing baskets if she had no way to distinguish one basket from another? The result: there’s no performance data to attract growth capital, and there’s no growth capital to induce performance measurement. Until we fix that Catch-22, performance evaluation will not become an integral part of social enterprise.”

pp. 264-5:
Long quotation from Ken Berger at Charity Navigator on their ongoing efforts at developing an outcome measurement system. [wpf, 8 Nov 2009: I read the passage quoted by Goldberg in Berger’s blog when it came out and have been watching and waiting ever since for the new system. wpf, 8 Feb 2012: The new system has been online for some time but still does not include anything on impacts or outcomes. It has expanded from a sole focus on financials to also include accountability and transparency. But it does not yet address Goldberg’s concerns as there still is no way to tell what works from what doesn’t.]

p. 265:
“The failure of the social sector to coordinate independent assets and create a whole that exceeds the sum of its parts results from an absence of.. platform leadership’: ‘the ability of a company to drive innovation around a particular platform technology at the broad industry level.’ The object is to multiply value by working together: ‘the more people who use the platform products, the more incentives there are for complement producers to introduce more complementary products, causing a virtuous cycle.'” [Quotes here from Cusumano & Gawer (2002). The concept of platform leadership speaks directly to the system of issues raised by Miller & O’Leary (2007) that must be addressed to form effective HSN capital markets.]

p. 266:
“…the nonprofit sector has a great deal of both money and innovation, but too little available information about too many organizations. The result is capital fragmentation that squelches growth. None of the stakeholders has enough horsepower on its own to impose order on this chaos, but some kind of realignment could release all of that pent-up potential energy. While command-and-control authority is neither feasible nor desirable, the conditions are ripe for platform leadership.”

“It is doubtful that the IMPEX could amass all of the resources internally needed to build and grow a virtual nonprofit stock market that could connect large numbers of growth-capital investors with large numbers of [p. 267] growth-ready mid-caps. But it might be able to convene a powerful coalition of complementary actors that could achieve a critical mass of support for performance-based philanthropy. The challenge would be to develop an organization focused on filling the gaps rather than encroaching on the turf of established firms whose participation and innovation would be required to build a platform for nurturing growth of social enterprise..”

p. 268-9:
Intermediated nonprofit capital market shifts fundraising burden from grantees to intermediaries.

p. 271:
“The surging growth of national donor-advised funds, which simplify and reduce the transaction costs of methodical giving, exemplifies the kind of financial innovation that is poised to leverage market-based investment guidance.” [President of Schwab Charitable quoted as wanting to make charitable giving information- and results-driven.]

p. 272:
Rating agencies and organizations: Charity Navigator, Guidestar, Wise Giving Alliance.
Online donor rankings: GlobalGiving, GreatNonprofits, SocialMarkets
Evaluation consultants: Mathematica

Google’s mission statement: “to organize the world’s information and make it universally accessible and useful.”

p. 273:
Exhibit 9.4 Impact Index Whole Product
Image of stakeholders circling IMPEX:
Trading engine
Listed nonprofits
Data producers and aggregators
Trading community
Researchers and analysts
Investors and advisors
Government and business supporters

p. 275:
“That’s the starting point for replication [of social innovations that work]: finding and funding; matching money with performance.”

[WPF bottom line: Because Goldberg misses De Soto’s point about transparent representations resolving the mystery of capital, he is unable to see his way toward making the nonprofit capital markets function more like financial capital markets, with the difference being the focus on the growth of human, social, and natural capital. Though Goldberg intuits good points about the wisdom of crowds, he doesn’t know enough about the flaws of ordinal measurement relative to interval measurement, or about the relatively easy access to interval measures that can be had, to do the job.]

References

Cusumano, M. A., & Gawer, A. (2002, Spring). The elements of platform leadership. MIT Sloan Management Review, 43(3), 58.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Fisher, W. P., Jr. (2002, Spring). “The Mystery of Capital” and the human sciences. Rasch Measurement Transactions, 15(4), 854 [http://www.rasch.org/rmt/rmt154j.htm].

Fisher, W. P., Jr. (2003). Measurement and communities of inquiry. Rasch Measurement Transactions, 17(3), 936-8 [http://www.rasch.org/rmt/rmt173.pdf].

Fisher, W. P., Jr. (2005). Daredevil barnstorming to the tipping point: New aspirations for the human sciences. Journal of Applied Measurement, 6(3), 173-9 [http://www.livingcapitalmetrics.com/images/FisherJAM05.pdf].

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In M. Wilson, K. Draney, N. Brown & B. Duckor (Eds.), Advances in Rasch Measurement, Vol. Two (p. in press [http://www.livingcapitalmetrics.com/images/BringingHSN_FisherARMII.pdf]). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2009b, November). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement (Elsevier), 42(9), 1278-1287.

Garner, M. (2009, Autumn). Google’s PageRank algorithm and the Rasch measurement model. Rasch Measurement Transactions, 23(2), 1201-2 [http://www.rasch.org/rmt/rmt232.pdf].

Grossman, A. (1999). Philanthropic social capital markets: Performance driven philanthropy (Social Enterprise Series 12 No. 00-002). Harvard Business School Working Paper.

Kotter, J. (1996). Leading change. Cambridge, Massachusetts: Harvard Business School Press.

Kurtzman, J. (2002). How the markets really work. New York: Crown Business.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Measuring Instruments as Media for the Expression of Creative Passions in Education

June 26, 2015

Measurement is often viewed as a reduction of complex phenomena to numbers. It is accordingly also often conceived as mechanical, and disconnected from the world of life. Educational examinations are seen by many as an especially egregious form of inappropriate reduction. This perspective is contradicted, however, by a perspective that sees an analogy between educational assessment and music. Calibrated instruments, mathematical scales, and high technology play key roles in the production of music, which, ironically, is widely considered the most alive, captivating and emotionally powerful of the arts. Though behavioral psychology has indeed learned how to use music to manipulate consumer purchasing decisions, music is unabashedly accepted nonetheless as the highest expression of passion in art.

The question then arises as to if and how measurement in other areas, such as in education, might be conceived, designed, and practiced as a medium for the expression and fulfillment of creative passions. Key issues involved in substantively realizing a musical metaphor in human and social measurement include capacities to tune instruments, to define common scales, to score performances, to orchestrate harmonious relationships, to enhance choral grace note effects, and to combine elements in unique but pleasing and recognizable rhythmic arrangements.

Practical methods for making educational measurement the medium for the expression of creative passions for learning are in place in thousands of schools nationally and internationally. With such tools in hand, formative applications of integrated instruction and assessment could be conceived as intuitive media for composing and conducting expressions of creative passions. Student outcomes in reading, mathematics, and other domains may then come to be seen in terms of portfolios of works akin to those produced by musicians, sculptors, film makers, or painters.

Hundreds of thousands of books and millions of articles tuned to the same text complexity scale, for instance, provide readers an extensive palette of colorful tones and timbres for expressing their desires and capacities for learning. Graphical presentations of individual students’ outcomes, as well as outcomes aggregated by classroom, school, district, etc., could be presented, interpreted and experienced as public performances of artful developmental narratives enabling dramatic performances of personal uniqueness and social generality.

Measurement instrumentation in education is able to capture, aggregate, and organize literacy, numeracy, socio-emotional intelligence, and other performances into special portfolios documenting the play and dance of emerging new understandings. As in any creative process, accidents, errors, and idiosyncratic patterns of strengths and weaknesses may evoke powerful and dramatic expressions of beauty, and human and social value. And just as members of musical ensembles may complement one another’s skills, using rhythm and harmony to improve each others’ playing abilities in practice, so, too, instruments of formative assessment tuned to the same scale can be used to coordinate and enhance individual student and teacher skill levels.

Possibilities for orchestrating such performances across educational, health care, social service, environmental management, and other fields could similarly take advantage of existing instrument calibration and measurement technologies.

Measurement as a Medium for the Expression of Creative Passions in Education

April 23, 2014

Measurement is often viewed as a purely technical task involving a reduction of complex phenomena to numbers. It is accordingly also experienced as mechanical in nature, and disconnected from the world of life. Educational examinations are often seen as an especially egregious form of inappropriate reduction.

This perspective on measurement is contradicted, however, by the essential roles of calibrated instrumentation, mathematical scales, and high technology in the production of music, which, ironically, is widely considered the most alive, captivating and emotionally powerful of the arts.

The question then arises as to if and how measurement in other areas, such as in education, might be conceived, designed, and practiced as a medium for the expression and fulfillment of creative passions. Key issues involved in substantively realizing a musical metaphor in human and social measurement include capacities to tune instruments, to define common scales, to orchestrate harmonious relationships, to enhance choral grace note effects, and to combine elements in unique but pleasing and recognizable forms.

Practical methods of this kind are in place in hundreds of schools nationally and internationally. With such tools in hand, formative applications of integrated instruction and assessment could be conceived as intuitive media for composing and conducting expressions of creative passions.

Student outcomes in reading, mathematics, and other domains may then come to be seen in terms of portfolios of works akin to those produced by musicians, sculptors, film makers, or painters. Hundreds of thousands of books and millions of articles tuned to the same text complexity scale provide readers an extensive palette of colorful tones and timbres for expressing their desires and capacities for learning. Graphical presentations of individual students’ outcomes, as well as outcomes aggregated by classroom, school, district, etc., may be interpreted and experienced as public performances of artful developmental narratives enabling dramatic performances of personal uniqueness and social generality.

Technical canvases capture, aggregate, and organize literacy performances into special portfolios documenting the play and dance of emerging new understandings. As in any creative process, accidents, errors, and idiosyncratic patterns of strengths and weaknesses may evoke powerful expressions of beauty, and human and social value. Just as members of musical ensembles may complement one another’s skills, using rhythm and harmony to improve each others’ playing abilities in practice, so, too, instruments of formative assessment tuned to the same scale can be used to enhance individual teacher skill levels.

Possibilities for orchestrating such performances across educational, health care, social service, environmental management, and other fields could similarly take advantage of existing instrument calibration and measurement technologies.

Creatively Expressing How Love Matters for Justice: Setting the Stage and Tuning the Instruments

April 16, 2014

Nussbaum (2013) argues about the political importance of connecting with our bodies without shame and disgust, and of the relevance musical and poetic public expressions of varieties of love offer to conceptions of justice. Institutions embodying principles of loving justice require media integrating emotional expression with technical calculation, in exactly the same way music does. Being able to dance at the revolution demands instruments tuned to shared scales, no matter if equal temperament, just intonation, meantone tuning, or any of a variety of other well, or irregular, temperaments are chosen.

The physicality of dancing, so often evoking romance and courtship, provides a point of entry to a metaphoric logic of reproduction applicable to the Socratic midwifery of ideas and to the products of social intercourse. Tuning the instruments of the human, social, and environmental arts and sciences to harmonize and choreograph relationships may then enable formulation of nonreductionist approaches to the problem of how to reconcile political emotions with physical or geometrical accounts of the scales of justice.

Historical accounts of (musical, medical, electrical, etc.) metrological standards describe ways in which passionate concern for shared vulnerabilities and common joys have sometimes succeeded in deploying systems realizing higher forms of just relations (Alder, 2002; Berg and Timmermans, 2000;  Isacoff, 2001; Schaffer, 1992). The question of the day is whether we will succeed in creating yet new forms of such relations in the many areas of life where they are needed.

Yes, as Nussbaum (2013, p. 396) admits, the demand for love is a tall order, and unrealistic. But all heuristic fictions, from Pythagorean triangles to the mathematical pendulum, are unrealistic and are never actually observed in practice, as has been pointed out by a number of historians and philosophers (Butterfield 1957, pp. 16-17; Heidegger, 1967, p. 89; Rasch, 1960, pp. 37-38, 1973/2011). These fictions are, however, eminently useful as guides, goals, and as coherent ways of telling our stories, and that is the criterion by which they should be judged.

 

Alder, K. (2002). The measure of all things: The seven-year odyssey and hidden error that transformed the world. New York: The Free Press.

Berg, M., & Timmermans, S. (2000). Order and their others: On the constitution of universalities in medical work. Configurations, 8(1), 31-61.

Butterfield, H. (1957). The origins of modern science (revised edition). New York: The Free Press.

Heidegger, M. (1967). What is a thing? (W. B. Barton, Jr. & V. Deutsch, Trans.). South Bend, Indiana: Regnery/Gateway.

Isacoff, S. M. (2001). Temperament: The idea that solved music’s greatest riddle. New York: Alfred A. Knopf.

Nussbaum, M. (2013). Political emotions: Why love matters for justice. Cambridge, MA: The Belknap Press of Harvard University Press.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.)

Rasch, G. (1973/2011, Spring). All statistical models are wrong! Comments on a paper presented by Per Martin-Löf, at the Conference on Foundational Questions in Statistical Inference, Aarhus, Denmark, May 7-12, 1973. Rasch Measurement Transactions, 24(4), 1309 [http://www.rasch.org/rmt/rmt244.pdf].

Schaffer, S. (1992). Late Victorian metrology and its instrumentation: A manufactory of Ohms. In R. Bud & S. E. Cozzens (Eds.), Invisible connections: Instruments, institutions, and science (pp. 23-56). Bellingham, WA: SPIE Optical Engineering Press.

The New Information Platform No One Sees Coming

December 6, 2012

I’d like to draw your attention to a fundamentally important area of disruptive innovations no one seems to see coming. The biggest thing rising in the world of science today that does not appear to be on anyone’s radar is measurement. Transformative potential beyond that of the Internet itself is available.

Realizing that potential will require an Intangible Assets Metric System. This system will connect together all the different ways any one thing is measured, bringing common languages for representing human, social, and economic value into play everywhere. We need these metrics on the front lines of education, health care, social services, and in human, reputation, and natural resource management, as well as in the economic models and financial spreadsheets informing policy, and in the scientific research conducted in dozens of fields.

All reading ability measures, for instance, should be transparently, inexpensively, and effortlessly expressed in a universally uniform metric, in the same way that standardized measures of weight and volume inform grocery store purchasing decisions. We have made starts at such systems for reading, writing, and math ability measures, and for health status, functionality, and chronic disease management measures. There oddly seems to be, however, little awareness of the full value that stands to be gained from uniform metrics in these areas, despite the overwhelming human, economic, and scientific value derived from standardized units in the existing economy. There has accordingly been virtually no leadership or investment in this area.

Measurement practice in business is woefully out of touch with the true paradigm shift that has been underway in psychometrics for years, even though the mantra “you manage what you measure” is repeated far and wide. In a fascinating twist, practically the only ones who notice the business world’s conceptual shortfall in measurement practice are the contrarians who observe that quantification can often be more of a distraction from management than the medium of its execution—but this is true only when measures are poorly conceived, designed, and implemented.

Demand for better measurement—measurement that reduces data volume not only with no loss of information but with the addition of otherwise unavailable interstitial information; that supports mass customized comparability for informed purchasing and quality improvement decisions; and that enables common product definitions for outcomes-based budgeting—is growing hand in hand with the spread of resilient, nimble, lean, and adaptive business models, and with the ongoing geometrical growth in data volume.

An even bigger source of demand for the features of advanced measurement is the increasing dependence of the economy on intangible assets, those forms of human, social, and natural capital that comprise 90% or more of the total capital under management. We will bring these now economically dead forms of capital to life by systematically standardizing representations of their quality and quantity. The Internet is the planetary nervous system through which basic information travels, and the Intangible Assets Metric System will be the global cerebrum, where higher order thinking takes place.

It will not be possible to realize the full potential of lean thinking in the information- and service-based economy without an Intangible Assets Metric System. Given the long-proven business value of standards and the role of measurement in management, it seems self-evident that our ongoing economic difficulties stem largely from our failure to develop and deploy an Intangible Assets Metric System providing common currencies for the exchange of authentic wealth. The future of sustainable and socially responsible business practices must surely depend extensively on universal access to flexible and practical uniform metrics for intangible assets.

Of course, for global intangible assets standards to be viable, they must be adaptable to local business demands and conditions without compromising their comparability. And that is just what is most powerfully disruptive about contemporary measurement methods: they make mass customization a reality. They’ve been doing so in computerized testing since the 1970s. Isn’t it time we started putting this technology to systematic use in a wide range of applications, from human and environmental resource management to education, health care, and social services?

Comments on the New ANSI Human Capital Investor Metrics Standard

April 16, 2012

The full text of the proposed standard is available here.

It’s good to see a document emerge in this area, especially one with such a broad base of support from a diverse range of stakeholders. As is stated in the standard, the metrics defined in it are a good place to start and in many instances will likely improve the quality and quantity of the information made available to investors.

There are several issues to keep in mind as the value of standards for human capital metrics becomes more widely appreciated. First, in the context of a comprehensively defined investment framework, human capital is just one of the four major forms of capital, the other three being social, natural, and manufactured (Ekins, 1992; Ekins, Dresden, and Dahlstrom, 2008). To ensure as far as possible the long term stability and sustainability of their profits, and of the economic system as a whole, investors will certainly want to expand the range of the available standards to include social and natural capital along with human capital.

Second, though we manage what we measure, investment management is seriously compromised by having high quality scientific measurement standards only for manufactured capital (length, weight, volume, temperature, energy, time, kilowatts, etc.). Over 80 years of research on ability tests, surveys, rating scales, and assessments has reached a place from which it is prepared to revolutionize the management of intangible forms of capital (Fisher, 2007, 2009a, 2009b, 2010, 2011a, 2011b; Fisher & Stenner, 2011a, 2011b; Wilson, 2011; Wright, 1999). The very large reductions in transaction costs effected by standardized metrics in the economy at large (Barzel, 1982; Benham and Benham, 2000) are likely to have a similarly profound effect on the economics of human, social, and natural capital (Fisher, 2011a, 2012a, 2012b).

The potential for dramatic change in the conceptualization of metrics is most evident in the proposed standard in the sections on leadership quality and employee engagement. For instance, in the section on leadership quality, it is stated that “Investors will be able to directly compare all organizations that are using the same vendor’s methodology.” This kind of dependency should not be allowed to stand as a significant factor in a measurement standard. Properly constructed and validated scientific measures, such as those that have been in wide use in education, psychology and health care for several decades (Andrich, 2010; Bezruzcko, 2005; Bond and Fox, 2007; Fisher and Wright, 1994; Rasch, 1960; Salzberger, 2009; Wright, 1999), are equated to a common unit. Comparability should never depend on which vendor is used. Rather, any instrument that actually measures the construct of interest (leadership quality or employee engagement) should do so in a common unit and within an acceptable range of error. “Normalizing” measures for comparability, as is suggested in the standard, means employing psychometric methods that are 50 years out of date and that are far less rigorous and practical than need be. Transparency in measurement means looking through the instrument to the thing itself. If particular instruments color or reshape what is measured, or merely change the meaning of the numbers reported, then the integrity of the standard as a standard should be re-examined.

Third, for investments in human capital to be effectively managed, each distinct aspect of it (motivations, skills and abilities, health) needs to be measured separately, just as height, weight, and temperature are. New technologies have already transformed measurement practices in ways that make the necessary processes precise and inexpensive. Of special interest are adaptively administered precalibrated instruments supporting mass customized—but globally comparable—measures (for instance, see the examples at http://blog.lexile.com/tag/oasis/ and that were presented at the recent Pearson Global Research Conference in Fremantle, Australia http://www.pearson.com.au/marketing/corporate/pearson_global/default.html; also see Wright and Bell 1984, Lunz, Bergstrom, and Gershon, 1994, Bejar, et al., 2003).

Fourth, the ownership of human capital needs clarification and legal status. If we consider each individual to own their abilities, health, and motivations, and to be solely responsible for decisions made concerning the disposition of those properties, then, in accord with their proven measured amounts of each type of human capital, everyone ought to have legal title to a specific number of shares or credits of each type. This may transform employment away from wage-based job classification compensation to an individualized investment-based continuous quality improvement platform. The same kind of legal titling system will, of course, need to be worked out for social and natural capital, as well.

Fifth, given scientific standards for each major form of capital, practical measurement technologies, and legal title to our shares of capital, we will need expanded financial accounting standards and tools for managing our individual and collective investments. Ongoing research and debates concerning these standards and tools (Siegel and Borgia, 2006; Young and Williams, 2010) have yet to connect with the larger scientific, economic, and legal issues raised here, but developments in this direction should be emerging in due course.

Sixth, a number of lingering moral, ethical and political questions are cast in a new light in this context. The significance of individual behaviors and decisions is informed and largely determined by the context of the culture and institutions in which those behaviors and decisions are executed. Many of the morally despicable but not illegal investment decisions leading to the recent economic downturn put individuals in the position of either setting themselves apart and threatening their careers or doing what was best for their portfolios within the limits of the law. Current efforts intended to devise new regulatory constraints are misguided in focusing on ever more microscopically defined particulars. What is needed is instead a system in which profits are contingent on the growth of human, social, and natural capital. In that framework, legal but ultimately unfair practices would drive down social capital stock values, counterbalancing ill-gotten gains and making them unprofitable.

Seventh, the International Vocabulary of Measurement, now in its third edition (VIM3), is a standard recognized by all eight international standards accrediting bodies (BIPM, etc.). The VIM3 (http://www.bipm.org/en/publications/guides/vim.html) and forthcoming VIM4 are intended to provide a uniform set of concepts and terms for all fields that employ measures across the natural and social sciences. A new dialogue on these issues has commenced in the context of the International Measurement Confederation (IMEKO), whose member organizations are the weights and standards measurement institutes from countries around the world (Conference note, 2011). The 2012 President of the Psychometric Society, Mark Wilson, gave an invited address at the September 2011 IMEKO meeting (Wilson, 2011), and a member of the VIM3 editorial board, Luca Mari, is invited to speak at the July, 2012 International Meeting of the Psychometric Society. I encourage all interested parties to become involved in efforts of these kinds in their own fields.

References

Andrich, D. (2010). Sufficiency and conditional estimation of person parameters in the polytomous Rasch model. Psychometrika, 75(2), 292-308.

Barzel, Y. (1982). Measurement costs and the organization of markets. Journal of Law and Economics, 25, 27-48.

Bejar, I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003, November). A feasibility study of on-the-fly item generation in adaptive testing. The Journal of Technology, Learning, and Assessment, 2(3), 1-29; http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1663.

Benham, A., & Benham, L. (2000). Measuring the costs of exchange. In C. Ménard (Ed.), Institutions, contracts and organizations: Perspectives from new institutional economics (pp. 367-375). Cheltenham, UK: Edward Elgar.

Bezruczko, N. (Ed.). (2005). Rasch measurement in health sciences. Maple Grove, MN: JAM Press.

Bond, T., & Fox, C. (2007). Applying the Rasch model: Fundamental measurement in the human sciences, 2d edition. Mahwah, New Jersey: Lawrence Erlbaum Associates.

Conference note. (2011). IMEKO Symposium: August 31- September 2, 2011, Jena, Germany. Rasch Measurement Transactions, 25(1), 1318.

Ekins, P. (1992). A four-capital model of wealth creation. In P. Ekins & M. Max-Neef (Eds.), Real-life economics: Understanding wealth creation (pp. 147-155). London: Routledge.

Ekins, P., Dresner, S., & Dahlstrom, K. (2008). The four-capital method of sustainable development evaluation. European Environment, 18(2), 63-80.

Fisher, W. P., Jr. (2007). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2009a). Invariance and traceability for measures of human, social, and natural capital: Theory and application. Measurement, 42(9), 1278-1287.

Fisher, W. P.. Jr. (2009b). NIST Critical national need idea White Paper: metrological infrastructure for human, social, and natural capital (http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC: National Institute for Standards and Technology.

Fisher, W. P.. Jr. (2010). Rasch, Maxwell’s method of analogy, and the Chicago tradition. In G. Cooper (Chair), https://conference.cbs.dk/index.php/rasch/Rasch2010/paper/view/824. Probabilistic models for measurement in education, psychology, social science and health: Celebrating 50 years since the publication of Rasch’s Probabilistic Models.., University of Copenhagen School of Business, FUHU Conference Centre, Copenhagen, Denmark.

Fisher, W. P., Jr. (2011a). Bringing human, social, and natural capital to life: Practical consequences and opportunities. In N. Brown, B. Duckor, K. Draney & M. Wilson (Eds.), Advances in Rasch Measurement, Vol. 2 (pp. 1-27). Maple Grove, MN: JAM Press.

Fisher, W. P., Jr. (2011b). Measurement, metrology and the coordination of sociotechnical networks. In  S. Bercea (Chair), New Education and Training Methods. International Measurement Confederation (IMEKO), http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24491/ilm1-2011imeko-017.pdf, Jena, Germany.

Fisher, W. P., Jr. (2012a). Measure local, manage global: Intangible assets metric standards for sustainability. In J. Marques, S. Dhiman & S. Holt (Eds.), Business administration education: Changes in management and leadership strategies (pp. in press). New York: Palgrave Macmillan.

Fisher, W. P., Jr. (2012b). What the world needs now: A bold plan for new standards. Standards Engineering, 64, in press.

Fisher, W. P., Jr., & Stenner, A. J. (2011a). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). Retrieved 25 October 2011, from National Science Foundation: http://www.nsf.gov/sbe/sbe_2020/submission_detail.cfm?upld_id=36.

Fisher, W. P., Jr., & Stenner, A. J. (2011b). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO) TC1-TC7-TC13 Joint Symposium, http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24493/ilm1-2011imeko-018.pdf, Jena, Germany.

Fisher, W. P., Jr., & Wright, B. D. (Eds.). (1994). Applications of probabilistic conjoint measurement. International Journal of Educational Research, 21(6), 557-664.

Lunz, M. E., Bergstrom, B. A., & Gershon, R. C. (1994). Computer adaptive testing. International Journal of Educational Research, 21(6), 623-634.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Salzberger, T. (2009). Measurement in marketing research: An alternative framework. Northampton, MA: Edward Elgar.

Siegel, P., & Borgia, C. (2006). The measurement and recognition of intangible assets. Journal of Business and Public Affairs, 1(1).

Wilson, M. (2011). The role of mathematical models in measurement: A perspective from psychometrics. In L. Mari (Chair), Plenary lecture. International Measurement Confederation (IMEKO), http://www.db-thueringen.de/servlets/DerivateServlet/Derivate-24178/ilm1-2011imeko-005.pdf, Jena, Germany.

Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65-104 [http://www.rasch.org/memo64.htm]). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Wright, B. D., & Bell, S. R. (1984, Winter). Item banks: What, why, how. Journal of Educational Measurement, 21(4), 331-345 [http://www.rasch.org/memo43.htm].

Young, J. J., & Williams, P. F. (2010, August). Sorting and comparing: Standard-setting and “ethical” categories. Critical Perspectives on Accounting, 21(6), 509-521.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.