Posts Tagged ‘natural law’

Externalities are to markets as anomalies are to scientific laws

October 28, 2011

Economic externalities are to efficient markets as any consistent anomaly is relative to a lawful regularity. Government intervention in markets is akin to fudging the laws of physics to explain the wobble in Uranus’ orbit, or to explain why magnetized masses would not behave like wooden or stone masses in a metal catapult (Rasch’s example). Further, government intervention in markets is necessary only as long as efficient markets for externalized forms of capital are not created. The anomalous exceptions to the general rule of market efficiency have long since been shown to themselves be internally consistent lawful regularities in their own right amenable to configuration as markets for human, social and natural forms of capital.

There is an opportunity here for the concise and elegant statement of the efficient markets hypothesis, the observation of certain anomalies, the formulation of new theories concerning these forms of capital, the framing of efficient markets hypotheses concerning the behavior of these anomalies, tests of these hypotheses in terms of the inverse proportionality of two of the parameters relative to the third, proposals as to the uniform metrics by which the scientific laws will be made commercially viable expressions of capital value, etc.

We suffer from the illusion that trading activity somehow spontaneously emerges from social interactions. It’s as though comparable equivalent value is some kind of irrefutable, incontestable feature of the world to which humanity adapts its institutions. But this order of things plainly puts the cart before the horse when the emergence of markets is viewed historically. The idea of fair trade, how it is arranged, how it is recognized, when it is appropriate, etc. varies markedly across cultures and over time.

Yes, “’the price of things is in inverse ratio to the quantity offered and in direct ratio to the quantity demanded’ (Walras 1965, I, 216-17)” (Mirowski, 1988, p. 20). Yes, Pareto made “a direct extrapolation of the path-independence of equilibrium energy states in rational mechanics and thermodynamics” to “the path-independence of the realization of utility” (Mirowski, 1988, p. 21). Yes, as Ehrenfest showed, “an analogy between thermodynamics and economics” can be made, and economic concepts can be formulated “as parallels of thermodynamic concepts, with the concept of equilibrium occupying the central position in both theories” (Boumans, 2005, p. 31).  But markets are built up around these lawful regularities by skilled actors who articulate the rules, embody the roles, and initiate the relationships comprising economic, legal, and scientific institutions. “The institutions define the market, rather than the reverse” (Miller & O’Leary, 2007, p. 710). What we need are new institutions built up around the lawful regularities revealed by Rasch models. The problem is how to articulate the rules, embody the roles, and initiate the relationships.

Noyes (1936, pp. 2, 13; quoted in De Soto 2000, p. 158) provides some useful pointers:

“The chips in the economic game today are not so much the physical goods and actual services that are almost exclusively considered in economic text books, as they are that elaboration of legal relations which we call property…. One is led, by studying its development, to conceive the social reality as a web of intangible bonds–a cobweb of invisible filaments–which surround and engage the individual and which thereby organize society…. And the process of coming to grips with the actual world we live in is the process of objectivizing these relations.”

 Noyes (1936, p. 20, quoted in De Soto 2000, p. 163) continues:

“Human nature demands regularity and certainty and this demand requires that these primitive judgments be consistent and thus be permitted to crystallize into certain rules–into ‘this body of dogma or systematized prediction which we call law.’ … The practical convenience of the public … leads to the recurrent efforts to systematize the body of laws. The demand for codification is a demand of the people to be released from the mystery and uncertainty of unwritten or even of case law.” [This is quite an apt statement of the largely unstated demands of the Occupy Wall Street movement.]

  De Soto (2000, p. 158) explains:

 “Lifting the bell jar [integrating legal and extralegal property rights], then, is principally a legal challenge. The official legal order must interact with extralegal arrangements outside the bell jar to create a social contract on property and capital. To achieve this integration, many other disciplines are of course necessary … [economists, urban planners, agronomists, mappers, surveyers, IT specialists, etc]. But ultimately, an integrated national social contract will be concretized only in laws.”

  “Implementing major legal change is a political responsibility. There are various reasons for this. First, law is generally concerned with protecting property rights. However, the real task in developing and former communist countries is not so much to perfect existing rights as to give everyone a right to property rights–‘meta-rights,’ if you will. [Paraphrasing, the real task in the undeveloped domains of human, social, and natural capital is not so much the perfection of existing rights as it is to harness scientific measurement in the name of economic justice and grant everyone legal title to their shares of their ownmost personal properties, their abilities, health, motivations, and trustworthiness, along with their shares of the common stock of social and natural resources.] Bestowing such meta-rights, emancipating people from bad law, is a political job. Second, very small but powerful vested interests–mostly repre- [p. 159] sented by the countries best commercial lawyers–are likely to oppose change unless they are convinced otherwise. Bringing well-connected and moneyed people onto the bandwagon requires not consultants committed to serving their clients but talented politicians committed to serving their people. Third, creating an integrated system is not about drafting laws and regulations that look good on paper but rather about designing norms that are rooted in people’s beliefs and are thus more likely to be obeyed and enforced. Being in touch with real people is a politician’s task. Fourth, prodding underground economies to become legal is a major political sales job.”

 De Soto continues (p. 159), intending to refer only to real estate but actually speaking of the need for formal legal title to personal property of all kinds, which ought to include human, social, and natural capital:

  “Without succeeding on these legal and political fronts, no nation can overcome the legal apartheid between those who can create capital and those who cannot. Without formal property, no matter how many assets they accumulate or how hard they work, most people will not be able to prosper in a capitalist society. They will continue to remain beyond the radar of policymakers, out of the reach of official records, and thus economically invisible.”

Boumans, M. (2005). How economists model the world into numbers. New York: Routledge.

De Soto, H. (2000). The mystery of capital: Why capitalism triumphs in the West and fails everywhere else. New York: Basic Books.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Mirowski, P. (1988). Against mechanism: Protecting economics from science. Lanham, MD: Rowman & Littlefield.

Noyes, C. R. (1936). The institution of property. New York: Longman’s Green.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Rasch Measurement as a Basis for a New Standards Framework

October 26, 2011

The 2011 U.S. celebration of World Standards Day took place on October 13 at the Fairmont Hotel in Washington, D.C., with the theme of “Advancing Safety and Sustainability Standards Worldwide.” The evening began with a reception in a hall of exhibits from the celebrations sponsors, which included the National Institute for Standards and Technology (NIST), the Society for Standards Professionals (SES), the American National Standards Institute (ANSI), Microsoft, IEEE, Underwriters Laboratories, the Consumer Electronics Association, ASME, ASTM International, Qualcomm, Techstreet, and many others. Several speakers took the podium after dinner to welcome the 400 or so attendees and to present the World Standards Day Paper Competition Awards and the Ronald H. Brown Standards Leadership Award.

Dr. Patrick Gallagher, Under Secretary of Commerce for Standards and Technology, and Director of NIST, was the first speaker after dinner. He directed his remarks at the value of a decentralized, voluntary, and demand-driven system of standards in promoting innovation and economic prosperity. Gallagher emphasized that “standards provide the common language that keeps domestic and international trade flowing,” concluding that “it is difficult to overestimate their critical value to both the U.S. and global economy.”

James Shannon, President of the National Fire Protection Association (NFPA), accepted the R. H. Brown Standards Leadership Award in recognition for his work initiating or improving the National Electrical Code, the Life Safety Code, and the Fire Safe Cigarette and Residential Sprinkler Campaigns.

Ellen Emard, President of SES, introduced the paper competition award winners. As of this writing the titles and authors of the first and second place awards are not yet available on the SES web site (http://www.ses-standards.org/displaycommon.cfm?an=1&subarticlenbr=56). I took third place for my paper, “What the World Needs Now: A Bold Plan for New Standards.” Where the other winning papers took up traditional engineering issues concerning the role of standards in advancing safety and sustainability issues, my paper spoke to the potential scientific and economic benefits that could be realized by standard metrics and common product definitions for outcomes in education, health care, social services, and environmental resource management. All three of the award-winning papers will appear in a forthcoming issue of Standards Engineering, the journal of SES.

I was coincidentally seated at the dinner alongside Gordon Gillerman, winner of third place in the 2004 paper competition (http://www.ses-standards.org/associations/3698/files/WSD%202004%20-%203%20-%20Gillerman.pdf) and currently Chief of the Standards Services Division at NIST. Gillerman has a broad range of experience in coordinating standards across multiple domains, including environmental protection, homeland security, safety, and health care. Having recently been involved in a workshop focused on measuring, evaluating, and improving the usability of electronic health records (http://www.nist.gov/healthcare/usability/upload/EHR-Usability-Workshop-2011-6-03-2011_final.pdf), Gillerman was quite interested in the potential Rasch measurement techniques hold for reducing data volume with no loss of information, and so for streamlining computer interfaces.

Robert Massof of Johns Hopkins University accompanied me to the dinner, and was seated at a nearby table. Also at Massof’s table were several representatives of the National Institute of Building Sciences, some of whom Massof had recently met at a workshop on adaptations for persons with low vision disabilities. Massof’s work equating the main instruments used for assessing visual function in low vision rehabilitation could lead to a standard metric useful in improving the safety and convenience of buildings.

As is stated in educational materials distributed at the World Standards Day celebration by ANSI, standards are a constant behind-the-scenes presence in nearly all areas of everyday life. Everything from air, water, and food to buildings, clothing, automobiles, roads, and electricity are produced in conformity with voluntary consensus standards of various kinds. In the U.S. alone, more than 100,000 standards specify product and system features and interconnections, making it possible for appliances to tap the electrical grid with the same results no matter where they are plugged in, and for products of all kinds to be purchased with confidence. Life is safer and more convenient, and science and industry are more innovative and profitable, because of standards.

The point of my third-place paper is that life could be even safer and more convenient, and science and industry could be yet more innovative and profitable, if standards and conformity assessment procedures for outcomes in education, health care, social services, and environmental resource management were developed and implemented. Rasch measurement demonstrates the consistent reproducibility of meaningful measures across samples and different collections of construct-relevant items. Within any specific area of interest, then, Rasch measures have the potential of serving as the kind of mediating instruments or objects recognized as essential to the process of linking science with the economy (Fisher & Stenner, 2011b; Hussenot & Missonier, 2010; Miller & O’Leary, 2007). Recent white papers published by NIST and NSF document the challenges and benefits likely to be encountered and produced by initiatives moving in this direction (Fisher, 2009; Fisher & Stenner, 2011a).

A diverse array of Rasch measurement presentations were made at the recent International Measurement Confederation (IMEKO) meeting of metrology engineers in Jena, Germany (see RMT 25 (1), p. 1318). With that start at a new dialogue between the natural and social sciences, the NIST and NSF white papers, and with the award in the World Standards Day paper competition, the U.S. and international standards development communities have shown their interest in exploring possibilities for a new array of standard units of measurement, standardized outcome product definitions, standard conformity assessment procedures, and outcome product quality standards. The increasing acceptance and recognition of the viability of such standards is a logical consequence of observations like these:

  • “Where this law [relating reading ability and text difficulty to comprehension rate] can be applied it provides a principle of measurement on a ratio scale of both stimulus parameters and object parameters, the conceptual status of which is comparable to that of measuring mass and force. Thus…the reading accuracy of a child…can be measured with the same kind of objectivity as we may tell its weight” (Rasch, 1960, p. 115).
  • “Today there is no methodological reason why social science cannot become as stable, as reproducible, and hence as useful as physics” (Wright, 1997, p. 44).
  • “…when the key features of a statistical model relevant to the analysis of social science data are the same as those of the laws of physics, then those features are difficult to ignore” (Andrich, 1988, p. 22).

Rasch’s work has been wrongly assimilated in social science research practice as just another example of the “standard model” of statistical analysis. Rasch measurement rightly ought instead to be treated as a general articulation of the three-variable structure of natural law useful in framing the context of scientific practice. That is, Rasch’s models ought to be employed primarily in calibrating instruments quantitatively interpretable at the point of use in a mathematical language shared by a community of research and practice. To be shared in this way as a universally uniform coin of the realm, that language must be embodied in a consensus standard defining universally uniform units of comparison.

Rasch measurement offers the potential of shifting the focus of quantitative psychosocial research away from data analysis to integrated qualitative and quantitative methods enabling the definition of standard units and the calibration of instruments measuring in that unit. An intangible assets metric system will, in turn, support the emergence of new product- and performance-based standards, management system standards, and personnel certification standards. Reiterating once again Rasch’s (1960, p. xx) insight, we can acknowledge with him that “this is a huge challenge, but once the problem has been formulated it does seem possible to meet it.”

 References

Andrich, D. (1988). Rasch models for measurement. (Vols. series no. 07-068). Sage University Paper Series on Quantitative Applications in the Social Sciences. Beverly Hills, California: Sage Publications.

Fisher, W. P.. Jr. (2009). Metrological infrastructure for human, social, and natural capital (NIST Critical National Need Idea White Paper Series, Retrieved 25 October 2011 from http://www.nist.gov/tip/wp/pswp/upload/202_metrological_infrastructure_for_human_social_natural.pdf). Washington, DC: National Institute for Standards and Technology.

Fisher, W. P., Jr., & Stenner, A. J. (2011a, January). Metrology for the social, behavioral, and economic sciences (Social, Behavioral, and Economic Sciences White Paper Series). Retrieved 25 October 2011 from http://www.nsf.gov/sbe/sbe_2020/submission_detail.cfm?upld_id=36. Washington, DC: National Science Foundation.

Fisher, W. P., Jr., & Stenner, A. J. (2011b). A technology roadmap for intangible assets metrology. In Fundamentals of measurement science. International Measurement Confederation (IMEKO), Jena, Germany, August 31 to September 2.

Hussenot, A., & Missonier, S. (2010). A deeper understanding of evolution of the role of the object in organizational process. The concept of ‘mediation object.’ Journal of Organizational Change Management, 23(3), 269-286.

Miller, P., & O’Leary, T. (2007, October/November). Mediating instruments and making markets: Capital budgeting, science and the economy. Accounting, Organizations, and Society, 32(7-8), 701-34.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Wright, B. D. (1997, Winter). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33-45, 52 [http://www.rasch.org/memo62.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Simple ideas, complex possibilities, elegant and beautiful results

February 11, 2011

Possibilities of great subtlety, elegance, and power can follow from the simplest ideas. Leonardo da Vinci is often credited with offering a variation on this theme, but the basic idea is much older. Philosophy, for instance, began with Plato’s distinction between name and concept. This realization that words are not the things they stand for has informed and structured each of several scientific revolutions.

How so? It all begins from the reasons why Plato required his students to have studied geometry. He knew that those familiar with the Pythagorean theorem would understand the difference between any given triangle and the mathematical relationships it represents. No right triangle ever definitively embodies a perfect realization of the assertion that the square of the hypotenuse equals the sum of the squares of the other two sides. The mathematical definition or concept of a triangle is not the same thing as any actual triangle.

The subtlety and power of this distinction became apparent in its repeated application throughout the history of science. In a sense, astronomy is a geometry of the heavens, Newton’s laws are a geometry of gravity, Ohm’s law is a geometry of electromagnetism, and relativity is a geometry of the invariance of mass and energy in relation to the speed of light. Rasch models present a means to geometries of literacy, numeracy, health, trust, and environmental quality.

We are still witnessing the truth, however partial, of Whitehead’s assertion that the entire history of Western culture is a footnote to Plato. As Husserl put it, we’re still struggling with the possibility of creating a geometry of experience, a phenomenology that is not a mere description of data but that achieves a science of living meaning. The work presented in other posts here attests to a basis for optimism that this quest will be fruitful.

Geometrical and algebraic expressions of scientific laws

April 12, 2010

Geometry provides a model of scientific understanding that has repeatedly proven itself over the course of history. Einstein (1922) considered geometry to be “the most ancient branch of physics” (p. 14). He accorded “special importance” to his view that “all linear measurement in physics is practical geometry,” “because without it I should have been unable to formulate the theory of relativity” (p. 14).

Burtt (1954) concurs, pointing out that the essential question for Copernicus was not “Does the earth move?” but, rather, “…what motions should we attribute to the earth in order to obtain the simplest and most harmonious geometry of the heavens that will accord with the facts?” (p. 39). Maxwell similarly employed a geometrical analogy in working out his electromagnetic theory, saying

“By referring everything to the purely geometrical idea of the motion of an imaginary fluid, I hope to attain generality and precision, and to avoid the dangers arising from a premature theory professing to explain the cause of the phenomena. If the results of mere speculation which I have collected are found to be of any use to experimental philosophers, in arranging and interpreting their results, they will have served their purpose, and a mature theory, in which physical facts will be physically explained, will be formed by those who by interrogating Nature herself can obtain the only true solution of the questions which the mathematical theory suggests.” (Maxwell, 1965/1890, p. 159).

Maxwell was known for thinking visually, once as a student offering a concise geometrical solution to a problem that resisted a lecturer’s lengthy algebraic efforts (Forfar, 2002, p. 8). His approach seemed to be one of playing with images with the aim of arriving at simple mathematical representations, instead of thinking linearly through a train of analysis. A similar method is said to have been used by Einstein (Holton, 1988, pp. 385-388).

Gadamer (1980) speaks of the mathematical transparency of geometric figures to convey Plato’s reasons for requiring mathematical training of the students in his Academy, saying:

“Geometry requires figures which we draw, but its object is the circle itself…. Even he who has not yet seen all the metaphysical implications of the concept of pure thinking but only grasps something of mathematics—and as we know, Plato assumed that such was the case with his listeners—even he knows that in a manner of speaking one looks right through the drawn circle and keeps the pure thought of the circle in mind.” (p. 101)

But exactly how do geometrical visualizations lend themselves to algebraic formulae? More specifically, is it possible to see the algebraic structure of scientific laws in geometry?

Yes, it is. Here’s how. Starting from the Pythagorean theorem, we know that the square of a right triangle’s hypotenuse is equal to the sum of the squares of the other two sides. For convenience, imagine that the lengths of the sides of the triangle, as shown in Figure 1, are 3, 4, and 5, for sides a, b, and c, respectively. We can count the unit squares within each side’s square and see that the 25 in the square of the hypotenuse equal the sum of the 9 in the square of side a and the 16 in the sum of side b.

That mathematical relationship can, of course, be written as

a2 + b2 = c2

which, for Figure 1, is

32 + 42 = 52 = 9 + 16 = 25

Now, most scientific laws are written in a multiplicative form, like this:

m = f / a

or

f = m * a

which, of course, is how Maxwell presented Newton’s Second Law. So how would the Pythagorean Theorem be written like a physical law?

Since the advent of small, cheap electronic calculators, slide rules have fallen out of fashion. But these eminently useful tools are built to take advantage of the way the natural logarithm and the number e (2.71828…) make division interchangeable with subtraction, and multiplication interchangeable with addition.

That means the Pythagorean Theorem could be written like Newton’s Second Law of Motion, or the Combined Gas Law. Here’s how it works. The Pythagorean Theorem is normally written as

a2 + b2 = c2

but does it make sense to write it as follows?

a2 * b2 = c2

Using the convenient values for a, b, and c from above

32 + 42 = 52

and

9 + 16 = 25

so, plainly, simply changing the plus sign to a multiplication sign will not work, since 9 * 16 is 144. This is where the number e comes in. What happens if e is taken as a base raised to the power of each of the parameters in the equation? Does this equation work?

e9 * e16 = e25

which, substituting a for e9, b for e16, and c for e25, could be represented by

a * b = c

and which could be solved as

8103 * 8,886,015 ≈ 72,003,378,611

Yes, it works, and so it is possible to divide through by e16 and arrive at the form of the law used by Maxwell and Rasch:

8103 ≈ 72,003,378,611 / 8,886,015

or

e9 = e25 / e16

or, again substituting a for e9, b for e16, and c for e25, could be represented by

a = c / b

which, when converted back to the additive form, looks like this:

a = c – b

and this

9 = 25 – 16 .

Rasch wrote his model in the multiplicative form of

εvi = θvσi

and it is often written in the form of

Pr {Xni = 1} = eβnδi / 1 + eβnδi

or

Pni = exp(Bn – Di) / [1 + exp(Bn – Di)]

which is to say that the probability of a correct response from person n on item i is equal to e taken to the power of the difference between the estimate β (or B) of person n‘s ability and the estimate δ (or D) of item i‘s difficulty, divided by one plus e to that same power.

Logit estimates of Rasch model parameters taken straight from software output usually range between ­-3.0 or so and 3.0. So what happens if a couple of arbitrary values are plugged into these equations? If someone has a measure of 2 logits, what is their probability of a correct answer on an item that calibrates at 0.5 logits? The answer should be

e2-0.5 / (1 + e2-0.5).

Now,

e1.5 = 2.718281.5 = 4.481685….

and

4.481685 / (1 + 4.481685) ≈ 0.8176

For a table of the relationships between logit differences, odds, and probabilities, see Table 1.4.1 in Wright & Stone (1979, p. 16), or Table 1 in Wright (1977).

This form of the model

Pni = exp(Bn – Di) / [1 + exp(Bn – Di)]

can be rewritten in an equivalent form as

[Pni / (1 – Pni)] = exp(Bn – Di) .

Taking the natural logarithm of the response probabilities expresses the model in perhaps its most intuitive form, often written as

ln[Pni / (1 – Pni)] = Bn – Di .

Substituting a for ln[Pni / (1 – Pni)], b for Bn, and c for Di, we have the same equation as we had for the Pythagorean Theorem, above

a = c – b .

Plugging in the same values of 2.0 and 0.5 logits for Bn and Di,

ln[Pni / (1 – Pni)] = 2.0 – 0.5 = 1.5.

The logit value of 1.5 is obtained from response odds [Pni / (1 – Pni)] of about 4.5, making, again, Pni equal to about 0.82.

Rasch wrote the model in working from Maxwell like this:

Avj = Fj / Mv .

So when catapult j’s force F of 50 Newtons (361.65 poundals) is applied to object v’s mass M of 10 kilograms (22.046 pounds), the acceleration of this interaction is 5 meters (16.404 feet) per second, per second. Increases in force relative to the same mass result in proportionate increases in acceleration, etc.

The same consistent and invariant structural relationship is posited and often found in Rasch model applications, such that reasonable matches are found between the expected and observed response probabilities are found for various differences between ability, attitude, or performance measures Bn and the difficulty calibrations Di of the items on the scale, between different measures relative to any given item, and between different calibrations relative to any given person. Of course, any number of parameters may be added, as long as they are included in an initial calibration design in which they are linked together in a common frame of reference.

Model fit statistics, principal components analysis of the standardized residuals, statistical studies of differential item/person functioning, and graphical methods are all applied to the study of departures from the modeled expectations.

I’ve shown here how the additive expression of the Pythagorean theorem, the multiplicative expression of natural laws, and the additive and multiplicative forms of Rasch models all participate in the same simultaneous, conjoint relation of two parameters mediated by a third. For those who think geometrically, perhaps the connections drawn here will be helpful in visualizing the design of experiments testing hypotheses of converging yet separable parameters. For those who think algebraically, perhaps the structure of lawful regularity in question and answer processes will be helpful in focusing attention on how to proceed step by step from one definite idea to another, in the manner so well demonstrated by Maxwell (Forfar, 2002, p. 8). Either way, the geometrical and/or algebraic figures and symbols ought to work together to provide a transparent view on the abstract mathematical relationships that stand independent from whatever local particulars are used as the medium of their representation.

Just as Einstein held that it would have been impossible to formulate the theory of relativity without the concepts, relationships, and images of practical geometry, so, too, may it one day turn out that key advances in the social and human sciences depend on the invariance of measures related to one another in the simple and lawful regularities of geometry.

Figure 1. A geometrical proof of the Pythagorean Theorem

References

Burtt, E. A. (1954). The metaphysical foundations of modern physical science (Rev. ed.) [First edition published in 1924]. Garden City, New York: Doubleday Anchor.

Einstein, A. (1922). Geometry and experience (G. B. Jeffery, W. Perrett, Trans.). In Sidelights on relativity (pp. 12-23). London, England: Methuen & Co. LTD.

Forfar, J. (2002, June). James Clerk Maxwell: His qualities of mind and personality as judged by his contemporaries. Mathematics Today, 38(3), 83.

Gadamer, H.-G. (1980). Dialogue and dialectic: Eight hermeneutical studies on Plato (P. C. Smith, Trans.). New Haven: Yale University Press.

Holton, G. (1988). Thematic origins of scientific thought (Revised ed.). Cambridge, Massachusetts: Harvard University Press.

Maxwell, J. C. (1965/1890). The scientific papers of James Clerk Maxwell (W. D. Niven, Ed.). New York: Dover Publications.

Wright, B. D. (1977). Solving measurement problems with the Rasch model. Journal of Educational Measurement, 14(2), 97-116 [http://www.rasch.org/memo42.htm].

Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. Chicago, Illinois: MESA Press.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Contesting the Claim, Part II: Are Rasch Measures Really as Objective as Physical Measures?

July 22, 2009

When a raw score is sufficient to the task of measurement, the model is the Rasch model, we can estimate the parameters consistently, and we can evaluate the fit of the data to the model. The invariance properties that follow from a sufficient statistic include virtually the entire class of invariant rules (Hall, Wijsman, & Ghosh, 1965; Arnold, 1985), and similar relationships with other key measurement properties follow from there (Fischer, 1981, 1995; Newby, Conner, Grant, & Bunderson, 2009; Wright, 1977, 1997).

What does this all actually mean? Imagine we were able to ask an infinite number of people an infinite number of questions that all work together to measure the same thing. Because (1) the scores are sufficient statistics, (2) the ruler is not affected by what is measured, (3) the parameters separate, and (4) the data fit the model, any subset of the questions asked would give the same measure. This means that any subscore for any person measured would be a function of any and all other subscores. When a sufficient statistic is a function of all other sufficient statistics, it is not only sufficient, it is necessary, and is referred to as a minimally sufficient statistic. Thus, if separable, independent model parameters can be estimated, the model must be the Rasch model, and the raw score is both sufficient and necessary (Andersen, 1977; Dynkin, 1951; van der Linden, 1992).

This means that scores, ratings, and percentages actually stand for something measurable only when they fit a Rasch model.  After all, what actually would be the point of using data that do not support the estimation of independent parameters? If the meaning of the results is tied in unknown ways to the specific particulars of a given situation, then those results are meaningless, by definition (Roberts & Rosenbaum, 1986; Falmagne & Narens, 1983; Mundy, 1986; Narens, 2002; also see Embretson, 1996; Romanoski and Douglas, 2002). There would be no point in trying to learn anything from them, as whatever happened was a one-time unique event that tells us nothing we can use in any future event (Wright, 1977, 1997).

What we’ve done here is akin to taking a narrative stroll through a garden of mathematical proofs. These conceptual analyses can be very convincing, but actual demonstrations of them are essential. Demonstrations would be especially persuasive if there would be some way of showing three things. First, shouldn’t there be some way of constructing ordinal ratings or scores for one or another physical variable that, when scaled, give us measures that are the same as the usual measures we are accustomed to?

This would show that we can use the type of instrument usually found in the social sciences to construct physical measures with the characteristics we expect. There are four available examples, in fact, involving paired comparisons of weights (Choi, 1998), measures of short lengths (Fisher, 1988), ratings of medium-range distances (Moulton, 1993), and a recovery of the density scale (Pelton & Bunderson, 2003). In each case, the Rasch-calibrated experimental instruments produced measures equivalent to the controls, as shown in linear plots of the pairs of measures.

A second thing to build out from the mathematical proofs are experiments in which we check the purported stability of measures and calibrations. We can do this by splitting large data sets, using different groups of items to produce two or more measures for each person, or using different groups of respondents/examinees to provide data for two or more sets of item calibrations. This is a routine experimental procedure in many psychometric labs, and results tend to conform with theory, with strong associations found between increasing sample sizes and increasing reliability coefficients for the respective measures or calibrations. These associations can be plotted (Fisher, 2008), as can the pairs of calibrations estimated from different samples (Fisher, 1999), and the pairs of measures estimated from different instruments (Fisher, Harvey, Kilgore, et al., 1995; Smith & Taylor, 2004). The theoretical expectation of tighter plots for better designed instruments, larger sample sizes, and longer tests is confirmed so regularly that it should itself have the status of a law of nature (Linacre, 1993).

A third convincing demonstration is to compare studies of the same thing conducted in different times and places by different researchers using different instruments on different samples. If the instruments really measure the same thing, there will not only be obvious similarities in their item contents, but similar items will calibrate in similar positions on the metric across samples. Results of this kind have been obtained in at least three published studies (Fisher, 1997a, 1997b; Belyukova, Stone, & Fox, 2004).

All of these arguments are spelled out in greater length and detail, with illustrations, in a forthcoming article (Fisher, 2009). I learned all of this from Benjamin Wright, who worked directly with Rasch himself, and who, perhaps more importantly, was prepared for what he could learn from Rasch in his previous career as a physicist. Before encountering Rasch in 1960, Wright had worked with Feynman at Cornell, Townes at Bell Labs, and Mulliken at the University of Chicago. Taught and influenced not just by three of the great minds of twentieth-century physics, but also by Townes’ philosophical perspectives on meaning and beauty, Wright had left physics in search of life. He was happy to transfer his experience with computers into his new field of educational research, but he was dissatisfied with the quality of the data and how it was treated.

Rasch’s ideas gave Wright the conceptual tools he needed to integrate his scientific values with the demands of the field he was in. Over the course of his 40-year career in measurement, Wright wrote the first software for estimating Rasch model parameters and continuously improved it; he adapted new estimation algorithms for Rasch’s models and was involved in the articulation of new models; he applied the models to hundreds of data sets using his software; he vigorously invested himself in students and colleagues; he founded new professional societies, meetings, and journals;  and he never stopped learning how to think anew about measurement and the meaning of numbers. Through it all, there was always a yardstick handy as a simple way of conveying the basic requirements of measurement as we intuitively understand it in physical terms.

Those of us who spend a lot of time working with these ideas and trying them out on lots of different kinds of data forget or never realize how skewed our experience is relative to everyone else’s. I guess a person lives in a different world when you have the sustained luxury of working with very large databases, as I have had, and you see the constancy and stability of well-designed measures and calibrations over time, across instruments, and over repeated samples ranging from 30 to several million.

When you have that experience, it becomes a basic description of reasonable expectation to read the work of a colleague and see him say that “when the key features of a statistical model relevant to the analysis of social science data are the same as those of the laws of physics, then those features are difficult to ignore” (Andrich, 1988, p. 22). After calibrating dozens of instruments over 25 years, some of them many times over, it just seems like the plainest statement of the obvious to see the same guy say “Our measurement principles should be the same for properties of rocks as for the properties of people. What we say has to be consistent with physical measurement” (Andrich, 1998, p. 3).

And I find myself wishing more people held the opinion expressed by two other colleagues, that “scientific measures in the social sciences must hold to the same standards as do measures in the physical sciences if they are going to lead to the same quality of generalizations” (Bond & Fox, 2001, p. 2). When these sentiments are taken to their logical conclusion in a practical application, the real value of “attempting for reading comprehension what Newtonian mechanics achieved for astronomy” (Burdick & Stenner, 1996) becomes apparent. Rasch’s analogy of the structure of his model for reading tests and Newton’s Second Law can be restated relative to any physical law expressed as universal conditionals among variable triplets; a theory of the variable measured capable of predicting item calibrations provides the causal story for the observed variation (Burdick, Stone, & Stenner, 2006; DeBoeck & Wilson, 2004).

Knowing what I know, from the mathematical principles I’ve been trained in and from the extensive experimental work I’ve done, it seems amazing that so little attention is actually paid to tools and concepts that receive daily lip service as to their central importance in every facet of life, from health care to education to economics to business. Measurement technology rose up decades ago in preparation for the demands of today’s challenges. It is just plain weird the way we’re not using it to anything anywhere near its potential.

I’m convinced, though, that the problem is not a matter of persuasive rhetoric applied to the minds of the right people. Rather, someone, hopefully me, has got to configure the right combination of players in the right situation at the right time and at the right place to create a new form of real value that can’t be created any other way. Like they say, money talks. Persuasion is all well and good, but things will really take off only when people see that better measurement can aid in removing inefficiencies from the management of human, social, and natural capital, that better measurement is essential to creating sustainable and socially responsible policies and practices, and that better measurement means new sources of profitability.  I’m convinced that advanced measurement techniques are really nothing more than a new form of IT or communications technology. They will fit right into the existing networks and multiply their efficiencies many times over.

And when they do, we may be in a position to finally

“confront the remarkable fact that throughout the gigantic range of physical knowledge numerical laws assume a remarkably simple form provided fundamental measurement has taken place. Although the authors cannot explain this fact to their own satisfaction, the extension to behavioral science is obvious: we may have to await fundamental measurement before we will see any real progress in quantitative laws of behavior. In short, ordinal scales (even continuous ordinal scales) are perhaps not good enough and it may not be possible to live forever with a dozen different procedures for quantifying the same piece of behavior, each making strong but untestable and basically unlikely assumptions which result in nonlinear plots of one scale against another. Progress in physics would have been impossibly difficult without fundamental measurement and the reader who believes that all that is at stake in the axiomatic treatment of measurement is a possible criterion for canonizing one scaling procedure at the expense of others is missing the point” (Ramsay, Bloxom, and Cramer, 1975, p. 262).

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Contesting the Claim, Part I: Are Rasch Measures Really as Objective as Physical Measures?

July 21, 2009

Psychometricians, statisticians, metrologists, and measurement theoreticians tend to be pretty unassuming kinds of people. They’re unobtrusive and retiring, by and large. But there is one thing some of them are prone to say that will raise the ire of others in a flash, and the poor innocent geek will suddenly be subjected to previously unknown forms and degrees of social exclusion.

What is that one thing? “Instruments calibrated by fitting data to a Rasch model measure with the same kind of objectivity as is obtained with physical measures.” That’s one version. Another could be along these lines: “When data fit a Rasch model, we’ve discovered a pattern in human attitudes or behaviors so regular that it is conceptually equivalent to a law of nature.”

Maybe it is the implication of objectivity as something that must be politically incorrect that causes the looks of horror and recoiling retreats in the nonmetrically inclined when they hear things like this. Maybe it is the ingrained cultural predisposition to thinking such claims outrageously preposterous that makes those unfamiliar with 80 years of developments and applications so dismissive. Maybe it’s just fear of the unknown, or a desire not to have to be responsible for knowing something important that hardly anyone else knows.

Of course, it could just be a simple misunderstanding. When people hear the word “objective” do most of them have an image of an object in mind? Does objectivity connote physical concreteness to most people? That doesn’t hold up well for me, since we can be objective about events and things people do without any confusions involving being able to touch and feel what’s at issue.

No, I think something else is going on. I think it has to do with the persistent idea that objectivity requires a disconnected, alienated point of view, one that ignores the mutual implication of subject and object in favor of analytically tractable formulations of problems that, though solvable, are irrelevant to anything important or real. But that is hardly the only available meaning of objectivity, and it isn’t anywhere near the best. It certainly is not what is meant in the world of measurement theory and practice.

It’s better to think of objectivity as something having to do with things like the object of a conversation, or an object of linguistic reference: “chair” as referring to the entire class of all forms of seating technology, for instance. In these cases, we know right away that we’re dealing with what might be considered a heuristic ideal, an abstraction. It also helps to think of objectivity in terms of fairness and justice. After all, don’t we want our educational, health care, and social services systems to respect the equality of all individuals and their rights?

That is not, of course, how measurement theoreticians in psychology have always thought about objectivity. In fact, it was only 70-80 years ago that most psychologists gave up on objective measurement because they couldn’t find enough evidence of concrete phenomena to support the claims to objectivity they wanted to make (Michell, 1999). The focus on the reflex arc led a lot of psychologists into psychophysics, and the effects of operant conditioning led others to behaviorism. But a lot of the problems studied in these fields, though solvable, turned out to be uninteresting and unrelated to the larger issues of life demanding attention.

And so, with no physical entity that could be laid end-to-end and concatenated in the way weights are in a balance scale, psychologists just redefined measurement to suit what they perceived to be the inherent limits of their subject matter. Measurement didn’t have to be just ratio or interval, it could also be ordinal and even nominal. The important thing was to get numbers that could be statistically manipulated. That would provide more than enough credibility, or obfuscation, to create the appearance of legitimate science.

But while mainstream psychology was focused on hunting for statistically significant p-values, there were others trying to figure out if attitudes, abilities, and behaviors could be measured in a rigorously meaningful way.

Louis Thurstone, a former electrical engineer turned psychologist, was among the first to formulate the problem. Writing in 1928, Thurstone rightly focused on the instrument as the focus of attention:

The scale must transcend the group measured.–One crucial experimental test must be applied to our method of measuring attitudes before it can be accepted as valid. A measuring instrument must not be seriously affected in its measuring function by the object of measurement. To the extent that its measuring function is so affected, the validity of the instrument is impaired or limited. If a yardstick measured differently because of the fact that it was a rug, a picture, or a piece of paper that was being measured, then to that extent the trustworthiness of that yardstick as a measuring device would be impaired. Within the range of objects for which the measuring instrument is intended, its function must be independent of the object of measurement”  (Thurstone, 1959, p. 228).

Thurstone aptly captures what is meant when it is said that attitudes, abilities, or behaviors can be measured with the same kind of objectivity as is obtained in the natural sciences. Objectivity is realized when a test, survey, or assessment functions the same way no matter who is being measured, and, conversely (Thurstone took this up, too), an attitude, ability, or behavior exhibits the same amount of what is measured no matter which instrument is used.

This claim, too, may seem to some to be so outrageously improbable as to be worthy of rejecting out of hand. After all, hasn’t everyone learned how the fact of being measured changes the measure? Thing is, this is just as true in physics and ecology as it is in psychiatry or sociology, and the natural sciences haven’t abandoned their claims to objectivity. So what’s up?

What’s up is that all sciences now have participant observers. The old Cartesian duality of the subject-object split still resides in various rhetorical choices and affects our choices and behaviors, but, in actual practice, scientific methods have always had to deal with the way questions imply particular answers.

And there’s more. Qualitative methods have grown out of some of the deep philosophical introspections of the twentieth century, such as phenomenology, hermeneutics, deconstruction, postmodernism, etc. But most researchers who are adopting qualitative methods over quantitative ones don’t know that the philosophers legitimating the new focuses on narrative, interpretation, and the construction of meaning did quite a lot of very good thinking about mathematics and quantitative reasoning. Much of my own published work engages with these philosophers to find new ways of thinking about measurement (Fisher, 2004, for instance). And there are some very interesting connections to be made that show quantification does not necessarily have to involve a positivist, subject-object split.

So where does that leave us? Well, with probability. Not in the sense of statistical hypothesis testing, but in the sense of calibrating instruments with known probabilistic characteristics. If the social sciences are ever to be scientific, null hypothesis significance tests are going to have to be replaced with universally uniform metrics embodying and deploying the regularities of natural laws, as is the case in the physical sciences. Various arguments on this issue have been offered for decades (Cohen, 1994; Meehl, 1967, 1978; Goodman, 1999; Guttman, 1985; Rozeboom, 1960). The point is not to proscribe allowable statistics based on scale type  (Velleman & Wilkinson, 1993). Rather, we need to shift and simplify the focus of inference from the statistical analysis of data to the calibration and distribution of instruments that support distributed cognition, unify networks, lubricate markets, and coordinate collective thinking and acting (Fisher, 2000, 2009). Persuasion will likely matter far less in resolving the matter than an ability to create new value, efficiencies, and profits.

In 1964, Luce and Tukey gave us another way of stating what Thurstone was getting at:

“The axioms of conjoint measurement apply naturally to problems of classical physics and permit the measurement of conventional physical quantities on ratio scales…. In the various fields, including the behavioral and biological sciences, where factors producing orderable effects and responses deserve both more useful and more fundamental measurement, the moral seems clear: when no natural concatenation operation exists, one should try to discover a way to measure factors and responses such that the ‘effects’ of different factors are additive.”

In other words, if we cannot find some physical thing that we can make add up the way numbers do, as we did with length, weight, volts, temperature, time, etc., then we ought to ask questions in a way that allows the answers to reveal the kind of patterns we expect to see when things do concatenate. What Thurstone and others working in his wake have done is to see that we could possibly do some things virtually in terms of abstract relations that we cannot do actually in terms of concrete relations.

The concept is no more difficult to comprehend than understanding the difference between playing solitaire with actual cards and writing a computer program to play solitaire with virtual cards. Either way, the same relationships hold.

A Danish mathematician, Georg Rasch, understood this. Working in the 1950s with data from psychological and reading tests, Rasch worked from his training in the natural sciences and mathematics to arrive at a conception of measurement that would apply in the natural and human sciences equally well. He realized that

“…the acceleration of a body cannot be determined; the observation of it is admittedly liable to … ‘errors of measurement’, but … this admittance is paramount to defining the acceleration per se as a parameter in a probability distribution — e.g., the mean value of a Gaussian distribution — and it is such parameters, not the observed estimates, which are assumed to follow the multiplicative law [acceleration = force / mass, or mass * acceleration = force].

“Thus, in any case an actual observation can be taken as nothing more than an accidental response, as it were, of an object — a person, a solid body, etc. — to a stimulus — a test, an item, a push, etc. — taking place in accordance with a potential distribution of responses — the qualification ‘potential’ referring to experimental situations which cannot possibly be [exactly] reproduced.

“In the cases considered [earlier in the book] this distribution depended on one relevant parameter only, which could be chosen such as to follow the multiplicative law.

“Where this law can be applied it provides a principle of measurement on a ratio scale of both stimulus parameters and object parameters, the conceptual status of which is comparable to that of measuring mass and force. Thus, … the reading accuracy of a child … can be measured with the same kind of objectivity as we may tell its weight …” (Rasch, 1960, p. 115).

Rasch’s model not only sets the parameters for data sufficient to the task of measurement, it lays out the relationships that must be found in data for objective results to be possible. Rasch studied with Ronald Fisher in London in 1935, expanded his understanding of statistical sufficiency with him, and then applied it in his measurement work, but not in the way that most statisticians understand it. Yes, in the context of group-level statistics, sufficiency concerns the reproducibility of a normal distribution when all that is known are the mean and the standard deviation. But sufficiency is something quite different in the context of individual-level measurement. Here, counts of correct answers or sums of ratings serve as sufficient statistics  for any statistical model’s parameters when they contain all of the information needed to establish that the parameters are independent of one another, and are not interacting in ways that keep them tied together. So despite his respect for Ronald Fisher and the concept of sufficiency, Rasch’s work with models and methods that worked equally well with many different kinds of distributions led him to jokingly suggest (Andersen, 1995, p. 385) that all textbooks mentioning the normal distribution should be burned!

In plain English, all that we’re talking about here is what Thurstone said: the ruler has to work the same way no matter what or who it is measuring, and we have to get the same results for what or who we are measuring no matter which ruler we use. When parameters are not separable, when they stick together because some measures change depending on which questions are asked or because some calibrations change depending on who answers them, we have encountered a “failure of invariance” that tells us something is wrong. If we are to persist in our efforts to determine if something objective exists and can be measured, we need to investigate these interactions and explain them. Maybe there was a data entry error. Maybe a form was misprinted. Maybe a question was poorly phrased. Maybe we have questions that address different constructs all mixed together. Maybe math word problems work like reading test items for students who can’t read the language they’re written in.  Standard statistical modeling ignores these potential violations of construct validity in favor of adding more parameters to the model.

But that’s another story for another time. Tomorrow we’ll take a closer look at sufficiency, in both conceptual and practical terms. Cited references are always available on request, but I’ll post them in a couple of days.

The “Standard Model,” Part II: Natural Law, Economics, Measurement, and Capital

July 15, 2009

At Tjalling Koopmans’ invitation, Rasch became involved with the Cowles Commission, working at the University of Chicago in the 1947 academic year, and giving presentations in the same seminar series as Milton Friedman, Kenneth Arrow, and Jimmie Savage (Linacre, 1998; Cowles Foundation, 1947, 1952; Rasch, 1953). Savage would later be instrumental in bringing Rasch back to Chicago in 1960.

Rasch was prompted to approach Savage about giving a course at Chicago after receiving a particularly strong response to some of his ideas from his old mentor, Frisch, when Frisch had come to Copenhagen to receive an honorary doctorate in 1959. Frisch shared the first Nobel Prize in economics with Tinbergen, was a co-founder, with Irving Fisher, of the Econometric Society,  invented words such as “econometrics” and “macro-economics,” and was the editor of Econometrica for many years. As recounted by Rasch (1977, pp. 63-66; also see Andrich, 1997; Wright, 1980, 1998), Frisch was struck by the disappearance of the person parameter from the comparisons of item calibrations in the series of equations he presented. In response to Frisch’s reaction, Rasch formalized his mathematical ideas in a Separability Theorem.

Why were the separable parameters  significant to Frisch? Because they addressed the problem that was at the center of Frisch’s network of concepts: autonomy, better known today as structural invariance (Aldrich, 1989, p. 15; Boumans, 2005, pp. 51 ff.; Haavelmo, 1948). Autonomy concerns the capacity of data to represent a pattern of relationships that holds up across the local particulars. It is, in effect, Frisch’s own particular way of extending the Standard Model. Irving Fisher (1930) had similarly stated what he termed a Separation Theorem, which, in the manner of previous work by Walras, Jevons, and others, was also presented in terms of a multiplicative relation between three variables. Frisch (1930) complemented Irving Fisher’s focus on an instrumental approach with a mathematical, axiomatic approach (Boumans, 2005) offering necessary and sufficient conditions for tests of Irving Fisher’s theorem.

When Rasch left Frisch, he went directly to London to work with Ronald Fisher, where he remained for a year. In the following decades, Rasch became known as the foremost advocate of Ronald Fisher’s ideas in Denmark. In particular, he stressed the value of statistical sufficiency, calling it the “high mark” of Fisher’s work (Fisher, 1922). Rasch’s student, Erling Andersen, later showed that when raw scores are both necessary and sufficient statistics for autonomous, separable parameters, the model employed is Rasch’s (Andersen, 1977; Fischer, 1981; van der Linden, 1992).

Whether or not Rasch’s conditions exactly reproduce Frisch’s, and whether or not his Separability Theorem is identical with Irving Fisher’s Separation Theorem, it would seem that time with Frisch exerted a significant degree of influence on Rasch, likely focusing his attention on statistical sufficiency, the autonomy implied by separable parameters, and the multiplicative relations of variable triples.

These developments, and those documented in previous of my blogs, suggest the existence of powerful and untapped potentials hidden within psychometrics and econometrics. The story told thus far remains incomplete. However compelling the logic and personal histories may be, central questions remain unanswered. To provide a more well-rounded assessment of the situation, we must take up several unresolved philosophical issues (Fisher, 2003a, 2003b, 2004).

It is my contention that, for better measurement to become more mainstream, a certain kind of cultural shift is going to have to happen. This shift has already been underway for decades, and has precedents that go back centuries. Its features are becoming more apparent as long term economic sustainability is understood to involve significant investments in humanly, socially and environmentally responsible practices.  For such practices to be more than just superficial expressions of intentions that might be less interested in the greater good than selfish gain, they have to emerge organically from cultural roots that are already alive and thriving.

It is not difficult to see how such an organic emergence might happen, though describing it appropriately requires an ability to keep the relationship of the local individual to the global universal always in mind. And even if and when that description might be provided, having it in hand in no way shows how it could be brought about. All we can do is to persist in preparing ourselves for the opportunities that arise, reading, thinking, discussing, and practicing. Then, and only then, might we start to plant the seeds, nurture them, and see them grow.

References

Aldrich, J. (1989). Autonomy. Oxford Economic Papers, 41, 15-34.

Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42(1), 69-81.

Andrich, D. (1997). Georg Rasch in his own words [excerpt from a 1979 interview]. Rasch Measurement Transactions, 11(1), 542-3. [http://www.rasch.org/rmt/rmt111.htm#Georg].

Boumans, M. (2001). Fisher’s instrumental approach to index numbers. In M. S. Morgan & J. Klein (Eds.), The age of economic measurement (pp. 313-44). Durham, North Carolina: Duke University Press.

Bjerkholt, O. (2001). Tracing Haavelmo’s steps from Confluence Analysis to the Probability Approach (Tech. Rep. No. 25). Oslo, Norway: Department of Economics, University of Oslo, in cooperation with The Frisch Centre for Economic Research.

Boumans, M. (1993). Paul Ehrenfest and Jan Tinbergen: A case of limited physics transfer. In N. De Marchi (Ed.), Non-natural social science: Reflecting on the enterprise of “More Heat than Light” (pp. 131-156). Durham, NC: Duke University Press.

Boumans, M. (2005). How economists model the world into numbers. New York: Routledge.

Burdick, D. S., Stone, M. H., & Stenner, A. J. (2006). The Combined Gas Law and a Rasch Reading Law. Rasch Measurement Transactions, 20(2), 1059-60 [http://www.rasch.org/rmt/rmt202.pdf].

Cowles Foundation for Research in Economics. (1947). Report for period 1947, Cowles Commission for Research in Economics. Retrieved 7 July 2009, from Yale University Dept. of Economics: http://cowles.econ.yale.edu/P/reports/1947.htm.

Cowles Foundation for Research in Economics. (1952). Biographies of Staff, Fellows, and Guests, 1932-1952. Retrieved 7 July 2009 from Yale University Dept. of Economics: http://cowles.econ.yale.edu/P/reports/1932-52d.htm#Biographies.

Fischer, G. H. (1981, March). On the existence and uniqueness of maximum-likelihood estimates in the Rasch model. Psychometrika, 46(1), 59-77.

Fisher, I. (1930). The theory of interest. New York: Macmillan.

Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London, A, 222, 309-368.

Fisher, W. P., Jr. (1992). Objectivity in measurement: A philosophical history of Rasch’s separability theorem. In M. Wilson (Ed.), Objective measurement: Theory into practice. Vol. I (pp. 29-58). Norwood, New Jersey: Ablex Publishing Corporation.

Fisher, W. P., Jr. (2003a, December). Mathematics, measurement, metaphor, metaphysics: Part I. Implications for method in postmodern science. Theory & Psychology, 13(6), 753-90.

Fisher, W. P., Jr. (2003b, December). Mathematics, measurement, metaphor, metaphysics: Part II. Accounting for Galileo’s “fateful omission.” Theory & Psychology, 13(6), 791-828.

Fisher, W. P., Jr. (2004, October). Meaning and method in the social sciences. Human Studies: A Journal for Philosophy and the Social Sciences, 27(4), 429-54.

Fisher, W. P., Jr. (2007, Summer). Living capital metrics. Rasch Measurement Transactions, 21(1), 1092-3 [http://www.rasch.org/rmt/rmt211.pdf].

Fisher, W. P., Jr. (2008, March 28). Rasch, Frisch, two Fishers and the prehistory of the Separability Theorem. In Session 67.056. Reading Rasch Closely: The History and Future of Measurement. American Educational Research Association, Rasch Measurement SIG, New York University, New York City.

Frisch, R. (1930). Necessary and sufficient conditions regarding the form of an index number which shall meet certain of Fisher’s tests. Journal of the American Statistical Association, 25, 397-406.

Haavelmo, T. (1948). The autonomy of an economic relation. In R. Frisch &  et al. (Eds.), Autonomy of economic relations. Oslo, Norway: Memo DE-UO, 25-38.

Heilbron, J. L. (1993). Weighing imponderables and other quantitative science around 1800 Historical studies in the physical and biological sciences, 24 (Supplement), Part I, 1-337.

Jammer, M. (1999). Concepts of mass in contemporary physics and philosophy. Princeton, NJ: Princeton University Press.

Linacre, J. M. (1998). Rasch at the Cowles Commission. Rasch Measurement Transactions, 11(4), 603.

Maas, H. (2001). An instrument can make a science: Jevons’s balancing acts in economics. In M. S. Morgan & J. Klein (Eds.), The age of economic measurement (pp. 277-302). Durham, North Carolina: Duke University Press.

Mirowski, P. (1988). Against mechanism. Lanham, MD: Rowman & Littlefield.

Rasch, G. (1953, March 17-19). On simultaneous factor analysis in several populations. From the Uppsala Symposium on Psychological Factor Analysis. Nordisk Psykologi’s Monograph Series, 3, 65-71, 76-79, 82-88, 90.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.

Rasch, G. (1977). On specific objectivity: An attempt at formalizing the request for generality and validity of scientific statements. Danish Yearbook of Philosophy,  14, 58-94.

van der Linden, W. J. (1992). Sufficient and necessary statistics. Rasch Measurement Transactions, 6(3), 231 [http://www.rasch.org/rmt/rmt63d.htm].

Wright, B. D. (1980). Foreword, Afterword. In Probabilistic models for some intelligence and attainment tests, by Georg Rasch (pp. ix-xix, 185-199. http://www.rasch.org/memo63.htm) [Reprint; original work published in 1960 by the Danish Institute for Educational Research]. Chicago, Illinois: University of Chicago Press.

Wright, B. D. (1994, Summer). Theory construction from empirical observations. Rasch Measurement Transactions, 8(2), 362 [http://www.rasch.org/rmt/rmt82h.htm].

Wright, B. D. (1998, Spring). Georg Rasch: The man behind the model. Popular Measurement, 1, 15-22 [http://www.rasch.org/pm/pm1-15.pdf].