Archive for the ‘Computer Adaptive Measurement’ Category

A Framework for Competitive Advantage in Managing Intangible Assets

July 26, 2011

It has long been recognized that externalities like social costs could be brought into the market should ways of measuring them objectively be devised. Markets, however, do not emerge spontaneously from the mere desire to be able to buy and sell; they are, rather, the products of actors and agencies that define the rules, roles, and relationships within which transaction costs are reduced and from which value, profits, and authentic wealth may be extracted. Objective measurement is necessary to reduced transaction costs but is by itself insufficient to the making of markets. Thus, markets for intangible assets, such as human, social, and natural capital, remain inefficient and undeveloped even though scientific theories, models, methods, and results demonstrating their objective measurability have been available for over 80 years.

Why has the science of objectively measured intangible assets not yet led to efficient markets for those assets? The crux of the problem, the pivot point at which an economic Archimedes could move the world of business, has to do with verifiable trust. It may seem like stating the obvious, but there is much to be learned from recognizing that shared narratives of past performance and a shared vision of the future are essential to the atmosphere of trust and verifiability needed for the making of markets. The key factor is the level of detail reliably tapped by such narratives.

For instance, some markets seem to have the weight of an immovable mass when the dominant narrative describes a static past and future with no clearly defined trajectory of leverageable development. But when a path of increasing technical capacity or precision over time can be articulated, entrepreneurs have the time frames they need to be able to coordinate, align, and manage budgeting decisions vis a vis investments, suppliers, manufacturers, marketing, sales, and customers. For example, the building out of the infrastructure of highways, electrical power, and water and sewer services assured manufacturers of automobiles, appliances, and homes that they could develop products for which there would be ready customers. Similarly, the mapping out of a path of steady increases in technical precision at no additional cost in Moore’s Law has been a key factor enabling the microprocessor industry’s ongoing history of success.

Of course, as has been the theme of this blog since day one, similar paths for the development of new infrastructural capacities could be vital factors for making new markets for human, social, and natural capital. I’ll be speaking on this topic at the forthcoming IMEKO meeting in Jena, Germany, August 31 to September 2. Watch this spot for more on this theme in the near future.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Advertisements

Stages in the Development of Meaningful, Efficient, and Useful Measures

February 21, 2011

In all learning, we use what we already know as a means of identifying what we do not yet know. When someone can read a written language, knows an alphabet and has a vocabulary, understands grammar and syntax, then that knowledge can be used to learn about the world. Then, knowing what birds are, for instance, one might learn about different kinds of birds or the typical behaviors of one bird species.

And so with measurement, we start from where we find ourselves, as with anything else. There is no need or possibility for everyone to master all the technical details of every different area of life that’s important. But it is essential that we know what is technically possible, so that we can seek out and find the tools that help us achieve our goals. We can’t get what we can’t or don’t ask for. In the domain of measurement, it seems that hardly anyone is looking for what’s actually readily available.

So it seems pertinent to offer a description of a continuum of increasingly meaningful, efficient and useful ways of measuring. Previous considerations of the problem have offered different categorizations for the transformations characterizing development on this continuum. Stenner and Horabin (1992) distinguish between 1) impressionistic and qualitative, nominal gradations found in the earliest conceptualizations of temperature, 2) local, data-based quantitative measures of temperature, and 3) generalized, universally uniform, theory-based quantitative measures of temperature.

Theory-based temperature measurement is prized for the way that thermodynamic theory enables the calibration of individual thermometers with no need for testing each one in empirical studies of its performance. As Lewin (1951, p. 169) put it, “There is nothing so practical as a good theory.” Thus we have electromagnetic theory making it possible to know the conduction and resistance characteristics of electrical cable from the properties of the metal alloys and insulators used, with no need to test more than a small fraction of that cable as a quality check.

Theory makes it possible to know in advance what the results of such tests would be with enough precision to greatly reduce the burden and expenses of instrument calibration. There likely would be no electrical industry at all if the properties of every centimeter of cable and every appliance had to be experimentally tested. This principle has been employed in measuring human, social, and natural capital for some time, but, for a variety of reasons, it has not yet been adopted on a wide scale.

Reflecting on the history of psychosocial measurement in this context, it then becomes apparent that Stenner and Horabin’s (1992) three stages can then be further broken down. Listed below are the distinguishing features for each of six stages in the evolution of measurement systems, building on the five stages described by Stenner, Burdick, Sanford, and Burdick (2006). This progression of increasing complexity, meaning, efficiency, and utility can be used as a basis for a technology roadmap that will enable the coordination and alignment of various services and products in the domain of intangible assets, as I will take up in a forthcoming post.

Stage 1. Least meaning, utility, efficiency, and value

Purely passive, receptive

Statistics describe data: What you see is what you get

Content defines measure

Additivity, invariance, etc. not tested, so numbers do not stand for something that adds up like they do

Measurement defined statistically in terms of group-level intervariable relations

Meaning of numbers changes with questions asked and persons answering

No theory

Data must be gathered and analyzed to have results

Commercial applications are instrument-dependent

Standards based in ensuring fair methods and processes

Stage 2

Slightly less passive, receptive but still descriptively oriented

Additivity, invariance, etc. tested, so numbers might stand for something that adds up like they do

Measurement still defined statistically in terms of group-level intervariable relations

Falsification of additive hypothesis effectively derails measurement effort

Descriptive models with interaction effects accepted as viable alternatives

Typically little or no attention to theory of item hierarchy and construct definition

Empirical (data-based) calibrations only

Data must be gathered and analyzed to have results

Initial awareness of measurement theory

Commercial applications are instrument-dependent

Standards based in ensuring fair methods and processes

Stage 3

Even less purely passive & receptive, more active

Instrument still designed relative to content specifications

Additivity, invariance, etc. tested, so numbers might stand for something that adds up like they do

Falsification of additive hypothesis provokes questions as to why

Descriptive models with interaction effects not accepted as viable alternatives

Measurement defined prescriptively in terms of individual-level intravariable invariance

Significant attention to theory of item hierarchy and construct definition

Empirical calibrations only

Data has to be gathered and analyzed to have results

More significant use of measurement theory in prescribing acceptable data quality

Limited construct theory (no predictive power)

Commercial applications are instrument-dependent

Standards based in ensuring fair methods and processes

Stage 4

First stage that is more active than passive

Initial efforts to (re-)design instrument relative to construct specifications and theory

Additivity, invariance, etc. tested in thoroughly prescriptive focus on calibrating instrument

Numbers not accepted unless they stand for something that adds up like they do

Falsification of additive hypothesis provokes questions as to why and corrective action

Models with interaction effects not accepted as viable alternatives

Measurement defined prescriptively in terms of individual-level intravariable invariance

Significant attention to theory of item hierarchy and construct definition relative to instrument design

Empirical calibrations only but model prescribes data quality

Data usually has to be gathered and analyzed to have results

Point of use self-scoring forms might provide immediate measurement results to end user

Some construct theory (limited predictive power)

Some commercial applications are not instrument-dependent (as in CAT item bank implementations)

Standards based in ensuring fair methods and processes

Stage 5

Significantly active approach to measurement

Item hierarchy translated into construct theory

Construct specification equation predicts item difficulties

Theory-predicted (not empirical) calibrations used in applications

Item banks superseded by single-use items created on the fly

Calibrations checked against empirical results but data gathering and analysis not necessary

Point of use self-scoring forms or computer apps provide immediate measurement results to end user

Used routinely in commercial applications

Awareness that standards might be based in metrological traceability to consensus standard uniform metric

Stage 6. Most meaning, utility, efficiency, and value

Most purely active approach to measurement

Item hierarchy translated into construct theory

Construct specification equation predicts item ensemble difficulties

Theory-predicted calibrations enable single-use items created from context

Checked against empirical results for quality assessment but data gathering and analysis not necessary

Point of use self-scoring forms or computer apps provide immediate measurement results to end user

Used routinely in commercial applications

Standards based in metrological traceability to consensus standard uniform metric

 

References

Lewin, K. (1951). Field theory in social science: Selected theoretical papers (D. Cartwright, Ed.). New York: Harper & Row.

Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text measures? Journal of Applied Measurement, 7(3), 307-22.

Stenner, A. J., & Horabin, I. (1992). Three stages of construct definition. Rasch Measurement Transactions, 6(3), 229 [http://www.rasch.org/rmt/rmt63b.htm].

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Open Letter to the Impact Investment Community

May 4, 2010

It is very encouraging to discover your web sites (GIIN, IRIS, and GIIRS) and to see the work you’re doing in advancing the concept of impact investing. The defining issue of our time is figuring out how to harness the profit motive for socially responsible and environmentally sustainable prosperity. The economic, social, and environmental disasters of today might all have been prevented or significantly mitigated had social and environmental impacts been taken into account in all investing.

My contribution is to point out that, though the profit motive must be harnessed as the engine driving responsible and sustainable business practices, the force of that power is dissipated and negated by the lack of efficient human, social, and natural capital markets. If we cannot make these markets function more like financial markets, so that money naturally flows to those places where it produces the greatest returns, we will never succeed in the fundamental reorientation of the economy toward responsible sustainability. The goal has to be one of tying financial profits to growth in realized human potential, community, and environmental quality, but to do that we need measures of these intangible forms of capital that are as scientifically rigorous as they are eminently practical and convenient.

Better measurement is key to reducing the market frictions that inflate the cost of human, social, and natural capital transactions. A truly revolutionary paradigm shift has occurred in measurement theory and practice over the last fifty years and more. New methods make it possible

* to reduce data volume dramatically with no loss of information,
* to custom tailor measures by selectively adapting indicators to the entity rated, without compromising comparability,
* to remove rater leniency or severity effects from the measures,
* to design optimally efficient measurement systems that provide the level of precision needed to support decision making,
* to establish reference standard metrics that remain universally uniform across variations in local impact assessment indicator configurations, and
* to calibrate instruments that measure in metrics intuitively meaningful to stakeholders and end users.

Unfortunately, almost all the admirable energy and resources being poured into business intelligence measures skip over these “new” developments, defaulting to mistaken assumptions about numbers and the nature of measurement. Typical ratings, checklists, and scores provide units of measurement that

* change size depending on which question is asked, which rating category is assigned, and who or what is rated,
* increase data volume with every new question asked,
* push measures up and down in uncontrolled ways depending on who is judging the performance,
* are of unknown precision, and
* cannot be compared across different composite aggregations of ratings.

I have over 25 years experience in the use of advanced measurement and instrument calibration methods, backed up with MA and PhD degrees from the University of Chicago. The methods in which I am trained have been standard practice in educational testing for decades, and in the last 20 years have become the methods of choice in health care outcomes assessment.

I am passionately committed to putting these methods to work in the domain of impact investing, business intelligence, and ecological economics. As is shown in my attached CV, I have dozens of peer-reviewed publications presenting technical and philosophical research in measurement theory and practice.

In the last few years, I have taken my work in the direction of documenting the ways in which measurement can and should reduce information overload and transaction costs; enhance human, social, and natural capital market efficiencies; provide the instruments embodying common currencies for the exchange of value; and inform a new kind of Genuine Progress Indicator or Happiness Index.

For more information, please see the attached 2009 article I published in Measurement on these topics, and the attached White Paper I produced last July in response to call from NIST for critical national need ideas. Various entries in my blog (https://livingcapitalmetrics.wordpress.com) elaborate on measurement technicalities, history, and philosophy, as do my web site at http://www.livingcapitalmetrics.com and my profile at http://www.linkedin.com/in/livingcapitalmetrics.

For instance, the blog post at https://livingcapitalmetrics.wordpress.com/2009/11/22/al-gore-will-is-not-the-problem/ explores the idea with which I introduced myself to you here, that the profit motive embodies our collective will for responsible and sustainable business practices, but we hobble ourselves with self-defeating inattention to the ways in which capital is brought to life in efficient markets. We have the solutions to our problems at hand, though there are no panaceas, and the challenges are huge.

Please feel free to contact me at your convenience. Whether we are ultimately able to work together or not, I enthusiastically wish you all possible success in your endeavors.

Sincerely,

William P. Fisher, Jr., Ph.D.
LivingCapitalMetrics.com
919-599-7245

We are what we measure.
It’s time we measured what we want to be.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

How bad will the financial crises have to get before…?

April 30, 2010

More and more states and nations around the world face the possibility of defaulting on their financial obligations. The financial crises are of epic historical proportions. This is a disaster of the first order. And yet, it is so odd–we have the solutions and preventative measures we need at our finger tips, but no one knows about them or is looking for them.

So,  I am persuaded to once again wonder if there might now be some real interest in the possibilities of capitalizing on

  • measurement’s well-known capacity for reducing transaction costs by improving information quality and reducing information volume;
  • instruments calibrated to measure in constant units (not ordinal ones) within known error ranges (not as though the measures are perfectly precise) with known data quality;
  • measures made meaningful by their association with invariant scales defined in terms of the questions asked;
  • adaptive instrument administration methods that make all measures equally precise by targeting the questions asked;
  • judge calibration methods that remove the person rating performances as a factor influencing the measures;
  • the metaphor of transparency by calibrating instruments that we really look right through at the thing measured (risk, governance, abilities, health, performance, etc.);
  • efficient markets for human, social, and natural capital by means of the common currencies of uniform metrics, calibrated instrumentation, and metrological networks;
  • the means available for tuning the instruments of the human, social, and environmental sciences to well-tempered scales that enable us to more easily harmonize, orchestrate, arrange, and choreograph relationships;
  • our understandings that universal human rights require universal uniform measures, that fair dealing requires fair measures, and that our measures define who we are and what we value; and, last but very far from least,
  • the power of love–the back and forth of probing questions and honest answers in caring social intercourse plants seminal ideas in fertile minds that can be nurtured to maturity and Socratically midwifed as living meaning born into supportive ecologies of caring relations.

How bad do things have to get before we systematically and collectively implement the long-established and proven methods we have at our disposal? It is the most surreal kind of schizophrenia or passive-aggressive avoidance pathology to keep on tormenting ourselves with problems for which we have solutions.

For more information on these issues, see prior blogs posted here, the extensive documentation provided, and http://www.livingcapitalmetrics.com.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Mass Customization: Tailoring Tests, Surveys, and Assessments to Individuals without Sacrificing Comparability

January 11, 2010

One of the recurring themes in this blog concerns the technical capacities for more precise and meaningful measurement that remain unrecognized and under-utilized in business, finance, and economics. One of the especially productive capacities I have in mind relates to the techniques of adaptive measurement. These techniques make it possible to tailor measuring tools to the needs of the people measured, which is the diametric opposite of standard practice, which typically assumes it is necessary for people to adapt to the needs of the measuring instrument.

Think about what it means to try to measure every case using the same statements. When you define the limits of your instrument in terms of common content, you are looking for a one-size-fits-all solution. This design requires that you restrict the content of the statements to those that will be relevant in every case. The reason for proceeding in this way hinges on the assumption that you need to administer all of the items to every case in order to make the measures comparable, but this is not true. To conceive measurement in this way is to be shackled to an obsolete technology. Instead of operating within the constraints of an overly-limiting set of assumptions, you could be designing a system that takes missing data into account and that supports adaptive item administration, so that the instrument is tailored to the needs of the measurement situation. The benefits from taking this approach are extensive.

Think of the statements comprising the instrument as defining a hierarchy or continuum that extends from the most important, most agreeable, or easiest-to-achieve things at the bottom, and the least important, least agreeable, and hardest to achieve at the top. Imagine that your data are consistent, so that the probability of importance, agreeability, or success steadily decreases for any individual case as you read up the scale.

Obtaining data consistency like this is not always easy, but it is essential to measurement and to calibrating a scientific instrument. Even when data do not provide the needed consistency, much can be learned from them as to what needs to be done to get it.

Now hold that thought: you have a matrix of complete data, with responses to every item for every case. Now, following the typically assumed design scenario, in which all items are applied to every case, no matter how low a measure is, you think you need to administer the items calibrated at the top of the scale, even if we know from long experience and repeated recalibrations across multiple samples that the response probabilities of importance, agreement, or success are virtually 0.00 for these items.

Conversely, no matter how high a measure is, the usual design demands that all items be administered, even if we know from experience that the response probabilities for the items at the bottom of the scale are virtually 1.00.

In this scenario, we are wasting time and resources obtaining data on items for which we already know the answers. We are furthermore not asking other questions that would be particularly relevant to different individual cases because to include them in a complete data design where one size fits all would make the instrument too long. So we are stuck with a situation in which perhaps only a tenth of the overall instrument is actually being used for cases with measures toward the extremes.

One of the consequences of this is that we have much less information about the very low and very high measures, and so we have much less confidence about where the measures are than we do for more centrally located measures.

If measurement projects were oriented toward the development of an item bank, however, these problems can be overcome. You might develop and calibrate dozens, hundreds, or thousands of items. The bank might be administered in such a way that the same sets of items are applied to different cases only rarely. To the extent that the basic research on the bank shows that the items all measure the same thing, so that different item subsets all give the same result in terms of resolving the location of the measure on the quantitative continuum, comparability is not compromised.

The big plus is that all cases can now be measured with the same degree of meaningfulness, precision and confidence. We can administer the same number of items to all cases, and we can administer the same number of items as you would in your one-size-fits-all design, but now the items are targeted at each individual, providing maximum information. But the quantitative properties are only half the story. Real measurement integrates qualitative meaningfulness with quantitative precision.

As illustrated in the description of the typically assumed one-size-fits-all scenario, we interpret the measures in terms of the item calibrations. In the one-size-fits-all design, very low and very high measures can be associated with consistent variation on only a few items, as there is no variation on most of the items, since they are too easy or hard for this case. And it might happen that even cases in the middle of the scale are found to have response probabilities of 1.00 and 0.00 for the items at the very bottom and top of the scale, respectfully, further impairing the efficiency of the measurement process.

In the adaptive scenario, though, items are selected from the item bank via an algorithm that uses the expected response probabilities to target the respondent. Success on an easy item causes the algorithm to pick a harder item, and vice versa. In this way, the instrument is tailored for the individual case. This kind of mass customization can also be qualitatively based. Items that are irrelevant to the particular characteristics of an individual case can be excluded from consideration.

And adaptive designs do not necessarily have to be computerized, since respondents, examinees, and judges can be instructed to complete a given number of contiguous items in a sequence ordered by calibration values. This effects a kind of self-targeting that effectively reduces the number of overall items administered without the need for expensive investments in programming or hardware.

The literature on adaptive instrument administration is over 40 years old, and is quite technical and extensive. I’ve provided a sample of articles below, including some providing programming guidelines.

The concepts of item banking and adaptive administration of course are the technical mechanisms on which will be built metrological networks of instruments linked to reference standards. See previously posted blog entries here for more on metrology and traceability.

References

Association of Test Publishers. (2001, Fall). Benjamin D. Wright, Ph.D. honored with the Career Achievement Award in Computer-Based Testing. Test Publisher, 8(2). Retrieved 20 May 2009, from http://www.testpublishers.org/newsletter7.htm#Wright.

Bergstrom, B. A., Lunz, M. E., & Gershon, R. C. (1992). Altering the level of difficulty in computer adaptive testing. Applied Measurement in Education, 5(2), 137-149.

Choppin, B. (1968). An item bank using sample-free calibration. Nature, 219, 870-872.

Choppin, B. (1976). Recent developments in item banking. In D. N. M. DeGruitjer & L. J. van der Kamp (Eds.), Advances in Psychological and Educational Measurement (pp. 233-245). New York: Wiley.

Cook, K., O’Malley, K. J., & Roddey, T. S. (2005, October). Dynamic Assessment of Health Outcomes: Time to Let the CAT Out of the Bag? Health Services Research, 40(Suppl 1), 1694-1711.

Dijkers, M. P. (2003). A computer adaptive testing simulation applied to the FIM instrument motor component. Archives of Physical Medicine & Rehabilitation, 84(3), 384-93.

Halkitis, P. N. (1993). Computer adaptive testing algorithm. Rasch Measurement Transactions, 6(4), 254-255.

Linacre, J. M. (1999). Individualized testing in the classroom. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment (pp. 186-94). New York: Pergamon.

Linacre, J. M. (2000). Computer-adaptive testing: A methodology whose time has come. In S. Chae, U. Kang, E. Jeon & J. M. Linacre (Eds.), Development of Computerized Middle School Achievement Tests [in Korean] (MESA Research Memorandum No. 69). Seoul, South Korea: Komesa Press. Available in English at http://www.rasch.org/memo69.htm.

Linacre, J. M. (2006). Computer adaptive tests (CAT), standard errors, and stopping rules. Rasch Measurement Transactions, 20(2), 1062 [http://www.rasch.org/rmt/rmt202f.htm].

Lunz, M. E., & Bergstrom, B. A. (1991). Comparability of decision for computer adaptive and written examinations. Journal of Allied Health, 20(1), 15-23.

Lunz, M. E., & Bergstrom, B. A. (1994). An empirical study of computerized adaptive test administration conditions. Journal of Educational Measurement, 31(3), 251-263.

Lunz, M. E., & Bergstrom, B. A. (1995). Computerized adaptive testing: Tracking candidate response patterns. Journal of Educational Computing Research, 13(2), 151-162.

Lunz, M. E., Bergstrom, B. A., & Gershon, R. C. (1994). Computer adaptive testing. In W. P. Fisher, Jr. & B. D. Wright (Eds.), Special Issue: International Journal of Educational Research, 21(6), 623-634.

Lunz, M. E., Bergstrom, B. A., & Wright, B. D. (1992, Mar). The effect of review on student ability and test efficiency for computerized adaptive tests. Applied Psychological Measurement, 16(1), 33-40.

McHorney, C. A. (1997, Oct 15). Generic health measurement: Past accomplishments and a measurement paradigm for the 21st century. [Review] [102 refs]. Annals of Internal Medicine, 127(8 Pt 2), 743-50.

Meijer, R. R., & Nering, M. L. (1999, Sep). Computerized adaptive testing: Overview and introduction. Applied Psychological Measurement, 23(3), 187-194.

Raîche, G., & Blais, J.-G. (2009). Considerations about expected a posteriori estimation in adaptive testing. Journal of Applied Measurement, 10(2), 138-156.

Raîche, G., Blais, J.-G., & Riopel, M. (2006, Autumn). A SAS solution to simulate a Rasch computerized adaptive test. Rasch Measurement Transactions, 20(2), 1061.

Reckase, M. D. (1989). Adaptive testing: The evolution of a good idea. Educational Measurement: Issues and Practice, 8, 3.

Revicki, D. A., & Cella, D. F. (1997, Aug). Health status assessment for the twenty-first century: Item response theory item banking and computer adaptive testing. Quality of Life Research, 6(6), 595-600.

Riley, B. B., Conrad, K., Bezruczko, N., & Dennis, M. L. (2007). Relative precision, efficiency, and construct validity of different starting and stopping rules for a computerized adaptive test: The GAIN substance problem scale. Journal of Applied Measurement, 8(1), 48-64.

van der Linden, W. J. (1999). Computerized educational testing. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment (pp. 138-50). New York: Pergamon.

Velozo, C. A., Wang, Y., Lehman, L., & Wang, J.-H. (2008). Utilizing Rasch measurement models to develop a computer adaptive self-report of walking, climbing, and running. Disability & Rehabilitation, 30(6), 458-67.

Vispoel, W. P., Rocklin, T. R., & Wang, T. (1994). Individual differences and test administration procedures: A comparison of fixed-item, computerized-adaptive, self-adapted testing. Applied Measurement in Education, 7(1), 53-79.

Wang, T., Hanson, B. A., & Lau, C. M. A. (1999, Sep). Reducing bias in CAT trait estimation: A comparison of approaches. Applied Psychological Measurement, 23(3), 263-278.

Ware, J. E., Bjorner, J., & Kosinski, M. (2000). Practical implications of item response theory and computerized adaptive testing: A brief summary of ongoing studies of widely used headache impact scales. Medical Care, 38(9 Suppl), II73-82.

Weiss, D. J. (1983). New horizons in testing: Latent trait test theory and computerized adaptive testing. New York: Academic Press.

Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361-375.

Weiss, D. J., & Schleisman, J. L. (1999). Adaptive testing. In G. N. Masters & J. P. Keeves (Eds.), Advances in measurement in educational research and assessment (pp. 129-37). New York: Pergamon.

Wouters, H., Zwinderman, A. H., van Gool, W. A., Schmand, B., & Lindeboom, R. (2009). Adaptive cognitive testing in dementia. International Journal of Methods in Psychiatric Research, 18(2), 118-127.

Wright, B. D., & Bell, S. R. (1984, Winter). Item banks: What, why, how. Journal of Educational Measurement, 21(4), 331-345 [http://www.rasch.org/memo43.htm].

Wright, B. D., & Douglas, G. A. (1975). Best test design and self-tailored testing (Tech. Rep. No. 19). Chicago, Illinois: MESA Laboratory, Department of Education, University of Chicago [http://www.rasch.org/memo19.pdf] (Research Memorandum No. 19).

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.

Wright, B. D., & Douglas, G. A. (1975). Best test design and self-tailored testing (Tech. Rep. No. 19). Chicago, Illinois: MESA Laboratory, Department of Education,  University of Chicago [http://www.rasch.org/memo19.pdf] (Research Memorandum No. 19).

Contrasting Network Communities: Transparent, Efficient, and Invested vs Not

November 30, 2009

Different networks and different communities have different amounts of social capital going for them. As was originally described by Putnam (1993), some networks are organized hierarchically in a command-and-control structure. The top layers here are the autocrats, nobility, or bosses who run the show. Rigid conformity is the name of the game to get by. Those in power can make or break anyone. Market transactions in this context are characterized by the thumb on the scale, the bribe, and the kickback. Everyone is watching out for themselves.

At the opposite extreme are horizontal networks characterized by altruism and a sense that doing what’s good for everyone will eventually come back around to be good for me. The ideal here is a republic in which the law rules and everyone has the same price of entry into the market.

What I’d like to focus on is what’s going on in these horizontal networks. What makes one a more tightly-knit community than another? The closeness people feel should not be oppressive or claustrophic or smothering. I’m thinking of community relations in which people feel safe, not just personally but creatively. How and when are diversity, dissent and innovation not just tolerated but celebrated? What makes it possible for a market in new ideas and new ways of doing things to take off?

And how does a community like this differ from another one that is just as horizontally structured but that does not give rise to anything at all creative?

The answers to all of these questions seem to me to hinge on the transparency, efficiency, and volume of investments in the relationships making up the networks. What kinds of investments? All kinds: emotional, social, intellectual, financial, spiritual, etc. Less transparent, inefficient, and low volume investments don’t have the thickness or complexity of the relationships that we can see through, that are well lubricated, and that are reinforced with frequent visits.

Putnam (1993, p. 183) has a very illuminating way of putting this: “The harmonies of a choral society illustrate how voluntary collaboration can create value that no individual, no matter how wealthy, no matter how wily, could produce alone.” Social capital is the coordination of thought and behavior that embodies trust, good will, and loyalty. Social capital is at play when an individual can rely on a thickly elaborated network of largely unknown others who provide clean water, nutritious food, effective public health practices (sanitation, restaurant inspections, and sewers), fire and police protection, a fair and just judiciary, electrical and information technology, affordably priced consumer goods, medical care, and who ensure the future by educating the next generation.

Life would be incredibly difficult if we could not trust others to obey traffic laws, or to do their jobs without taking unfair advantage of access to special knowledge (credit card numbers, cash, inside information), etc. But beyond that, we gain huge efficiencies in our lives because of the way our thoughts and behaviors are harmonized and coordinated on mass scales. We just simply do not have to worry about millions of things that are being taken care of, things that would completely freeze us in our tracks if they weren’t being done.

Thus, later on the same page, Putnam also observes that, “For political stability, for government effectiveness, and even for economic progress social capital may be even more important than physical or human capital.” And so, he says, “Where norms and networks of civic engagement are lacking, the outlook for collective action appears bleak.”

But what if two communities have identical norms and networks, but they differ in one crucial way: one relies on everyday language, used in conversations and written messages, to get things done, and the other has a new language, one with a heightened capacity for transparent meaningfulness and precision efficiency? Which one is likely to be more creative and innovative?

The question can be re-expressed in terms of Gladwell’s (2000) sense of the factors contributing to reaching a tipping point: the mavens, connectors, salespeople, and the stickiness of the messages. What if the mavens in two communities are equally knowledgeable, the connectors just as interconnected, and the salespeople just as persuasive, but messages are dramatically less sticky in one community than the other? In one network of networks, saying things once gets the right response 99% of the time, but in the other things have to be repeated seven times before the right response comes back even 50% of the time, and hardly anyone makes the effort to repeat things that many times. Guess which community will be safer, more creative, and thriving?

All of this, of course, is just another way to bring out the importance of improved measurement for improving network quality and community life. As Surowiecki put it in The Wisdom of Crowds, the SARS virus was sequenced in a matter of weeks by a network of labs sharing common technical standards; without those standards, it would have taken any one of them weeks to do the same job alone. The messages these labs sent back and forth had an elevated stickiness index because they were more transparently and efficiently codified than messages were back in the days before the technical standards were created.

So the question emerges, given the means to create common languages with enhanced stickiness properties, such as we have in advanced measurement models, what kinds of creativity and innovation can we expect when these languages are introduced in the domains of human, social, and natural capital markets? That is the question of the age, it seems to me…

Gladwell, M. (2000). The tipping point: How little things can make a big difference. Boston: Little, Brown, and Company.

Putnam, R. D. (1993). Making democracy work: Civic traditions in modern Italy. Princeton, New Jersey: Princeton University Press.

Surowiecki, J. (2004). The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies and nations. New York: Doubleday.

Creative Commons License
LivingCapitalMetrics Blog by William P. Fisher, Jr., Ph.D. is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Based on a work at livingcapitalmetrics.wordpress.com.
Permissions beyond the scope of this license may be available at http://www.livingcapitalmetrics.com.