Robert Philen's Blog: statistics

Thursday, February 7, 2008

Central Tendency Measurement and Non-Enumerative Data

As I suggested in my previous post, statistical measures and concepts are one set of analytical tools that can be useful for a variety of research purposes. This can even be true with regard to research on phenomena that, while quantifiable (all phenomena have quantity), are difficult or impossible to measure in a highly enumerated fashion. (Take the example of kinship. One could measure the presence or absence of matrilineality. One could count up the number of households or family groups practicing matrilineality in a given community. One could assess in rough terms whether filiation is strongly or weakly matrilineal. It’s difficult to imagine how one would precisely measure matrilineality on a numerical scale, though.)

One important statistical concept is that of central tendency, and central tendency measures can be usefully applied to a variety of quantities, including some non-enumerable entities.

For example, in his textbook Traditional Cultures, Glenn King uses the notion of modal patterns as a central measure of broad cultural patterns for a variety of world areas. This is not a “normative” approach to the representation of cultures and culture areas in the sense of presenting universal patterns that inevitably essentialize and homogenize the areas in question. Instead, King is careful to point out the identification of a modal pattern simply means to identify for any particular component of culture the pattern that is more common than any other for the spatial frame of reference at hand, and that almost by definition, to speak of modal patterns is to recognize that there will be exceptions, perhaps copious exceptions, to the identified central tendency.

The mode is a particularly useful central tendency measure for phenomena that are hard or impossible to enumerate. Take kinship again. One could say (and King’s textbook does) that among Eastern Native North Americans prior to European contact, matrilineality was the modal pattern, and that’s a useful piece of information. On the other hand, with this and much other information anthropologists are interested in, I’m not sure how one would usefully apply other central tendency measures – so I’m definitely not arguing for over-statisticalization of the discipline. For example, what would a mean or median kinship system be? (I suppose one could take possible rough measures of degree of filiation, rank them on an arbitrary scale, e.g. 1= strong patrilineal filiation, 2= weak patrilineal filiation, 3=bilateral or bilineal filiation, 4= weak matrilineal filiation, 5=strong matrilineal filiation, and collect mean or median tendencies on that basis, but that strikes me as exceedingly artificial and I’m at a loss to imagine the use for such figures.)

Even in cases where statistical concepts and measures (whether in basic terms as I’ve been discussing or through the use of more complex analyses and tests) are useful, scholarship remains simultaneously intrinsically qualitative.

To assess modal tendencies is to first define what entities are to be assessed as present or not and counted. With something like kinship, different tendencies could potentially be measured depending on whether one focused on individuals, households, or families (with those last two needing careful definition in research planning and interpretation as well). To create a hypothetical situation, I could imagine that many Iroquois communities experienced transformations in the early 19th century, through influence of things like religious conversion and revitalization, inter-marriage with Anglos, the encroachment of white settlers, etc., where within communities there may have been co-presence of many small bilaterally-trending neolocal households alongside a small number of large matrilineal matrilocal households. In some communities at certain points of time, there may have been no clear modal pattern – or rather multiple modal patterns might have co-existed. For example, the modal household may have been small and neolocal, while the modal individual may have lived in a large matrilocal household. For such a purely hypothetical context, both would be important measures that would depend on attention to qualitative details in order to be assessable.

Lastly, I am arguing for transcendence of the false qualitative/quantitative divide in social science and humanities research. I’m also arguing that as part of this statistical concepts and analysis can provide one set of tools for many research purposes, including with data that are not particularly amenable to enumeration.

I’m not arguing at all that statistics are the answer to everything. As with any task, the proper analytical tools to use depend on the task at hand. Something statistics are the wrong tool, and sometimes it’s overkill.

Tuesday, February 5, 2008

Statistics and Lies

I was recently having a discussion with a group of students, specifically about Marvin Harris’ discussion of the importance of statements of co-variance and his call for a more statistically oriented anthropology in The Rise of Anthropological Theory (affectionately – or disaffectionately – referred to as The RAT during my time as a master’s student at the University of Georgia).

One student objected that “Statistics are basically just lies.”

I was a bit taken aback by this.

Statistics can be used to mislead or distort things. For example, it’s fairly common to encounter figures on median income for U.S. households in the mainstream mass media. There’s no particular reason to doubt the accuracy of such figures in most cases, but one could begin to wonder why reportage of mean household income is much less common, much less why the two central tendency measures are so rarely seen together. But statistics per se aren’t lies.

Statistics involves a set of analytical tools and ways of thinking about sets of data. As with any other tool, statistics can be misused. But saying that statistics are lies because they can be used to lie strikes me a bit like saying that words are inherently lies because words are used to lie. (There are some who think that – but they’re lying.)

Still, there is a real and strong distrust of statistics among many cultural anthropologists and scholars in the humanities disciplines. This seems to me to derive from the now old (and tired) divide between “quantitative” and “qualitative” scholarship and the strong mutual distrust that has permeated that divide.

I’ve written before that this is a false divide. There is no non-quantitative research. All scholarship involves an awareness of quantity, whether in the binary mathematics of presence/absence; rough quantification along the lines of something being present in small or large amount, or happening frequently, continuously, or infrequently; or the highly enumerated quantification of precise counting. There is no non-qualitative research. All scholarship involves choice of what to pay attention to, count, etc.

Moreover, the emphasis on the qualitative/quantitative labels tends to obscure what all good scholarship shares in common, which is measurement and interpretation (see “Measurement and Interpretation”). If one moves past the qual/quant divide (the sort of attitude of “I’m not the sort of scholar who does statistics” or “I’m not the sort who pays attention to anything that can’t be quantified” [by which most mean enumeration, because again, there’s nothing that’s without quantity]) then a whole range of analytical tools and ways of thinking are opened up as possibilities, to be deployed as best fits the research question at hand rather than as best fits an ideological commitment to being “qualitative” or “quantitative.”

Robert Philen's Blog

Thursday, February 7, 2008

Central Tendency Measurement and Non-Enumerative Data

Tuesday, February 5, 2008

Statistics and Lies

Links

Blog Archive

About Me