||[Aug. 30th, 2006|10:22 pm]
CHI'06 Geographical Distribution I got interested in other statistical work with the CHI archive. (Rant: fer chrissake, ACM, could you make things be available in any other way than raw HTML? Even some bloody structured HTML would be a step up. I shouldn't be searching for strong tags in an attempt to find names. Get your act together and have some decent CSS classes. Rant over.) So after seeing Nick's |
One thing I've wondered about for a while is gender. Is HCI becoming a more evenly mixed conference? Is it an old boys club and going to stay that way? What about all the awards? This year, Judith Olson was the first woman to win the Lifetime Achievement Award, jointly with Gary Olson; the other eight members are all men. There are 32 male members of the CHI Academy, and 5 women. But for the Distinguished Service Award... six men, five women. Interesting, eh? What's going on there? Now, let's face it, computing in general has been male dominated for a long time, and a lot of men have done a lot of very good work, and I do not deny in any way that there are good researchers on these lists who have every right in the world to be there. (But I remember hearing the graduate students of one very respected European female researchers suggesting that they buy her an honorary penis so that she'd be eligible for the Academy rather than the Service award...) So what is the gender breakdown of CHI publications, anyway, as a baseline measure of CHI involvement?
Have a look at this:
Now, one big problem is that I don't have a good list of names by gender. I looked around for a public domain name list on the web, and couldn't find one.
So I went to the US Census Website and took their top 1000 names by gender for the last six decades, and put those all into a file with some weighting based on the number of times they showed up for some weighting (IE 'George' was on the list for women for a while back in the day.) Results are here in no particular order: they're off the form name #ofinstancesofmalename #ofinstancesoffemale name. Then I wrote some code to go through the screen-scraped lists of CHI'xx papers and dig out the names and figure out if they're male, female, if they're just initials or if we don't know which they are. Raw spreadsheet data here. As you'll note from above, I'm differentiating between being unable to tell because there's only initials in the database (as in three years in the eighties), and from names it doesn't know, like "Hiroshi".
And therein lies the rub. A quick look at the above picture will show you my problem: I just don't have a good enough list of gendered names. I think you can surmise that what's happening is an *increase* in cultural diversity: as time goes on we see more non-US names that don't show up in the US census, and so I can't place those. Does anyone know of a good source for a list of gendered names? I've found a few commercial packages googling around, but nothing that I can get for free.
This is still, clearly, in its early stages, and I don't think there's anything much to say from its current state. I do think it's an interesting thing to look at, however, and can only benefit thinking about it.
ps. Disclaimer: There are some basic assumptions about binary gender that are problematic on a deeper look, but that's not really the focus here.
pps. I have such a major cold right now it's horrible. I went home and slept for a few hours at 4pm today. This is is the sort of relaxing I like to do when I'm sick. Twisted, eh?
ppps. See updated statistics here