Gender
A key element of our proposed work involves classifying the gender of each editorial board member whose name we gather from the internet. Gender classification of names is known to be an extremely difficult computational task, and human intelligence is necessary in many cases. For these hard-to-determine names, we will use Amazon Mechanicl Turk (MTurk) (described in more detail in this lab note).
Our use of MTurk involves asking other people ("workers") to state the gender associated with a given editor based on the editor's name and based on other information that might be found on the internet, including text and images referring to the editor. Workers are also asked to indicate their degree of certainty in the gender they guess. Finally, each editorial board member will have its gender guessed by multiple workers to validate the data. In keeping with best practices and modern understanding, we will allow for the workers to assign an "alternate identification" if they feel the male/female binary choice does not apply.
Most importantly, we recognize that an individual's gender may not fit neatly into traditionally used categories, and that gender is most appropriately expressed and explained by that individual. Still, we must have gender data in order to begin quantifying the representation of women. Also, our crowdsourced approach will result in a more complete data set than would an attempted survey of many thousands of board members, which we anticipate would have low yield.
0 comments