thesis proposal (from the ITP thesis blog)



TITLE Hate Tapestries

ELEVATOR PITCH A series of embroidered tapestries visualizing the usage and contexts of specific hateful words on the Internet. Each tapestry is representative of a single word used against one demographic group over a set period of time.

DESCRIPTION I plan to scrape specific words that are universally recognized as "hate speech" from the Internet. I will perform a key word search on specific websites for words associated with four demographic groups (right now I'm planning to use women, the LGBT community, African-Americans and Muslims/Muslim-Americans) over a set period of time (24 hours or one week); I will then use natural language processing in Python to place each usage of the specified word on a spectrum ranging from "benign/re-appropriation" to "extremely hateful". From there, I will then use Processing to assign each usage a color based on where they fall on the spectrum (each demographic group will have a hue associated with it, and I will vary the saturation and brightness), create a pattern that equates one pixel with one instance of the word, and cross-stitch a tapestry that visualizes all of the uses in a physical form. Each tapestry will ideally be 16" x 20" (but this may change as I begin the visualizations and experiment with different aspect ratios), and I hope to make 4 of them by the end of the semester (one for each group); however, I can envision this as an ongoing project where I perhaps scrape data every week for two or three months, maybe longer, to enable viewers to see the discrepancies and changes in usage over a longer period of time. I can imagine a larger set being displayed in a gallery, or during some kind of fundraising event benefiting one or more of the affected groups.

WHY YOU WANT TO MAKE THIS I'm extremely interested in the immense power of language to dehumanize and ostracize entire groups of people based on race, gender, physical ability, and other cultural or ethnic criteria. I am also interested in the way that hate speech is used on the Internet — particularly in the way that these words can be completely stripped from their extremely historically loaded meanings and instead used almost as something to add flavor to a sentence, while simultaneously used in their original contexts and loaded with pointed, vitriolic rage (the most recent and popular example being the expletive-filled tweets and posts referring to President Obama as the "n-word" after his re-election and during his second inauguration). In both cases, it's so much easier to use this type of language online than it is in real life; words that you would never dream of saying to someone's face are all of a sudden thrown around on the internet without a second thought, because there are no real consequences to the action. I'm concerned that as more and more people start to essentially grow up on the Internet, the boundaries will begin to blur and using hate speech and disrespectful language in real life will begin to become as accepted and commonplace as it seems to have become online.

I want to examine how the use of hateful language is changing as the internet increasingly becomes a part of the way we live our lives. This project will serve as a snapshot of hate speech on the Internet today as a potential first step into broader work about this issue.

I want to create tapestries as the final output for three reasons. One, embroidery (and cross-stitch specifically) lends itself really well to data representation (each instance of a word can be represented by one pixel, which translates into one stitch); two, I see the analog handicraft form as a nod to the fact that hate speech is a centuries-old "tradition" that has existed long before the dawn of the Internet; and three, I envision the final pieces as artifacts that can now be given to someone who has felt stigmatized by the use of the represented word; that person now has a beautiful, visually interesting object that was born out of something personally detrimental and negative. To that end, beyond the scope of my ITP thesis, as I make more of these tapestries I can see making them available online; a further step could possibly be to continuously run my text scraping script for a long period of time, and then make tapestries for certain weeks based on requests that I potentially get from people who would like to own one of the pieces.

RESEARCH PLAN I'm currently reading about the history of the social usage of hate speech, as well as some legal background about its association with free speech, the First Amendment, and usage of hate speech on the Internet (legally oriented books by Samuel Walker, Michael Herz/Peter Molnar, articles from various law reviews, as well as examining the work of organizations like Hatewatch and Partners Against Hate). Over the course of the next two to three weeks, I plan to do extensive research into blog posts, tweets, message boards, and forums that will help me determine context words for explicitly hateful usage that I can then use to create a dictionary that will evaluate how the words from my searches are being used and place them on a spectrum appropriately; the more context words appear, or the more severe or profane they are, the more hateful the usage is. I will do this while writing the Python code to scrape and collect the data, so I can be testing and evaluating this process continuously. By the midterm presentation, I hope to have one screen-based or print visualization completed that I can show as an example. The rest of the semester will be spent on creating the final visualizations and stitching the tapestries. My plan is to have at least 4 completed tapestries by Thesis Week.

I'm planning on scraping text data from Twitter, Reddit, and the New York Times comments; however, if, in the next week, I find a better source with an accessible comment search API, I will replace one of these sites (most likely the Times).

thesisRoopa Vasudevan