This is the schedule for Advanced Seminar in Computational Linguistics: Computational Social Science, Fall 2015.
THIS SCHEDULE IS A WORK IN PROGRESS!
In addition, some topic areas may take longer than expected, so keep an eye on the class mailing list or e-mail me for "official" dates.
Also note that some links point to pay-for-access publishers, but the links are accessible for free from UMD IP addresses.
I've organized this week's readings into two groups: readings that we can use to structure the discussion and readings that will inform the discussion. There are a lot of readings listed in the latter group, but none of them have any technical complexity (not even the emotional contagion paper) and most of them are quite short, so please at least look them over.
Readings to structure the discussion
Readings that should also inform the discussion
With regard to computational social science, the Supreme Court is a great area of study. There are large, available sources of data that include voluminous language of various types (e.g. merits briefs, amicus briefs, majority and minority opinions, oral arguments), along with tons of metadata (vote data, age, party of the appointing president, etc.).
And from the perspective of this seminar, the Supreme Court is a fascinating area of study because in this setting the connection between language and mental state is paramount. What can language tell us about the underlying opinion or ideology of justices or the people arguing the case? To what extent do we see linguistic evidence of influence or persuasion? What approach do opposing sides take to framing the same issue? How might power relationships be reflected in language use?
Appetizers
Main course
Some additional notes on ideal point models
For those who are helped a lot by understanding the intuitive basis for the model, the first section of Sim et al. (2015) has a nice summary of previous ideas (see next week's readings). See also Section 1 of Bafumi et al. http://www.stat.columbia.edu/~gelman/research/published/171.pdf, which has a nice explanation of what this kind of model is doing.
For those who want to see the mathematical discussion in a little more detail, Clinton et al. http://politics.as.nyu.edu/docs/IO/4756/jackman_nemp.pdf (Section 3) fleshes out the discussion in Martin and Quinn Section 3.1 where they formalize the justice's decision process. See also Clinton et al. (2004), http://www.cs.princeton.edu/courses/archive/fall09/cos597A/papers/ClintonJackmanRivers2004.pdf.
Ideal point models in political science are related to item response theory (IRT), which is discussed in the educational assessment literature: probability of a yes/no vote as related to ideological point, in politics, is analogous to the probability of giving a correct answer on a test as related to your ability. There is a nice discussion of IRT at Partchev (2004), https://www.metheval.uni-jena.de/irt/VisualIRT.pdf; see in particular the 2PL model (Section 5). Slides 22-23 at http://jonathantemplin.com/files/irt/irt11icpsr/irt11icpsr_lecture14.pdf derive the form of the model we're looking at from the 2PL IRT model.
Data resources
We have access to pretty much anything one could ask for with regard to the Supreme Court: judge-case-level metadata, case-level metadata, processed opinion content, merits briefs, amicus briefs, transcripts of oral arguments. Here are a few useful links.
More generally, there are lots of really interesting sources out there for code and data. One nice compendium I've found is from Bicoastal Datafest: analyzing money's influence on politics, which includes a nice list of well defined project ideas as well as pointers to projects that were done, along with great lists of data and tools.
Framing, on the other hand, is not about what gets talked about but how; Entman (1993) writes that framing "framing essentially involves selection and salience. To frame is to select some aspects of a perceived reality and make them more salient in a communicating text, in such a way as to promote a particular problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described."
There is a truly extensive literature on both of these topics -- enough for an entire course in itself. This week we will get familiar with one widely cited discussion, Sheufele's article considering these concepts from the perspective of cognitive effects. Then we'll look at two computational papers related to these concepts. Related to agenda setting, we discuss Leskovec et al., which was a very innovative development in understanding how "memes" get on the radar in news and blogs. Related to framing, we cover Nguyen et al., who develop a model extending the idea of using topic models to define ideal point dimensions (cf. Lauderdale and Clark, last week) to a hierarchical topic model inspired by the treatment of framing as second level agenda setting.
Readings
Also of interest:
Framing
Also of interest
Readings