Data Archive

Guidelines for Using Four Data Sets
Congress, the Press, and Political Accountability
R. Douglas Arnold


This is a guide to using the four data sets that I collected for Congress, the Press, and Political Accountability (Princeton University Press and the Russell Sage Foundation, spring 2004).

The first data set shows how a random sample of twenty-five local newspapers covered a random sample of twenty-five representatives for nearly two years. The data set, which consists of 8,003 articles, contains every news story, editorial, opinion column, letter, and list in a local newspaper that mentioned a local representative between January 1, 1993, and November 8, 1994.

The second data set shows how twelve newspapers - a random sample of six newspapers and six competing papers from the same cities - covered a random sample of six representatives for nearly two years. The data set, which consists of 2,175 articles, contains every news story, editorial, opinion column, letter, and list that mentioned a local representative between January 1, 1993, and November 8, 1994.

The third data set includes information about the volume and timing of coverage for a large sample of newspapers and representatives. This data set shows how 67 local newspapers covered 187 representatives during 1993 and 1994, with a total of 242 representative/newspaper dyads. The 61,084 citations - headline, date, section, page, and byline, but not full-text - allow one to analyze how the amount and timing of coverage depend on the newsworthiness of individual representatives, the competitiveness of elections, and the resources and constraints that face individual newspapers.

The fourth data set is designed to determine whether the volume of newspaper coverage affected what citizens knew about their representatives. This data set was constructed by linking information about how extensively the 67 newspapers in the third data set covered particular representatives with information about citizens' knowledge of their local representatives, as recorded in the autumn 1994 survey conducted by the National Election Studies. The unit of analysis is the individual citizen. Added to the usual attitudinal data about each citizen is information about how a local newspaper covered that citizen's representative during 1993 and 1994. The original 1994 NES data set had 1,795 respondents. I have information about local newspaper coverage for 675 of these respondents.

The collection of these data sets was supported by the Russell Sage Foundation, the National Science Foundation (SBR-9422386; SES-0209609), Princeton University, the Earhart Foundation, the Dirksen Congressional Center, the Caterpillar Foundation, and the Goldsmith Awards Program at the Joan Shorenstein Center on the Press, Politics, and Public Policy at Harvard University.

Congress, the Press, and Political Accountability is copyrighted by the Russell Sage Foundation (©2004). Several paragraphs in the three documents that describe the procedures for assembling the various data sets were taken from that book and retain their copyright.

The following guides are in pdf format, created with Adobe Acrobat 5.0. The data sets are in SPSS format, created with SPSS 11.5. The data sets can be opened up with SPSS, R, and many other statistical programs.

Comparison of Four Data Sets 
Lists of Newspapers and Representatives 
 Table A Newspapers and Representatives - First Data Set 
 Table B Newspapers and Representatives - Second Data Set 
 Table C Newspapers and Representatives - Third Data Set 
First and Second Data Sets 
 Procedures for Assembling the First and Second Data Sets 
 Coding Newspaper Articles in the First and Second Data Sets 
 Issue and Bill Codes in the First and Second Data Sets 
 Origin of Variables in the First and Second Data Sets 
 Values for Variables in the First and Second Data Set 
 First Data Set (N = 8,003 articles) (SPSS file) 
 Second Data Set (N = 2,175 articles) (SPSS file) 
 First and Second Data Sets Combined (N = 9,125 articles) (SPSS file) 
Third Data Set 
 Procedures for Assembling the Third Data Set 
 Origin of Variables in the Third Data Set 
 Number of Citations in the Third Data Set 
 Third Data Set (N = 242 representative/newspaper dyads) (SPSS file) 
Fourth Data Set 
 Procedures for Assembling the Fourth Data Set 
 Origin of Variables in the Fourth Data Set 
 Variable Information for the Fourth Data Set 
 Fourth Data Set (N = 675 citizens) (SPSS file)