Data

Clean, Tidy Data Sets

The data sets below are ready to visualize and model. They are tidy. I filtered out useful subsets. I selected and descriptively renamed the most important variables. I meaningfully reordered factors.

Name	Data Files	Codebook	Description
`gamson`	[`.csv`] [`.dta`] [`.rds`] [`.xlsx`]	[html]	From Warwick and Druckman (2006)
`health`	[`.csv`] [`.dta`] [`.rds`] [`.xlsx`]	[html]	From Barrilleaux and Rainey (2014).
`nominate`	[`.csv`] [`.dta`] [`.rds`] [`.xlsx`]	[html]	Ideology scores for members of the U.S. House.
`parties`	[`.csv`] [`.dta`] [`.rds`] [`.xlsx`]	[html]	From Clark and Golder (2006) .
`state-legislators`	[`.csv`] [`.dta`] [`.rds`] [`.xlsx`]

Original Data Sources

For replication data sets, I recommend starting with the Dataverse archives for AJPS [web and PSRM.
For really raw data for wrangling practice, I recommend (ordered from least to most difficult) Donald Trump’s tweets [GitHub], the World Bank’s World Development Indicators [GitHub], Google political ads data [web, Dropbox], or 10 million dyadic events [Dataverse].
For data on international politics, I recommend COW [web], DESTA [web], and Matt Fuhrmann’s data on nuclear weapons [web].
For data on political institutions, I recommend Polity IV [web], DD [web], Freedom House [web], and DES [web].
For US state politics, I recommend the Correlates of State Policy data set [web], which combines variables from many projects in single, enormous collection.
For data on legislator ideology, I recommend NOMINATE [web] and the American Legislatures Project [web].
For data on human rights, I recommend Human Rights Scores [web], PTS [web], CIRI [web], and ITT [web].
For survey data, I recommend the ANES [web], CCES [web], CSES [web], and the World Values Survey [web].
For data from randomized experiments, I recommend TESS [web].
Google now has a search for data sets.
Let’s go meta: PolData is a data set of data sets.