The table was originally going to stand alone, but I wanted to provide some sort of visual overview, and that led me to the tag plot. The plot shows tags for all the statements in the filtered set. Larger type and a higher position reflects frequency (please note that the y axis is set to a log-10 scale to make the lower half of the plot easier to read).
Color and left-right position shows whether the tag appears more often with one party or the other. The x axis is based on a simple index. A value of -1 means the tag only appears in statements by Democrats (in the filtered set), and a value of 1 means the tag only appears with Republican statements. A value of zero means it’s an even split. Please note that I lumped both independents in Congress with the Democrats, because they caucus with that party (and because this shortcut saved me many headaches).
The tag plot can get crowded very quickly, so I’ve provided a filter on the right that can limit the display to the top tags in the filtered set. The plot can also display dots instead of text, but this feature is still in development.
Much credit for the tag plot goes to Daniel McNichol, who thought about all the ways he was disappointed with word clouds and came up with a much better approach.
In the example above, the table is filtered on the tag “Abraham Lincoln.” We see that Democrats are mentioning Lincoln more often than Republicans, and are evoking him during discussions of voting, shutdowns, and Congress. Both parties have used Lincoln while discussing anti-Semitism, with slight favor going to the Democrats. Only Republicans have evoked Lincoln while discussing abortion.
Please contact me with any other uses and ideas.
A home-brewed script, written in R and making extensive use of the Tidyverse and Rpart packages, examines each sentence and serves up the ones that are most likely to contain some historical content. I then remove the statements that are not concerned with using the past to advance policy or politics.
That’s a lot of whittling down, but there’s still a lot left over. Some of this is due to the political impulse to place the current moment in history or justify a current policy with a historical argument. And there’s also the fact that politicians repeat themselves. Trump and Vice President Pence repeat themselves a lot. And when members of Congress agree on a talking point, it becomes a drumbeat.
But I think it’s important to include them all. The repetition means that a particular group or party finds this use of the past important enough to hammer home.
On the other hand, I’m not including references to the past contained in the great many tributes, honorifics, and celebratory speeches that appear on the House and Senate floor. These are political, to an extent, but have little to do with policy. Including them would utterly flood the database. Similarly, I’ve excluded purely biographical and autobiographical statements. I’m also passing over casual uses of the past; when a speaker mentions that a certain event was “historic” but does not connect that to a broader interpretation or make the connection to politics or policy, I leave it out.
For example, when members of Congress rose to speak about the shooting in Las Vegas in 2017, each one made note that it was the worst mass shooting in US history. Few followed through to connect that historical statement to their policy objectives. Trump, on the other hand, frequently refers to his own presidency as historic and his actions as unprecedented. This could arguably be called “casual,” but it is more likely serving a distinct political purpose in maintaining his image as a new and unconventional force in Washington, and the policies aligned with that image.
Finally, I had to decide what time frame counts as the historical past. There’s no unassailable answer here, so I settled on an arbitrary but defensible border of roughly 20 years. So if it’s referenced by a politician and it happened in the 1990s or before, I’ve included it. It’s an admittedly porous line, but a line had to be drawn somewhere.
The tags are organized into broad categories. Selecting a category will retrieve all the tags in that grouping in the tag browser’s second search box.
So for example, users interested in Latin America might select that broad category and then see the list of countries that appear in the dataset. More amorphous categories include “Government,” which covers impeachment, elections, and the census, and “Social,” which covers domestic issues and social groups.
It is not necessary to select a tag after selecting a category. If a user wants to see all statements under the umbrella “Latin America,” they should leave the “Tags” search box blank.
The search box above the table is a global search, and it is not literal. For instance a search on “voting rights” will find either of those words, together or separated by other words, in any column. This is handy for searching the headline and the tags at the same time.
The searches above the speaker, party, and institution columns are limited to the contents of that column. Start searching for “green” in the speaker column and you will quickly be given a list of speakers with that word in their name (Reps. Al Green and Mark Green). Select one or more from the list to filter. Keep in mind if you select one person, the x-axis in the tag plot won’t have much meaning.
In the tags and headline columns, you can use Regular Expressions. A few useful examples are below.
retrieves statements tagged with either Founders or Jefferson
Retrieves statements tagged with both Quotes and Founders, in either order (you can do this more simply in the global search box (type: “quotes founders”), but you’ll also be searching the other columns.
Will match Africa but not African