I am trying to load wikipedia\'s data on US Supreme Court Justices into R:
library(rvest)
html = html(\"http://en.wikipedia.org/wiki/List_of_Justices_of_the
Maybe like this
library(XML)
library(rvest)
html = html("http://en.wikipedia.org/wiki/List_of_Justices_of_the_Supreme_Court_of_the_United_States")
judges = html_table(html_nodes(html, "table")[[2]])
head(judges[,2])
# [1] "Wilson, JamesJames Wilson" "Jay, JohnJohn Jay†" "Cushing, WilliamWilliam Cushing" "Blair, JohnJohn Blair, Jr."
# [5] "Rutledge, JohnJohn Rutledge" "Iredell, JamesJames Iredel
removeNodes(getNodeSet(html, "//table/tr/td[2]/span"))
judges = html_table(html_nodes(html, "table")[[2]])
head(judges[,2])
# [1] "James Wilson" "John Jay†" "William Cushing" "John Blair, Jr." "John Rutledge" "James Iredell"
You could use rvest
library(rvest)
html("http://en.wikipedia.org/wiki/List_of_Justices_of_the_Supreme_Court_of_the_United_States")%>%
html_nodes("span+ a") %>%
html_text()
It's not perfect so you might want to refine the css selector but it gets you fairly close.