I am using the following code to scrape an HTML table on AFL player data:
library(rvest)
website <-read_html(\"https://afltables.com/afl/stats/teams/adel
Firstly, and unrelated to your question: Don't use table
as a name for your objects, because this name is already reserved for other functionalities in R
. It is considered bad practice and I've been told that it will come back and nip you in the butt somewhere down the line.
Moving on to the question: You are struggling with the type of data that html_table()
gives you. You are returned a list, which contains a regular data.frame. The list you outputted, has NULL
for the number of columns and rows, because that list only has one element: the data.frame. By selecting that first (and only) element of your list, you will get to the dataframe you're actually interesting in. This dataframe has 27 columns and 34 rows
website <-read_html("https://afltables.com/afl/stats/teams/adelaide/2017_gbg.html")
scraped <- website %>%
html_nodes("table") %>%
.[(1)] %>%
html_table() %>%
`[[`(1) # Select the first element of the list, like scraped[[1]]
ncol(scraped)
# 27
nrow(scraped)
# 34