Extracting <tr> values from multiple html files
问题 I am new to web-scrapping. I have 3000+ html/htm files and I need to extract "tr" values from them and transform in a dataframe to do further analysis. Codes which I have used is: html <- list.files(pattern="\\.(htm|html)$") mydata <- lapply(html,read_html)%>% html_nodes("tr")%>% html_text() Error in UseMethod("xml_find_all") : no applicable method for 'xml_find_all' applied to an object of class "character" What I am doing wrong? To extract in a dataframe, i have this code u <- as.data.frame