R: rvest - is not proper UTF-8, indicate encoding?

不想你离开。 提交于 2019-12-01 14:09:56

Looking at the page source, they claim to be using UTF-8 encoding:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

So, the question is, are they really using a different enough encoding we need to worry about, or can we just convert to utf-8, guessing that any errors will be negligible?

If you are happy with a quick and dirty approach, and some potential mojibake, you can just force utf-8 using iconv:

TV_Audio_Video_Marca <- read_html(iconv(page_source[[1]], to = "UTF-8"), encoding = "utf8")

In general, this is a bad idea - better to specify the encoding it's from. In this case, maybe the error is theirs, so this quick and dirty approach might be ok.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!