while scrapping some websites I see some text contains HTML tags, CSS style, undefined characters ... in it. due to these characters, I get an error while inserting it into