Web page recommender system
I am trying to build a recommender system which would recommend webpages to the user based on his actions(google search, clicks, he can also explicitly rate webpages). To get an idea the way google news does it, it displays news articles from the web on a particular topic. In technical terms that is clustering, but my aim is similar. It will be content based recommendation based on user's action. So my questions are: How can I possibly trawl the internet to find related web-pages? And what algorithm should I use to extract data from web-page is textual analysis and word frequency the only way