I\'m revisiting som old code of mine and have stumbled upon a method for getting the title of a website based on its url. It\'s not really what you would call a stable metho
A simpler way to get the content:
WebClient x = new WebClient();
string source = x.DownloadString("http://www.singingeels.com/");
A simpler, more reliable way to get the title:
string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",
RegexOptions.IgnoreCase).Groups["Title"].Value;
Inorder to accomplish this you are going to need to do a couple of things.
I have done this before with SEO bots and I have been able to handle almost 10,000 requests at a single time. You just need to make sure that each web request can be self contained in a thread.
Perhaps with this suggestion a new world opens up for you I also had this question and came to this
Download "Html Agility Pack" from http://html-agility-pack.net/?z=codeplex
Or go to nuget: https://www.nuget.org/packages/HtmlAgilityPack/ And add in this reference.
Add folow using in the code file:
using HtmlAgilityPack;
Write folowing code in your methode:
var webGet = new HtmlWeb();
var document = webGet.Load(url);
var title = document.DocumentNode.SelectSingleNode("html/head/title").InnerText;
Sources:
https://codeshare.co.uk/blog/how-to-scrape-meta-data-from-a-url-using-htmlagilitypack-in-c/ HtmlAgilityPack obtain Title and meta