How to get website title from c#

后端 未结 3 1810
夕颜
夕颜 2020-12-15 07:35

I\'m revisiting som old code of mine and have stumbled upon a method for getting the title of a website based on its url. It\'s not really what you would call a stable metho

相关标签:
3条回答
  • 2020-12-15 07:49

    A simpler way to get the content:

    WebClient x = new WebClient();
    string source = x.DownloadString("http://www.singingeels.com/");
    

    A simpler, more reliable way to get the title:

    string title = Regex.Match(source, @"\<title\b[^>]*\>\s*(?<Title>[\s\S]*?)\</title\>",
        RegexOptions.IgnoreCase).Groups["Title"].Value;
    
    0 讨论(0)
  • 2020-12-15 07:51

    Inorder to accomplish this you are going to need to do a couple of things.

    • Make your app threaded, so that you can process multiple requests at the time and maximize the number of HTTP requests that are being made.
    • Durring the async request, download only the amount of data you want to pull back, you could probably do parsing on the data as it comes back looking for
    • Probably want to use regex to pull out the title name

    I have done this before with SEO bots and I have been able to handle almost 10,000 requests at a single time. You just need to make sure that each web request can be self contained in a thread.

    0 讨论(0)
  • 2020-12-15 07:55

    Perhaps with this suggestion a new world opens up for you I also had this question and came to this

    Download "Html Agility Pack" from http://html-agility-pack.net/?z=codeplex

    Or go to nuget: https://www.nuget.org/packages/HtmlAgilityPack/ And add in this reference.

    Add folow using in the code file:

    using HtmlAgilityPack;
    

    Write folowing code in your methode:

    var webGet = new HtmlWeb();
    var document = webGet.Load(url);    
    var title = document.DocumentNode.SelectSingleNode("html/head/title").InnerText;
    

    Sources:

    https://codeshare.co.uk/blog/how-to-scrape-meta-data-from-a-url-using-htmlagilitypack-in-c/ HtmlAgilityPack obtain Title and meta

    0 讨论(0)
提交回复
热议问题