Simple web crawler in C#

前端未结

关注

 4  773

囚心锁ツ 2020-12-04 18:55

I have created a simple web crawler but i want to add the recursion function so that every page that is opened i can get the urls in this page,but i have no idea how i can d

4条回答

粉色の甜心 (楼主)

2020-12-04 19:20

I fixed your GetContent method as follow to get new links from crawled page:

public ISet GetNewLinks(string content)
{
    Regex regexLink = new Regex("(?<= newLinks = new HashSet();    
    foreach (var match in regexLink.Matches(content))
    {
        if (!newLinks.Contains(match.ToString()))
            newLinks.Add(match.ToString());
    }

    return newLinks;
}

Updated

Fixed: regex should be regexLink. Thanks @shashlearner for pointing this out (my mistype).

0 讨论(0)

查看其它4个回答