NSXMLDocument, search with nodesForXPath:

天大地大妈咪最大 提交于 2019-12-14 04:25:23

问题


I need to search through an HTML document for two specific strings of text in cocoa. I am creating an NSXMLDocument with the web page: Page Example Then I am trying to search it for the app title, and the url of the icon. I am currently using this code to search for the specific strings:

NSString *xpathQueryStringTitle = @"//div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='title' @class='intro has-gcbadge']/h1";
NSString *xpathQueryStringIcon = @"//div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='left-stack']/div[@class='lockup product application']/a";
NSArray *titleItemsNodes = [document nodesForXPath:xpathQueryStringTitle error:&error];
if (error)
    {
        [[NSAlert alertWithError:error] runModal];
        return;
    }
error = nil;
NSArray *iconItemsNodes = [document nodesForXPath:xpathQueryStringIcon error:&error];
    if (error)
    {
        [[NSAlert alertWithError:error] runModal];
        return;
    }

When I try to search for these strings I get the error: "XQueryError:3 - "invalid token (@) - ./*/div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='title' @class='intro has-gcbadge']/h1" at line:1"

I am loosely following this tutorial.

I also tried this without all of the @ symbols in the xPath, and it also returns an error. My syntax is obviously wrong for the xPath. What would the basic syntax be for this path. I've seen plenty of examples with a basic XML tree, but not html.


回答1:


I suspect it's that part near then end where you have a test for two attributes

/div[@id='title' @class='intro has-gcbadge']/h1";

Try changing it to:

/div[@id='title'][@class='intro has-gcbadge']/h1";



回答2:


OP's additional questions (from comments):

but I need to modify the returned strings. For the first string, i get "<h1>App Title</h1>, what would I add to get just the text inside the <h1>?

Use:

/div[@id='title' and @class='intro has-gcbadge']/h1/text()

or use:

string(/div[@id='title' and @class='intro has-gcbadge']/h1)

On the second string, the i get the entire <img width="111" src="link"> how would I return the value of link from the src tag?

Use:

YorSecond-Not-Shown-Expression/@src

or use:

string(YorSecond-Not-Shown-Expression/@src)


来源:https://stackoverflow.com/questions/8055817/nsxmldocument-search-with-nodesforxpath

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!