How to check for a sub-property at all levels expanded from a SPARQL * wildcard?

一世执手 提交于 2019-12-14 03:48:50

问题


In Wikidata, I want to find an item's country. Either directly if the item has a country directly, or by climbing up the P131s (located in the administrative territorial entity) until I find a country. Here is the query:

?item wdt:P131*/wdt:P17 ?country.

The query above works fine... except when a sub-division used to belong to another country, like for Q25270 (Prishtina). In such case, the result can be anachronistic. That's what I want to fix.

Great news: in such cases we should only consider the unique P131 (located in the administrative territorial entity) that has no P582 (end time) sub-property attached to it, and the problem is solved!

My question: how to alter my query above to achieve that?

Example: Let's say MyItem is in MyStreet is in MyTown is in MyRegion is in MyCountry, I must make sure that MyStreet, MyTown, and MyRegion do not have a P582 (end time).

(If "sub-property" is not the correct term, please let me know the right term and I will fix the question, thanks!)

An attempt

The query below works in most cases, but unfortunately it has a bug: It finds the wrong country in cases where the current country was also the country in the past (for instance Alsace belonged to France until 1871 then to Germany and currently to France again).

SELECT DISTINCT ?country WHERE {
  wd:Q6556803 wdt:P131* ?area .
  ?area wdt:P17 ?country .
  OPTIONAL {
    wd:Q6556803 wdt:P131*/p:P131 [
      pq:P582 ?endTime; ps:P131/wdt:P131* ?area
    ] .
  } .
  FILTER( !BOUND( ?endTime ) ) .
}

回答1:


Wikidata uses different properties for direct links and links with extra information. So, for the statement "Prishtina is located in the administrative territorial entity Socialist Autonomous Province of Kosovo", there's the simple triple:

wd:Q25270 wdt:P131 wd:Q646035

And the long form with additional information (the end time):

wd:Q25270 p:P131 wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b .

wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b ps:P131 wd:Q646035 ;
    pq:P582 "1990-01-01T00:00:00Z"

So, we need to filter out all paths with an end time (pq:582):

SELECT DISTINCT ?s ?sLabel ?country ?countryLabel {
  VALUES ?s {
    wd:Q25270 
  }
  ?s wdt:P131* ?area .
  ?area wdt:P17 ?country .
  FILTER NOT EXISTS {
    ?s p:P131/(ps:P131/p:P131)* ?statement .
    ?statement ps:P131 ?area .
    ?s p:P131/(ps:P131/p:P131)* ?intermediateStatement .
    ?intermediateStatement (ps:P131/p:P131)* ?statement .
    ?intermediateStatement pq:P582 ?endTime .
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
limit 50

Here, ?intermediateStatement is a statement with an end time on the path from ?s to a country.

This query does seem to time out if there is more than one value set for ?s. Also, the query does not take into account that there might exist multiple links from an item to an area where one has a timestamp and the other doesn't (both paths will be filtered out).



来源:https://stackoverflow.com/questions/44301893/how-to-check-for-a-sub-property-at-all-levels-expanded-from-a-sparql-wildcard

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!