Getting only english property value

元气小坏坏 提交于 2021-02-11 17:37:30

问题


I am trying to get a list countries including the english short names:

# get a list countries with the corresponding ISO code
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
SELECT ?country ?countryLabel ?shortName (MAX(?pop) as ?population) ?coord ?isocode
WHERE 
{
  # instance of country
  ?country wdt:P31 wd:Q3624078.
  OPTIONAL {
     ?country rdfs:label ?countryLabel filter (lang(?countryLabel) = "en").
   }
  OPTIONAL {
    # https://www.wikidata.org/wiki/Property:P1813
    ?country wdt:P1813 ?shortName.
  }   
  OPTIONAL { 
    # get the population
     # https://www.wikidata.org/wiki/Property:P1082
     ?country wdt:P1082 ?pop. 
  }
  # get the iso countryCode
  { ?country wdt:P297 ?isocode }.
  # get the coordinate
  OPTIONAL { ?country wdt:P625 ?coord }.
} 
GROUP BY ?country ?countryLabel ?shortName ?population ?coord ?isocode 
ORDER BY ?countryLabel

try it!

Unfortunately also flags and non english versions of "shortName" are returned. I tried using a subquery but that timed out. I'd like to avoid using the wikibase label service since I need to run the query on my local wikidata copy which uses Apache Jena

How could i get the english shortnames of countries? E.g. China for People's republic of china and USA for United States of America?


回答1:


There are two issues here:

  1. we need to filter for the English short names only, i.e. we need a filter (lang(?shortName) = "en") clause inside the second OPTIONAL pattern
  2. for some reason, there are some flags having an English language tag, so we have to ignore those somehow - the good thing, there is a statement qualifier that helps here: an instance of (P31) relation to the Wikidata entity emoji flag sequence (Q28840786)

So, overall, we replace

OPTIONAL {
    # https://www.wikidata.org/wiki/Property:P1813
    ?country wdt:P1813 ?shortName.
} 

by

OPTIONAL {
  ?country p:P1813 ?shortNameStmt. # get the short name statement
  ?shortNameStmt ps:P1813 ?shortName # the the short name value from the statement
  filter (lang(?shortName) = "en") # filter for English short names only
  filter not exists {?shortNameStmt pq:P31 wd:Q28840786} # ignore flags (aka emojis)
}

Still, there will be multiple entries for some countries because of multiple short names. One way to workaround this is to use some aggregate function like sample or min/max and pick just a single short name per country.



来源:https://stackoverflow.com/questions/63773185/getting-only-english-property-value

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!