How to handle case-insensitive SPARQL data in MarkLogic

流过昼夜 提交于 2021-02-18 22:01:47


I'm trying to understand how best to handle literals in Marklogic SPARQL data which may be in any case. I'd like to be able to do a case insensitive search but I believe that isn't possible with semantic queries. For a simplistic example I want:

WHERE { ?s ?p "Red"}


WHERE { ?s ?p "red"}

to return all values whether the object is "Red", "RED", "red" or "rED".

My data is from another source which has variable capitalisation rules. At the moment the only thing I can think of is to add an extra triple which always contains the text in lower case so I can always search on that value. Alternatively, would it make sense to create some new range query in MarkLogic with a case insensitive collation (if that's possible on triple data)?


You could use a filter that ignores case.

select * where {
  ?s ?p ?o
  FILTER (lcase(str(?o)) = "red")

Based on the answer to another question.

Edit: I asked Steve Buxton, MarkLogic's PM for semantics features, and he suggested this:

let $store := sem:store( (), cts:element-value-query(xs:QName("sem:object"), "red", "case-insensitive") )
    SELECT ?o
    WHERE {
      ?s ?p ?o
      FILTER (lcase(str(?o)) = "red")
    }', (), (), $store

sem:store is a MarkLogic 8 (now available through Early Access) function that selects a group of triples. The SPARQL query then runs on the reduced set, limiting the number of triples that need to be filtered.

