Given a list of company names, how to fetch company names, website url, year established, number of employees etc

核能气质少年 提交于 2019-12-04 13:22:54

You can start using a query like this:

select * where {
  values ?company { dbpedia:Microsoft
                    <http://dbpedia.org/resource/Apple_Inc.>
                    dbpedia:Kimberly-Clark
                  } 
  OPTIONAL { { ?company dbpprop:logo ?logo  FILTER(isIRI(?logo)) }
             UNION 
             { ?company foaf:depiction ?logo FILTER(isIRI(?logo)) } }
  OPTIONAL { ?company dbpedia-owl:abstract ?abstract 
             FILTER(langMatches(lang(?abstract),"EN")) }
  OPTIONAL { ?company geo:lat ?latitude ;
                      geo:long ?longitude }
  OPTIONAL { ?company dbpedia-owl:foundingDate ?foundingDate }
  OPTIONAL { ?company dbpedia-owl:wikiPageExternalLink ?externalLink }
  OPTIONAL { ?company dbpprop:symbol ?stockSymbol }
  OPTIONAL { ?company dbpedia-owl:subsidiary ?subsidiaryPage }
}

SPARQL Results

I based this on the properties I saw on the DBpedia pages for Microsoft, Kimberly-Clark, and Apple, Inc.. The data isn't particularly clean, and because of that, I added a few filters to the query:

  • Not all of these list subsidiaries, and the subsidiary property for Microsoft doesn't relate to subsidiaries, but a page that presumably enumerates some subsidiaries).

  • Some of the companies have bad information for the logos (hence the FILTERs with isIRI). For instance, Apple's dbpprop:logo is the integer 150. I think that that comes from the Wikipedia infobox line | logo = [[File:{{#property:p154}}|150px]], where 150 is getting pulled out rather than a more meaningful value. Filtering by isIRI helps a little bit.

  • Some of the companies have multiple founding dates. I'm not sure how you might decided which of the multiple ones to use.

  • While the company page is usually listed as an external link, not all of the external links associated with a page are the company page. I'm not sure how you could select one as the company page.

All that said, it looks like you can get a lot of this information from DBpedia.

Pierre

you could start with the following sparql query. It retrieves all the triples for a subject having a name=Apple Inc.".

select distinct ?subject ?predicate ?object where { 
  ?subject ?predicate ?object .
  ?subject <http://xmlns.com/foaf/0.1/name> "Apple Inc."@en .
}

SPARQL results

subject     predicate   object
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://www.w3.org/2002/07/owl#Thing
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/ontology/Company
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://www.opengis.net/gml/_Feature
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/ontology/Organisation
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/ontology/Agent
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://schema.org/Organization
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/ComputerCompaniesOfTheUnitedStates
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/SoftwareCompaniesOfTheUnitedStates
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/RetailCompaniesOfTheUnitedStates
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/CompaniesEstablishedIn1976
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/ComputerHardwareCompanies
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://umbel.org/umbel/rc/Organization
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/Company108058098
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/HomeComputerHardwareCompanies
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/CompaniesBasedInCupertino,California
http://dbpedia.org/resource/Apple_Inc.  http://www.w3.org/1999/02/22-rdf-syntax-ns#type     http://dbpedia.org/class/yago/MobilePhoneManuFACturers
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!