我的维基数据查询服务查询有时需要 35000 毫秒或 35 秒才能完成。我不太擅长 SPARQL。下面的查询确实有效(除了有时会重复)。我想通过提供生日日期和月份来获得“名人”,其中我得到他们的名字、生日、他们的图像(维基媒体)和职业。我还按出生地在美国和英国的人进行过滤。
我添加了一个名为“站点链接”的变量,我计算有多少链接专用于它们作为流行度指标(如果有更好的方法来衡量流行度,我愿意接受更好的想法)。有没有办法让这个更加优化?同样,查询有效,只是速度慢。
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?person ?personLabel ?birthdate ?countryLabel (COUNT(DISTINCT(?sitelink)) as ?sites) (GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") as ?occupations) (SAMPLE(?image) as ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country ; # Citizenship
wdt:P106 ?occupation ; # Occupation
wdt:P18 ?uniqueImage . # Image
?country rdfs:label ?countryLabel .
?occupation rdfs:label ?occupationLabel .
?sitelink schema:about ?person .
FILTER(LANG(?countryLabel) = "en")
FILTER(LANG(?occupationLabel) = "en")
FILTER(MONTH(?birthdate) = 5 && DAY(?birthdate) = 20)
FILTER(?country IN (wd:Q30, wd:Q145))
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel ?uniqueImage ORDER BY DESC(?sites)
LIMIT 50
如果有人想将查询粘贴到维基数据查询服务,这里是链接 https://query.wikidata.org/
一些小改进:
rdfs:label
,则无需再使用 SERVICE wikibase:label {...}
,除非您想强制在项目中指定英文标签或者想要使用 SELECT
部分中的变量。?country
,而不是使用 FILTER(...)
过滤 VALUES ?country { wd:Q30 wd:Q145 }
的后验值。PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?person ?personLabel ?birthdate ?countryLabel (COUNT(DISTINCT(?sitelink)) as ?sites) (GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") as ?occupations) (SAMPLE(?image) as ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country ; # Citizenship
wdt:P106 ?occupation ; # Occupation
wdt:P18 ?uniqueImage . # Image
?sitelink schema:about ?person .
?occupation rdfs:label ?occupationLabel .
FILTER(LANG(?occupationLabel) = "en")
FILTER(MONTH(?birthdate) = 5 && DAY(?birthdate) = 20)
VALUES ?country { wd:Q30 wd:Q145 }
SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel ?uniqueImage ORDER BY DESC(?sites)
LIMIT 50
我不认为你能做得更好,根据经验我可以告诉你
schema:about
房产是非常需要资源的。
我认为是站点计数杀死了您的查询,并且我认为我解决了重复记录问题。这样的东西适合您的用例吗?
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT
?person
?personLabel
?birthdate
?countryLabel
?website
(GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") AS ?occupations)
(MIN(?image) AS ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country . # Citizenship
OPTIONAL { ?person wdt:P856 ?website } # Official website
?country rdfs:label ?countryLabel .
FILTER(?country IN (wd:Q30, wd:Q145))
FILTER(LANG(?countryLabel) = "en")
FILTER(DATATYPE(?birthdate) = xsd:dateTime
&& MONTH(?birthdate) = 5
&& DAY(?birthdate) = 20)
OPTIONAL {
?person wdt:P106 ?occupation . # Occupation
?occupation rdfs:label ?occupationLabel .
FILTER(LANG(?occupationLabel) = "en")
}
OPTIONAL { ?person wdt:P18 ?image } # Image
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel ?website
ORDER BY DESC(?website)
LIMIT 50
如果您确实需要网站计数,尽管您可能可以使用类似的东西
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT
?person
?personLabel
?birthdate
?countryLabel
(COUNT(DISTINCT ?sitelink) AS ?sites)
(GROUP_CONCAT(DISTINCT ?occupationLabel; separator=", ") AS ?occupations)
(MIN(?image) AS ?uniqueImage)
WHERE {
?person wdt:P31 wd:Q5 ; # Instance of human
wdt:P569 ?birthdate ; # Date of birth
wdt:P27 ?country . # Citizenship
OPTIONAL { ?person wdt:P18 ?image } # Image
?sitelink schema:about ?person .
?country rdfs:label ?countryLabel .
FILTER(?country IN (wd:Q30, wd:Q145))
FILTER(LANG(?countryLabel) = "en")
FILTER(DATATYPE(?birthdate) = xsd:dateTime
&& MONTH(?birthdate) = 5
&& DAY(?birthdate) = 20)
OPTIONAL {
?person wdt:P106 ?occupation . # Occupation
?occupation rdfs:label ?occupationLabel .
FILTER(LANG(?occupationLabel) = "en")
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
GROUP BY ?person ?personLabel ?birthdate ?countryLabel
ORDER BY DESC(?sites)
LIMIT 50