How To Use Wikipedia Api To Get Section Of Sidebar?
I have a command line program that gets passed in the name of a species (e.x. Fusulinida). It needs to return the plaintext of the section of the sidebar about taxonomy. I can get
Solution 1:
I hope this helps:
import requests, json
def getTaxonomy(title):
r = requests.get('https://en.wikipedia.org/w/api.php?action=query&titles=' + title + '&prop=revisions&rvprop=content&rvsection=0&format=json')
#https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=revisions&format=json&rvprop=content&rvsection=0&rvcontentformat=text%2Fx-wiki&titles=Foraminifera
a = ''
t = json.loads(r.text)
for i in t['query']['pages']:
a = t['query']['pages'][ i ]['revisions'][0]['*']
taxobox = axobox = a[a.upper().index('{{TAXOBOX') + len('{{taxobox'):]
taxobox = taxobox[taxobox.index("\n[["):]
taxobox = taxobox[:taxobox.index("}}")]
taxobox = taxobox.replace('[[','')
taxobox = taxobox.replace(']]','')
taxobox = taxobox.replace('<br>','')
taxobox = taxobox.replace("''",'')
taxobox = taxobox.replace(" ",' ')
t = []
for i in taxobox.split("\n"):
if len(i) > 0:
if '|' in i: # for href titles
t.append( i.split('|')[1] ) # for href titles
else:
t.append( i )
return "\n".join(t)
print(getTaxonomy('Foraminifera'))
print(getTaxonomy('Fusulinida'))
Post a Comment for "How To Use Wikipedia Api To Get Section Of Sidebar?"