Scrape Wikipedia table using python
Today I am going to demonstrate how to scrape Wikipedia tables using python and export it to the CSV file. Before the start of this topic, you have to install two python packages.
External Libraries list :
1-pandas library
2-wikipedia library
Installation Process:
1-pip install pandas
2-pip install wikipedia
Library import declaration at the top of your python program.
"""
import pandas as pd
import wikipedia as wiki """
Sample Program:
Example: Scrape list of IIT Colleges
Code :
import pandas as pd
import wikipedia as wiki
#Get the html page source based on keywords
html = wiki.page("List of IIT in india").html().encode("UTF-8")
#Get second table data
df = pd.read_html(html)[1]
#Write the table data to csv
df.to_csv('wiki_table.csv',header=0,index=False)
#print data frame
print(df)
You can find complete code on my GitHub repository. Click here