I just get the impact of this screen scraping from my trainer Mr.Kamesh and it was a good one.
Scraping:
Screen scraping is a process of extracting the data from the webpage and its a technique in which a computer program extracts a data from a website we want.
Some its too good to retrieve data from our favourite website with out CTRL+c and CTRL+v
to view the data we want to see..
So i just tried it with a program in python language.i tried to retrive the name of the candidate who recently placed from our college..
my target url:www.saec.ac.in
I scraped the data is that the person who get placed in our college from M.B.A department.
and finally i end up the code that gives me the result.
import urllib
import re
import string
url = 'http://saec.ac.in/campus/lcube.html'
response = urllib.urlopen(url)
content = response.read()
start=string.index(content,"<table")
stop=string.index(content,"</table")
content = content[start:stop]
start=string.index(content,"</tr")
content = content[start+4:]
start=string.index(content,"<tr")
stop=string.index(content,"</tr")
content = content[start:stop]
for field in re.findall('>([^<]+)<',content):
print field.replace('\n ',''),
just tried a bit of scraping over here..and im working on top of it to scrap a bulk data.
No comments:
Post a Comment