Friday 6 January 2012

Screen scraping....

I just get the impact of this screen scraping from my trainer Mr.Kamesh and it was a good one. 

Scraping:

Screen scraping is a process of extracting the data from the webpage and its a technique in which a computer program extracts a data from a website we want.

Some its too good to retrieve data from our favourite website with out CTRL+c and CTRL+v
to view the data we want to see..

So i just tried it with a program in python language.i tried to retrive the name of the candidate who recently placed from our college..

my target url:www.saec.ac.in

I scraped the data is that the person who get placed in our college from M.B.A department.
and finally i end up the code that gives me the result.

import urllib
import re
import string

url = 'http://saec.ac.in/campus/lcube.html'
response = urllib.urlopen(url)
content = response.read()
start=string.index(content,"<table")
stop=string.index(content,"</table")
content = content[start:stop]
start=string.index(content,"</tr")
content = content[start+4:]
start=string.index(content,"<tr")
stop=string.index(content,"</tr")
content = content[start:stop]
for field in re.findall('>([^<]+)<',content):
    print field.replace('\n ',''),

just tried a bit of scraping over here..and im working on top of it to scrap a bulk data.  

 

No comments:

Post a Comment