Thank you for your reply...
But I have to use only Beautifulsoup for improving programming skill...
Here I attached my simple python code which retrieves all mobiles and price
and the link in flipkart.com
*import MySQLdb*
*import urllib2*
*from bs4 import BeautifulSoup*
*import itertools*
*db=MySQLdb.connect("localhost","root","","rate")*
*cursor = db.cursor()*
*for i in range(1, 2317):*
* url = "http://www.flipkart.com/mobiles/pr?sid=tyy,4io&start=
<http://www.flipkart.com/mobiles/pr?sid=tyy,4io&start=>"+str(i)+"&otracker=nmenu_sub_electronics_0_All%20Brands"*
* page = urllib2.urlopen(url)*
* soup = BeautifulSoup(page)*
* links = soup.find("a", {"class":"fk-display-block"})*
* prices = soup.find("span", {"class":"fk-font-17 fk-bold"})*
* name = (links.text).lstrip()*
* price = prices.text*
* address = "flipkart.com <http://flipkart.com>"+links.get('href')*
* print ('\n\nPhone No = '),i*
* print (name + '\n' + price + '\n' + address)*
* sql = "INSERT INTO flipkart (M_NAME, M_PRICE, ADDRESS) VALUES('%s',
'%s', '%s')"%(name, price, address)*
* try:*
* cursor.execute(sql)*
* db.commit()*
* except:*
* db.rollback()*
*db.close()*
This code gets all the details successfully but not quickly...
Is there any way to made this quick???
Thanks in advance...
With Regards
S. Praveen
http://praveenlearner.wordpress.com
On Thu, Apr 3, 2014 at 11:36 PM, Shrinivasan T <tshrinivasan at gmail.com>wrote:
Not the direct answer.
But related.
Portia is a tool for visually scraping web sites without any
programming knowledge. Just annotate web pages with a point and click
editor to indicate what data you want to extract, and portia will
learn how to scrape similar pages from the site. Portia has a web
based UI served by a Twisted server, so you can install it on almost
any modern platform.
http://blog.scrapinghub.com/2014/04/01/announcing-portia/
https://github.com/scrapinghub/portia
may be useful for you.
--
Regards,
T.Shrinivasan
My Life with GNU/Linux : http://goinggnu.wordpress.com
Free E-Magazine on Free Open Source Software in Tamil : http://kaniyam.com
Get CollabNet Subversion Edge : http://www.collab.net/svnedge
Kanchilug Blog : http://kanchilug.wordpress.com
To subscribe/unsubscribe kanchilug mailing list :
http://kanchilug.wordpress.com/join-mailing-list/