python - How can I use beautiful soup to get the current price of a stock on Google Finance? -

- July 15, 2014

i have following python code , goal current price of stock, $110.80.

import urlparse import urllib2 import pdb bs4 import beautifulsoup pprint import pprint  url = "https://www.google.com.hk/finance?q=0001&ei=yf14vyc4f4wd0asb64cocw"  def webcrawl(url):     htmltext = urllib2.urlopen(url).read()     soup = beautifulsoup(htmltext)     p = soup.find()     print p  webcrawl(url)

now when print soup, number 110.80 appears in multiple places, example:

{u:"/finance?q=hkg:0001",name:"0001",cp:"-1.07",p:"110.80",cid:"164573760542896"}

and

<span id="ref_164573760542896_l">110.80</span>

and

<meta content="110.80" itemprop="price"/>

first question: right place within html text current price of stock, since seems price occurs in multiple areas within html text ?

second question: should put in soup.find() or soup.find_all() field such can obtain current price of particular stock. can me out here please ?

find() allow find tag within html dom. example, if want title of website can like, bs.find("title") , return first instance of title. (like: <title>some title here</title>) can filter tags attributes. lot of websites have tons of divs, if want divs have class type red, do: bs.find('div', attrs={'class': 'red'}). return first div has class type red. read documentation more detail.

for example, obtain stock price:

import urllib2 bs4 import beautifulsoup  url = "https://www.google.com.hk/finance?q=0001&ei=yf14vyc4f4wd0asb64cocw"  def webcrawl(url):     htmltext = urllib2.urlopen(url).read()     soup = beautifulsoup(htmltext)     p = soup.find("span", attrs={"id": "ref_164573760542896_l"}).text     print p  webcrawl(url)

for meta tag can do:

p = soup.find("meta", attrs={"itemprop": "price"}) print p['content']

Search This Blog

Th

python - How can I use beautiful soup to get the current price of a stock on Google Finance? -

Comments

Post a Comment

Popular posts from this blog

xslt - Substring before throwing error -

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -