python - How can I use beautiful soup to get the current price of a stock on Google Finance? -
i have following python code , goal current price of stock, $110.80.
import urlparse import urllib2 import pdb bs4 import beautifulsoup pprint import pprint url = "https://www.google.com.hk/finance?q=0001&ei=yf14vyc4f4wd0asb64cocw" def webcrawl(url): htmltext = urllib2.urlopen(url).read() soup = beautifulsoup(htmltext) p = soup.find() print p webcrawl(url)
now when print soup
, number 110.80 appears in multiple places, example:
{u:"/finance?q=hkg:0001",name:"0001",cp:"-1.07",p:"110.80",cid:"164573760542896"}
and
<span id="ref_164573760542896_l">110.80</span>
and
<meta content="110.80" itemprop="price"/>
first question: right place within html text current price of stock, since seems price occurs in multiple areas within html text ?
second question: should put in soup.find()
or soup.find_all()
field such can obtain current price of particular stock. can me out here please ?
find()
allow find tag within html dom. example, if want title of website can like, bs.find("title")
, return first instance of title. (like: <title>some title here</title>
) can filter tags attributes. lot of websites have tons of divs, if want divs have class type red
, do: bs.find('div', attrs={'class': 'red'})
. return first div
has class type red
. read documentation more detail.
for example, obtain stock price:
import urllib2 bs4 import beautifulsoup url = "https://www.google.com.hk/finance?q=0001&ei=yf14vyc4f4wd0asb64cocw" def webcrawl(url): htmltext = urllib2.urlopen(url).read() soup = beautifulsoup(htmltext) p = soup.find("span", attrs={"id": "ref_164573760542896_l"}).text print p webcrawl(url)
for meta tag can do:
p = soup.find("meta", attrs={"itemprop": "price"}) print p['content']
Comments
Post a Comment