📘
Python web crawler note
  • Introduction
  • 1. 環境安裝與爬蟲的基本
  • 1.1 環境安裝
  • 1.2 IDE設定
  • 1.3 一隻很原始的爬蟲
  • 1.4 幫爬蟲裝煞車
  • 2. 用BeautifuSoup來分析網頁
  • 2.1 BeautifulSoup範例 - 1
  • 2.2 BeautifulSoup說明
  • 2.3 BeautifulSoup範例 - 2
  • 2.4 加入Regular Expression
  • 2.5 Dcard今日十大熱門文章
  • 3. 更多實際的應用
  • 3.1 PTT八卦版今日熱門文章
  • 3.2 Yahoo奇摩電影本週新片
  • 3.3 蘋果日報/自由時報今日焦點
  • 3.4 Google Finance 個股資訊
  • 3.5 Yahoo奇摩字典
  • 4. 基於API的爬蟲
  • 4.1 八卦版鄉民從哪來?
  • 4.2 Facebook Graph API
  • 4.3 imdb電影資訊查詢
  • 4.4 Google Finance API
  • 4.5 台灣證券交易所API
  • 5. 資料儲存
  • 5.1 痴漢爬蟲(PTT表特版下載器)
  • 5.2 儲存成CSV檔案
  • 5.3 儲存至SQLite
  • 6. 不同編碼/文件類型的爬蟲
  • 6.1 非UTF-8編碼的文件
  • 6.2 XML文件
  • 7. 比價爬蟲
  • 7.1 momo購物網爬蟲
  • 7.2 PChome 24h API爬蟲
  • 7.3 比價圖表程式
  • 8. 處理POST請求/登入頁面
  • 8.1 空氣品質監測網
  • 9. 動態網頁爬蟲
  • 9.1 台銀法拍屋資訊查詢
  • 10. 自然語言處理
  • 10.1 歌詞頻率與歌詞雲
Powered by GitBook
On this page

Was this helpful?

4.4 Google Finance API

3.4小節的Google Finance個股資訊是直接爬網頁來的, 這邊要示範怎麼透過Google Finance API達到類似的效果.

import requests
import json
from datetime import datetime, timedelta


GOOGLE_FINANCE_API_URL = 'http://finance.google.com/finance/info?client=ig&q='
GOOGLE_FINANCE_HISTORY_API_URL = 'http://www.google.com/finance/getprices?q='


def get_stock(query):
    # You can query for multiple stocks by splitting with ","
    resp = requests.get(GOOGLE_FINANCE_API_URL + query)
    if resp.status_code != 200:
        print('Invalid url or query param: ' + resp.url)
        return None
    else:
        # Need to remove the redundant chars '//' at the head of response
        return json.loads(resp.text.replace('//', ''))


def get_stock_history(stock_id, stock_mkt):
    resp = requests.get(GOOGLE_FINANCE_HISTORY_API_URL + stock_id + '&x=' + stock_mkt + '&i=86400&p=1M')
    ''' e.g.,
    EXCHANGE%3DTPE
    MARKET_OPEN_MINUTE=540
    MARKET_CLOSE_MINUTE=810
    INTERVAL=86400
    COLUMNS=DATE,CLOSE,HIGH,LOW,OPEN,VOLUME
    DATA=
    TIMEZONE_OFFSET=480
    a1488346200,186,188.5,186,188.5,46176000
    1,186,188.5,185,188,39914000
    2,184,185,184,184.5,28085000
    5,183.5,184.5,183.5,184,12527000
    ...
    '''
    index = -1
    records = resp.text.split('\n')
    for record in records:
        # 'a' means the start point of stock information
        if record.startswith('a'):
            index = records.index(record)
            break
    if index > 0:
        records = records[index:]
        # To transform the unix time to human readable time at the first line of stock info
        unix_time = int(records[0].split(',')[0][1:])
        init_time = datetime.fromtimestamp(unix_time)

        # To handle to first row
        first_row = records[0].split(',')
        first_row[0] = init_time

        history = list()
        history.append(first_row)

        # To handle the rest of stock records
        for record in records[1:]:
            if record:
                data = record.split(',')
                delta = int(data[0])
                data[0] = init_time + timedelta(days=delta)
                history.append(data)
        return history
    else:
        return None


def main():
    query = 'TPE:2330'
    print('Real time stock price for ' + query)
    stocks = get_stock(query)
    print(stocks[0])
    print('\n')
    stock_id = '2330'
    stock_mkt = 'TPE'
    print('Stock price history for ' + stock_mkt + ":" + stock_id)
    print('(Date, Close, High, Low, Open, Volume)')
    history = get_stock_history(stock_id, stock_mkt)
    for hist in history:
        print(hist[0].strftime("%Y/%m/%d"), hist[1:])


if __name__ == '__main__':
    main()

輸出結果:

Real time stock price for TPE:2330
{'id': '674465', 't': '2330', 'e': 'TPE', 'l': '207.00', 'l_fix': '207.00', 'l_cur': 'NT$207.00', 's': '0', 'ltt': '1:30PM GMT+8', 'lt': 'May 26, 1:30PM GMT+8', 'lt_dts': '2017-05-26T13:30:02Z', 'c': '0.00', 'c_fix': '0.00', 'cp': '0.00', 'cp_fix': '0.00', 'ccol': 'chb', 'pcls_fix': '207'}


Stock price history for TPE:2330
(Date, Close, High, Low, Open, Volume)
2017/04/28 ['194.5', '194.5', '193', '193.5', '34837000']
2017/05/02 ['196.5', '199', '195.5', '198.5', '44102000']
2017/05/03 ['198', '198.5', '197', '198', '25702000']
2017/05/04 ['198', '199', '197', '198.5', '22076000']
2017/05/05 ['197.5', '198.5', '197', '197', '17022000']
2017/05/08 ['202.5', '202.5', '199', '199', '36514000']
2017/05/09 ['203.5', '207', '203.5', '205.5', '48047000']
2017/05/10 ['205.5', '206', '204', '204', '28296000']
2017/05/11 ['207.5', '208.5', '204', '204.5', '43692000']
2017/05/12 ['206', '207', '205', '205', '24996000']
2017/05/15 ['206', '206', '204', '204', '25655000']
2017/05/16 ['204.5', '205', '203.5', '205', '33212000']
2017/05/17 ['204', '204', '203', '203', '21558000']
2017/05/18 ['203.5', '204', '201.5', '202.5', '22448000']
2017/05/19 ['203', '204.5', '202.5', '203.5', '16483000']
2017/05/22 ['205', '205', '203', '203.5', '13714000']
2017/05/23 ['205', '207', '204.5', '205', '19988000']
2017/05/24 ['205.5', '206', '205', '205', '13378000']
2017/05/25 ['207', '207', '205.5', '206', '22311000']
2017/05/26 ['207', '207', '205', '205', '32492000']

Process finished with exit code 0
Previous4.3 imdb電影資訊查詢Next4.5 台灣證券交易所API

Last updated 5 years ago

Was this helpful?

原始碼

點我