📘
Python web crawler note
  • Introduction
  • 1. 環境安裝與爬蟲的基本
  • 1.1 環境安裝
  • 1.2 IDE設定
  • 1.3 一隻很原始的爬蟲
  • 1.4 幫爬蟲裝煞車
  • 2. 用BeautifuSoup來分析網頁
  • 2.1 BeautifulSoup範例 - 1
  • 2.2 BeautifulSoup說明
  • 2.3 BeautifulSoup範例 - 2
  • 2.4 加入Regular Expression
  • 2.5 Dcard今日十大熱門文章
  • 3. 更多實際的應用
  • 3.1 PTT八卦版今日熱門文章
  • 3.2 Yahoo奇摩電影本週新片
  • 3.3 蘋果日報/自由時報今日焦點
  • 3.4 Google Finance 個股資訊
  • 3.5 Yahoo奇摩字典
  • 4. 基於API的爬蟲
  • 4.1 八卦版鄉民從哪來?
  • 4.2 Facebook Graph API
  • 4.3 imdb電影資訊查詢
  • 4.4 Google Finance API
  • 4.5 台灣證券交易所API
  • 5. 資料儲存
  • 5.1 痴漢爬蟲(PTT表特版下載器)
  • 5.2 儲存成CSV檔案
  • 5.3 儲存至SQLite
  • 6. 不同編碼/文件類型的爬蟲
  • 6.1 非UTF-8編碼的文件
  • 6.2 XML文件
  • 7. 比價爬蟲
  • 7.1 momo購物網爬蟲
  • 7.2 PChome 24h API爬蟲
  • 7.3 比價圖表程式
  • 8. 處理POST請求/登入頁面
  • 8.1 空氣品質監測網
  • 9. 動態網頁爬蟲
  • 9.1 台銀法拍屋資訊查詢
  • 10. 自然語言處理
  • 10.1 歌詞頻率與歌詞雲
Powered by GitBook
On this page

Was this helpful?

4.3 imdb電影資訊查詢

Previous4.2 Facebook Graph APINext4.4 Google Finance API

Last updated 5 years ago

Was this helpful?

imdb是很熱門的電影資訊網站, 不過其本身是沒有對外開放API的, 所以這邊會透過一個叫做的第三方服務去取得imdb的電影資訊, 要使用OMDb API的服務, 必須要有API key, 這部分請自行付費取得(API_KEY).

import requests
import json
import math
from collections import Counter

# Please pay for the key yourself.
API_KEY = ''
OMDB_URL = 'http://www.omdbapi.com/?apikey=' + API_KEY


def get_movie_date(url):
    data = json.loads(requests.get(url).text)
    if data['Response'] == 'True':
        return data
    else:
        return None


def search_ids_by_keyword(keywords):
    movie_ids = list()
    # e.g., "Iron Man" -> Iron+Man
    query = '+'.join(keywords.split())
    url = OMDB_URL + '&s=' + query
    data = get_movie_date(url)

    if data:
        for item in data['Search']:
            movie_ids.append(item['imdbID'])
        total = int(data['totalResults'])
        num_pages = math.floor(total/10) + 1

        for i in range(2, num_pages+1):
            url = OMDB_URL + '&s=' + query + '&page=' + str(i)
            data = get_movie_date(url)
            if data:
                for item in data['Search']:
                    movie_ids.append(item['imdbID'])
    return movie_ids


def search_by_id(movie_id):
    url = OMDB_URL + '&i=' + movie_id
    data = get_movie_date(url)
    return data if data else None


def main():
    keyword = 'iron man'
    m_ids = search_ids_by_keyword(keyword)
    print('There are %s movies contain the keyword %s.' % (len(m_ids), keyword))
    print('Retrieving movie data...')
    movies = list()
    for m_id in m_ids:
        movies.append(search_by_id(m_id))
    print('Top 5 movie results:')
    for movie in movies[:5]:
        print(movie)
    years = [movie['Year'] for movie in movies]
    year_dist = Counter(years)
    print('Publish year distribution: ', year_dist)
    ratings = [float(movie['imdbRating']) for movie in movies if movie['imdbRating'] != 'N/A']
    print('Average rating: %.2f' % (sum(ratings)/len(ratings)))


if __name__ == '__main__':
    main()

輸出結果:

There are 81 movies contain the keyword iron man.
Retrieving movie data...
Top 5 movie results:
{'Title': 'Iron Man', 'Year': '2008', 'Rated': 'PG-13', 'Released': '02 May 2008', 'Runtime': '126 min', 'Genre': 'Action, Adventure, Sci-Fi', 'Director': 'Jon Favreau', 'Writer': 'Mark Fergus (screenplay), Hawk Ostby (screenplay), Art Marcum (screenplay), Matt Holloway (screenplay), Stan Lee (characters), Don Heck (characters), Larry Lieber (characters), Jack Kirby (characters)', 'Actors': 'Robert Downey Jr., Terrence Howard, Jeff Bridges, Gwyneth Paltrow', 'Plot': 'After being held captive in an Afghan cave, billionaire engineer Tony Stark creates a unique weaponized suit of armor to fight evil.', 'Language': 'English, Persian, Urdu, Arabic, Hungarian', 'Country': 'USA', 'Awards': 'Nominated for 2 Oscars. Another 19 wins & 64 nominations.', 'Poster': 'https://images-na.ssl-images-amazon.com/images/M/MV5BMTczNTI2ODUwOF5BMl5BanBnXkFtZTcwMTU0NTIzMw@@._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '7.9/10'}, {'Source': 'Rotten Tomatoes', 'Value': '94%'}, {'Source': 'Metacritic', 'Value': '79/100'}], 'Metascore': '79', 'imdbRating': '7.9', 'imdbVotes': '737,777', 'imdbID': 'tt0371746', 'Type': 'movie', 'DVD': 'N/A', 'BoxOffice': '$318,298,180', 'Production': 'Paramount Pictures', 'Website': 'http://www.ironmanmovie.com/', 'Response': 'True'}
{'Title': 'Iron Man 3', 'Year': '2013', 'Rated': 'PG-13', 'Released': '03 May 2013', 'Runtime': '130 min', 'Genre': 'Action, Adventure, Sci-Fi', 'Director': 'Shane Black', 'Writer': 'Drew Pearce (screenplay), Shane Black (screenplay), Stan Lee (based on the Marvel comic book by), Don Heck (based on the Marvel comic book by), Larry Lieber (based on the Marvel comic book by), Jack Kirby (based on the Marvel comic book by), Warren Ellis (based on the "Extremis" mini-series written by), Adi Granov (based on the "Extremis" mini-series illustrated by)', 'Actors': 'Robert Downey Jr., Gwyneth Paltrow, Don Cheadle, Guy Pearce', 'Plot': "When Tony Stark's world is torn apart by a formidable terrorist called the Mandarin, he starts an odyssey of rebuilding and retribution.", 'Language': 'English', 'Country': 'China, USA', 'Awards': 'Nominated for 1 Oscar. Another 17 wins & 59 nominations.', 'Poster': 'https://images-na.ssl-images-amazon.com/images/M/MV5BMTkzMjEzMjY1M15BMl5BanBnXkFtZTcwNTMxOTYyOQ@@._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '7.2/10'}, {'Source': 'Rotten Tomatoes', 'Value': '79%'}, {'Source': 'Metacritic', 'Value': '62/100'}], 'Metascore': '62', 'imdbRating': '7.2', 'imdbVotes': '590,791', 'imdbID': 'tt1300854', 'Type': 'movie', 'DVD': 'N/A', 'BoxOffice': '$408,992,272', 'Production': 'Walt Disney Pictures', 'Website': 'http://IronManMovie3.com', 'Response': 'True'}
{'Title': 'Iron Man 2', 'Year': '2010', 'Rated': 'PG-13', 'Released': '07 May 2010', 'Runtime': '124 min', 'Genre': 'Action, Adventure, Sci-Fi', 'Director': 'Jon Favreau', 'Writer': 'Justin Theroux (screenplay), Stan Lee (Marvel comic book), Don Heck (Marvel comic book), Larry Lieber (Marvel comic book), Jack Kirby (Marvel comic book)', 'Actors': 'Robert Downey Jr., Gwyneth Paltrow, Don Cheadle, Scarlett Johansson', 'Plot': "With the world now aware of his identity as Iron Man, Tony Stark must contend with both his declining health and a vengeful mad man with ties to his father's legacy.", 'Language': 'English, French, Russian', 'Country': 'USA', 'Awards': 'Nominated for 1 Oscar. Another 7 wins & 40 nominations.', 'Poster': 'https://images-na.ssl-images-amazon.com/images/M/MV5BMTM0MDgwNjMyMl5BMl5BanBnXkFtZTcwNTg3NzAzMw@@._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '7.0/10'}, {'Source': 'Rotten Tomatoes', 'Value': '72%'}, {'Source': 'Metacritic', 'Value': '57/100'}], 'Metascore': '57', 'imdbRating': '7.0', 'imdbVotes': '555,962', 'imdbID': 'tt1228705', 'Type': 'movie', 'DVD': 'N/A', 'BoxOffice': '$312,057,433', 'Production': 'Paramount Studios', 'Website': 'http://www.ironmanmovie.com/', 'Response': 'True'}
{'Title': 'The Man in the Iron Mask', 'Year': '1998', 'Rated': 'PG-13', 'Released': '13 Mar 1998', 'Runtime': '132 min', 'Genre': 'Action, Adventure', 'Director': 'Randall Wallace', 'Writer': 'Alexandre Dumas (novels), Randall Wallace (screenplay)', 'Actors': 'Leonardo DiCaprio, Jeremy Irons, John Malkovich, Gérard Depardieu', 'Plot': 'The cruel King Louis XIV of France has a secret twin brother who he keeps imprisoned. Can the twin be substituted for the real king?', 'Language': 'English, Italian', 'Country': 'USA, France', 'Awards': '3 wins & 4 nominations.', 'Poster': 'https://images-na.ssl-images-amazon.com/images/M/MV5BZjM2YzcxMmQtOTc2Mi00YjdhLWFlZjUtNmFmMDQzYzU2YTk5L2ltYWdlXkEyXkFqcGdeQXVyNTAyODkwOQ@@._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '6.5/10'}, {'Source': 'Rotten Tomatoes', 'Value': '33%'}, {'Source': 'Metacritic', 'Value': '48/100'}], 'Metascore': '48', 'imdbRating': '6.5', 'imdbVotes': '132,182', 'imdbID': 'tt0120744', 'Type': 'movie', 'DVD': '11 Aug 1998', 'BoxOffice': 'N/A', 'Production': 'MGM Home Entertainment', 'Website': 'N/A', 'Response': 'True'}
{'Title': 'The Man with the Iron Fists', 'Year': '2012', 'Rated': 'R', 'Released': '02 Nov 2012', 'Runtime': '95 min', 'Genre': 'Action', 'Director': 'RZA', 'Writer': 'RZA (story), RZA (screenplay), Eli Roth (screenplay)', 'Actors': 'RZA, Rick Yune, Russell Crowe, Lucy Liu', 'Plot': 'On the hunt for a fabled treasure of gold, a band of warriors, assassins, and a rogue British soldier descend upon a village in feudal China, where a humble blacksmith looks to defend himself and his fellow villagers.', 'Language': 'English, Mandarin', 'Country': 'USA, Hong Kong', 'Awards': '3 nominations.', 'Poster': 'https://images-na.ssl-images-amazon.com/images/M/MV5BMTg5ODI3ODkzOV5BMl5BanBnXkFtZTcwMTQxNjUwOA@@._V1_SX300.jpg', 'Ratings': [{'Source': 'Internet Movie Database', 'Value': '5.4/10'}, {'Source': 'Rotten Tomatoes', 'Value': '49%'}, {'Source': 'Metacritic', 'Value': '51/100'}], 'Metascore': '51', 'imdbRating': '5.4', 'imdbVotes': '54,984', 'imdbID': 'tt1258972', 'Type': 'movie', 'DVD': 'N/A', 'BoxOffice': '$15,608,545', 'Production': 'Universal Studios', 'Website': 'http://www.ironfists.com', 'Response': 'True'}
Publish year distribution:  Counter({'2013': 14, '2010': 9, '2008': 7, '2012': 5, '2014': 4, '2016': 4, '1998': 2, '2010–': 2, '1974': 2, '1931': 2, '1996': 2, '1915': 2, '1914': 2, '1989': 1, '2007': 1, '2015': 1, '1981': 1, '2008–': 1, '1994–1996': 1, '1977': 1, '1939': 1, '1956': 1, '1966–': 1, '1993': 1, '1951': 1, '2006': 1, '1935': 1, '1985': 1, '2011': 1, '2004': 1, '1997': 1, '1928': 1, '2017': 1, '1925': 1, '1924': 1, '1903': 1, '1968–': 1})
Average rating: 6.24

Process finished with exit code 0

原始碼

OMDb API
點我