[Python] Collecting YouTube Comments by using YouTube Data API

Step 1. Request your own API

You need to enable your own YouTube Data API v3 at this link (https://console.cloud.google.com/apis/library/youtube.googleapis.com). It has a daily limit, but you can get free access to some amount of data.

In the following screen, click “create credentials” and put your information.

It will return the API key. Do not share it with others and copy it to put it in the Python code.

Step 2: Find the key for YouTube Playlist or video

Now, you need to find the key for the YouTube Playlist or video that you would love to collect the comments data. It’s really simple! You can find it in the URL of the Playlist or video.

For example, if the URL of the playlist is “https://www.youtube.com/watch?v=ZfCNFYAd77o&list=PL5gua8hQg_DoHCEBeOISWjUK2I3r00puR,” PL5gua8hQg_DoHCEBeOISWjUK2I3r00puR after &=list is the key to the playlist. The video key is ZfCNFYAd77o after watch?=v.

Step 3: Run it in Python

First, you need to import the libraries and then put your API key in the “put-your-key-here.”

Python
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
import pandas as pd

DEVELOPER_KEY = 'put-your-key-here'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'

Python Code to Collect Comments from YouTube playlist

Python
def get_playlist_video_ids(service, **kwargs):
    video_ids = []
    results = service.playlistItems().list(**kwargs).execute()
    while results:
        for item in results['items']:
            video_ids.append(item['snippet']['resourceId']['videoId'])

        # check if there are more videos
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.playlistItems().list(**kwargs).execute()
        else:
            break

    return video_ids
    
def get_video_comments(service, **kwargs):
    comments, dates, likes, video_titles = [], [], [], []
    results = service.commentThreads().list(**kwargs).execute()

    while results:
        for item in results['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            date = item['snippet']['topLevelComment']['snippet']['publishedAt']
            like = item['snippet']['topLevelComment']['snippet']['likeCount']
            video_title = service.videos().list(part='snippet', id=kwargs['videoId']).execute()['items'][0]['snippet']['title']

            comments.append(comment)
            dates.append(date)
            likes.append(like)
            video_titles.append(video_title)

        # check if there are more comments
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.commentThreads().list(**kwargs).execute()
        else:
            break

    return pd.DataFrame({'Video Title': video_titles, 'Comments': comments, 'Date': dates, 'Likes': likes})

The following codes will return the pandas DataFrame with the Video Title column.

Python
def main():
    # build the service
    youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=DEVELOPER_KEY)

    # get playlist video ids
    playlist_id = 'PLxNb_gmvauiRtxQrQsKLEWlFVUmRixmtS'
    video_ids = get_playlist_video_ids(youtube, part='snippet', maxResults=50, playlistId=playlist_id)

    # get the comments from each video
    all_comments_df = pd.DataFrame()

    for video_id in video_ids:
        try:
            comments_df = get_video_comments(youtube, part='snippet', videoId=video_id, textFormat='plainText')
            all_comments_df = pd.concat([all_comments_df, comments_df], ignore_index=True)
        except HttpError as e:
            print(f"An HTTP error {e.resp.status} occurred:\n{e.content}")

    return all_comments_df  # return the DataFrame

if __name__ == '__main__':
    df = main()
    print(df)  # print the DataFrame here

Python Code to Collect Comments from YouTube Video

If you would love to collect the comments from the single video, you can run this code instead of the codes above.

Python
def get_video_comments(service, **kwargs):
    comments, dates, likes = [], [], []
    results = service.commentThreads().list(**kwargs).execute()

    while results:
        for item in results['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            date = item['snippet']['topLevelComment']['snippet']['publishedAt']
            like = item['snippet']['topLevelComment']['snippet']['likeCount']

            comments.append(comment)
            dates.append(date)
            likes.append(like)

        # check if there are more comments
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.commentThreads().list(**kwargs).execute()
        else:
            break

    return pd.DataFrame({'Comments': comments, 'Date': dates, 'Likes': likes})

The following will return the pandas DataFrame with date and number of likes as follows 🙂

comments_df = None

def main():
    global comments_df
    
    # Build the service
    youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=DEVELOPER_KEY)

    # Get the comments
    video_id = 'your-video-key-here' 
    comments_df = get_video_comments(youtube, part='snippet', videoId=video_id, textFormat='plainText')

if __name__ == '__main__':
    main()

print(comments_df)

Step 4: Export it to CSV format

The final step is always to export data to use it later or share it with others! You can always do it in one line for pandas DataFrame.

comments_df.to_csv('output.csv')
  • June 20, 2023