{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# WEEK 6: Web Crawling & Twitter API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Web Crawling\n", "We will introduce two methods to collect data: web crawling (this week) and calling API (next week).
\n", "Web crawling is to design an automatic bot to imitate human browsing behavior." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Understanding HTML\n", "- HTML stands for **Hyper Text Markup Language**, which is used to define a website.\n", "- All HTML contents are hierarchical and structured.\n", " - Basic Element: `Tag` and `Text`\n", " - Text is the content shown on the screen. **Tag is not displayed but is used to render the text.**\n", " - Text is wrapped by start and end tags.\n", " - Tag: denoted by a pair of angle bracket <>\n", " - Start Tag\n", " - Tag Name\n", " - Attributes (optional): attributes provide additional information about the element\n", " - Attribute Name\n", " - Attribute Value\n", " - format: <...>\n", " - End Tag\n", " - format: \n", " - All tags are used in pairs, except line break tag <br> and input box tag <input>." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input Types\n", "\n", "```html\n", "\n", " \n", " This is a title\n", " \n", " \n", " Go to our Home Page\n", "

Please input your user name:

\n", " \n", "

Please input your password:

\n", " \n", "
\n", " Do you like Python?\n", "
\n", " Do you like HTML?\n", "
\n", " \n", " \n", " \n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " This is a title\n", " \n", " \n", " Go to our Home Page\n", "

Please input your user name:

\n", " \n", "

Please input your password:

\n", " \n", "
\n", " Do you like Python?\n", "
\n", " Do you like HTML?\n", "
\n", " \n", " \n", " \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To assign default value, you can use `value` attribute.\n", "\n", "```html\n", "\n", " \n", " This is a title\n", " \n", " \n", " Go to our Home Page\n", "

Please input your user name:

\n", " \n", "

Please input your password:

\n", " \n", "
\n", " \n", " \n", " \n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " This is a title\n", " \n", " \n", " Go to our Home Page\n", "

Please input your user name:

\n", " \n", "

Please input your password:

\n", " \n", "
\n", " \n", " \n", " \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Publish HTML page\n", "Please save your HTML code as a file and rename it as \"week5.html\"\n", "Double click to render the page at your local end.\n", "If you have a server, then you can send this file to your server and publish it as a online web page." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Practice:\n", "Please create a page as the screen, save it as \"week5_practice.html\" and render it in your computer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " survey\n", " \n", " \n", "

Online Survey by JMSC

\n", "

Applicable to HKU students only.

\n", "

Q1: Is Common Core helpful for broadening your intellectual perspective?

\n", " Strongly Agree\n", "
\n", " Agree\n", "
\n", " Neutral\n", "
\n", " Disagree\n", "
\n", " Strongly Disagree\n", "
\n", "

Q2: Is Common Core helpful for broadening your intellectual perspective?

\n", " Strongly Agree\n", "
\n", " Agree\n", "
\n", " Neutral\n", "
\n", " Disagree\n", "
\n", " Strongly Disagree\n", "
\n", " \n", " \n", " \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using Selenium\n", "\n", "We will use `selenium` package to collect data, which is applicable to both static and dynamic websites.
\n", "Please download Chrome driver from this link: https://chromedriver.storage.googleapis.com/index.html?path=73.0.3683.20/" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from selenium import webdriver\n", "from selenium.webdriver.common.keys import Keys" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "driver=webdriver.Chrome(executable_path='C:\\\\Python27\\\\selenium\\\\webdriver\\\\chrome\\\\chromedriver.exe') #load the browser" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "driver.get('file:///C:/Users/yuner/Desktop/week5.html') #use absolute path to open local html file" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'This is a title'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "driver.title #print the title" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'file:///C:/Users/yuner/Desktop/week5.html'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "driver.current_url #get the url of the page" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Locate Element by Xpath" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can locate elements by their relative/absolute paths in the file with additional hints about their tag name, attribute name, and attribute value.
\n", "- Xpath is an expression of HTML element path\n", " - `/` is the sign of **absolute path**:\n", " - if used at the begining: this is a xpath starting from the root node\n", " - if used in the middle: refer to the element **at the next level**\n", " - i.e. xpath of <body> can be written as \"html/body\" or \"/html/body\". \n", " - If you write \"/body\", system will pop up error message.\n", " - `//` is the sign of **relative path**: refer to any element that matches to the pattern no matter where they are.\n", " - i.e. xpath of <body> can be written as \"//body\"\n", " - `[@attribute name=attribute value]` we can include attribute into the matching pattern\n", " - i.e. \"//input[@type='reset']\"\n", " - The most efficient attribute is `id`. `id` is the unique identification of element." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "#you can use find_element_by_xpath function to find the element by relative xpath\n", "body=driver.find_element_by_xpath('//body')" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Go to our Home Page\\nPlease input your user name:\\nPlease input your password:'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "body.text #get the text of the matched element" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Go to our Home Page\n", "Please input your user name:\n", "Please input your password:\n" ] } ], "source": [ "#or by absolute xpath\n", "body=driver.find_element_by_xpath('/html/body')\n", "print(body.text)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "#use find_elements_by_xpath function to find a list of elements with shared pattern\n", "inputs=driver.find_elements_by_xpath('//input')" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(inputs)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "junior\n" ] } ], "source": [ "#1 way\n", "first_input=inputs[0]\n", "print(first_input.get_attribute('value'))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "#2nd way\n", "first_input=driver.find_element_by_xpath('//input[1]')" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "junior\n" ] } ], "source": [ "print(first_input.get_attribute('value'))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "junior\n" ] } ], "source": [ "#3rd way\n", "first_input=driver.find_element_by_xpath('//input[@type=\"text\"]')\n", "print(first_input.get_attribute('value'))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "text\n" ] } ], "source": [ "print(first_input.get_attribute('type'))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "ps=driver.find_elements_by_xpath('//p')" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "Please input your user name:\n", "Please input your password:\n" ] } ], "source": [ "print(len(ps)) #count how many

are in the html\n", "print(ps[0].text) #first element's text\n", "print(ps[1].text) #second element's text" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imitate Browsing Behavior" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some frequently used behaviors:\n", "1. Click: `element.click()`\n", "2. Type: `element.send_keys('something')`\n", "3. Clear existing content: `element.clear()`\n", "4. Scroll: \n", " - Scroll to bottom: `driver.execute_script(\"window.scrollTo(0, document.body.scrollHeight);\")`\n", " - Scroll to specific location: i.e. scroll down by 400px, `driver.execute_script(\"window.scrollTo(0, 400);\")`" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "#clean default name and fill in your name\n", "name_box=inputs[0]\n", "name_box.clear()\n", "name_box.send_keys('your name')" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "#clean default password and fill in any random keys\n", "password_box=inputs[1]\n", "password_box.clear()\n", "password_box.send_keys('abcd')\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "#click the link of \"GO to our Home Page\"\n", "link=driver.find_element_by_xpath('//a')\n", "link.click()" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "#navigate to another online page and inspect the page\n", "driver.get('https://juniorworld.github.io/python-workshop-2018/week5/1.html')" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Q1: Is Common Core helpful for broadening your intellectual perspective?\n", "Q2: Is Common Core helpful for building friendships across faculties for you?\n" ] } ], "source": [ "#copy the xpath and fill it into the bracket\n", "Q1=driver.find_element_by_xpath('//*[@id=\"1\"]')\n", "print(Q1.text)\n", "Q2=driver.find_element_by_xpath('//*[@id=\"2\"]')\n", "print(Q2.text)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "#click the submit button\n", "submit=driver.find_element_by_xpath('/html/body/input[11]') #copy the xpath from inspect window will not look into attributes other than id\n", "submit=driver.find_element_by_xpath('//input[@type=\"submit\"]') #or you can specify xpath by yourself\n", "submit.click()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Practice:\n", "Open Google page (https://www.google.com/), search for \"JMSC\" and click the \"Google Search\" button." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "#write your code here\n", "driver.get('https://www.google.com/')" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "search_box=driver.find_element_by_xpath('//*[@id=\"tsf\"]/div[2]/div/div[1]/div/div[1]/input')\n", "search_box.send_keys('JMSC')" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "search_button=driver.find_element_by_xpath('//*[@id=\"tsf\"]/div[2]/div/div[3]/center/input[1]')\n", "search_button.click()" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "#collect all results on the first page\n", "results=driver.find_elements_by_xpath('//div[@class=\"rc\"]')" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#how many results are listed on the first page\n", "len(results)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Journalism and Media Studies Centre, The University of Hong Kong https://jmsc.hku.hk/ http://www.handbook.hku.hk/ug/full-time-2018-19/arrangements-during-bad-weather. Copyright © 2019 Journalism and Media Studies Centre, The University of ...\n", "Journalism and Media Studies Centre 香港大學新聞及傳媒研究 ... - HKU https://www4.hku.hk/hkumcd/index.php/eng/unit/111_Journalism_and_Media_Studies_Centre 2018年1月10日 - The Journalism and Media Studies Centre has brought professional journalism education to Hong Kong's premier university, creating an ...\n", "JMSC (@JMSCHKU) | Twitter https://twitter.com/jmschku The latest Tweets from JMSC (@JMSCHKU). Founded in 1999, the Journalism and Media Studies Centre of The University of Hong Kong offers professional ...\n", "Manuscript Submission - Editorial Manager https://www.editorialmanager.com/jmsc/ 沒有這個頁面的資訊。\n", "瞭解原因\n", "JMSC - 7th ATC http://www.7atc.army.mil/JMSC/ The Joint Multinational Simulation Center, headquartered in Grafenwoehr, Germany, trains the art and science of command and control, from company-level to a ...\n", "About JMSC | The Society of Publishers in Asia https://www.sopasia.com/home/about-jmsc/ The mission of JMSC is to pursue excellence in journalism and foster Asian voices in the international media. JMSC has a long association with SOPA, and ...\n" ] } ], "source": [ "#print every result\n", "for result in results:\n", " result_link=result.find_element_by_xpath('div[@class=\"r\"]/a') #we can also find element under current note\n", " result_link_text=result_link.find_element_by_xpath('h3').text\n", " result_link_href=result_link.get_attribute('href')\n", " result_description=result.find_element_by_xpath('div[@class=\"s\"]').text\n", " print(result_link_text,result_link_href,result_description)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "#save results\n", "output_file=open('week5_google.txt','w',encoding='utf-8')\n", "for result in results:\n", " result_link=result.find_element_by_xpath('div[@class=\"r\"]/a') #we can also find element under current note\n", " result_link_text=result_link.find_element_by_xpath('h3').text\n", " result_link_href=result_link.get_attribute('href')\n", " result_description=result.find_element_by_xpath('div[@class=\"s\"]').text\n", " output_file.write(result_link_text+'\\t'+result_link_href+'\\t'+result_description+'\\n')\n", "output_file.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "# Break\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Twitter API\n", "API stands for Application Interface, which is provided and maintained by IT company as an official approach to automatically fetch data from their servers. Almost all IT giants like Twitter, Facebook and Google have their APIs. Therefore, knowing how to API is a very critical capacity for anyone who aims to do social media analytics.\n", "Please follow this instruction to apply for a Twitter API: https://juniorworld.github.io/python-workshop-2018/doc/Instructions_on_Twitter_API.pdf" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "import requests\n", "import time\n", "import base64\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "#Authorize your App\n", "\n", "api_key = 'KyQ9A6AkM9fkopbKHu2eRQGxM'\n", "api_secret = 'M0mckxZVYIPXXsJSmXZWfWsnt0LJcesdKm1hn5UkQQW1lbGs0c'\n", "\n", "key_secret = api_key+':'+api_secret\n", "b64_encoded_key = base64.b64encode(key_secret.encode('ascii')).decode('ascii')\n", "\n", "auth_url = 'https://api.twitter.com/oauth2/token'\n", "\n", "auth_headers = {\n", " 'Authorization': 'Basic '+b64_encoded_key,\n", " 'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8'\n", "}\n", "\n", "auth_data = {\n", " 'grant_type': 'client_credentials'\n", "}\n", "\n", "auth_resp = requests.post(auth_url, headers=auth_headers, data=auth_data)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "200" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "auth_resp.status_code #status code \"200\" means authorization succeeds, \"400\" bad request, \"401\" unauthorized, \"403\" forbidden " ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "access_token=auth_resp.json()['access_token'] #get your bearer access token" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "headers = {'Authorization': 'Bearer '+access_token} #we will use this header throughout the course" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'AAAAAAAAAAAAAAAAAAAAAM8M9gAAAAAA33GKu2zHP%2BCcelTcDGw%2FIK0KQGg%3DkLls0647xj9UYpmfgd0x8IduB3DdNurBTEYAYyFF43w84Ak8j9'" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "access_token" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Search API\n", "We can use Search API to search for posts or users in Twitter platform." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Search for Posts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we are using free version API, we are only allowed to collect post in the past 7 days. But this limitation can be transcended if you schedule a routine program to collect data every 7 days.
\n", "The Search API functions in a way similar to Twitter advanced search: https://twitter.com/search-advanced
\n", "The key to search is creating a query url containing search parameters." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "search_url = 'https://api.twitter.com/1.1/search/tweets.json'" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "params = {\n", " 'q': '\"#hongkong\"', #search string\n", " 'result_type': 'recent', #mixed,recent,popular\n", " 'count': 100 #up to 100\n", "}\n", "\n", "search_resp = requests.get(search_url, headers=headers, params=params)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(search_resp.json())" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict_keys(['statuses', 'search_metadata'])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "search_resp.json().keys()" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(search_resp.json()['statuses'])" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100\n" ] } ], "source": [ "print(len(search_resp.json()['statuses'])) #a list of tweet objects" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['created_at', 'id', 'id_str', 'text', 'truncated', 'entities', 'extended_entities', 'metadata', 'source', 'in_reply_to_status_id', 'in_reply_to_status_id_str', 'in_reply_to_user_id', 'in_reply_to_user_id_str', 'in_reply_to_screen_name', 'user', 'geo', 'coordinates', 'place', 'contributors', 'retweeted_status', 'is_quote_status', 'retweet_count', 'favorite_count', 'favorited', 'retweeted', 'possibly_sensitive', 'lang'])" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "search_resp.json()['statuses'][0].keys()" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "results=search_resp.json()['statuses'] #save first 100 results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more information about tweet object, please refer to: https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/intro-to-tweet-json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Navigate to next page of results (step-by-step breakdown)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'completed_in': 0.07,\n", " 'max_id': 1101769423252123649,\n", " 'max_id_str': '1101769423252123649',\n", " 'next_results': '?max_id=1101750455875584000&q=%22%23hongkong%22&count=100&include_entities=1&result_type=recent',\n", " 'query': '%22%23hongkong%22',\n", " 'refresh_url': '?since_id=1101769423252123649&q=%22%23hongkong%22&result_type=recent&include_entities=1',\n", " 'count': 100,\n", " 'since_id': 0,\n", " 'since_id_str': '0'}" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#have a look at the metadata\n", "search_resp.json()['search_metadata'] #the link of next_results is the one we need" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "#let's do next run search\n", "next_page=search_resp.json()['search_metadata']['next_results'] #please extract the link from the dictionary and save it as \"next_page\" variable\n", "search_resp=requests.get(search_url+next_page,headers=headers)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "100" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(search_resp.json()['statuses']) #another 100 posts are in place" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "#update the results\n", "results.extend(search_resp.json()['statuses'])" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "200" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. Navigate to next N page of results (integrated)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 pages have been collected\n", "2 pages have been collected\n", "3 pages have been collected\n", "4 pages have been collected\n", "5 pages have been collected\n", "DONE!\n" ] } ], "source": [ "#you can use a for loop to collect specific pages of results\n", "for page in range(5):\n", " next_page=search_resp.json()['search_metadata']['next_results']\n", " search_resp=requests.get(search_url+next_page,headers=headers)\n", " results.extend(search_resp.json()['statuses'])\n", " print(page+1,'pages have been collected')\n", " time.sleep(15)\n", "print('DONE!')" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 pages have been collected\n", "2 pages have been collected\n", "3 pages have been collected\n", "4 pages have been collected\n", "5 pages have been collected\n", "6 pages have been collected\n", "7 pages have been collected\n", "8 pages have been collected\n", "9 pages have been collected\n", "10 pages have been collected\n", "11 pages have been collected\n", "12 pages have been collected\n", "13 pages have been collected\n", "14 pages have been collected\n", "15 pages have been collected\n" ] }, { "ename": "KeyboardInterrupt", "evalue": "", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 8\u001b[0m \u001b[0mresults\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mextend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0msearch_resp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mjson\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'statuses'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpage\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;34m'pages have been collected'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 10\u001b[1;33m \u001b[0mtime\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msleep\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m15\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 11\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'DONE!'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mKeyboardInterrupt\u001b[0m: " ] } ], "source": [ "#you can use a while loop to exhaust all posts\n", "#Reminder: put some time delay so that you won't exceed the rate limit\n", "page=0\n", "while 'next_results' in search_resp.json()['search_metadata'].keys():\n", " page+=1\n", " next_page=search_resp.json()['search_metadata']['next_results']\n", " search_resp=requests.get(search_url+next_page,headers=headers)\n", " results.extend(search_resp.json()['statuses'])\n", " print(page,'pages have been collected')\n", " time.sleep(15)\n", "print('DONE!')" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2127" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Preliminary Analysis" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "scrolled": false }, "outputs": [], "source": [ "#turn results into a dataframe\n", "table=pd.DataFrame.from_records(results)" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Index(['contributors', 'coordinates', 'created_at', 'entities',\n", " 'extended_entities', 'favorite_count', 'favorited', 'geo', 'id',\n", " 'id_str', 'in_reply_to_screen_name', 'in_reply_to_status_id',\n", " 'in_reply_to_status_id_str', 'in_reply_to_user_id',\n", " 'in_reply_to_user_id_str', 'is_quote_status', 'lang', 'metadata',\n", " 'place', 'possibly_sensitive', 'quoted_status', 'quoted_status_id',\n", " 'quoted_status_id_str', 'retweet_count', 'retweeted',\n", " 'retweeted_status', 'source', 'text', 'truncated', 'user'],\n", " dtype='object')" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4(a) Co-hashtag Analysis" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "{'hashtags': [{'text': '晴天', 'indices': [61, 64]},\n", " {'text': 'HongKong', 'indices': [65, 74]},\n", " {'text': 'アットジャム', 'indices': [75, 82]}],\n", " 'symbols': [],\n", " 'user_mentions': [{'screen_name': 'sakuraebi_staff',\n", " 'name': '桜エビ~ず',\n", " 'id': 3254175722,\n", " 'id_str': '3254175722',\n", " 'indices': [3, 19]}],\n", " 'urls': [],\n", " 'media': [{'id': 1101675110212132864,\n", " 'id_str': '1101675110212132864',\n", " 'indices': [83, 106],\n", " 'media_url': 'http://pbs.twimg.com/media/D0nvrQIVYAAyLgs.jpg',\n", " 'media_url_https': 'https://pbs.twimg.com/media/D0nvrQIVYAAyLgs.jpg',\n", " 'url': 'https://t.co/sFDRhRnefM',\n", " 'display_url': 'pic.twitter.com/sFDRhRnefM',\n", " 'expanded_url': 'https://twitter.com/sakuraebi_staff/status/1101675118990782464/photo/1',\n", " 'type': 'photo',\n", " 'sizes': {'medium': {'w': 1024, 'h': 674, 'resize': 'fit'},\n", " 'thumb': {'w': 150, 'h': 150, 'resize': 'crop'},\n", " 'large': {'w': 1024, 'h': 674, 'resize': 'fit'},\n", " 'small': {'w': 680, 'h': 448, 'resize': 'fit'}},\n", " 'source_status_id': 1101675118990782464,\n", " 'source_status_id_str': '1101675118990782464',\n", " 'source_user_id': 3254175722,\n", " 'source_user_id_str': '3254175722'}]}" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table['entities'][0]" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict_keys(['hashtags', 'symbols', 'user_mentions', 'urls', 'media'])" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table['entities'][0].keys() #entities is a dictionary about in-text connections" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'OpenDataDay'" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table['entities'][1]['hashtags'][1]['text'] #first 1: user index; second 1 is the hashtag index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more information about entities, please refer to official documentation: https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/entities-object" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [], "source": [ "hashtags=[]\n", "for entity in table['entities']:\n", " for hashtag in entity['hashtags']:\n", " hashtags.append(hashtag['text'])" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5645" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(hashtags)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "hashtag_freq=pd.value_counts(hashtags) #frequency distribution of hashtags" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "HongKong 923\n", "hongkong 332\n", "香港 137\n", "RaspberryPi 68\n", "4IN1 68\n", "dtype: int64" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hashtag_freq.head() #first 5 rows" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'hongkong'" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'HongKong'.lower()" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "hongkong 1315\n", "香港 137\n", "china 90\n", "4in1 68\n", "raspberrypi 68\n", "dtype: int64" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#convert uppercase to lowercase\n", "hashtags=[i.lower() for i in hashtags] #Write your code here\n", "hashtag_freq=pd.value_counts(hashtags) #data type: Series, index: hashtag\n", "hashtag_freq.head()" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [], "source": [ "pd.DataFrame(hashtags).to_csv('hashtags.txt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create a word cloud of co-hashtags of #hongkong in https://wordcloud.timdream.org/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Practice:\n", "---\n", "Please collect most recent 500 tweets using hashtag #FinishTheWall and visualize its co-hashtags with word cloud.
\n", " Please use a variable name other than \"table\" to store your results, because we will use table later. \n", "
" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 pages have been collected\n", "2 pages have been collected\n", "3 pages have been collected\n", "4 pages have been collected\n", "DONE!\n" ] } ], "source": [ "#Write your code here\n", "params = {\n", " 'q': '\"#FinishTheWall\"', #search string\n", " 'result_type': 'recent', #mixed,recent,popular\n", " 'count': 100 #up to 100\n", "}\n", "\n", "search_resp = requests.get(search_url, headers=headers, params=params)\n", "results=search_resp.json()['statuses']\n", "for page in range(4):\n", " next_page=search_resp.json()['search_metadata']['next_results']\n", " search_resp=requests.get(search_url+next_page,headers=headers)\n", " results.extend(search_resp.json()['statuses'])\n", " print(page+1,'pages have been collected')\n", " time.sleep(15)\n", "print('DONE!')" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "table2=pd.DataFrame.from_records(results)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [], "source": [ "hashtags2=[]\n", "for entity in table2['entities']:\n", " for hashtag in entity['hashtags']:\n", " hashtags2.append(hashtag['text'])" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [], "source": [ "pd.DataFrame(hashtags2).to_csv('hashtags2.txt')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": { "080e2c47f8ad43f8af72a4742cf1e138": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "float -> integer", "disabled": false, "layout": "IPY_MODEL_d1bb9617ca9a469293c3d7872e048cf5", "style": "IPY_MODEL_46b4655aef3d497f97d962708c65df33", "value": false } }, "09ce660d36034e27a17d579f4c9f9ab7": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "0fc283fcb76f4be3a960745eed2085fa": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "0fe58378d42648d0b73ff0f6454e0057": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "12f31b4141844a6ea96726c39acc5e30": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "list -> boolean", "disabled": false, "layout": "IPY_MODEL_78ab52ea991743b590e8322cb50a6f7c", "style": "IPY_MODEL_f73a070b6b814a358c1e6a2c1b5176d3", "value": false } }, "1b2c6cab5f134194b9263db4b76a30f9": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "string -> list", "disabled": false, "layout": "IPY_MODEL_dc3500e672b441f184cb66c9164f46c5", "style": "IPY_MODEL_3dbcc07594dc4af69db3bd8ed0488aab", "value": false } }, "2402d123a94a4e88a71a9241817b4ddd": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "list -> integer", "disabled": false, "layout": "IPY_MODEL_588e4cdc3f85445587c599183cef6b82", "style": "IPY_MODEL_48f6441a930a4442b8a9774607eb2607", "value": false } }, "263851430804425ba4c9912992ba392e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "VBoxModel", "state": { "children": [ "IPY_MODEL_331a9e88e84d4657901d770b98811c89", "IPY_MODEL_dd3cd9d0b4da42e38851025c52c016d7", "IPY_MODEL_c7f9522104d044b2865c5b5ace3b5838", "IPY_MODEL_1b2c6cab5f134194b9263db4b76a30f9" ], "layout": "IPY_MODEL_f95f5547e73b4ebebb45141e029cd7fa" } }, "2d169e7d0d824f619d6fe9134201dc49": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "2e47d0304aa142c68ae79ce9af1e5de0": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "integer -> string", "disabled": false, "layout": "IPY_MODEL_d2a31aff6ca44e299e4d3b3e0a688e0e", "style": "IPY_MODEL_86e22148b4ea403a9965d15f61ed7773", "value": false } }, "2e65245b455242068b5d5e5121a66778": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "31896adee2084ee8a38b6c7dca0bd78e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "32e66f8a5302444d9741b0b052a23fa7": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "331a9e88e84d4657901d770b98811c89": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "string -> integer", "disabled": false, "layout": "IPY_MODEL_8b1fad754fa14059b682ca11c1469f8f", "style": "IPY_MODEL_a6eebb365fae434bba48e3fb56f3797d", "value": false } }, "33c5cdcf55fb4721a57a50c2a5b5598a": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "3730c393db284068afb44c0336c9fe69": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "boolean -> float", "disabled": false, "layout": "IPY_MODEL_33c5cdcf55fb4721a57a50c2a5b5598a", "style": "IPY_MODEL_8ac5c8abd45b45468472270c97163281", "value": false } }, "3bcbe7f5655b48ec8a0e11a971305e5b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "float -> list", "disabled": false, "layout": "IPY_MODEL_d0a3786bf784405d8a3b1d62fff9d4df", "style": "IPY_MODEL_eb9e207aef45431d979929a70d6dcdfe", "value": false } }, "3dbcc07594dc4af69db3bd8ed0488aab": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "4639370494564980a83c25276bf1c525": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "VBoxModel", "state": { "children": [ "IPY_MODEL_78281b8599ef48e0ad7504242d9ec655", "IPY_MODEL_3730c393db284068afb44c0336c9fe69", "IPY_MODEL_dd92bb017b4a443cb4a32e6e96346ad6", "IPY_MODEL_d440a7f911d04b60a5ea23a47cac51d5", "IPY_MODEL_2402d123a94a4e88a71a9241817b4ddd", "IPY_MODEL_4756cdff64d0476eb778792ceafed665", "IPY_MODEL_12f31b4141844a6ea96726c39acc5e30", "IPY_MODEL_95012cc5cabb42c7b78294c683a57edb" ], "layout": "IPY_MODEL_ea32c50591e04ef984c8538179c8750b" } }, "46b4655aef3d497f97d962708c65df33": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "4756cdff64d0476eb778792ceafed665": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "list -> float", "disabled": false, "layout": "IPY_MODEL_781854e64bce447ea5646b2a2d37a21e", "style": "IPY_MODEL_63b666b6466f49a787ce5b4f04ecf795", "value": false } }, "48f237b0a5e843a286d0c2061a771cda": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "48f6441a930a4442b8a9774607eb2607": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "524e0a435d7e4ead92056561e3013788": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "563a591ca8f449e1a238a611497977a4": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "HBoxModel", "state": { "children": [ "IPY_MODEL_aadb0e9a766b4ff88e2c66572d38a37d", "IPY_MODEL_4639370494564980a83c25276bf1c525", "IPY_MODEL_263851430804425ba4c9912992ba392e" ], "layout": "IPY_MODEL_5c3803e3ce814277b676f43f4079bc38" } }, "588e4cdc3f85445587c599183cef6b82": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "5c3803e3ce814277b676f43f4079bc38": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "63b666b6466f49a787ce5b4f04ecf795": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "781854e64bce447ea5646b2a2d37a21e": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "78281b8599ef48e0ad7504242d9ec655": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "boolean -> integer", "disabled": false, "layout": "IPY_MODEL_a06e54ff34914ffcbfae071fb4de80ed", "style": "IPY_MODEL_524e0a435d7e4ead92056561e3013788", "value": false } }, "78ab52ea991743b590e8322cb50a6f7c": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "86e22148b4ea403a9965d15f61ed7773": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "888434c979f04e12aac8f2646c93369f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "8ac5c8abd45b45468472270c97163281": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "8b1fad754fa14059b682ca11c1469f8f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "911eb96c40f744bfa93164cdc9cac0f5": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "95012cc5cabb42c7b78294c683a57edb": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "list -> string", "disabled": false, "layout": "IPY_MODEL_0fc283fcb76f4be3a960745eed2085fa", "style": "IPY_MODEL_feb0b4e84a17424ea6790a8a6dbd7139", "value": false } }, "980b04576b014e998b763b98e353141f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "9b9fd7a1d0174c0d9843763e900aab5a": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "a06e54ff34914ffcbfae071fb4de80ed": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "a6eebb365fae434bba48e3fb56f3797d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "a7caa6bc99b64fe485c43f75cea49f4b": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "integer -> list", "disabled": false, "layout": "IPY_MODEL_0fe58378d42648d0b73ff0f6454e0057", "style": "IPY_MODEL_911eb96c40f744bfa93164cdc9cac0f5", "value": false } }, "aa34f9124dce4f2981ccdc41cfd2026b": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "aadb0e9a766b4ff88e2c66572d38a37d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "VBoxModel", "state": { "children": [ "IPY_MODEL_ee2a141c799e4ab1ae864b4984b1992e", "IPY_MODEL_e6ac8cb0bd5c429d8b97840b0dae2523", "IPY_MODEL_a7caa6bc99b64fe485c43f75cea49f4b", "IPY_MODEL_2e47d0304aa142c68ae79ce9af1e5de0", "IPY_MODEL_080e2c47f8ad43f8af72a4742cf1e138", "IPY_MODEL_f6ff3964100241c19ac10dd537866b65", "IPY_MODEL_3bcbe7f5655b48ec8a0e11a971305e5b", "IPY_MODEL_b973e671a4fc4741bec46c7a4fc6801d" ], "layout": "IPY_MODEL_e4d0940ebbde4afea334bc5a50d58b77" } }, "b973e671a4fc4741bec46c7a4fc6801d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "float -> string", "disabled": false, "layout": "IPY_MODEL_e1517a5ece41466c962bd97efd23f41d", "style": "IPY_MODEL_d96e918c59714edca31e188fbdba400f", "value": false } }, "c7f9522104d044b2865c5b5ace3b5838": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "string -> boolean", "disabled": false, "layout": "IPY_MODEL_48f237b0a5e843a286d0c2061a771cda", "style": "IPY_MODEL_2e65245b455242068b5d5e5121a66778", "value": false } }, "ce97ddd4384242d7b4f4d8973f3d934d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "d0a3786bf784405d8a3b1d62fff9d4df": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "d1bb9617ca9a469293c3d7872e048cf5": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "d2a31aff6ca44e299e4d3b3e0a688e0e": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "d440a7f911d04b60a5ea23a47cac51d5": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "boolean -> string", "disabled": false, "layout": "IPY_MODEL_d5598195117f439d8eed0dd556a546c3", "style": "IPY_MODEL_31896adee2084ee8a38b6c7dca0bd78e", "value": false } }, "d5598195117f439d8eed0dd556a546c3": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "d96e918c59714edca31e188fbdba400f": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "dc3500e672b441f184cb66c9164f46c5": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "dd3cd9d0b4da42e38851025c52c016d7": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "string -> float", "disabled": false, "layout": "IPY_MODEL_f30b1f4890d84faeb4ab78e1f01ebbed", "style": "IPY_MODEL_2d169e7d0d824f619d6fe9134201dc49", "value": false } }, "dd92bb017b4a443cb4a32e6e96346ad6": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "boolean -> list", "disabled": false, "layout": "IPY_MODEL_32e66f8a5302444d9741b0b052a23fa7", "style": "IPY_MODEL_980b04576b014e998b763b98e353141f", "value": false } }, "e1517a5ece41466c962bd97efd23f41d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "e39cc79c87574f648ea15c0c7ece4595": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "e4d0940ebbde4afea334bc5a50d58b77": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "e6ac8cb0bd5c429d8b97840b0dae2523": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "integer -> boolean", "disabled": false, "layout": "IPY_MODEL_ce97ddd4384242d7b4f4d8973f3d934d", "style": "IPY_MODEL_9b9fd7a1d0174c0d9843763e900aab5a", "value": false } }, "ea32c50591e04ef984c8538179c8750b": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "eb9e207aef45431d979929a70d6dcdfe": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "ee2a141c799e4ab1ae864b4984b1992e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "integer -> float", "disabled": false, "layout": "IPY_MODEL_09ce660d36034e27a17d579f4c9f9ab7", "style": "IPY_MODEL_e39cc79c87574f648ea15c0c7ece4595", "value": false } }, "f30b1f4890d84faeb4ab78e1f01ebbed": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "f6ff3964100241c19ac10dd537866b65": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "CheckboxModel", "state": { "description": "float -> boolean", "disabled": false, "layout": "IPY_MODEL_aa34f9124dce4f2981ccdc41cfd2026b", "style": "IPY_MODEL_888434c979f04e12aac8f2646c93369f", "value": false } }, "f73a070b6b814a358c1e6a2c1b5176d3": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } }, "f95f5547e73b4ebebb45141e029cd7fa": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.1.0", "model_name": "LayoutModel", "state": {} }, "feb0b4e84a17424ea6790a8a6dbd7139": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.4.0", "model_name": "DescriptionStyleModel", "state": { "description_width": "" } } }, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 2 }