#Cleaning Tweets corpus = for i in range(0, 1000): tweet = re.sub('', ' ', tweet_dataset) tweet = tweet.lower() tweet = re.sub('rt', '', tweet) tweet = re.sub('http', '', tweet) tweet = re.sub('https', '', tweet) tweet = tweet.split() ps = PorterStemmer() tweet = tweet = ' '.join(tweet) corpus.append(tweet) are present in the tweets, we need to remove all those unnecessary information. Here, as we are ready with the clean tweet data, we will perform NLP operations on the tweet texts including taking only alphabets, converting all to lower cases, tokenization and stemming. #Cleaning Data #Removing handle def remove_pattern(input_txt, pattern): r = re.findall(pattern, input_txt) for i in r: input_txt = re.sub(i, '', input_txt) return input_txt tweet_dataset = np.vectorize(remove_pattern)(tweet_dataset, tweet_dataset.head() Creating the Data Frame and Saving into CSV File #Creating Dataframe of Tweets #Cleaning searched tweets and converting into Dataframe my_list_of_dicts = for each_json_tweet in searched_tweets: my_list_of_dicts.append(each_json_tweet._json) with open('tweet_json_Data.txt', 'w') as file: file.write(json.dumps(my_list_of_dicts, indent=4)) my_demo_list = with open('tweet_json_Data.txt', encoding='utf-8') as json_file: all_data = json.load(json_file) for each_dictionary in all_data: tweet_id = each_dictionary text = each_dictionary favorite_count = each_dictionary retweet_count = each_dictionary created_at = each_dictionary my_demo_list.append() tweet_dataset = pd.DataFrame(my_demo_list, columns = ) #Writing tweet dataset ti csv file for future reference tweet_dataset.to_csv('tweet_data.csv') Cleaning Tweet Texts using NLP OperationsĪs we are ready now with the tweet data set, we will analyze our dataset and clean this data in the following segments. Through this way, we can utilize this tweet data for other experimental purposes. Later all the processed data will be saved to a CSV file in the local system. Here, we will create a data frame of all the tweet data that we have downloaded. Sentiment Analysis #Sentiment Analysis Report #Finding sentiment analysis (+ve, -ve and neutral) pos = 0 neg = 0 neu = 0 for tweet in searched_tweets: analysis = TextBlob(tweet.text) if ntiment>0: pos = pos +1 elif ntiment<0: neg = neg + 1 else: neu = neu + 1 print("Total Positive = ", pos) print("Total Negative = ", neg) print("Total Neutral = ", neu) #Plotting sentiments labels = 'Positive', 'Negative', 'Neutral' sizes = colors = explode = (0.1, 0, 0) # explode 1st slice plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', shadow=True, startangle=140) plt.axis('equal') plt.show() We will now analyze the sentiments of tweets that we have downloaded and then visualize them here. #Defining Search keyword and number of tweets and searching tweets query = 'lockdown' max_tweets = 2000 searched_tweets = You can pass the keyword of your interest here and the maximum number of tweets to be downloaded through the tweepy API. #Authorization and Search tweets #Getting authorization consumer_key = 'XXXXXXXXXXXXXXX' consumer_key_secret = 'XXXXXXXXXXXXXXX' access_token = 'XXXXXXXXXXXXXXX' access_token_secret = 'XXXXXXXXXXXXXXX' auth = tweepy.OAuthHandler(consumer_key, consumer_key_secret) t_access_token(access_token, access_token_secret) api = tweepy.API(auth, wait_on_rate_limit=True) After you create the app, not down the below-required credentials from there. After creating the account, go to ‘Get Started’ option and navigate to the ‘Create an app’ option. To use the ‘ tweepy‘ API, you need to create an account with Twitter Developer. #Importing Libraries import tweepy from textblob import TextBlob import pandas as pd import numpy as np import matplotlib.pyplot as plt import re import nltk nltk.download('stopwords') from rpus import stopwords from import PorterStemmer from wordcloud import WordCloud import json from collections import Counter Downloading the data from Twitter
#Twitter download tweet install#
Make sure to install ‘tweepy’, ‘ textblob ‘ and ‘ wordcloud ‘ libraries using ‘ pip install tweepy ’, ‘ pip install textblob ‘ and ‘ pip install wordcloud ‘. We will import all the required libraries here. Here, we will discuss a hands-on approach to download and analyze twitter data.