Data analysis journey

Data analysis journey

Visualizing text data

Visualizing keywords in text dataset

PublishedFebruary 23, 2024

•2 min read•View as Markdown

Visualizing text data

I am an Industrial Engineer utilizing the power of python to gain deeper insights in data.

I am currently learning Deep learning with TensorFlow

As part of exploring the Kaggle dataset of tweets used to train and predict whether a tweet was about a real disaster or not, I explored a couple ways of visualizing the text data.

Python Bar Chart

Part of the dataset is a column for the tweet keyword. I created a sorted bar chart to display top keywords. 'Fatalities' is the top keyword by count with a number words very close.

# Plot tweet keywords
plt.bar(keyword_df['keyword'].head(20), keyword_df['count'].head(20), color='green')
plt.xticks(rotation = 90)
plt.ylabel('Count')
plt.title('Top keywords')
plt.show()

Python WordCloud

In python, I created a world cloud of the keywords.

# Create word cloud visual of keywords
from wordcloud import WordCloud
word_frequencies = keyword_df['keyword'].value_counts().to_dict()

# Generate the word cloud with frequencies
wordcloud = WordCloud(width=800, height=400, background_color='white')
wordcloud.generate_from_frequencies(word_frequencies)

# Display the word cloud using matplotlib
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.show()

This created a neat visual where larger words are based on the count of keywords.

Tableau Word Cloud and TreeMap

Finally, I explored visualizing in Tableau, but since the top keywords have counts very close together, the word cloud looked more like a list of words.

I followed this video in creating the visual.

https://www.youtube.com/watch?v=UHOMH5DTq14

Changing the tableau chart into a TreeMap helped visualize a little better but it isn't as helpful as I would like.

You can view the Kaggle notebook where the python charts are used in a NLP classification model.

Comments

Join the discussion

No comments yet. Be the first to comment.

More from this blog

Building an A3 Process Improvement App with Claude

I recently built an interactive A3 process improvement app using Claude Sonnet 4 - and here's the interesting part: the app itself uses Claude's API to analyze completed A3 documents. It's essentially Claude helping to build a tool that leverages Cla...

Jun 28, 20252 min read

Building an A3 Process Improvement App with Claude

Prompt Comparison app

Comparing Mistral 7B vs. LLaMA-2 7B using HuggingFace

Jun 20, 20252 min read

Prompt Comparison app

🤖 Building a Personalized Chatbot Powered by my Portfolio

Implementing a RAG model

May 24, 20251 min read

🤖 Building a Personalized Chatbot Powered by my Portfolio

🎉 Apprenticeship Milestone Unlocked!

This took a bit of time to get the certificate, but circling back to follow up on a previous accomplishment — I officially completed an AI/ML Apprenticeship! [U.S. Department of Labor apprenticeship completion certificate for the AI/ML Fundamentals P...

May 5, 20251 min read

Building a Time-Series Forecast & Anomaly Dashboard

A time-series forecasting and anomaly-detection tool

Apr 25, 20252 min read

Building a Time-Series Forecast & Anomaly Dashboard

Data analysis journey

26 posts