Lorena Gongang
Tech journey with Lorena


Tech journey with Lorena

Sentiment analysis on streaming Twitter data using Kafka, Spark Structured Streaming & Python (Part 3)

Sentiment analysis on streaming Twitter data using Kafka, Spark Structured Streaming & Python (Part 3)

Lorena Gongang's photo
Lorena Gongang
·Mar 21, 2022·

2 min read

Play this article

Table of contents

  • Goal
  • Roadmap
  • Let's jump into it.
  • Conclusion

Before jumping into this article, make sure you read the other parts of this project: Part 1 and Part 2.


The goal of this article is to visualise the sentiment analysis previously done.



As shown in the previous article related to this project, we stored the real-time data into MongoDB Atlas after doing sentiment analysis on them. During my research, I found that MongoDB Atlas has a visualization part and I decided to explore it.

Let's jump into it.

While ingesting the tweets using Kafka, and processing them using Spark Streaming, I store them into MongoDB Atlas. It is important to notice that, we choose as a search word in tweets the terms Bitcoin. We then get tweets in the database in this form and with the sentiment associated:

Capture d’écran 2022-03-18 à 22.17.44.png

I then visualize them by clicking on "visualize your data" You get then a platform very clear and easy to manipulate and where you can drag and drop the information you want to visualize. I just do a simple visualization chart with the number of positive, negative or neutral tweets. We get a chart like this.

Enregistrement de l’écran 2022-03-18 à 221244-high.gif


To conclude, MongoDB Atlas is a good tool for storing and doing charts in my opinion.

But, I noticed a problem with the refreshing time, as I'm doing real-time, MongoDB gave a range of time to refresh starting with 20 seconds. This means that the charts are not automatically upgrading while receiving the data; It has to wait 20 seconds.

I have maybe to test other visualisation tools to do the comparison in terms of latency.

Share this