University of Waterloo

Graduate Research Student
September 2022 - Present, Waterloo, ON, Canada

I am a research student affiliated with Waterloo Analytics AND Optimization LAB (WANOPT) and Critical ML lab. My research focuses on Machine Learning, Spatiotemporal Data Analysis, and Optimization.

Socio Cognitive Lab

Research Associate
December 2020 - May 2022, Dhaka, Bangladesh

A data science lab devoted to social media mining, opinion mining, and social computing directed by Dr. Saddam Hossian.

My responsibilities -

  • Identified and formulated research problem and scope
  • Performed data collection, preparation, and analysis
  • Designed, developed, and evaluated Machine Learning models


Peer-reviewed Journal Papers

[J.1] Ahmed Shahriar Sakib, Md. Saddam Hossain Mukta, Fariha Rowshan Huda, A.K.M. Najmul Islam, Tohedul Islam and Mohammed Eunus Ali, “Identifying Insomnia from Social Media Posts: Psycholinguistic Analyses of User Tweets”, Journal of Medical Internet Research (JMIR), 23(12):e27613, Dec 2021. [IF-7.08] DOI:10.2196/27613

Summary : The purpose of this research is to build an insomnia prediction model from users’ psycholinguistic patterns, i.e., word usage, semantics, and their Big5 personality traits as derived from tweets. Using Twitter’s advanced search technique, users were collected based on tweet keywords (e.g., “insomnia”, “sleepless” ) from six different countries where English is the first language and divided into two groups - insomniac and non-insomniac (YES and NO). Psycholinguistic tools - LIWC and Empath were used to build psycholinguistic profiles of the users from their word choices and the semantic relationships between the words of their tweets. Also to find the relationship between a user’s personality traits and insomnia, IBM Watson Personality Insights API was used. Feature selection was performed via Fishers’ linear discriminant analysis using IBM SPSS. Moreover, to find the contextual relationships between words in a sentence BERT word-embedding vectors were generated using Sentence Transformers. Finally, a double-weighted ensemble classification model was developed to predict insomnia from both psycholinguistic and personality traits as derived from user tweets.

Peer-reviewed Conference Papers

[C.1] Md. Saddam Hossain Mukta, Ahmed Shahriar Sakib, Md. Adnanul Islam, Mohammed Eunus Ali, Mohiuddin Ahmed and Mumshad Ahamed Rifat, “Friends’ Influence Driven Users’ Value Change Prediction from Social Media Usage”, 2021 International Conference on Social Computing, Behavioral-Cultural Modeling& Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS 2021)

Summary : In this study, we show that we can predict the value change of a person by considering both the influence of her friends and her social media usage. This is the first work in the literature that relates the influence of social media friends on the human value dynamics of a user. We propose a Bounded Confidence Model (BCM) based value dynamics model from 275 different ego networks in Facebook that predicts how social influence may persuade a person to change her value over time. We use a particle swarm optimization based hyperparameter tuning technique to optimize our proposed model’s hyperparameters by applying regression.


Thesis : Can We Predict Insomnia From Tweets?

Summary : Social media has currently become a popular platform for people to share their thoughts, opinions, and beliefs that often play a vital role in analyzing their personality and mental traits. In this research, a hypothesis is being presented to predict insomnia by analyzing the tweets from the users of Twitter (~1800 users, 6 million+ tweets), a commonly used social networking site. Insomnia is a sleep disorder that is characterized by having difficulty falling and/or staying asleep, which ultimately leads to exhaustion, low vitality, trouble concentrating, aggravation of mind, and diminished productivity at work or school.

After collecting, filtering, and pre-processing the tweets, psycholinguistic features were extracted using LIWC2007 and Empath. Feature selection process invloved two different approaches using - i) regsubsets from R leaps package for LIWC feature set and ii) performing Fisher’s discriminant analysis using IBM SPSS for Empath feature set. After performing the discriminant analysis, empath feature set did not show acceptable correlation with the target variable (Insomnia Yes or No).

Finally, classic machine learning models (i.e., Random Forest, Naive Bayes, J48, PART) were used on LIWC feature set and target variable for predicting insomnia using Weka. Among them, Random Forest performed better than rest of the models with an accuracy of approximately 70%. This study aims to raise awareness and provide a prognosis about insomnia before it turns into a health-hazardous form.

Details about this work can be found here - PDF.