Utkarsh Lal
Research & Projects
Research Paper: Fractal Dimensions and Machine Learning for the Detection of Parkinson’s Disease in resting-state EEG.
Doi: https://doi.org/10.21203/rs.3.rs-3270985/v1
Status: PUBLISHED in the Neural Computing and Applications Journal (IF: 6.0).
Supervision: Professor Arjun CV (PhD scholar in Technological University (TU) Dublin), and Dr. Luca Longo (Founder, AI and Cognitive Load Lab TU Dublin)
​
I conducted a comparative analysis of various Window segmentation, Machine Learning, and Fractal Dimensional techniques and proposed a novel model achieving over 97% accuracy in detecting Parkinson’s Disease from EEG. I also employed Explainable AI methods to enhance the interpretability of the model by visualizing feature importances yielded by the classification models using topographic plots of the brain. These plots accurately identified the motor cortex in the brain as having higher importance in differentiating between Healthy controls and Parkinson’s Disease patients. Furthermore, the proposed model illustrated robust performance in detecting Parkinson’s Disease in patients under varying levels of medication.
Research Paper: Leveraging Singular Value Decomposition Entropy and Machine Learning for Alzheimer’s and Frontotemporal Dementia Detection using EEG.
Status: Further Refining the paper.
DOI: https://doi.org/10.36227/techrxiv.23992554.v2
Supervision: Prof. Arjun CV, Dr. Luca Longo
​
FTD is often misdiagnosed as AD. Therefore, there is a need for automated techniques to accurately differentiate between the two diseases. In this study, I built a model using SVD Entropy, sliding window segmentation, and Extreme Gradient Boosting for detecting and differentiating between Alzheimer’s and Frontotemporal Dementia with 90-95% average accuracy. I also employed XAI to identify relevant brain regions with higher degeneration by interpreting the results yielded by classification models. I also conducted a comparative performance analysis of several feature extraction measures and machine learning algorithms. This paper is currently going through the second round of review in IEEE Access. This paper also received funding from the Science Foundation Ireland for publication costs.
Client - Reliance JIO
Oct, 2022 - Present
​
-
Developed a SARIMAX model to forecast sales revenue and quantity of products available on an e-commerce platform called Jio Mart.
-
Decomposed time series to analyze trends and seasonality. Created exogenous variables to incorporate for increased user interaction during festive seasons.
-
Employed Custom Exponential Smoothing for handling seasonality and computed city-wise product popularities for JioMart.
-
Conducted hypothesis testing using the Augmented Dickey-Fuller test to validate the preprocessing performance.
-
Leveraged the pmdarima python package to build and tune the SARIMAX model.
-
Optimized the model and implemented mechanisms for logging and auditing.
-
Stack - python, pmdarima, statsmodels
​
Facial Emotion Recognition using CNN
-
Conducted this project as a part of the Applied Data Science Program at MIT Professional Education.
-
Crafted a TensorFlow-based Convolutional Neural Network (CNN) model to classify four emotion states (happy, sad, neutral, surprised) within an image dataset.
Leveraged transfer learning by integrating and fine-tuning pre-trained models such as VGG16, ResNet V2, and EfficientNet.
Published a comprehensive report detailing the comparative performance analysis of all the implemented models.
Client - Kimberly-Clark
Jan - Jul, 2021
​
Built a web application for scraping product reviews from Amazon.com and Shopee, paired with a sentiment analysis model.
Applied Lemmatization and Negation Handling in text pre-processing and utilized TF-IDF and n-grams for feature extraction.
Developed a POC for Topic Modelling using Latent Dirichlet Allocation to identify the most significant features in negative reviews and deployed the web app on Azure using Flask and Redis.
Research Paper: Ensemble Temporal feature extraction and Machine Learning for Classification of Sleep Stages from Telemetry PSG Data
Doi: https://doi.org/10.3390/brainsci13081201
Status: PUBLISHED in the Brain Sciences Journal (IF: 3.4)
Supervision: Professor Suhas Mathavu and Professor Anitha Hoblidar (Department of Electronics and Communications Engineering, Manipal Institute of Technology)
​
I implemented an ensemble feature extraction method using Power Spectral Density, Higuchi Fractal Dimension, Detrended Fluctuation Analysis, SVD Entropy, and Permutation Entropy, coupled with statistical measures, including standard deviation, kurtosis, skewness, and mean, to extract salient features of different sleep stages from Polysomnography data. Electromyography (EMG), Electrooculography (EOG), and Electroencephalography (EEG) biosignals were utilized. Comparative analysis of various Machine Learning models was conducted to determine the optimal pipeline for distinguishing between five different sleep stage configurations. The final pipeline achieved 90-97% accuracy across all configurations. This paper was published in the Brain Sciences journal.
Research Paper: Effective Negation Handling Approach for Sentiment Classification using Synsets in the WordNet lexical database.
Status: PUBLISHED
Presented the paper at the 2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT)
Honor: BEST PAPER
Supervision: Prof. Priya Kamath, MIT Manipal
​
I created a novel First Sentiment Word (FSW) Negation Replacement Algorithm based on the antonymy lexical semantics present in the WordNet lexical database, which resides in the Natural Language Toolkit (NLTK) library. I harnessed synsets present in WordNet to replace sentiment words with their antonyms without inverting the polarity of the sentence. I experimented with different combinations of Negation Handling, Lemmatization, and Stemming to determine the optimal preprocessing technique. Afterward, I employed TF-IDF and extracted features as unigrams and bigrams. After training machine learning models on these features, it was observed that the average accuracy increased by 5-10% using Negation Handling. I presented the paper at the ICEEICT (IEEE) 2022 conference and received the Best Paper recognition.
Client - Reliance JIO
Oct, 2022 - Present
​
-
​Constructed the end-to-end Machine Learning pipeline for predicting product popularity of items on the Tira catalog. (Tirabeauty.com).
-
Leveraged regression algorithms like XGB and Huber Regressor to identify the relevance of click events towards a unique product purchase.
-
collaborating with multiple product managers.Served as the Project owner of the entire project and the point of contact for all stakeholders in the Tira product space, -
Analyzed user interaction through events like adds to cart, wishlist, orders completed, and number of clicks.
-
Implemented an inverse logarithmic recency bias function for data augmentation, adding higher weights to recent products and lower weights to the products that were interacted with in the past.
-
Ranked products based on popularity and user interaction to enhance the Search recommendations.
Stack - Python, SQL, BigQuery, Scikit-learn, PyCaret
-
Conducted this project as a part of the Applied Data Science Program at MIT Professional Education. Applied and fine-tuned ensemble Random Forest, XGBoost, Logistic Regression, and KNN to predict loan defaults for a bank. -
Handled severely imbalanced datasets using the Synthetic Minority Oversampling Technique (SMOTE).
-
Conducted extensive univariate and multivariate exploratory data analysis, providing an in-depth report with data-driven insights and client recommendations.
-
Built a churn prediction model using Call Detail Record (CDR) data of a telecom organization on Google Cloud's Vertex AI.
-
Created a streaming data pipeline handling 50,000+ rows per second of real-time data utilizing Spark Streaming.
-
Employed Recursive Feature Elimination and Ensembling with Random Forest and XGBoost using soft voting classifier.
-
Constructed batch Python data pipelines in Google Cloud Functions and DataProc. Orchestrated the pipelines using Apache Airflow and Cloud Composer. Performed extensive feature engineering using SQL in Google BigQuery.