Welcome To My Website!
Glad to see you here!

About

With over 3+ years of hands-on experience in the development and deployment of ML models, I have cultivated a strong passion for driving innovation and delivering impactful solutions. My expertise spans a wide range of ML domains, including Machine Learning, Deep Learning, NLP, Transformers, LLM, and Generative AI. I possess comprehensive knowledge in end-to-end AI deployment, from system setup to monitoring and maintenance, utilizing platforms such as AWS and GCP. Fueled by creativity and passion, I thrive in challenging environments and find joy in collaborating on creative solutions. I am always open to connecting and exploring potential collaborations. Let's discuss how we can create a positive impact together.

TECHNICAL SKILLS: Python, SQL, R, MATLAB, Scikit-Learn, Keras, TensorFlow, PyTorch, Hugging Face, LangChain, Flask, AWS, GCP.

Projects

(Click each tile to learn more)

Highlights

  • Vice President & Academic Committee Head (Ex President) of the Machine Learning Club at SJSU.
  • Recipient of prestigious academic scholarship from SJSU alumni association.
  • Received Patent from Intellectual Property India for an innovative product Nylon Fabricated Bone Immobilizer using Rapid Prototyping in the field of orthopedics.
  • Awarded Best Outgoing Student of the Year 2016 during my Bachelors.

Discover what my research professor had to say about me Recommendation Letter

Education

San Jose State University
MS Data Analytics

Linkoping University
MS Biomedical Engineering

Anna University
BE Biomedical Engineering

Work Experience

  • June 2016 - July 2017

    Cognizant Technologies Solutions, India
    Computer Programmer

  • May 2019 - Jan 2020

    Integrum ab, Sweden
    Graduate Intern

  • Jan 2020 - Present

    San Jose State University
    Graduate Research Assistant

  • NEXT
    BIG
    THING!

Message Me

Feel free to message me for any career opportunities pertaining to Data Analytics or Data Science, you can also drop an email to nivedha.balakrishnan@sjsu.edu

Discover the Undiscovered
Identification of Thrombin Inhibitor from diverse habitats using ML

Tools and Techniques used: Python, PyTorch, Huggingface Transformers, Flan T5 transformer model, Prompt Engineering, LLM Fine Tuning.

GitHub Link

Abstract

Thrombin is a key enzyme involved in the development and progression of many cardiovascular diseases. Direct thrombin inhibitors (DTI), with their minimum off-target effects and immediacy of action, have greatly improved the treatment of these diseases. However, the risk of bleeding, pharmacokinetic issues and thrombotic complications remain major concerns. In an effort to increase the effectiveness of the DTI discovery pipeline, we developed a two- staged machine learning pipeline to identify and rank peptide sequences based on their effective thrombin inhibitory potential. The positive dataset for our model consisted of thrombin inhibitor peptides, and their binding affinities (KI) curated from published literature, and the negative dataset consisted of peptides with no-known thrombin inhibitory or related activity. The first stage of the model identified thrombin inhibitory sequences with an MCC of 83.6%; the second stage of the model, which covers an eight-orders of magnitude range in KI values, predicted the binding affinity of new sequences with a log RMSE of 1.114. These models also revealed physicochemical and structural characteristics that are hidden but unique to thrombin inhibitor peptides. Using the model, we classified more than 10 million peptides from diverse habitats, and identified unique short peptide sequences (less than 15 aa) of interest, based on their predicted KI. Based on the binding energies of the interaction of the peptide with thrombin, we identified a promising set of DTI candidates. The prediction pipeline is available at the webserver: Wesite Link

Phase 1

Modeling

Results

Summarization
Insights for Enhanced Service and Satisfaction

Tools and Techniques used: Python, PyTorch, Huggingface Transformers, Flan T5 transformer model, Prompt Engineering, LLM Fine Tuning.

GitHub Link

Project Description

In today's fast-paced business landscape, effective communication between customer service representatives and clients is paramount for building strong relationships and ensuring customer satisfaction. The "Conversational Analysis for Customer Service Enhancement" project aims to harness the power of Natural Language Processing (NLP) to extract valuable insights from customer service interactions. By summarizing conversations and analyzing them, this project seeks to provide actionable intelligence that can elevate customer service strategies, enhance client experiences, and drive overall business success.

Importance: Strategic Insights and Customer Retention

  • Enhance Efficiency: The automatic summarization of conversations enables quick identification of crucial information, reducing the time spent manually reviewing interactions. This efficiency boost enables customer service teams to respond promptly to inquiries and issues.
  • Improve Strategy: Patterns and trends uncovered through conversational analysis provide strategic direction. Companies can optimize resources, address recurring concerns, and align their offerings with customer demands.
  • Retain Customers: Addressing customer concerns promptly and accurately cultivates trust and loyalty. By addressing issues proactively, businesses can mitigate potential churn and foster long-lasting relationships
  • Personalize Experiences: Insights gained from analyzing interactions enable tailored responses and personalized customer experiences. This personal touch boosts engagement and customer satisfaction.

Modeling

A Transformers Insight
Discovering Sentiments in Dating App Reviews

Tools and Techniques used: Python, NLP, Vader Sentiment Analysis, Huggingface Transformers, RoBERTa transformer model, LangChain, OpenAI GPT 3.5 Model.

GitHub Link

Importance of Relationship

  • Reduced Loneliness and Stress: Romantic relationships offer emotional support during challenging times, effectively reducing stress levels and enhancing overall mental well-being.
  • Emotional Connection: Partners provide companionship, emotional understanding, and a sense of belonging, reducing feelings of loneliness and isolation.
  • Personal Growth: Healthy relationships encourage personal development and growth by motivating each other's aspirations and goals.
  • Enhanced Self-Esteem: Positive affirmations and validation from a partner contribute to increased self-esteem and self-worth.
  • Happiness and Well-being: A strong romantic bond contributes to increased happiness, life satisfaction, and improved overall physical health.

Problem Statement

  • The advent of COVID-19 dramatically altered the dynamics of relationships and social connections, leading to increased reliance on virtual platforms for interaction.
  • Dating apps emerged as essential tools for facilitating connections in a world where physical meetings were limited, offering a unique avenue for people to meet and interact.
  • Enhancing the quality of these digital interactions becomes imperative as individuals seek meaningful connections and genuine experiences through dating platforms.
  • Analyzing Google app reviews not only offers insights into user perceptions and experiences but also aids in identifying areas of improvement crucial for optimizing user satisfaction and app performance.

AIM

The aim of this project is to contribute towards enhancing the potential for individuals to find meaningful connections by improving the overall quality of dating apps.

GOAL

  • Identify the strengths and weaknesses of dating apps through comprehensive analysis.
  • Identify prevalent issues faced by users to facilitate targeted improvements.
  • Uncover trends and shifts in user behavior and preferences, particularly in the context of the COVID-19 pandemic.

Modelling

Results

1. Model Evaluation

BUMBLE

HINGE

MATCH

TINDER

2. Rating Over time

3. Extract Insights using Q&A Bot

BUMBLE

HINGE

MATCH

TINDER

-->

Dribbling Data
A Database Management Project Analyzing NBA

Problem Statement

In today's data-driven world, the use of database management systems has become increasingly important for businesses and organizations to store and manage their data effectively. The National Basketball Association (NBA) is no exception, as it generates vast amounts of data each season, including player statistics, team rankings, and game scores. To make informed decisions, it is important to build high-performing database management systems that can handle large amounts of data and support efficient querying for analysis.

Goal

This project aims to build and compare the performance of two popular database management systems, MySQL and MongoDB, using JMeter with NBA data. The project also involves analyzing NBA stats to gain useful insights from the data. The first part involves building the databases and populating them with NBA data. The second part involves testing the performance of the databases using JMeter. The final part involves analyzing the NBA stats to gain insights into player performance, team performance, and trends in the NBA.

Data Collection

The data for this project was collected from the NBA website through web scraping, resulting in more than 60,000 games, 30 teams, 4.5 players, and their statistics. The collected data was then distributed among 16 CSVs, each containing specific information related to the NBA, including game scores, team rankings, player statistics, and more. The data spans from the year 1946 to 2020, covering over seven decades of NBA history.

Workflow

The figure above illustrates the flow of data in the database systems. The data is processing in the database and imported into a cloud-based database management system. This will provide several benefits, such as improved scalability, availability, and security. Once the data is in the cloud, it is then connected to the visualization tools for further analysis.

Performance Comparison between MySQL and MongoDB

Italy
Forest

From the above graph, we can see MongoDB performs better compared to MySQL.

Analysis of NBA Stat

Italy
Forest
Italy
Forest

Key Inferences

Here are some insights we could get from the above analysis

  • The chance of winning as a home team is higher than as an opposite team.
  • Moderate to higher correlation between the winning rate of the team and the 3 points score efficiency, free throw percentage, rebound percentage.
  • The team with both good offensive and defensive strategies have a high winning percentage.
  • No significant correlation between the Team salary with their performance and Player Salary with their performance.

Future Work

  • Streamlining the data.