Discover the Undiscovered
Identification of Thrombin Inhibitor from diverse habitats using ML
Tools and Techniques used: Python, PyTorch, Huggingface Transformers, Flan T5 transformer model, Prompt Engineering, LLM Fine Tuning.
GitHub Link
Abstract
Thrombin is a key enzyme involved in the development and progression of many cardiovascular diseases. Direct thrombin inhibitors (DTI), with their minimum off-target effects and immediacy of action, have greatly improved the treatment of these diseases. However, the risk of bleeding, pharmacokinetic issues and thrombotic complications remain major concerns. In an effort to increase the effectiveness of the DTI discovery pipeline, we developed a two- staged machine learning pipeline to identify and rank peptide sequences based on their effective thrombin inhibitory potential. The positive dataset for our model consisted of thrombin inhibitor peptides, and their binding affinities (KI) curated from published literature, and the negative dataset consisted of peptides with no-known thrombin inhibitory or related activity. The first stage of the model identified thrombin inhibitory sequences with an MCC of 83.6%; the second stage of the model, which covers an eight-orders of magnitude range in KI values, predicted the binding affinity of new sequences with a log RMSE of 1.114. These models also revealed physicochemical and structural characteristics that are hidden but unique to thrombin inhibitor peptides. Using the model, we classified more than 10 million peptides from diverse habitats, and identified unique short peptide sequences (less than 15 aa) of interest, based on their predicted KI. Based on the binding energies of the interaction of the peptide with thrombin, we identified a promising set of DTI candidates. The prediction pipeline is available at the webserver: Wesite Link
Phase 1
Modeling
Results
Summarization
Insights for Enhanced Service and Satisfaction
Tools and Techniques used: Python, PyTorch, Huggingface Transformers, Flan T5 transformer model, Prompt Engineering, LLM Fine Tuning.
GitHub Link
Project Description
In today's fast-paced business landscape, effective communication between customer service representatives and clients is paramount for building strong relationships and ensuring customer satisfaction. The "Conversational Analysis for Customer Service Enhancement" project aims to harness the power of Natural Language Processing (NLP) to extract valuable insights from customer service interactions. By summarizing conversations and analyzing them, this project seeks to provide actionable intelligence that can elevate customer service strategies, enhance client experiences, and drive overall business success.
Importance: Strategic Insights and Customer Retention
-
Enhance Efficiency: The automatic summarization of conversations enables quick identification of crucial information, reducing the time spent manually reviewing interactions. This efficiency boost enables customer service teams to respond promptly to inquiries and issues.
-
Improve Strategy: Patterns and trends uncovered through conversational analysis provide strategic direction. Companies can optimize resources, address recurring concerns, and align their offerings with customer demands.
-
Retain Customers: Addressing customer concerns promptly and accurately cultivates trust and loyalty. By addressing issues proactively, businesses can mitigate potential churn and foster long-lasting relationships
-
Personalize Experiences: Insights gained from analyzing interactions enable tailored responses and personalized customer experiences. This personal touch boosts engagement and customer satisfaction.
Modeling
A Transformers Insight
Discovering Sentiments in Dating App Reviews
Tools and Techniques used: Python, NLP, Vader Sentiment Analysis, Huggingface Transformers, RoBERTa transformer model, LangChain, OpenAI GPT 3.5 Model.
GitHub Link
Importance of Relationship
- Reduced Loneliness and Stress: Romantic relationships offer emotional support during challenging times, effectively reducing stress levels and enhancing overall mental well-being.
- Emotional Connection: Partners provide companionship, emotional understanding, and a sense of belonging, reducing feelings of loneliness and isolation.
- Personal Growth: Healthy relationships encourage personal development and growth by motivating each other's aspirations and goals.
- Enhanced Self-Esteem: Positive affirmations and validation from a partner contribute to increased self-esteem and self-worth.
- Happiness and Well-being: A strong romantic bond contributes to increased happiness, life satisfaction, and improved overall physical health.
Problem Statement
- The advent of COVID-19 dramatically altered the dynamics of relationships and social connections, leading to increased reliance on virtual platforms for interaction.
- Dating apps emerged as essential tools for facilitating connections in a world where physical meetings were limited, offering a unique avenue for people to meet and interact.
- Enhancing the quality of these digital interactions becomes imperative as individuals seek meaningful connections and genuine experiences through dating platforms.
- Analyzing Google app reviews not only offers insights into user perceptions and experiences but also aids in identifying areas of improvement crucial for optimizing user satisfaction and app performance.
AIM
The aim of this project is to contribute towards enhancing the potential for individuals to find meaningful connections by improving the overall quality of dating apps.
GOAL
-
Identify the strengths and weaknesses of dating apps through comprehensive analysis.
-
Identify prevalent issues faced by users to facilitate targeted improvements.
-
Uncover trends and shifts in user behavior and preferences, particularly in the context of the COVID-19 pandemic.
Modelling
Results
1. Model Evaluation
BUMBLE
HINGE
MATCH
TINDER
2. Rating Over time
3. Extract Insights using Q&A Bot
BUMBLE
HINGE
MATCH
TINDER
-->
Dribbling Data
A Database Management Project Analyzing NBA
Problem Statement
In today's data-driven world, the use of database management systems has become increasingly important for businesses and organizations to store and manage their data effectively. The National Basketball Association (NBA) is no exception, as it generates vast amounts of data each season, including player statistics, team rankings, and game scores. To make informed decisions, it is important to build high-performing database management systems that can handle large amounts of data and support efficient querying for analysis.
Goal
This project aims to build and compare the performance of two popular database management systems, MySQL and MongoDB, using JMeter with NBA data. The project also involves analyzing NBA stats to gain useful insights from the data. The first part involves building the databases and populating them with NBA data. The second part involves testing the performance of the databases using JMeter. The final part involves analyzing the NBA stats to gain insights into player performance, team performance, and trends in the NBA.
Data Collection
The data for this project was collected from the NBA website through web scraping, resulting in more than 60,000 games, 30 teams, 4.5 players, and their statistics. The collected data was then distributed among 16 CSVs, each containing specific information related to the NBA, including game scores, team rankings, player statistics, and more. The data spans from the year 1946 to 2020, covering over seven decades of NBA history.
Workflow
The figure above illustrates the flow of data in the database systems. The data is processing in the database and imported into a cloud-based database management system. This will provide several benefits, such as improved scalability, availability, and security. Once the data is in the cloud, it is then connected to the visualization tools for further analysis.
Performance Comparison between MySQL and MongoDB
From the above graph, we can see MongoDB performs better compared to MySQL.
Analysis of NBA Stat
Key Inferences
Here are some insights we could get from the above analysis
- The chance of winning as a home team is higher than as an opposite team.
- Moderate to higher correlation between the winning rate of the team and the 3 points score efficiency, free throw percentage, rebound percentage.
- The team with both good offensive and defensive strategies have a high winning percentage.
- No significant correlation between the Team salary with their performance and Player Salary with their performance.
Future Work