My name defines a lot of who I am. I like to behave like a star in the solar system: with intensity, brightness, and authenticity. I learned to be a Scientist when I graduated in Biology, discovering my critical, researching, and questioning side. I learned to be a Data Scientist in my postgraduate studies, where I realized that I could use my scientist side to look at analyses with different eyes and obtain important information through data.
This project implements a sentiment analysis pipeline using Retrieval-Augmented Generation (RAG) powered by Large Language Models (LLMs) and a vector database (ChromaDB).
This project implements a sentiment analysis pipeline using Retrieval-Augmented Generation (RAG) powered by Large Language Models (LLMs) and a vector database (ChromaDB).
Analysis of factors influencing customer churn for a telecommunications company and identification of factors leading to customer retention.
Analysis of factors influencing customer churn for a telecommunications company and identification of factors leading to customer retention.
This project applies machine learning techniques to predict the likelihood of customer default based on credit data.
This project applies machine learning techniques to predict the likelihood of customer default based on credit data.
A detailed analysis using statistical and visual techniques to uncover patterns and insights in the food delivery market.
A detailed analysis using statistical and visual techniques to uncover patterns and insights in the food delivery market.
Classifying customer sentiment across airlines using NLP and Machine Learning models, with insights into the impact of reviews on each airline's Net Promoter Score (NPS).
Classifying customer sentiment across airlines using NLP and Machine Learning models, with insights into the impact of reviews on each airline's Net Promoter Score (NPS).
A trend analysis of ENEM exam questions, visualizing word proximity to classify questions as physics, chemistry, or biology with Random Forest model.
A trend analysis of ENEM exam questions, visualizing word proximity to classify questions as physics, chemistry, or biology with Random Forest model.
In this post, I share practical study tips that helped me balance work, study, and growth. Hope it inspires you to keep going, even when it’s tough!
This analogy compares two students to bias and variance in machine learning. It explains how models, like students, need to balance learning from training data while generalizing to new challenges.
Parameters are learned during model training, while hyperparameters are set beforehand to guide the learning process. Just like an athlete needs both well-developed muscles and a solid training plan, a model needs tuned parameters and well-chosen hyperparameters to perform well.
Here, I discuss some common approaches to transform categorical variables into numerical ones, such as mapping strings to numbers, Label Encoding, and One Hot Encoding, along with practical examples. To check out more publications like these, visit my Linkedin.
This was my first scientific paper published in a journal. Through this experience, I enhanced my skills in data visualization using R and creating statistical graphics.