In this project, we will go through all the stages of the lifecycle of a Machine Learning project, resulting in the development of an API deployed on Render, through which queries can be made to the records of a Steam (videogames) platform database. Additionally, a machine learning recommendation model for video games based on cosine similarity is developed, which can also be accessed through the API.
data:image/s3,"s3://crabby-images/786dd/786ddd5ab3cde5809ed2251055cc8364ed1a3c6b" alt=""
The project is divided into two parts:
Part I: Data Engineering. It starts from scratch, quickly working as a Data Engineer with data collection and extraction from files, as well as their processing, transformation, and modeling.
Part I: Data Engineering. It starts from scratch, quickly working as a Data Engineer with data collection and extraction from files, as well as their processing, transformation, and modeling.
Part II: Machine Learning. The model is created, the cleaned data is consumed, and it is trained under certain conditions. As a result, a video game recommendation system for Steam users is created, using MLOps techniques to ensure that the model and the API are scalable, reproducible, and maintainable.
↓ Tech stack
data:image/s3,"s3://crabby-images/356dc/356dca8b19f00c1fb082f2327406ce0611aaa668" alt=""
OBJECTIVES
1- Data Transformations: Read and clean the dataset, removing unnecessary columns to optimize performance, knowing that data maturity is low: nested data, raw type, no automated processes for updating new products, among other things.
data:image/s3,"s3://crabby-images/a78aa/a78aaced103e6996ef1bdde8e527e175b8f47b11" alt=""
2- Feature Engineering: Perform sentiment analysis on user reviews and create a new column 'sentiment_analysis'.
data:image/s3,"s3://crabby-images/77afd/77afd40fed1fb44a808daa7166a8ed2abd33bd5a" alt=""
3- Exploratory Data Analysis (EDA): explore and visualize the data to gain valuable insights
data:image/s3,"s3://crabby-images/a69c7/a69c7dbc2c68063639dc936f01983383f900e4e5" alt=""
data:image/s3,"s3://crabby-images/2022c/2022cd6b60ccf0a80b0a200f7d557cb4d10c7d8d" alt=""
data:image/s3,"s3://crabby-images/8f6dd/8f6dd7cb710821a3f6bcbfe506db64d820bd97b8" alt=""
data:image/s3,"s3://crabby-images/d3e95/d3e955f9883d9cf6b141c09cbddd920715c05e2c" alt=""
data:image/s3,"s3://crabby-images/15733/157331ba04e566936e39d8a8894ee67df794bc40" alt=""
data:image/s3,"s3://crabby-images/caf4d/caf4d38cc8acf7cecff8e94d1ebe263c4cd6f526" alt=""
data:image/s3,"s3://crabby-images/05de8/05de8004ae7b46d8bb2d19741564aafc9357d8ec" alt=""
data:image/s3,"s3://crabby-images/e59f0/e59f0b04c6c9e9bb515e04507960a5a26a3320ea" alt=""
4- Machine Learning Model: develop a recommendation system based on cosine similarity: Input: Product ID / Output: List of 5 recommended games similar to the entered one
data:image/s3,"s3://crabby-images/a2103/a2103343a28fd793dd555e3ec262e2d4743c208f" alt=""
5- API Development: implement an API with FastAPI that allows querying data and recommendations.
data:image/s3,"s3://crabby-images/3fdee/3fdee7c0344ef9c68a69981251071ee88cb22ae5" alt=""
6- Deployment: deploy the API on a web service to make it publicly accessible
data:image/s3,"s3://crabby-images/cf752/cf7525d400b88eb34c43d401c1a1b0a226bcd575" alt=""