This page is meant to store work samples to use as cheat sheets. Thus, the following collection of notebooks are simple and can be modified following required use cases.
Overall, the work samples cover my skill sets, including:
- delivering meaningful data-driven insights to support business goals,
- automating data processing (with python),
- data analysis (tabular, time series, text/NLP, and image),
- descriptive and inferential statistical analysis,
- GIS or spatial data analysis
- data visualization and dashboard development,
- Machine Learning modeling (regression, classification, clustering, dimensionality reduction, time series forecasting, recommender engine)
- Deep Learning or Artificial Intelligence (regression and classification with MLP, image classification with CNN, time series forecasting with LSTM, text classification with LSTM)
- web application development,
- developing APIs,
- Large Language Model (LLM),
- Diffusion (Image Generation) etc.
Highlights:
| Task Group | Tasks | Description | Notebook/Repo |
| Large Language Model (LLM) | Retrieval-Augmented Generation (RAG) | Develop RAG to enhance LLMs with custom documents. Streamlit chatbot as the UI | article, repository |
| Deep Learning | Image classification with CNN, Multi-label classification | Developed image classification model using CNN to recognize buildings, forest, glacier, mountain, sea, and street images. | CNN |
| Deep Learning | Time series forecasting with vanilla LSTM, stacked LSTM, bidirectional LSTM, CNN LSTM, and Conv LSTM | Forecast carbon monoxide emission using LSTM and did time series analysis. | Time series |
| Deep Learning | Text Classification with Dense, LSTM, Bi-LSTM, GRU, CNN, CNN + GRU | Developed text classification model to distinguish tweets into 4 emotions: joy, sadness, anger, and fear. | Text classification |
| Supervised Learning | Supervised Learning for Remote Sensing | Predicted the spatial distribution of land cover using Remote Sensing/satellite data. Published the result on a web app. | ML + Remote sensing, web app |
| NLP | NLP and Sentiment Analysis | Performed NLP analysis and text regression for sentiment analysis. | article, part 1, part 2 |
Others:
| Task Group | Tasks | Description | Notebook/Repo |
| Supervised Learning | Regression | Predicted house prices with various regression algorithms. | Regression |
| Supervised Learning | Binary classification | Predicted survival rate in titanic using various classification algorithms. | Binary Classification |
| Supervised Learning | Binary classification (with probability) | Predicted high traffic probability using the metrics of AUC, accuracy, and F1-score. | Binary Classification |
| Supervised Learning | Multi-class classification | Predicted household poverty as a multi-class classification problem. | Multi-class Classification |
| Supervised Learning | Imbalanced classification | Predicted whether an employee was a best performer as an imbalanced classification task. | Imbalanced |
| Supervised Learning | Bayesian Optimization: bayes_opt or fmin | Comparing the libraries bayes_opt and fmin to perform Bayesian optimization for hyperparameter-tuning. | Bayesian Optimization |
| Supervised Learning | Supervised Learning for Remote Sensing | Predicted the spatial distribution of land cover using Remote Sensing/satellite data. Published the result on a web app. | ML + Remote sensing, web app |
| AutoML | AutoML for Regression | Predicted house prices with various autoML regression algorithms. | Part 1, Part 2 |
| AutoML | AutoML for Classification | Predicted household poverty classes using autoML classification algorithms. | Part 1, Part 2 |
| Unsupervised Learning | Clustering | Clustered customer segmentation using k-means and hierarchical clustering. | k-means, hierarchical clustering |
| Unsupervised Learning | Geo-spatial clustering and point pattern analysis | Spatial pattern analysis (point/polygon pattern analysis, Spatially Constrained Hierarchical Clustering, etc. ) of e-commerce customers in Brazil. | Geo-spatial clustering |
| Unsupervised Learning | Dimensionality reduction: PCA with Sagemaker (upcoming) | Performed PCA on environmental variables dataset. | PCA |
| Unsupervised Learning | Anomaly detection: Random Cut Forest with Sagemaker | Performed anomaly detection on daily climate dataset and deployed the model using sagemaker. | Random Cut Forest |
| Time series forecasting | Time series forecasting with SARIMAX | Forecast the cash of ATMs across the time. | “not yet published” |
| Deep Learning | Image classification with CNN, Multi-label classification | Developed image classification model using CNN to recognize buildings, forest, glacier, mountain, sea, and street images. | CNN |
| Deep Learning | Time series forecasting with LSTM | Forecast carbon monoxide emission using LSTM and did time series analysis. | Time series |
| Deep Learning | Text Classification with Dense, LSTM, Bi-LSTM, GRU, CNN, CNN + GRU | Developed text classification model to distinguish tweets into 4 emotions: joy, sadness, anger, and fear. | Text classification |
| NLP | NLP and Sentiment Analysis | Performed NLP analysis and text regression for sentiment analysis. | article, part 1, part 2 |
| Inferential Statistics | Inferential Statistics, hypothesis testing, etc. | “not yet published” | |
| Dashboard | Shiny Dashboard | Visualized daily covid cases in dashboard. | Shiny Dashboard |
| Dashboard | Tableau Dashboard | Visualized spatiotemporal analysis of house prices | “not yet published” (upcoming) |
| Web application | Streamlit | Streamlit as the chatbot interface for an RAG or LLM application | https://github.com/rendy-k/LLM-RAG |
| API | FAST API | “not yet published” | |
| Sagemaker | Sagemaker: classification | Developed and deployed loan default probability classification using AWS sagemaker. | classification |
| Sagemaker | Sagemaker: invoke model | Developed the API to invoked deployed Machine Learning model. | invoke model |
| Sagemaker | Multi-model deployment with Sagemaker | Deployed multi-model on AWS instance. | multi-model deployment |
| Sagemaker | Recommender system | Built and deployed a recommender system to recommend anime titles using Factorization Machine of AWS. | recommender system |
| Sagemaker | Time series forecasting | Built and deployed DeepAR to forecast the time series of New Delhi daily weather. | Deep AR |
| Large Language Model (LLM) | Develop RAG to enhance LLMs with custom documents. Streamlit chatbot as the UI | article, repository |
In the “Notebook/Repo” column, the URLs will direct to where the notebooks or repositories are stored. Some of them do not have the URLs, but “not yet published”. This means that the notebooks are available in local computer for professional work. They are not yet modified and published.

