Data Science, Geographic Information System (GIS), Machine Learning, Remote Sensing

My Work Samples Portfolio

This page is meant to store work samples to use as cheat sheets. Thus, the following collection of notebooks are simple and can be modified following required use cases.

Overall, the work samples cover my skill sets, including:

  • delivering meaningful data-driven insights to support business goals,
  • automating data processing (with python),
  • data analysis (tabular, time series, text/NLP, and image),
  • descriptive and inferential statistical analysis,
  • GIS or spatial data analysis
  • data visualization and dashboard development,
  • Machine Learning modeling (regression, classification, clustering, dimensionality reduction, time series forecasting, recommender engine)
  • Deep Learning or Artificial Intelligence (regression and classification with MLP, image classification with CNN, time series forecasting with LSTM, text classification with LSTM)
  • web application development,
  • developing APIs,
  • Large Language Model (LLM),
  • Diffusion (Image Generation) etc.

Highlights:

Task GroupTasksDescriptionNotebook/Repo
Large Language Model (LLM)Retrieval-Augmented Generation (RAG)Develop RAG to enhance LLMs with custom documents. Streamlit chatbot as the UIarticle, repository
Deep LearningImage classification with CNN, Multi-label classificationDeveloped image classification model using CNN to recognize buildings, forest, glacier, mountain, sea, and street images.CNN
Deep LearningTime series forecasting with vanilla LSTM, stacked LSTM, bidirectional LSTM, CNN LSTM, and Conv LSTMForecast carbon monoxide emission using LSTM and did time series analysis.Time series
Deep LearningText Classification with Dense, LSTM, Bi-LSTM, GRU, CNN, CNN + GRUDeveloped text classification model to distinguish tweets into 4 emotions: joy, sadness, anger, and fear.Text classification
Supervised LearningSupervised Learning for Remote SensingPredicted the spatial distribution of land cover using Remote Sensing/satellite data. Published the result on a web app.ML + Remote sensing, web app
NLPNLP and Sentiment AnalysisPerformed NLP analysis and text regression for sentiment analysis.article, part 1, part 2
Table 1 Favorite notebooks

Others:

Task GroupTasksDescriptionNotebook/Repo
Supervised LearningRegressionPredicted house prices with various regression algorithms.Regression
Supervised LearningBinary classificationPredicted survival rate in titanic using various classification algorithms.Binary Classification
Supervised LearningBinary classification (with probability)Predicted high traffic probability using the metrics of AUC, accuracy, and F1-score.Binary Classification
Supervised LearningMulti-class classificationPredicted household poverty as a multi-class classification problem.Multi-class Classification
Supervised LearningImbalanced classificationPredicted whether an employee was a best performer as an imbalanced classification task.Imbalanced
Supervised LearningBayesian Optimization: bayes_opt or fminComparing the libraries bayes_opt and fmin to perform Bayesian optimization for hyperparameter-tuning.Bayesian Optimization
Supervised LearningSupervised Learning for Remote SensingPredicted the spatial distribution of land cover using Remote Sensing/satellite data. Published the result on a web app.ML + Remote sensing, web app
AutoMLAutoML for RegressionPredicted house prices with various autoML regression algorithms.Part 1, Part 2
AutoMLAutoML for ClassificationPredicted household poverty classes using autoML classification algorithms.Part 1, Part 2
Unsupervised LearningClusteringClustered customer segmentation using k-means and hierarchical clustering.k-means,
hierarchical clustering
Unsupervised LearningGeo-spatial clustering and point pattern analysisSpatial pattern analysis (point/polygon pattern analysis, Spatially Constrained Hierarchical Clustering, etc. ) of e-commerce customers in Brazil.Geo-spatial clustering
Unsupervised LearningDimensionality reduction: PCA with Sagemaker (upcoming)Performed PCA on environmental variables dataset.PCA
Unsupervised LearningAnomaly detection: Random Cut Forest with SagemakerPerformed anomaly detection on daily climate dataset and deployed the model using sagemaker.Random Cut Forest
Time series forecastingTime series forecasting with SARIMAXForecast the cash of ATMs across the time.“not yet published”
Deep LearningImage classification with CNN, Multi-label classificationDeveloped image classification model using CNN to recognize buildings, forest, glacier, mountain, sea, and street images.CNN
Deep LearningTime series forecasting with LSTMForecast carbon monoxide emission using LSTM and did time series analysis.Time series
Deep LearningText Classification with Dense, LSTM, Bi-LSTM, GRU, CNN, CNN + GRU Developed text classification model to distinguish tweets into 4 emotions: joy, sadness, anger, and fear.Text classification
NLPNLP and Sentiment AnalysisPerformed NLP analysis and text regression for sentiment analysis.article, part 1, part 2
Inferential StatisticsInferential Statistics, hypothesis testing, etc. “not yet published”
DashboardShiny DashboardVisualized daily covid cases in dashboard.Shiny Dashboard
DashboardTableau DashboardVisualized spatiotemporal analysis of house prices“not yet published” (upcoming)
Web applicationStreamlitStreamlit as the chatbot interface for an RAG or LLM applicationhttps://github.com/rendy-k/LLM-RAG
APIFAST API“not yet published”
SagemakerSagemaker: classificationDeveloped and deployed loan default probability classification using AWS sagemaker.classification
SagemakerSagemaker: invoke modelDeveloped the API to invoked deployed Machine Learning model.invoke model
SagemakerMulti-model deployment with SagemakerDeployed multi-model on AWS instance.multi-model deployment
SagemakerRecommender systemBuilt and deployed a recommender system to recommend anime titles using Factorization Machine of AWS.recommender system
SagemakerTime series forecastingBuilt and deployed DeepAR to forecast the time series of New Delhi daily weather.Deep AR
Large Language Model (LLM)Develop RAG to enhance LLMs with custom documents. Streamlit chatbot as the UIarticle, repository
Table 2 Notebook collection

In the “Notebook/Repo” column, the URLs will direct to where the notebooks or repositories are stored. Some of them do not have the URLs, but “not yet published”. This means that the notebooks are available in local computer for professional work. They are not yet modified and published.

Geographic Information System (GIS)

WebGIS Without Coding

Web GIS is a digital map. Digital map is interactive, not like conventional printed map. In digital map, like Google Map, users can pan, zoom in, zoom out, and manage which layers to show.

WebGIS can be made by coding skills, or simply without anny coding skills. Creating WebGIS with coding requires several coding languages. While, it can also be made without coding, only requiring ArcGIS online or Google Map. This article will show it. The following is made using Google Map.

Continue reading “WebGIS Without Coding”
Geographic Information System (GIS)

ArcPy Cursor for GIS

Analyzing attirbute table in GIS data, like feature class can be done by Python Arcpy. This article is the second part of this article.  It is called cursor. There are three types of cursor: search, update  and insert cursor. Search cursor is for summarizing the rows in the attribute table  by average, sum, and others. Below is the example script of using search cursor to print and calculate the total area of polygons of which the area is less than or equals to 20 ha.

Continue reading “ArcPy Cursor for GIS”
Geographic Information System (GIS)

Model Builder, Batch Files, and Python for GIS

Working with GIS is fun. Preparing, analysing, visualizing, and concluding geospatial data are interesting for those have passion in it. But, doing repeated and monotonous technical jobs can be boring or even frustrating, especially if the huge load of work must be completed in a short time. Some human errors can happen. However, if the work is repeated, it can be automated using Model Builder, Batch Files, or Python. In this article, geospatial data processing is the focus of the discussion.

Continue reading “Model Builder, Batch Files, and Python for GIS”