Programme outline
Learning objectives
- Data preparation including cleaning, enriching, and validating data through statistical approaches.
- Analyse data with multiple linear regression and logistic regression and evaluate model performance.
- Develop visualisations and dashboards to effectively communicate key analysis results and insights.
Day 1
- Data Science Lifecycle (from business problem to data-driven decision)
- Data Preparation – Data cleaning – Extracting text, converting to numeric, splitting text, removing extra spaces, combining cell values, dealing with dates
- Data Preparation – Data enrichment – Performing VLookup using dictionary
- Data Preparation – Data validation – Sample vs population, measures of central tendency, measures of dispersion, moments of distribution, normal distribution, Quartile-Quartile plot, correlation analysis, T-test
Day 2
- Data Science Modelling – Regression – Multiple linear regression, interpretation of regression results.
- Data Science Modelling – Regression – Multicollinearity, correlation matrix, variation inflation factor, evaluation metrics using MSE, RMSE and MAE
- Data Science Modelling – Classification – Logistic regression, evaluation metrics using confusion matrix, accuracy, precision, recall, F1 score, true negative rate, false positive rate, ROC curve and AUC value.
- Data Visualisation, Business Decision – Creating visuals and dashboard from excel table, Steps to data storytelling with dashboard.
Day 3
- Project consultation
- Project presentation
Mode of assessment
- Assignment
- Project