Programme Outline
Learning Objectives and Structure
- Perform the statistical validation component for the role of a junior data scientist or statistical researcher
- Validate dataset statistically to support data analysis and modelling
- Explore on data distribution and identify statistical anomalies
- Compute Statistical Indicators for the dataset
- Check normality assumption for a data series
- Perform statistical test and validation on dataset
- Conduct further statistical analysis
- Understand healthcare case studies shared by SingHealth faculty members to gain insights into real-world scenarios.
- Utilise curated public healthcare datasets to perform hands-on activities and assignments, fostering practical experience and understanding of the subject matter.
Day 1
- Overview of Data Science Pipeline
- What is Data Validation and Statistical Analysis?
- Importance of Statistics in Data Validation for Machine Learning
- Requirements prior to Data Validation phase
- Understand how data scientist leverage on Data Validation and Statistics
- Basics of Statistics and Hypothesis testing
Day 2
- Statistical Test on Dataset Characteristics
- Probability and Expectations
- Central Tendencies and Dispersion
- Central Limit Theorem
Day 3
- Understanding Dataset characteristics or differences
- Parametric Test
- Introduction to Interval and Ratio Data
- Central Limit Theorem
- z – test
- t – test
- Parametric ANOVA (Interval Data or F-Test)
- Understanding Dataset variables relationship
- Spearman r
- Pearson r
Day 4
- Understanding Dataset characteristics or differences
- Non-Parametric Tests
- Introduction to Ordinal Data
- Non-Parametric ANOVA (Ranked Data – Friedman Test)
- Introduction to Categorical Data Analysis
- Goodness of Fit – Chi Square (Categorical Data)
Day 5 – Consultation / Project presentation
Project Consultation
Each group of participants will present the progress of their projects and have the opportunity to ask questions and clarify any doubts pertaining to their projects.
Project Presentation
Each group of participants will showcase their work and respond to questions during a Q&A session.