Applied Data Analysis Curriculum (Graduate Level)

Focus: 20% Theory | 80% Practice (with tools & real datasets)

Module 1: Introduction & Basics

Theory (Short):

  • What is data analysis?

  • Data types: structured, unstructured, categorical, numeric

  • Data analysis vs. data science

  • Data-driven decision-making process

Practice:

  • Importing datasets (CSV, Excel, JSON)

  • Exploring datasets in Excel (sorting, filtering, pivot tables, charts)

  • Hands-on with Python (Pandas) for dataset preview

  • Overview of data analytics lifecycle

Tools: Excel, Python (Pandas)

Module 2: Data Collection & Management

Practice-Focused:

  • Collect data using Google Forms → Excel

  • Import data from SQL databases

  • Use APIs and web scraping (Python requests/BeautifulSoup)

  • Manage large datasets in Power Query

Tools: MS Excel, SQL, Python, Google Sheets

Outcome: Students can gather and manage structured and unstructured data efficiently.

Module 3: Data Cleaning & Preprocessing

Practice:

  • Handle missing values (Excel formulas, Python .fillna())

  • Remove duplicates and handle outliers

  • Standardize and normalize data

  • Encode categorical variables

  • Real-world dataset cleaning

Tools: Excel, Python (Pandas, NumPy), OpenRefine

Outcome: Students can prepare high-quality, analysis-ready datasets.

Module 4: Exploratory Data Analysis (EDA)

Practice:

  • Descriptive statistics (mean, variance, standard deviation)

  • Visualizations: histograms, scatter plots, pivot charts

  • Correlation and heatmaps in Python (Seaborn, Matplotlib)

  • Interactive EDA using Tableau and Power BI

Tools: Excel, Python, Tableau, Power BI

Outcome: Students can explore datasets and uncover insights visually and statistically.

Module 5: Statistical & Predictive Analysis

Theory (Minimal):

  • Sampling, probability, and hypothesis testing

  • Regression fundamentals

Practice:

  • Run t-tests and regressions in Excel (Data Analysis ToolPak)

  • Multiple regression in Python (Statsmodels, Sklearn)

  • Logistic regression for classification problems

Tools: Excel, R (Basic), Python (Statsmodels, Sklearn)

Outcome: Students can build simple predictive models and interpret statistical results.

Module 6: Advanced Data Analysis

Practice:

  • Clustering (K-Means in Python & Excel add-ins)

  • Time Series Forecasting in Excel (Forecast Sheet) & Python (ARIMA)

  • Dimensionality Reduction (PCA in Python)

Tools: Excel Forecast Tool, Python (Scikit-learn, Statsmodels)

Outcome: Students can apply advanced modeling techniques for business insights.

Module 7: Machine Learning for Data Analysis

Practice:

  • Build predictive models (Random Forest, XGBoost)

  • Use AutoML tools in Excel (XLSTAT, Solver)

  • Evaluate models with confusion matrix and ROC curve

Tools: Python (Scikit-learn, XGBoost), Excel XLSTAT, Orange ML

Outcome: Students can perform automated machine learning for data-driven prediction.

Module 8: Data Visualization & Storytelling

Practice:

  • Build dashboards in Excel (Pivot dashboards)

  • Create Tableau and Power BI dashboards

  • Data storytelling principles — choosing the right chart for each dataset

  • Visual presentation of findings

Tools: Excel, Tableau, Power BI, Python (Plotly, Seaborn)

Outcome: Students can create professional reports and dashboards that communicate insights effectively.

Module 9: Real-World Applications

Practical Domains:

  • Business Analytics: Sales forecasting and KPI reporting

  • Healthcare Analytics: Patient data cleaning and trends

  • Social Media Analytics: Data collection via APIs (Twitter, Facebook)

  • Research Analytics: Survey data cleaning and visualization

Outcome: Students apply techniques to real industry datasets.

Module 10: Capstone Project

Practice:

    • Choose a dataset (Business, Finance, Healthcare, or Social Data)

    • Apply full cycle: Collection → Cleaning → Analysis → Visualization → Reporting

    • Present results as:

      • Written Report (Word/LaTeX)

      • Interactive Dashboard (Excel, Tableau, Power BI)

      • Oral Presentation with Slides

Outcome: Graduates produce a complete analytical project aligned with global data standards.

Advanced Data Analysis (Power BI, Looker, Tableau, KNIME)

1. Introduction to Business Intelligence (BI)

  • What is Business Intelligence and why it matters

  • Overview of Power BI, Tableau, Looker, and KNIME

  • Understanding ETL (Extract, Transform, Load)

  • Data pipelines and data flow concepts

  • Key metrics and KPIs in business analytics

2. Advanced Power BI

  • Connecting Power BI to multiple data sources

  • Advanced Power Query and M language

  • Data modeling and relationships

  • DAX (Data Analysis Expressions) formulas and calculations

  • Custom visuals and interactive reports

  • Power BI service: publishing, sharing, and scheduling refresh

  • Case Study: Sales & Profit Dashboard

3. Tableau for Data Storytelling

  • Tableau interface and data connection setup

  • Working with dimensions and measures

  • Calculated fields and parameters

  • Building interactive dashboards with filters and actions

  • Advanced charting (maps, boxplots, treemaps, story points)

  • Tableau Public & Tableau Online publishing

  • Case Study: Customer Retention Dashboard

4. Looker Studio (Google Looker)

  • Connecting data sources (Google Analytics, Sheets, BigQuery)

  • Creating metrics and blending data

  • Designing custom visualizations

  • Setting filters, controls, and drill-downs

  • Sharing reports and automation with Looker links

  • Case Study: Marketing Performance Dashboard

5. KNIME Analytics Platform

  • Introduction to KNIME and its visual workflow interface

  • Data import, cleaning, and transformation nodes

  • Machine learning workflows (classification, clustering, regression)

  • Text mining and sentiment analysis using KNIME nodes

  • Workflow automation and scheduling

  • Case Study: Predicting Customer Churn using KNIME

6. Advanced Data Integration & Automation

  • Integrating BI tools with databases and APIs

  • Connecting BI tools with Python/R scripts

  • Scheduling data refresh and workflow automation

  • Cloud integration (Google Cloud, Azure, AWS)

7. Dashboard Optimization & Data Governance

  • Optimizing dashboards for speed and usability

  • Applying color theory and design principles

  • Data privacy and governance in analytics

  • Understanding GDPR and compliance basics

8. Final Capstone Project

  • Choose a real dataset from finance, marketing, or operations

  • Develop a complete BI solution using one or more tools

  • Prepare an interactive dashboard

  • Present analytical insights and recommendations