Welcome to Introduction to Data Analytics! This course will guide you through the essential techniques for working with data, equipping you with skills used by data experts across industries. You’ll explore how to clean and preprocess data using Python libraries like Pandas and NumPy, laying the groundwork for effective data analysis.

Introduction to Data Analytics
Grow your skills with Coursera Plus for $239/year (usually $399). Save now.

Recommended experience
Recommended experience
Intermediate level
Learners must be well aware of Linear Algebra and Optimisation, Probability and Statistics, and Introduction to Programming.
Recommended experience
Recommended experience
Intermediate level
Learners must be well aware of Linear Algebra and Optimisation, Probability and Statistics, and Introduction to Programming.
What you'll learn
Apply data preprocessing techniques using Python libraries like Pandas and NumPy to clean, transform, and prepare datasets for analysis.
Use EDA and ML algorithms to identify patterns, trends & solve real-world data problems through regression, classification and clustering techniques.
Evaluate model performance using appropriate metrics and visualise insights through data visualisation tools to effectively communicate findings.
Skills you'll gain
Tools you'll learn
Details to know

Add to your LinkedIn profile
153 assignments
See how employees at top companies are mastering in-demand skills

There are 12 modules in this course
In this module, the learners will be introduced to the course and its syllabus, setting the foundation for their learning journey. The course's introductory video will provide them with insights into the valuable skills and knowledge they can expect to gain throughout the duration of this course. Additionally, the syllabus reading will comprehensively outline essential course components, including course values, assessment criteria, grading system, schedule, details of live sessions, and a recommended reading list that will enhance the learner’s understanding of the course concepts. Moreover, this module offers the learners the opportunity to connect with fellow learners as they participate in a discussion prompt designed to facilitate introductions and exchanges within the course community.
What's included
3 videos1 reading1 discussion prompt
3 videos• Total 6 minutes
- Meet Your Instructor - Prof. Seetha Parameswaran• 2 minutes
- Meet Your Instructor - Prof. Aneesh Chivukula• 1 minute
- Course Introductory Video• 3 minutes
1 reading• Total 10 minutes
- Course Overview• 10 minutes
1 discussion prompt• Total 10 minutes
- Know Your Peers• 10 minutes
This module provides a comprehensive introduction to data analytics, covering its definition, importance, key components, and industry applications. Students will learn to apply the four types of data analytics (descriptive, diagnostic, predictive, and prescriptive) to solve business problems and make data-driven decisions. They will also analyse real-world use cases, challenges, and future trends in data analytics across various domains. Additionally, the students will gain an understanding of structured, unstructured, semi-structured, quantitative, and qualitative data from primary, secondary, internal, and external sources, and learn how to apply this knowledge to data analytics projects.
What's included
17 videos4 readings16 assignments1 discussion prompt
17 videos• Total 155 minutes
- Definition of Data Analytics• 7 minutes
- The Importance of Data Analytics• 6 minutes
- Key Components • 6 minutes
- Descriptive Analytics• 6 minutes
- Diagnostic Analytics• 6 minutes
- Predictive Analytics• 6 minutes
- Prescriptive Analytics• 6 minutes
- Industry Applications • 7 minutes
- Challenges in Data Analytics• 6 minutes
- Structured Data• 6 minutes
- Unstructured Data• 6 minutes
- Semi-Structured Data• 7 minutes
- Quantitative Data• 7 minutes
- Qualitative Data• 7 minutes
- Primary and Secondary Data Sources• 6 minutes
- Internal and External Data Sources• 6 minutes
- Recording of Introduction to Data Analytics: Week 1 - Live Session on 24-11-08 18:34:57 [53:02]• 53 minutes
4 readings• Total 60 minutes
- Essential Reading: Data Analytics Process• 15 minutes
- Essential Reading: Skills Required and Tools and Technologies Used in Data Analytics• 15 minutes
- Essential Reading: Use Cases and Applications of Data Analytics• 15 minutes
- Essential Reading: Examples of Data and Data Sources• 15 minutes
16 assignments• Total 51 minutes
- Definition of Data Analytics• 6 minutes
- The Importance of Data Analytics• 3 minutes
- Key Components • 3 minutes
- Descriptive Analytics• 3 minutes
- Diagnostic Analytics• 3 minutes
- Predictive Analytics• 3 minutes
- Prescriptive Analytics• 3 minutes
- Industry Applications • 3 minutes
- Challenges in Data Analytics• 3 minutes
- Structured Data• 3 minutes
- Unstructured Data• 3 minutes
- Semi-Structured Data• 3 minutes
- Quantitative Data• 3 minutes
- Qualitative Data• 3 minutes
- Primary and Secondary Data Sources• 3 minutes
- Internal and External Data Sources• 3 minutes
1 discussion prompt• Total 20 minutes
- Industry Applications of Data Analytics• 20 minutes
This module focuses on essential Python concepts and techniques for data analytics. The module introduces basic Python concepts, such as the Python interpreter, Jupyter Notebook, input/output, and indentation, enabling students to start developing Python programs for data analytics. Students will learn to apply Python scalar types, objects, attributes, methods, and operators to create and manipulate data structures. They will also apply control statements and iterations, such as conditional statements and loops, to control the flow of execution and process data efficiently. The module covers the use of regular and lambda functions to create reusable and modular code. Additionally, students will learn to apply file-handling techniques to read from and write to files, facilitating data persistence and external data processing. By the end of this module, students will have the necessary Python skills to perform data manipulation, analysis, and processing tasks.
What's included
22 videos6 readings17 assignments1 discussion prompt
22 videos• Total 182 minutes
- Python Interpreter• 7 minutes
- Jupyter Python• 6 minutes
- Input and Print• 6 minutes
- Indentations• 6 minutes
- Lesson 1 Demo• 4 minutes
- Python Scalar Types • 6 minutes
- Objects • 5 minutes
- Attributes• 6 minutes
- Methods• 5 minutes
- Operators• 6 minutes
- Lesson 2 Demo• 12 minutes
- Conditional Statement• 6 minutes
- Nested Conditional Statement• 5 minutes
- For and While Loops• 6 minutes
- Lesson 3 Demo• 9 minutes
- Regular Functions• 7 minutes
- Lambda Functions• 7 minutes
- Lesson 4 Demo• 4 minutes
- Reading Files• 6 minutes
- Writing Files• 6 minutes
- Lesson 5 Demo• 3 minutes
- Recording of Introduction to Data Analytics: Week 2 - Live Session on 24-11-15 18:33:55 [56:11]• 56 minutes
6 readings• Total 65 minutes
- Essential Reading: Indentations in Python• 15 minutes
- Essential Reading: Operator Precedence and Indentation in Python• 10 minutes
- Essential Reading: Control Statements and Iterations in Python• 10 minutes
- Essential Reading: Handling Functions• 10 minutes
- Essential Reading: Handling Files• 10 minutes
- Python Notebooks used for Demos• 10 minutes
17 assignments• Total 108 minutes
- Python Interpreter• 3 minutes
- Jupyter Python• 3 minutes
- Input and Print• 3 minutes
- Indentations• 3 minutes
- Python Scalar Types • 3 minutes
- Objects • 3 minutes
- Attributes• 3 minutes
- Methods• 3 minutes
- Operators• 3 minutes
- Conditional Statement• 3 minutes
- Nested Conditional Statement• 3 minutes
- For and While Loops• 3 minutes
- Regular Functions• 3 minutes
- Lambda Functions• 3 minutes
- Reading Files• 3 minutes
- Writing Files• 3 minutes
- Modules 1 and 2 - Graded Quiz• 60 minutes
1 discussion prompt• Total 30 minutes
- Real-World Use of Python Libraries in Data Analytics • 30 minutes
This module explores essential data structures in Python, covering both immutable and mutable types and the powerful NumPy library. Students will learn to apply tuples and strings, along with their methods, to store and manipulate fixed data. They will also apply lists, dictionaries, and sets, as well as their respective methods and operations, to handle changeable data effectively. The module introduces NumPy, enabling students to create, manipulate, and perform arithmetic operations on NumPy arrays using built-in functions. By the end of this module, students will have a solid understanding of Python data structures and NumPy, equipping them with the necessary tools for efficient data manipulation and numerical computations in data analytics tasks.
What's included
19 videos4 readings15 assignments1 discussion prompt
19 videos• Total 185 minutes
- Tuple• 7 minutes
- Tuple Methods • 5 minutes
- Strings• 6 minutes
- Accessing Strings• 6 minutes
- Lesson 1 Demo• 6 minutes
- Lists • 5 minutes
- Slicing List• 6 minutes
- List Methods• 7 minutes
- Dictionary• 7 minutes
- Set• 6 minutes
- Set Operations• 6 minutes
- Lesson 2 Demo• 11 minutes
- NumPy Arrays• 7 minutes
- NumPy Data Types• 8 minutes
- Arithmetic with NumPy• 6 minutes
- Indexing and Slicing Arrays• 6 minutes
- NumPy Functions• 7 minutes
- Lesson 3 Demo• 12 minutes
- Recording of Introduction to Data Analytics: Week 3 - Live Session on 24-11-22 18:33:05 [01:49]• 62 minutes
4 readings• Total 55 minutes
- Essential Reading: Immutable Data Structures• 15 minutes
- Essential Reading: Mutable Data Structures• 15 minutes
- Essential Reading: NumPy Library• 15 minutes
- Python Notebooks used for Demos• 10 minutes
15 assignments• Total 45 minutes
- Tuple• 3 minutes
- Tuple Methods• 3 minutes
- Strings• 3 minutes
- Accessing Strings• 3 minutes
- Lists • 3 minutes
- Slicing List• 3 minutes
- List Methods• 3 minutes
- Dictionary• 3 minutes
- Set - Practice Quiz• 3 minutes
- Set Operations• 3 minutes
- NumPy Arrays• 3 minutes
- NumPy Data Types• 3 minutes
- Arithmetic with NumPy• 3 minutes
- Indexing and Slicing Arrays• 3 minutes
- NumPy Functions• 3 minutes
1 discussion prompt• Total 30 minutes
- Using Broadcasting to Simplify Complex Computations• 30 minutes
This module focuses on exploratory data analysis (EDA) and visualisation using the Pandas library and Matplotlib in Python. Students will learn to apply Pandas to create, manipulate, and perform operations on Series and DataFrame objects, enabling efficient data analysis and preprocessing. They will conduct EDA to identify patterns, trends, and relationships in the data. Additionally, students will apply Matplotlib to create informative and visually appealing plots to effectively communicate insights derived from EDA. By the end of this module, students will have the skills to perform comprehensive exploratory data analysis and create meaningful visualisations using Python.
What's included
19 videos4 readings16 assignments1 discussion prompt
19 videos• Total 192 minutes
- Series • 8 minutes
- DataFrame• 6 minutes
- Indexing a DataFrame• 4 minutes
- Selection in a DataFrame• 5 minutes
- Filtering a DataFrame• 5 minutes
- Operations on a DataFrame• 4 minutes
- Lesson 1 Demo• 16 minutes
- Descriptive Statistics for Numerical Data• 8 minutes
- Descriptive Statistics for Categorical Data• 8 minutes
- Data Relationship: Correlation and Covariance• 6 minutes
- Univariate Analysis• 4 minutes
- Bivariate Analysis• 4 minutes
- Lesson 2 Demo• 12 minutes
- Scatter Plots• 8 minutes
- Line Plots• 8 minutes
- Bar Plots• 9 minutes
- Histograms• 7 minutes
- Lesson 3 Demo• 15 minutes
- Recording of Introduction to Data Analytics: Week 4 - Live Session on 24-11-29 18:34:54 [54:04]• 54 minutes
4 readings• Total 55 minutes
- Essential Reading: Pandas Library• 15 minutes
- Essential Reading: EDA• 15 minutes
- Essential Reading: EDA Visualisation Using Matplotlib• 15 minutes
- Python Notebooks used for Demos• 10 minutes
16 assignments• Total 105 minutes
- Series • 3 minutes
- DataFrame• 3 minutes
- Indexing a DataFrame• 3 minutes
- Selection in a DataFrame• 3 minutes
- Filtering a DataFrame• 3 minutes
- Operations on a DataFrame• 3 minutes
- Descriptive Statistics for Numerical Data• 3 minutes
- Descriptive Statistics for Categorical Data• 3 minutes
- Data Relationship: Correlation and Covariance• 3 minutes
- Univariate Analysis• 3 minutes
- Bivariate Analysis• 3 minutes
- Scatter Plots• 3 minutes
- Line Plots• 3 minutes
- Bar Plots• 3 minutes
- Histograms• 3 minutes
- Modules 3 and 4 - Graded Quiz• 60 minutes
1 discussion prompt• Total 20 minutes
- Data Cleaning and Preprocessing Challenges• 20 minutes
This module focuses on data preprocessing techniques essential for preparing data for analysis. Students will learn to apply methods for reading and writing data in text format while identifying and addressing data quality issues. They will handle missing data by filtering out or filling in missing values and applying various data transformation techniques such as removing duplicates, mapping, replacing values, discretisation, outlier detection and filtering, and encoding categorical variables. Additionally, students will apply data aggregation techniques, including grouping, aggregation and combining functions, to summarise and analyse data. By the end of this module, students will have the skills to preprocess and clean datasets effectively, ensuring data quality and readiness for further analysis.
What's included
21 videos6 readings17 assignments1 discussion prompt
21 videos• Total 186 minutes
- Reading Data from Text Format• 6 minutes
- Writing Data to Text Format• 7 minutes
- Data Quality Issues• 7 minutes
- Lesson 1 Demo• 8 minutes
- Filtering out Missing Data• 7 minutes
- Filling in Missing Data• 7 minutes
- Lesson 2 Demo• 5 minutes
- Removing Duplicates• 5 minutes
- Transforming Data Using Mapping• 6 minutes
- Replacing Values• 5 minutes
- Discretisation and Binning• 5 minutes
- Encoding Categorical Data• 5 minutes
- Detecting Outliers• 6 minutes
- Filtering Outliers• 5 minutes
- Lesson 3 Demo• 16 minutes
- Split - Apply - Combine• 5 minutes
- Split Step• 4 minutes
- Apply Step• 5 minutes
- Combine Step• 4 minutes
- Lesson 4 Demo• 6 minutes
- Recording of Introduction to Data Analytics: Week 5 - Live Session on 24-12-06 18:30:38 [03:43]• 64 minutes
6 readings• Total 80 minutes
- Essential Reading: Data Quality • 15 minutes
- Essential Reading: Handling Missing Data• 15 minutes
- Essential Reading: Data Transformations• 15 minutes
- Essential Reading: Data Aggregation• 15 minutes
- SGA 1: Pre-Read• 10 minutes
- Python Notebooks used for Demos• 10 minutes
17 assignments• Total 408 minutes
- Reading Data from Text Format• 3 minutes
- Writing Data to Text Format• 3 minutes
- Data Quality Issues• 3 minutes
- Filtering out Missing Data• 3 minutes
- Filling in Missing Data• 3 minutes
- Removing Duplicates• 3 minutes
- Transforming Data Using Mapping• 3 minutes
- Replacing Value• 3 minutes
- Discretisation and Binning• 3 minutes
- Encoding Categorical Data• 3 minutes
- Detecting Outliers• 3 minutes
- Filtering Outliers• 3 minutes
- Split - Apply - Combine• 3 minutes
- Split Step• 3 minutes
- Apply Step• 3 minutes
- Combine Step• 3 minutes
- SGA 1: Data Loading, Cleaning, and Exploration• 360 minutes
1 discussion prompt• Total 20 minutes
- Data Transformation Techniques in Marketing Analytics• 20 minutes
This module focuses on advanced data preprocessing techniques for handling large and complex datasets. Students will learn to apply data reduction techniques, including dimensionality reduction, numerosity reduction, and sampling methods, to reduce the size and complexity of datasets while preserving important information. They will also apply feature selection techniques, such as filter methods, wrapper methods, and embedded methods, to identify and select the most relevant features for data analysis. Additionally, students will explore feature extraction techniques, including Principal Component Analysis (PCA) and Covariance Analysis, to transform and extract new, informative features from the original dataset. By the end of this module, students will have the skills to effectively preprocess and optimise datasets for improved performance and insights in data analysis tasks.
What's included
13 videos3 readings14 assignments1 discussion prompt1 ungraded lab
13 videos• Total 99 minutes
- Dimensionality Reduction• 8 minutes
- Numerosity Reduction• 9 minutes
- Sampling Methods• 5 minutes
- Filter Methods• 6 minutes
- Correlation Based Filters• 15 minutes
- Entropy-Based Filters• 5 minutes
- Wrapper Methods• 7 minutes
- Forward Selection• 7 minutes
- Backward Elimination• 7 minutes
- Embedded Methods• 6 minutes
- Mutual Information• 10 minutes
- Covariance Analysis• 6 minutes
- Principal Component Analysis• 7 minutes
3 readings• Total 170 minutes
- Essential Reading: Data Reduction• 50 minutes
- Essential Reading: Feature Selection• 60 minutes
- Essential Reading: Feature Extraction• 60 minutes
14 assignments• Total 138 minutes
- Dimensionality Reduction• 6 minutes
- Numerosity Reduction• 6 minutes
- Sampling Methods• 6 minutes
- Filter Methods• 6 minutes
- Correlation Based Filters• 6 minutes
- Entropy-Based Filters• 6 minutes
- Wrapper Methods• 6 minutes
- Forward Selection• 6 minutes
- Backward Elimination• 6 minutes
- Embedded Methods• 6 minutes
- Mutual Information• 6 minutes
- Covariance Analysis• 6 minutes
- Principal Component Analysis• 6 minutes
- Modules 5 and 6 - Graded Quiz• 60 minutes
1 discussion prompt• Total 30 minutes
- Balancing Data Reduction and Model Performance in Real-World Applications• 30 minutes
1 ungraded lab• Total 60 minutes
- Practice Lab: ML Engineering• 60 minutes
This module focuses on regression analysis, a fundamental technique in predictive modeling and data analysis. Students will learn to apply linear regression techniques, including univariate and multivariate linear models, to analyse and model the relationship between dependent and independent variables in real-world applications. They will also apply model fitting techniques, such as gradient descent, and evaluate regression models using appropriate metrics to select the best-performing model for a given dataset. Additionally, students will explore nonlinear regression techniques, including smoothing methods, regularised models, robust regression, and nonlinear models, to capture and model complex, nonlinear relationships between variables. By the end of this module, students will have the skills to effectively apply regression techniques to solve real-world problems and make data-driven predictions.
What's included
12 videos3 readings10 assignments1 discussion prompt1 ungraded lab
12 videos• Total 172 minutes
- Applications• 6 minutes
- Simple Linear Regression• 3 minutes
- Ordinary Least Squares Regression• 3 minutes
- Linear Models• 5 minutes
- Gradient Descent• 8 minutes
- Evaluation Metrics• 6 minutes
- Model Selection in Regression• 6 minutes
- Smoothing Methods• 5 minutes
- Regularised Models• 7 minutes
- Nonlinear Models• 5 minutes
- Recording of Introduction to Data Analytics: Week 6 - Live Session on 24-12-18 19:34:23 [56:22]• 56 minutes
- Recording of Introduction to Data Analytics: Week 7 - Live Session on 24-12-20 18:33:01 [02:08]• 62 minutes
3 readings• Total 180 minutes
- Essential Reading: Linear Regression• 60 minutes
- Essential Reading: Regression Fit• 60 minutes
- Essential Reading: Nonlinear Regression• 60 minutes
10 assignments• Total 57 minutes
- Applications• 6 minutes
- Simple Linear Regression• 6 minutes
- Ordinary Least Squares Regression• 6 minutes
- Linear Models• 6 minutes
- Gradient Descent• 6 minutes
- Evaluation Metrics• 6 minutes
- Model Selection in Regression• 6 minutes
- Smoothing Methods• 6 minutes
- Regularised Models• 3 minutes
- Nonlinear Models• 6 minutes
1 discussion prompt• Total 30 minutes
- Selecting and Evaluating Regression Models for Optimal Performance• 30 minutes
1 ungraded lab• Total 60 minutes
- Practice Lab: Time Series• 60 minutes
This module focuses on classification techniques, specifically rule-based and parameter-based models. Students will learn to apply decision trees to solve binary and multilabel classification problems and evaluate the performance of these models. They will explore decision tree induction algorithms, considering design issues and measures of impurity, and random forests, to build effective and interpretable models. Students will also apply model selection techniques, such as cross-validation, and address overfitting issues to optimise decision tree models and visualise decision boundaries. Additionally, they will learn to apply logistic regression and discriminant analysis, parameter-based models, to solve classification problems and evaluate its performance. By the end of this module, students will have the skills to effectively apply classification techniques to real-world problems and make data-driven predictions.
What's included
17 videos4 readings17 assignments1 discussion prompt1 ungraded lab
17 videos• Total 144 minutes
- Applications• 5 minutes
- Binary Classification • 5 minutes
- Multiclass Classification• 5 minutes
- Building Decision Trees - Part 1• 5 minutes
- Building Decision Trees - Part 2• 2 minutes
- Design Issues• 5 minutes
- Measures of Impurity - Part 1• 4 minutes
- Measures of Impurity - Part 2• 4 minutes
- Cross-Validation• 6 minutes
- Overfitting• 5 minutes
- Random Forests• 5 minutes
- Decision Boundaries• 9 minutes
- Logistic Regression• 4 minutes
- Discriminant Analysis• 4 minutes
- Classifier’s Performance Evaluation - Part 1• 8 minutes
- Classifier’s Performance Evaluation - Part 2• 5 minutes
- Recording of Introduction to Data Analytics: Week 8 - Live Session on 24-12-27 18:31:54 [01:17]• 61 minutes
4 readings• Total 240 minutes
- Essential Reading: Rule Based Models• 60 minutes
- Essential Reading: Decision Tree Induction Algorithms• 60 minutes
- Essential Reading: Model Selection in Decision Trees• 60 minutes
- Essential Reading: Parameter Based Models• 60 minutes
17 assignments• Total 156 minutes
- Applications• 6 minutes
- Binary Classification • 6 minutes
- Multiclass Classification• 6 minutes
- Building Decision Trees - Part 1• 6 minutes
- Building Decision Trees - Part 2• 6 minutes
- Design Issues• 6 minutes
- Measures of Impurity - Part 1• 6 minutes
- Measures of Impurity - Part 2• 6 minutes
- Cross-Validation• 6 minutes
- Overfitting• 6 minutes
- Random Forests• 6 minutes
- Decision Boundaries• 6 minutes
- Logistic Regression• 6 minutes
- Discriminant Analysis• 6 minutes
- Classifier’s Performance Evaluation - Part 1• 6 minutes
- Classifier’s Performance Evaluation - Part 2• 6 minutes
- Modules 7 and 8 - Graded Quiz• 60 minutes
1 discussion prompt• Total 30 minutes
- Addressing Overfitting in Real-World Machine Learning Applications• 30 minutes
1 ungraded lab• Total 60 minutes
- Practice Lab: Model Optimization• 60 minutes
This module focuses on unsupervised learning techniques for clustering, which aim to discover natural groupings and patterns in data without prior knowledge of class labels. Students will learn to apply partitional clustering techniques, specifically the k-Means algorithm, considering similarity measures, distance matrices, and cluster goodness evaluation. They will also explore hierarchical clustering methods, both bottom-up agglomerative and top-down divisive, to create nested clusters and analyse data at different levels of granularity. Additionally, students will apply cluster validation techniques, including external and internal indices, to assess the quality of clustering results and determine the optimal number of clusters for a given dataset. By the end of this module, students will have the skills to effectively apply clustering techniques to real-world problems and gain insights from unlabeled data.
What's included
13 videos4 readings13 assignments1 discussion prompt
13 videos• Total 52 minutes
- Applications• 3 minutes
- Types of Clusters• 3 minutes
- Types of Clustering Algorithms• 3 minutes
- Similarity Measures• 6 minutes
- Distance Matrix• 4 minutes
- k-Means Algorithm• 5 minutes
- Fuzzy C-Means Algorithm• 6 minutes
- Bottom-Up Agglomerative Methods• 4 minutes
- Top-Down Divisive Methods• 4 minutes
- Distance Measures in Hierarchical Methods• 2 minutes
- Aspects of Cluster Validation• 3 minutes
- External Indices• 5 minutes
- Internal Indices• 3 minutes
4 readings• Total 160 minutes
- Essential Reading: Partitional Clustering• 60 minutes
- Essential Reading: Hierarchical Clustering• 60 minutes
- Essential Reading: Cluster Validation• 30 minutes
- Practice Lab: Model Deployment• 10 minutes
13 assignments• Total 78 minutes
- Applications• 6 minutes
- Types of Clusters• 6 minutes
- Types of Clustering Algorithms• 6 minutes
- Similarity Measures• 6 minutes
- Distance Matrix• 6 minutes
- k-Means Algorithm• 6 minutes
- Fuzzy C-Means Algorithm• 6 minutes
- Bottom-Up Agglomerative Methods• 6 minutes
- Top-Down Divisive Methods• 6 minutes
- Distance Measures in Hierarchical Methods• 6 minutes
- Aspects of Cluster Validation• 6 minutes
- External Indices• 6 minutes
- Internal Indices • 6 minutes
1 discussion prompt• Total 30 minutes
- Selecting the Most Effective Clustering Algorithm for Real-World Datasets• 30 minutes
This module focuses on privacy, fairness, and security of data analytics. Students will learn about the risk assessment and threat modeling in the practical use of data analytics. Privacy-preserving data mechanism for model privacy will be surveyed. The attack strategies and defense mechanisms of model security will be emphasized. Notions of AI fairness and algorithmic bias will be covered at the stages of pre-processing, in-processing, post-processing stages of data analytics. Cost-sensitive classification and machine learning will be discussed to assess model fairness. Model security will be formalized under frameworks of adversarial data mining for game theory based AI with applications in the cyber kill chain for cybersecurity. Adversarial example games will be summarized for specific targets in adversarial capability, ability and goals. An adversarial risk analysis of the game theories and association optimization trade-offs will be presented in the setup of binary classification, multiclass classification, and multilabel classification. Relation between adversarial and robust data mining for classifier design will be motivated with respect to the robustness properties of analytics models satisfied in defense mechanisms such as semi-supervised machine learning, adversarial training and learning, empirical risk minimization, and mistake-bounds frameworks for adversarial classification. By the end of this module, students will have the skills to effectively apply data analytics techniques to real-world problems and gain insights in a safe, secure, and transparent manner.
What's included
16 videos4 readings17 assignments1 discussion prompt1 ungraded lab
16 videos• Total 148 minutes
- Data Privacy• 8 minutes
- Model Privacy• 9 minutes
- Privacy Enhancing Strategies• 7 minutes
- Data Fairness• 4 minutes
- Model Fairness• 6 minutes
- Algorithmic Fairness• 7 minutes
- Model Security - Part 1• 7 minutes
- Model Security - Part 2• 6 minutes
- Cost-Sensitive Classification• 6 minutes
- Cost-Sensitive Learning• 9 minutes
- Adversarial Data Mining - Part 1 • 9 minutes
- Adversarial Data Mining - Part 2• 13 minutes
- Robust Data Mining - Part 1 • 8 minutes
- Robust Data Mining - Part 2• 6 minutes
- Adversarial and Robust Data Mining• 8 minutes
- Recording of Introduction to Data Analytics: Week 10 - Live Session on 25-01-10 18:31:32 [38:57]• 36 minutes
4 readings• Total 105 minutes
- Essential Reading: Analytics Privacy• 30 minutes
- Essential Reading: Analytics Fairness• 45 minutes
- Essential Reading: Analytics Security• 20 minutes
- SGA 2: Pre-Read• 10 minutes
17 assignments• Total 750 minutes
- Data Privacy• 6 minutes
- Model Privacy• 6 minutes
- Privacy Enhancing Strategies• 6 minutes
- Data Fairness• 6 minutes
- Model Fairness• 6 minutes
- Algorithmic Fairness• 6 minutes
- Model Security - Part 1• 6 minutes
- Model Security - Part 2• 6 minutes
- Cost-Sensitive Classification• 6 minutes
- Cost-Sensitive Learning• 6 minutes
- Adversarial Data Mining - Part 1 • 6 minutes
- Adversarial Data Mining - Part 2• 6 minutes
- Robust Data Mining - Part 1 • 6 minutes
- Robust Data Mining - Part 2• 6 minutes
- Adversarial and Robust Data Mining• 6 minutes
- Modules 9 and 10 - Graded Quiz• 60 minutes
- SGA 2: Feature Engineering and Model Development• 600 minutes
1 discussion prompt• Total 30 minutes
- Ensuring Privacy While Maximising Data Utility in Analytics• 30 minutes
1 ungraded lab• Total 60 minutes
- Practice Lab: Neural Networks• 60 minutes
What's included
1 assignment
1 assignment• Total 30 minutes
- Comprehensive Final Examination• 30 minutes
Instructors


Offered by

Offered by

Birla Institute of Technology & Science, Pilani (BITS Pilani) is one of only ten private universities in India to be recognised as an Institute of Eminence by the Ministry of Human Resource Development, Government of India. It has been consistently ranked high by both governmental and private ranking agencies for its innovative processes and capabilities that have enabled it to impart quality education and emerge as the best private science and engineering institute in India. BITS Pilani has four international campuses in Pilani, Goa, Hyderabad, and Dubai, and has been offering bachelor's, master’s, and certificate programmes for over 58 years, helping to launch the careers for over 1,00,000 professionals.
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.

Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy
Frequently asked questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
More questions
Financial aid available,