Welcome to Data Wrangling for Business. This course will cover data wrangling principles and techniques for business. Key topics include data extraction, profiling, cleansing, integration, transformation, and automating data processes for business purposes. In the course, you will apply principles and techniques using data transformation tools, programming languages, and data process automation tools. The course offers you an opportunity to learn how to embed appropriate communication mechanisms for collaboration to identify and resolve real-world data challenges revealed in datasets and business processes, creating business value in today’s disparate computing and dynamic business environment.
In this module, you will learn about the structure of relational databases and how to use SQL queries for information retrieval, focusing on single-row and group functions. In the next module, you will build upon this foundation by exploring data manipulation and data joining techniques.
What's included
7 videos16 readings1 assignment
Show info about module content
7 videos•Total 20 minutes
Course Overview•3 minutes
Meet Your Faculty•2 minutes
Relational Database Data Structure•3 minutes
Unique Values•3 minutes
Constraints•5 minutes
Working with Text Values•2 minutes
TIMESTAMP•3 minutes
16 readings•Total 67 minutes
Course Introduction•2 minutes
Syllabus - Data Wrangling for Business•10 minutes
Academic Integrity•1 minute
Data Wrangling Key Questions & Steps•2 minutes
Relational Databases•1 minute
Go to Canvas•1 minute
Key SQL Concepts•5 minutes
Overview of Predefined Functions •2 minutes
Single-Row vs. Group Functions•10 minutes
Overview of Single-Row Functions•2 minutes
Uses of Single-Row Functions•4 minutes
Single-Row Function Example•10 minutes
Overview of Group Functions•2 minutes
Uses of Group Functions•4 minutes
Group Function Example•10 minutes
Go to Canvas•1 minute
1 assignment•Total 30 minutes
Module 1 Quiz•30 minutes
Managing and Retrieving Data with SQL
Module 2•3 hours to complete
Module details
The module also highlights how effective data manipulation and joining contribute to the broader goals of data wrangling and preparation, ensuring that data is both well-organized and ready for analysis.
What's included
14 readings1 assignment
Show info about module content
14 readings•Total 113 minutes
Data Manipulation Overview•3 minutes
INSERT•4 minutes
Update•5 minutes
Delete•10 minutes
TCL Core Operations•10 minutes
Summary Table•10 minutes
Data Joining Overview•10 minutes
Data Joining Types Overview•10 minutes
Inner Join•10 minutes
Left Join•10 minutes
Right Join•10 minutes
Full Outer Join•10 minutes
Cross Join•10 minutes
Go to Canvas•1 minute
1 assignment•Total 45 minutes
Module 2 Quiz•45 minutes
Data Profiling and Discovery
Module 3•2 hours to complete
Module details
In this module, you will learn how to explore datasets using Python. You’ll practice techniques to inspect dataset structure (rows, columns, and data types), and detect missing, invalid, or inconsistent data. You will also learn how to generate descriptive statistics and distribution summaries, as well as interpret profiling results to guide data cleansing and improve overall data quality.
What's included
2 videos16 readings1 assignment
Show info about module content
2 videos•Total 6 minutes
Data Profiling•3 minutes
Data Profiling Example•3 minutes
16 readings•Total 82 minutes
Go to Canvas•1 minute
Data Profiling Overview•10 minutes
Discovering Data Structure Overview•2 minutes
Rows and Columns•7 minutes
Data Type•7 minutes
Non-Null Entries•7 minutes
Discovering Data Structure Example•10 minutes
Discovering Data Content Overview•3 minutes
Summary Statistics•4 minutes
Descriptive Statistics•4 minutes
Frequency Distribution•4 minutes
Missing Values•3 minutes
Duplicate Data•3 minutes
Incorrect or Ambiguous Data•4 minutes
Discovering Data Content Example•10 minutes
Data Profiling•3 minutes
1 assignment•Total 30 minutes
Module 3 Quiz•30 minutes
Data Cleansing
Module 4•1 hour to complete
Module details
By the end of this module, you will be able to apply practical data cleansing techniques to improve data quality and make your analysis more accurate and trustworthy.
What's included
2 videos12 readings1 assignment
Show info about module content
2 videos•Total 3 minutes
Data Cleansing•1 minute
Data Cleansing to Handle Outliers•2 minutes
12 readings•Total 32 minutes
Data Cleansing Overview•2 minutes
Go to Canvas•1 minute
Overview of Types of Data Issues•2 minutes
Missing Values•3 minutes
Duplicates•3 minutes
Inconsistent Formats•3 minutes
Outliers and Anomalies•3 minutes
Incorrect or Invalid Values•3 minutes
Data Types Conversions•3 minutes
Filtering•3 minutes
Data Cleansing Example•2 minutes
Practice: Data Profiling and Cleansing•4 minutes
1 assignment•Total 30 minutes
Module 4 Quiz•30 minutes
Data Transformation
Module 5•2 hours to complete
Module details
In this module, you will learn how to transform raw data into clean, structured datasets that are ready to be linked with other relevant data for enrichment. To perform these tasks, you will continue your work with Python and explore later in upcoming modules, tools like Alteryx to enhance your data preparation and enrichment workflow.
What's included
2 videos12 readings1 assignment
Show info about module content
2 videos•Total 5 minutes
Data Transformation•2 minutes
Data Enrichment Example•3 minutes
12 readings•Total 82 minutes
Data Transformation Overview•1 minute
Converting Data Types•4 minutes
Converting Units of Measurement•4 minutes
Mapping Data Values•10 minutes
Splitting Data Values•10 minutes
Data Enrichment Overview•2 minutes
Data Unions•10 minutes
Data Joins•10 minutes
Derivation of New Values•10 minutes
Errors and Exceptions Overview•10 minutes
Handling Errors and Exceptions•10 minutes
Go to Canvas•1 minute
1 assignment•Total 30 minutes
Module 5 Quiz•30 minutes
Data Integration
Module 6•3 hours to complete
Module details
This module emphasizes techniques for gathering, integrating, and transforming data from diverse sources. Hands-on exercises focus on automating data extraction from webpages and processing textual data, enabling the conversion of raw, unstructured information into structured, analyzable formats. By applying these methods, participants learn to create unified datasets that are ready for deeper analysis and the generation of meaningful insights.
What's included
1 video17 readings1 assignment
Show info about module content
1 video•Total 2 minutes
Do Businesses Need to Automate Web Data Collection?•2 minutes
17 readings•Total 137 minutes
What is Data Integration?•3 minutes
What is Unstructured Text Data?•3 minutes
Common Transformation Techniques•4 minutes
Go to Canvas•1 minute
Web Scraping Overview•2 minutes
Go to Canvas•1 minute
Beautiful Soup Overview•3 minutes
Other Beautiful Soup Find Methods•3 minutes
Go to Canvas•1 minute
Beautiful Soup Function Example•90 minutes
Go to Canvas•1 minute
Transforming Unstructured Text Data•10 minutes
Natural Language Toolkit (NLTK)•3 minutes
Tokenization: Breaking Text Into Words•3 minutes
Removing Stopwords to Reduce Noise•4 minutes
Normalizing Words with Lemmatization•3 minutes
Result of Text Transformation•2 minutes
1 assignment•Total 30 minutes
Module 6 Quiz•30 minutes
Data Wrangling Automation
Module 7•2 hours to complete
Module details
In this module, you will learn how to use the industry automation tool, Alteryx, to automate the processes of data transformation and data integration. This skill will help you in your professional career to ease and expedite data processing.
Founded in 1898, Northeastern is a global research university with a distinctive, experience-driven approach to education and discovery. The university is a leader in experiential learning, powered by the world’s most far-reaching cooperative education program. The spirit of collaboration guides a use-inspired research enterprise focused on solving global challenges in health, security, and sustainability.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.