Course

Credit Type:

Course

ACE ID:

IBM-0026

Version:

Organization:

IBM

Length:

52 weeks (156 hours in total)

Minimum Passing Score:

ACE Credit Recommendation Period:

Credit Recommendation & Competencies

Level	Credits (SH)	Subject
Lower-Division Baccalaureate	3	Introduction to Python
Lower-Division Baccalaureate	3	Introduction to Database Systems
Upper-Division Baccalaureate	3	Advanced SQL Programming
Upper-Division Baccalaureate	3	Data Mining

Description

Objective:

This course is offered through Coursera, which is an ACE Authorized Instructional Platform.

The course objective is for students to immerse themselves in the role of a data engineer and acquire the essential skills needed to work with various tools and databases to design, deploy, and manage structured and unstructured data.

By the end of this Professional Certificate, students will be able to explain and perform the critical tasks required in a data engineering role. Learners will use the Python programming language and Linux/UNIX shell scripts to extract, transform and load (ETL) data, work with Relational Databases (RDBMS), query data using SQL statements, and use NoSQL databases and unstructured data. Learners will be introduced to Big Data, work with Big Data engines like Hadoop and Spark, and gain experience creating Data Warehouses and utilizing Business Intelligence tools to analyze and extract insights.

Each module includes numerous hands-on labs & projects to apply the concepts and skills learned. The program culminates in a Capstone Project, bringing together all of these skills to develop and implement an entire data platform with various data repositories and pipelines to address a real-world inspired data analytics problem.

This program does not require any prior data engineering or programming experience.

Learning Outcomes:

Describe data engineering and its function(s)
Describe and differentiate between the role and responsibilities of Data Engineers, Data Scientists, Data Analysts, Business Analysts, and Business Intelligence Analysts
Describe the different entities that form a modern data ecosystem
Explain what big data is, how it impacts the collection, monitoring, storage, analysis, and reporting of data, and what are some of the big data processing tools
Describe the elements of a data engineering ecosystem which includes data, data repositories, data integration platforms, data pipelines, languages, and BI and Reporting tools
Explain the characteristics and use of some of the programming, querying, and scripting languages
Explain the use of Data Integration Platforms and how they relate to data pipelines and the ETL and ELT processes
Describe what RDBMSes and NoSQL databases are and their examples of use
List and describe the most common use cases for MongoDB
Describe Apache Cassandra and explain how it fits in the NoSQL space
Demonstrate skill in retrieving SQL query results and analyzing data
Describe Apache Spark application submission, including use of Spark’s unified interface, ‘spark-submit’,describe and apply options for submitting applications, identify external application dependency management techniques and list Spark Shell benefits

General Topics:

Introduction to Data Engineering
Python for Data Science, AI & Development
Python Project for Data Engineering
Introduction to Relational Databases (RDBMS)
Databases and SQL for Data Science with Python
Introduction to NoSQL Databases
Introduction to Big Data with Spark and Hadoop
Data Engineering and Machine Learning using Spark
Hands-on Introduction to Linux Commands and Shell Scripting
ETL and Data Pipelines with Shell, Airflow and Kafka
Getting Started with Data Warehousing and BI Analytics

Instruction & Assessment

Instructional Strategies:

Audio Visual Materials
Case Studies
Lectures
Practical Exercises

Methods of Assessment:

Other
Quizzes
Peer review graded projects with rubrics

Supplemental Materials

Equivalencies

Other offerings from IBM

View All Courses

College Credit Opportunities