Course

Course Summary
Credit Type:
Course
ACE ID:
STAT-0055
Organization:
Location:
Online
Length:
4 weeks
Dates Offered:
Credit Recommendation & Competencies
Level Credits (SH) Subject
Upper-Division Baccalaureate 2 Natural Language Processing
Description

Objective:

The course objective is to introduce students to the essential techniques of natural language processing (NLP) and text mining with Python. The course discusses how to apply unsupervised and supervised modeling techniques to text, and it devotes considerable attention to data preparation and data handling methods required to transform unstructured text into a form in which it can be mined.

Learning Outcomes:

  • Use string functions.
  • Preprocess data via stemming, lemmatization, tokenization, N-gram creation, removal of stop-words.
  • Develop models to classify documents.
  • Specify regular expressions.
  • Characterize documents by term frequency (TF-IDF vectors).
  • Perform topic modeling and text summarization.
  • Leverage pre-trained models for part-of-speech (POS) tagging and named entity recognition (NER).

General Topics:

  • Introduction and Data Preparation
  • Predictive Models for Text
  • Retrieval and Clustering of Documents
  • Information Extraction
Instruction & Assessment

Instructional Strategies:

  • Discussion
  • Lectures
  • Practical Exercises
  • Textbook readings

Methods of Assessment:

  • Other
  • Capstone case study project, graded practical exercises, and discussion forums

Minimum Passing Score:

80%
Supplemental Materials