Course

Course Summary
Credit Type:
Course
ACE ID:
STAT-0048
Organization:
Location:
Online
Length:
4 weeks
Dates Offered:
Credit Recommendation & Competencies
Level Credits (SH) Subject
Lower-Division Baccalaureate 2 Natural Language Processing
Description

Objective:

The course objective is to introduce students to the essential techniques of natural language processing (NLP) and text mining with Python. The course discusses how to apply unsupervised and supervised modeling techniques to text, and it devotes considerable attention to data preparation and data handling methods required to transform unstructured text into a form in which it can be mined.

Learning Outcomes:

  • Specify regular expressions.
  • Use string functions.
  • Preprocess data via stemming, lemmatization, tokenization, N-gram creation, removal of stop-words.
  • Leverage pre-trained models for part-of-speech (POS) tagging and named entity recognition (NER).
  • Characterize documents by term frequency (TF-IDF vectors).
  • Perform topic modeling and text summarization.
  • Develop models to classify documents.

General Topics:

  • Introduction and Data Preparation
  • Predictive Models for Text
  • Retrieval and Clustering of Documents
  • Information Extraction
Instruction & Assessment

Instructional Strategies:

  • Discussion
  • Lectures
  • Practical Exercises
  • Textbook readings

Methods of Assessment:

  • Other
  • Graded practical exercises and discussion forums

Minimum Passing Score:

73%
Supplemental Materials