Week 7 - Natural Language Processing, Part II

This week you will...

  • start digging into a variety of model formats that are used in training models to understand context in sequence. In the last couple of weeks you looked first at Tokenizing words to get numeric values from them, and then using Embeddings to group words of similar meaning depending on how they were labelled. This gave you a good, but rough, sentiment analysis -- words such as 'fun' and 'entertaining' might show up in a positive movie review, and 'boring' and 'dull' might show up in a negative one. But sentiment can also be determined by the sequence in which words appear. For example, you could have 'not fun', which of course is the opposite of 'fun'.

  • learn about using natural language processing (NLP) models for predictions. Given a body of words, you could conceivably predict the word most likely to follow a given word or phrase, and once you've done that, to do it again, and again. With that in mind, you'll build a text generator. It's trained with texts that mimic the style of master Yoda from Star Wars and can be used to produce sentences that sound similar those of Yoda.

Learning Resources

Until next week you should...

  • complete week 1 and week 2 of the course Sequences, Time Series and Prediction

  • complete Exercise 1 (Exercise 2 in the same notebook is for next week) assignments in this notebook.

  • decide on an evaluation metric for your project task and evaluate your baseline model.

  • document the evaluation results of your baseline model and the used metric(s) in your project repository.

Last updated