Week 8 - Natural Language Processing, Part II
This week you will...
start digging into a variety of model formats that are used in training models to understand context in sequence. In the last couple of weeks you looked first at Tokenizing words to get numeric values from them, and then using Embeddings to group words of similar meaning depending on how they were labelled. This gave you a good, but rough, sentiment analysis -- words such as 'fun' and 'entertaining' might show up in a positive movie review, and 'boring' and 'dull' might show up in a negative one. But sentiment can also be determined by the sequence in which words appear. For example, you could have 'not fun', which of course is the opposite of 'fun'.
learn about using natural language processing (NLP) models for predictions. Given a body of words, you could conceivably predict the word most likely to follow a given word or phrase, and once you've done that, to do it again, and again. With that in mind, you'll build a text generator. It's trained with texts that mimic the style of master Yoda from Star Wars and can be used to produce sentences that sound similar those of Yoda.
Learning Resources
Will be provided here soon.
Until next week you should...
prepare questions for the instructor team on problems you have in your project or potential improvement ideas you are thinking of.
complete the second milestone, that is the definition of an evaluation metric and the estimation of a baseline model, on Sunday before the feedback session next week! Follow the instructions given in the template repository. We will review your completions via the link to your repository provided in the Google Sheet including the current list of projects.
Last updated
Was this helpful?