Week 6 - Natural Language Processing, Part I

  • get an understanding for the importance of tokenization of a text when training a neural network for texts, for example, to do a sentiment analysis. Tokenization is the process of converting the text into numeric values, with a number representing a word or a character.

  • learn about embeddings, where the text tokens are mapped as vectors in a high dimensional space. With embeddings and labelled examples, these vectors can then be tuned so that words with similar meaning will have a similar direction in the vector space. This will begin the process of training a neural network to understand sentiment in text -- and you'll begin by looking at movie reviews, training a neural network on texts that are labelled 'positive' or 'negative' and determining which words in a sentence drive those meanings.

Learning Resources

  • complete week 3 and week 4 of the course Natural Language Processing in TensorFlow

  • complete the exercise assignment in this notebook

  • consider a baseline model or a baseline comparison for your project task according to the instructions given here.

  • document your decision according to the instructions given in the link above

