Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
This section is currently re-written. More details on the GAN course will follow in this section soon.
For this week you should have gone through the lectures of week 1 of the first Coursera course on NLP, including the quiz and the assignment. https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/1
Let's start and dive into the fascinating world of NLP.
For an introduction we will look at how NLP was done a few years ago so that we can appreciate in the following courses what has changed in natural language processing for the better.
As a start we will do positive and negative sentiment analysis on Twitter tweets. Analyzing tweets is a huge topic in NLP. Hegdefonds use Twitter tweets to try to predict movement of stock prices, politician campaign managers analyze tweets to see how the sentiment about their candidate evolves. So we dive right to the heart of one of the use of NLP.
In this week we will count for each word in the dictionary how often it appears and positive respective negative tweets. We will use that dictionary to produce our input to a simple linear regression model and train it. So we won't use the words as the input to the model what we will do in later weeks. See the course videos for more details.
For the next week you should go through all the course videos, the assignment and the quiz of week 2 of course 1 in the NLP specialization. Take notes and notice if you have any questions about the material. In the next meeting we will discuss these.
https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/2
See you next week!
This is a brief overview of use cases of NLP. The goal is to show you what is possible with current NLP techniques and inspire you to use some of these applications for your own. This guide does not attempt to be comprehensive, so if you know of other interesting application we would be happy to tell us about them
Take one or more text documents and create a summary that represents the most important/relevant information from the original text. These summaries can either be “generic” (a general overview of the original text) or “query relevant” (a summary that only focuses on the text that is relevant to a picked topic). The summarisation process is either extractive (directly reproducing parts of the source text word-for-word) or abstractive (forming an internal semantic representation of the original content and using this to write the summary from scratch).
You can play around with a demo bot for this:
This are tools that can answer questions asked in normal (natural) language.
It replies based on either text it saw during training or on some text you provide to it at the same time as asking the question. As with text summarisation, the answering process can either be extractive (directly quoting the source text) or abstractive (writing the answer based on an internal semantic representation of the original content).
The task of automatically extracting structured information from text documents.
Information extraction can facilitate further computation to be done on the previously unstructured data. There are two main types of information extraction: Named entity recognition and Relation extraction.
Named entity recognition allows you to identify all entities of a predefined category (e.g. Extract all cities; or extract all company names).
Relation extraction builds on top of named entity recognition. In addition to finding the entities, it allows you to detect the semantic relationships between them (e.g. Extract all countries and their capital cities; or extract all companies and the year they were founded in).
Here is a demo website where you can enter your text and see what subjects are extracted
Normal chatbots can hold conversations, answer your questions and carry out simple tasks (e.g. changing a setting in your account, placing an order or scheduling a meeting for you).
The process of sorting pieces of text into one or more predefined categories. Examples of how this can be used include:
Text sentiment classification;
Spam filters;
Determining whether the author is making a claim or not - as the first step in fact-checking;
Analysing trends in social media monitoring.
Translate from one language to another or let your text be rewritten.
Check out:
Describe what you are trying to achieve, and let the AI draft the code for you (e.g. HTML, CSS, SQL query and Linux commands).
At present the tools that can do this are imperfect and can only really be used to write a first draft that you would need to review
We will talk about the general outline of the course, the final project and other organizational course related material. Also you get the chance to introduce yourself briefly and get to know your fellow students.
For the next week we will finally dive into the material. You should go through all the course videos, the assignment and the quiz of week 1 of course 1 in the NLP specialization.
https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/1
For this week you should have gone through the lectures of week 2 of the first Coursera course on NLP, including the quiz and the assignment. https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/2
This week is very similiar to the first but instead of linear regression we will use naive bayes.
If you are not familiar with this machine learning algorithm these videos will give you a head start since in the coursera course naive bayes is just covered on the fly.
At the end you will be able to test your naive bayes model with your own tweets or other that you source from the internet.
For the next week you should go through all the course videos, the assignment and the quiz of week 3 of course 1 in the NLP specialization. Take notes and notice if you have any questions about the material. In the next meeting we will discuss these.
https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/3
Have fun and see you next week.
For this week you should have gone through the lectures of week 4 of the first Coursera course on NLP and the assignment. https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/4
In this week we learn about one of the most important applications of NLP which is translating from one language to another.
For this week you should have gone through the lectures of week 3 of the first Coursera course on NLP and the assignment. https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/3
In this week we dive into representing our words as words vectors. Remember any ML Algorithm requires its input in a mathematical forms i.e in digits. The last two weeks we assiociated to each tweet two digits - one for posive sentiment and one for negative sentiment - and fed that vector in our models. From now on we will try a different approach indepedent from sentiment analysis ahd thus more general.
For that we encode each word as a vector. Thus we need a dictionary mapping each word to its corresponding vector. In general we do not know what is the best vector to represent a word. So we have also to learn that. Luckily there are a lot of pretrained word embeddings online and we can normally use one of those
For the next week you should go through all the course videos, the assignment and the quiz of week 4 of course 1 in the NLP specialization. Take notes and notice if you have any questions about the material. In the next meeting we will discuss these.
https://www.coursera.org/learn/classification-vector-spaces-in-nlp/home/week/4
To quote the NLP tutor: "And remember to have fun".
See you next week
Register yourself in the Opencampus Mattermost Chat
Register yourself in Coursera and for the Natural Language Processing Specialization, and enroll at least in the first course Natural Language Processing with Classification and Vector Spaces.
Welcome to second course and to Natural Language Processing with Probabilistic Models! Congratulations that you have made it that far!
Let us start with some words on what probalistic models are. These are models that are based on the principle: "given these words what is the most likely next word". So a pretty reasonable approach
In this week we learn about autocorrect, minimum edit distance, and dynamic programming. At the end we build your own spellchecker to correct misspelled words!