Difficulty: beginner
Estimated Time: 20 minutes

Natural Language Processing aggregates several tasks that can be performed, like:

  • Part of speech tagging
  • Word segmentation
  • Named entity recognition
  • Machine translation
  • Question answering
  • Sentiment analysis
  • Topic segmentation and recognition
  • Natural language generation

One of them is classifying the text based on the content. In this scenario you will learn how to use Bag of Words and td-idf models to perform the task.

You've completed Text Classification scenario.

Text Classification

Step 1 of 6

Text Classification

Text Classification tasks starts with providing training set: documents and categories (labels) to the Machine Learning algorithm. After the model is trained it can be used to categorize new examples.

Text classification

Text representation brings some complexity when forming machine learning problem. Usually the dataset has the form of rows organized into features.

Text classification features

In our case every document is a data point, label is a category, but what would features be?