Text Classification tasks starts with providing training set: documents and categories (labels) to the Machine Learning algorithm. After the model is trained it can be used to categorize new examples.
Text representation brings some complexity when forming machine learning problem. Usually the dataset has the form of rows organized into features.
In our case every document is a data point, label is a category, but what would features be?