I have an assignment about Twitter/API using Python. The idea is to record the tweets and cleaning text from URL / Retweets and so on. I have it partially done but I need help in these things:
1-Write a function to tokenized the cleaned text. Remove common stop words
and tokens less than 3 characters long from the resulting tokens.The
function’s output should be the filtered set of tokens.
2-Write a function to perform stemming on the cleaned tokens.
3-. Also, write a separate function to perform lemmatization on the cleaned
tokens (not stemmed tokens).
The post Create functions to remove common stopwords, perform stemming on the cleaned token,lemmatization on the cleaned tokens appeared first on learnedprofessors.