• Chapter 1 outlines the tidy text format and the unnest_tokens() function. It also introduces the gutenbergr and janeaustenr packages, which provide useful literary text datasets that we’ll use throughout this book.
  • Chapter 2 shows how to perform sentiment analysis on a tidy text dataset, using the sentimentsdataset from tidytext and inner_join() from dplyr.
  • Chapter 3 describes the tf-idf statistic (term frequency times inverse document frequency), a quantity used for identifying terms that are especially important to a particular document.
  • Chapter 4 introduces n-grams and how to analyze word networks in text using the widyr and ggraph packages.
  • Chapter 5 introduces methods for tidying document-term matrices and corpus objects from the tm and quanteda packages, as well as for casting tidy text datasets into those formats.
  • Chapter 6 explores the concept of topic modeling, and uses the tidy() method to interpret and visualize the output of the topicmodels package.

Get 15% discount on your first order with us
Use the following coupon
FIRST15

Order Now

Hi there! Click one of our representatives below and we will get back to you as soon as possible.

Chat with us on WhatsApp