tokens_segment(), which works on tokens objects in the same way as corpus_segment() does on corpus objects (#902).%>% can now be used with quanteda without needing to attach magrittr (or, as many users apparently believe, the entire tidyverse.)corpus_segment() now behaves more logically and flexibly, and is clearly differentiated from corpus_reshape() in terms of its functionality. Its documentation is also vastly improved. (#908)data_dictionary_LSD2015, the Lexicoder Sentiment 2015 dictionary (#963).tokens_lookup() and dfm_lookup() (#960).head.corpus(), tail.corpus() provide fast subsetting of the first or last documents in a corpus. (#952)purrr::map() to dfm() (#928).regex2fixed() and associated functions.textstat_collocations.tokens() caused by “documents” containing only "" as tokens. (#940)cbind.dfm() when features shared a name starting with quanteda_options("base_featname") (#946)quanteda_options(). (#966)summary.corpus() now generates a special data.frame, which has its own print method, rather than requiring verbose = FALSE to suppress output (#926).textstat_collocations() is now multi-threaded.head.dfm(), tail.dfm() now behave consistently with base R methods for matrix, with the added argument nfeature. Previously, these methods printed the subset and invisibly returned it. Now, they simply return the subset. (#952)textmodel_lsa() for Latent Semantic Analysis.tokens_segment() has a new window argument, permitting selection within an asymmetric window around the pattern of selection. (#521)tokens_replace() now allows token types to be substituted directly and quickly.textmodel_affinity() now adds functionality to fit the Perry and Benoit (2017) class affinity model.spacy_parse method for corpus objects. Also restored quanteda methods for spacyr spacy_parsed objects.textmodel_nb() (#1010), and made output quantities from the fitted NB model regular matrix objects instead of Matrix classes.tokens_group() is now significantly faster.tokenize() function and all methods associated with the tokenizedTexts object types have been removed.tokens_keep(), dfm_keep(), and fcm_keep(). (#1037)textmodel_NB() has been replaced by textmodel_nb().