NEWS.md
"id", "sentence_id", "date", "word_count" or "texts" will not be accepted even when numeric, to avoid duplicate column names down the line. A clear error message is issued to alert users.order() calls on data.frames where needed to avoid CRAN complaints.pkgdown website.sento_corpus() function that did not always order input correctly by date.summary.sento_measures(); the first one prevented printing of document-level weighting schemes, the second one did not remove NAs when averaging over correlations.1970-01-01 is considered day zero).plot.sento_measures() function as it distorts graphs of time series with values far away from zero.print.sento_corpus() now shows when corpus is multi-lingual.print.sento_corpus().warning() calls to message() calls to be more kind to the user.corpus object from quanteda >= v2.0."TF"-inspired weights for within-document aggregation except for "TFIDF", and made this option return the same sentiment scores as would when using the quanteda package (see the example on https://sentometrics-research.com/sentometrics/articles/examples/sentiment.html).compute_sentiment() function.as.data.table.sento_corpus(), as.data.frame.sento_corpus(), and as.data.frame.sento_measures().plot.attributions() to guaranty same plotting behaviour after update of ggplot2 package that gave buggy output for the geom_area() layer.measures_global() into the aggregate.sento_measures() function, adding a do.global argument to enact it.peakdates() and peakdocs() functions.sento_app() function) in a separate sole-purpose package sentometrics.app (see https://github.com/sborms/sentometrics.app).data.table package from Depends to Imports (see https://github.com/Rdatatable/data.table/issues/3076).merge.sentiment() function anymore, and modified the merging to give for instance a simple column binding of sentment methods when all else is equal.how argument in the compute_sentiment() function.list_valence_shifters.do.normalize option to the weights_beta() and weights_exponential() functions.do.inverse option to the weights_exponential() function and associated do.inverseExp argument in the ctr_agg() function."squareRootCounts" into "proportionalSquareRoot", "invertedExponential" into "inverseExponential", and "invertedUShaped" into "inverseUShaped".compute_sentiment() function now also can do a sentence-level calculation using the bigrams valence shifting approach.measures_update(), subset.sento_measures(), as.sentiment(), as.sento_measures(), as.data.table.sentiment(), corpus_summarize(), sento_app(), and aggregate.sento_measures().quanteda developers regarding their new corpus object.sento_xyz() function into the name of the function (e.g., the sento_measures() function now gives a sento_measures object instead of a sentomeasures object).aggregate.sento_measures() (previously measures_merge()) function to take the mean instead of the sum in a particular case.get_hows() function for an overview).do.sentence argument in the compute_sentiment() function).sento_corpus object to do a multi-language sentiment computation (applying different lexicons to texts written in different languages).compute_sentiment() function to also take tm SimpleCorpus and VCorpus objects.tm and NLP packages to Suggests.peakdates().peakdocs() function and added a peakdates() function to properly handle the entire functionality of extracting peaks.sentiment_bind(), and to_sentiment().sentolexicons object.lag = 1 in the ctr_agg() function, and set weights to 1 by default for n = 1 in the weights_beta() function.abind package from Imports.zoo package from Imports, by replacing the single occurrence of the zoo::na.locf() function by the fill_NAs() helper function (written in Rcpp).quanteda::docvars() replacement method to a sentocorpus object."x" output element from a sentomodel object (for large samples, this became too memory consuming)."howWithin" output element from a sentomeasures object, and simplified a sentiment object into a data.table directly instead of a list.do.shrinkage.x argument in the ctr_model() function to a vector argument.do.lags argument to the attributions() function, to be able to circumvent the most time-consuming part of the computation .sento_measures() function on the uniqueness of the names within and across the lexicons, features and time weighting schemes.measures_merge() function that made full merging not possible.n argument in the peakdocs() function can now also be specified as a quantile.nCore argument in the compute_sentiment() and ctr_agg() functions to 1.compute_sentiment.sentocorpus() function as a sentiment object, and modified the aggregate() function to aggregate.sentiment().weights_beta(), get_dates(), get_dimensions(), get_measures(), and get_loss_data().to_global() to measures_global(), perform_agg() to aggregate(), almons() to weights_almon(), exponentials() to weights_exponential(), setup_lexicons() to sento_lexicons(), retrieve_attributions() to attributions(), plot_attributions() to plot.attributions().ctr_merge() function, so that all merge parameters have to be passed on directly to the measures_merge() function.center and scale arguments in the scale() function.dateBefore and dateAfter arguments to the measures_fill() function, and dropped NA option of its fill argument."beta" time aggregation option (see associated weights_beta() function)."attribWeights" element of output sentomeasures object in required measures_xyz() functions."lags") to the attributions() function, and corrected some edge cases.lambdas argument to the ctr_model() function, directly passed on to the glmnet::glmnet() function if used.do.combine argument in measures_delete() and measures_select() functions to simplify.covr to Suggests.compute_sentiment() function, by writing part of the code in Rcpp relying on RcppParallel (added to Imports); there are now three approaches to computing sentiment (unigrams, bigrams and clusters).dfm argument in the compute_sentiment() and ctr_agg() functions by a tokens. argument, and altered the input and behaviour of the nCore argument in these same two functions.quanteda package to the stringi package for more direct tokenization.list_lexicons and list_valence_shifters built-in word lists by keeping only unigrams, and included same trimming procedure in the sento_lexicons() function."t" to the list_valence_shifters built-in word list, and reset values of the "y" column from 2 to 1.8 and from 0.5 to 0.2.epu built-in dataset with the newest available series, up to July 2018.list_valence_shifters[["en"]].compute_sentiment() function.print() generic for a sentomeasures object."tf-idf" option for within-document aggregation in the ctr_agg() function.sento_lexicons() function outputs a sentolexicons object, which the compute_sentiment(). function specifically requires as an input; a sentolexicons object also includes a "[" class-preserving extractor function.attributions() function outputs an attributions object; the plot_attribtutions() function is therefore replaced by the plot() generic.perform_MCS() function, but the output of the get_loss_data() function can easily be used as an input to the MCSprocedure() function from the MCS package (discarded from Imports).parallel and doParallel packages to Suggests, as only needed (if enacted) in the sento_model() function.ggthemes from Imports.measures_delete(), nmeasures(), nobs(), and to_sentocorpus().xyz_measures() to measures_xyz(), extract_peakdocs() to peakdocs().do.normalizeAlm argument in the ctr_agg() function (but kept in the almons() function).almons() function to be consistent with Ardia et al. (IJF, 2019) paper.lexicons to list_lexicons, and valence to list_valence_shifters.stats element of a sentomeasures object is now also updated in measures_fill()."_eng" to "_en"’ in list_lexicons and list_valence_shifters objects, to be in accordance with two-letter ISO language naming."valence_language" naming to "language" in list_valence_shifters object.compute_sentiment() function now also accepts a quanteda corpus object and a character vector.add_features() function now also accepts a quanteda corpus object.nCore argument to the compute_sentiment(), ctr_agg(), and ctr_model() functions to allow for (more straightforward) parallelized computations, and omitted the do.parallel argument in the ctr_model() function.do.difference argument to the ctr_model() function and expanded the use of the already existing oos argument.ggplot2 and foreach to Imports.to_global().tolower = FALSE of quanteda::dfm() constructor in compute_sentiment().intercept argument in ctr_model() to do.intercept for consistency.sento_corpus() and add_features().diff(), extract_peakdocs(), and subset_measures().sentimentr.incluce_valence() helper function)."proportionalPol").dfm argument in ctr_agg().select_measures(), but toSelect argument expanded.to_global() changed (see vignette).add_features(): regex and non-binary (between 0 and 1) allowed.