Skip to content

themis (development version)

  • Added a new article explaining how over_ratio and under_ratio work (#141).

  • All upsampling steps gain an indicator_column argument. When set, a logical column is added to the baked data marking rows added by the step (TRUE) vs rows from the original data (FALSE). For step_rose(), all rows are TRUE since ROSE generates a fully synthetic dataset (#58).

  • step_rose() and rose() now have improved documentation for minority_prop, clarifying that it controls the proportion of synthetic observations from the minority class, and how it differs from over_ratio (#144).

  • Added standalone rose() function as a thin wrapper around ROSE::ROSE(), making it consistent with the other algorithms in the package that expose a direct implementation alongside their recipe step (#195).

  • step_nearmiss() and step_tomek() gain a distance_with argument to control which variables are used for distance calculations. This allows the steps to be used when non-numeric predictor variables are present in the data (#166).

  • step_adasyn(), step_bsmote(), step_nearmiss(), step_smote(), and step_smotenc() now document the minimum number of observations needed to perform the algorithm (#104).

  • All step_*() functions now correctly handle 0 and 1 row inputs in bake() (#160).

  • adasyn(), bsmote(), nearmiss(), smote(), and tomek() now correctly attribute errors from non-numeric columns to the user-facing function (#181).

  • smotenc() now only suppresses the specific benign warning from gower::gower_topn() about variables with zero range, rather than all warnings (#182).

  • bsmote() now correctly passes the all_neighbors argument to the underlying implementation (#176).

  • step_bsmote() now works correctly when there is only a single predictor (#151).

  • step_downsample() and step_upsample() now correctly handle NA values in the outcome variable instead of erroring (#177).

themis 1.0.3

CRAN release: 2025-01-22

Improvements

  • Calling ?tidy.step_*() now sends you to the documentation for step_*() where the outcome is documented. (#142)

  • Documentation now correctly specifies majority-to-minority and minority-to-majority. (#143, #110)

  • Documentation for tidy methods for all steps has been improved to describe the return value more accurately. (#148)

  • All messages, warnings and errors has been translated to use {cli} package (#153, #155).

themis 1.0.2

CRAN release: 2023-08-14

Improvements

  • Many internal changes to improve consistency and slight speed increases.

themis 1.0.1

CRAN release: 2023-04-14

Improvements

  • Fixed bug where some upsampling functions would error if no upsampling was needed. (#119)

  • Steps with tunable arguments now have those arguments listed in the documentation.

themis 1.0.0

CRAN release: 2022-07-02

themis 0.2.2

CRAN release: 2022-05-11

  • tomek() has been added, rewritten to apply to multiple classes, removing the need for the unbalanced package, which has been removed as a dependency.

themis 0.2.1

CRAN release: 2022-04-13

themis 0.2.0

CRAN release: 2022-03-30

New steps

Improvements and Other Changes

  • export nearmiss() functions to users.
  • Update examples to no longer use iris or okc data sets.
  • All recipe steps now officially support empty selections to be more aligned with dplyr and other packages that use tidyselect (#55)

Bug fixes

  • step_rose() now correctly allows you to use characters variables. (#26)
  • step_tomek() now ignore non-predictor variables when appropriate. (#51)
  • Fix bug where wrong ordering of columns caused error in smote(). (#76)

themis 0.1.4

CRAN release: 2021-06-12

themis 0.1.3

CRAN release: 2020-11-12

  • Steps that use nearest neighbors gives cleaner errors.

themis 0.1.2

CRAN release: 2020-08-14

  • tuneable steps now properly work with tune package.
  • Steps now Retain original factor level ordering. (#22)
  • Oversampling steps now ignore non-predictor variables when appropriate. (#20)

themis 0.1.1

CRAN release: 2020-05-17

themis 0.1.0

CRAN release: 2020-01-13

  • Added a NEWS.md file to track changes to the package.