
<!-- README.md is generated from README.Rmd. Please edit that file -->

# rank <a href="https://selkamand.github.io/rank/"><img src="man/figures/logo.png" align="right" height="138" alt="rank website" /></a>

<!-- badges: start -->

[![Lifecycle:
experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![CRAN
status](https://www.r-pkg.org/badges/version/rank)](https://CRAN.R-project.org/package=rank)
[![R-CMD-check](https://github.com/selkamand/rank/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/selkamand/rank/actions/workflows/R-CMD-check.yaml)
[![Codecov test
coverage](https://codecov.io/gh/selkamand/rank/graph/badge.svg)](https://app.codecov.io/gh/selkamand/rank)
![GitHub Issues or Pull
Requests](https://img.shields.io/github/issues-closed/selkamand/rank)
[![code
size](https://img.shields.io/github/languages/code-size/selkamand/rank.svg)](https://github.com/selkamand/rank)
![GitHub last
commit](https://img.shields.io/github/last-commit/selkamand/rank)
[![Dependencies](https://tinyverse.netlify.app/badge/rank)](https://cran.r-project.org/package=rank)
[![](http://cranlogs.r-pkg.org/badges/last-month/rank)](https://cran.r-project.org/package=rank)
[![](http://cranlogs.r-pkg.org/badges/grand-total/rank)](https://cran.r-project.org/package=rank)
<!-- badges: end -->

Rank provides a customizable alternative to the built-in `rank()`
function. The package offers the following features:

1.  **Frequency-based ranking of categorical variables**: choose whether
    to rank based on alphabetic order or element frequency.

2.  **Control over sorting order**: Use `desc=TRUE` to rank based on
    descending or ascending order.

## Installation

To install **rank** from CRAN run:

``` r
install.packages("rank")
```

You can install the development version of rank like so:

``` r
# install.packages('remotes')
remotes::install_github("selkamand/rank")
```

## Usage

### Categorical Input

``` r
library(rank)

fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")

# rank alphabetically
smartrank(fruits)
#> [1] 1.5 3.5 1.5 5.0 3.5

# rank based on frequency
smartrank(fruits, sort_by = "frequency")
#> [1] 2.5 4.5 2.5 1.0 4.5

# rank based on descending order of frequency
smartrank(fruits, sort_by = "frequency", desc = TRUE)
#> [1] 3.5 1.5 3.5 5.0 1.5
```

### Numeric Input

``` r
# rank numerically
smartrank(c(1, 3, 2))
#> [1] 1 3 2

# rank numerically based on descending order
smartrank(c(1, 3, 2), desc = TRUE)
#> [1] 3 1 2
```

### Sorting By Rank

We can use `order` to sort vectors based on their ranks. For example, we
can sort the `fruits` vector based on the frequency of each element.

``` r
fruits <- c("Apple", "Orange", "Apple", "Pear", "Orange")
ranks <- smartrank(fruits, sort_by = "frequency")
fruits[order(ranks)]
#> [1] "Pear"   "Apple"  "Apple"  "Orange" "Orange"
```

### Ranking and reordering by priority values

`rank_by_priority()` assigns the *highest* ranks to specified values (in
order), while all remaining values share the same lower rank.  
`reorder_by_priority()` uses those ranks to move priority values to the
front of the vector.

``` r
# Prioritise D first, then C; A and B follow in original order
rank_by_priority(c("A", "B", "C", "D"), priority_values = c("D", "C"))
#> [1] 3.5 3.5 2.0 1.0

# Reorder so priorities come first
reorder_by_priority(c("A", "B", "C", "D"), priority_values = c("D", "C"))
#> [1] "D" "C" "A" "B"
```

### Stratified / hierarchical ranking

`rank_stratified()` computes a single combined rank across all columns
of a data frame, where each column is ranked within groups defined by
all previous columns. This produces a true hierarchical ordering.

``` r
data <- data.frame(
  gender = c("male", "male", "male", "male", "female", "female", "male", "female"),
  pet    = c("cat", "cat", "magpie", "magpie", "giraffe", "cat", "giraffe", "cat")
)

# Hierarchical ranking:
# 1. Rank gender (globally, by frequency)
# 2. Within each gender, rank pet by within-gender frequency
r <- rank_stratified(
  data,
  sort_by = c("frequency", "frequency"),
  desc    = TRUE
)

data[order(r), ]
#>   gender     pet
#> 3   male  magpie
#> 4   male  magpie
#> 1   male     cat
#> 2   male     cat
#> 7   male giraffe
#> 6 female     cat
#> 8 female     cat
#> 5 female giraffe
```

`smartrank` can be used to arrange data.frames based on one or more
columns, while maintaining complete control over how each column
contributes to the final row order.

#### BaseR

For example, we can sort the following dataframe based on frequency of
fruits, but break any ties based on the alphabetical order of the
picker.

``` r
data <- data.frame(
  fruits = c("Apple", "Orange", "Apple", "Pear", "Orange"),
  picker = c("Elizabeth", "Damian", "Bob", "Cameron", "Alice")
)

# Rank_stratified():
# 1. Rank fruits by frequency (globally)
# 2. Within each fruit, rank pickers alphabetically
strat_ranks <- rank_stratified(
  data,
  cols = c("fruits", "picker"),
  sort_by = c("frequency", "alphabetical"),
  desc = c(TRUE, FALSE)
)

data[order(strat_ranks), ]
#>   fruits    picker
#> 5 Orange     Alice
#> 2 Orange    Damian
#> 3  Apple       Bob
#> 1  Apple Elizabeth
#> 4   Pear   Cameron
```

#### Tidyverse Integration

An equivalent way to hierarchically sort data.frames is to use the
tidyverse `arrange()` function

``` r
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

arrange(
  data,
  rank_stratified(
    data,
    cols = c("fruits", "picker"),
    sort_by = c("frequency", "alphabetical"),
    desc = c(TRUE, FALSE)
  )
)
#>   fruits    picker
#> 1 Orange     Alice
#> 2 Orange    Damian
#> 3  Apple       Bob
#> 4  Apple Elizabeth
#> 5   Pear   Cameron
```

## Contributing

See
[CONTRIBUTING.md](https://github.com/selkamand/rank/blob/main/CONTRIBUTING.md).
