Prune group sizes
prune_groups(
x,
max_group_size = 5,
dedup = TRUE,
discardby = "length",
prefer = NULL,
quiet = FALSE
)
A DNAbin or DNAStringset object
The maximum number of sequences with the same taxonomic annotation to keep
Whether sequences with identical taxonomic name and nucleotide bases sequences should be discarded first
How sequences from groups with size above max_group_size should be discarded. Options include "length" (Default) which will discard sequences from smallest to largest until the group is below max_group_size, "random" which will randomly pick sequences to discard until the group is below max_group_size.
A vector of sequence names that will be preferred when subsampling groups when discardby=random, or prefered when breaking ties in sequences of the same length when discardby=length. For instance high quality in-house sequences.
Whether progress should be printed to the console.