Probabilitys of sharing a rank as a function of sequence identity

lca_probs(
  x,
  method = "mbed",
  k = 5,
  nstart = 20,
  ranks = c("kingdom", "phylum", "class", "order", "family", "genus", "species"),
  delim = ";"
)

Arguments

x

a DNAbin object or an object coercible to DNAbin

method

The distance matrix computation method to use, accepts "mbed" which computes a distance matrix from each sequence to a subset of 'seed' sequences using the method outlined in Blacksheilds et al (2010). This scales well to big datasets, alternatively "kdist" computes the full n * n distance matrix.

k

integer giving the k-mer size used to generate the input matrix for k-means clustering.

nstart

value passed to nstart passed to kmeans. Higher increases computation time but can improve clustering accuracy considerably.

ranks

The taxonomic ranks currently assigned to the names

delim

The delimiter used between ranks