This is to be used alongside a hierarchial classifier such as IDTAXA or RDP to assign additional species level matches. This is designed to be a more flexible version of dada2's assignSpecies function
blast_assign_species(
query,
db,
type = "blastn",
identity = 97,
coverage = 95,
evalue = 1e+06,
max_target_seqs = 5,
max_hsp = 5,
ranks = c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"),
delim = ";",
args = NULL,
quiet = FALSE,
remove_db_gaps = TRUE
)
(Required) Query sequence. Accepts a DNABin object, DNAStringSet object, Character string, or filepath.
(Required) Reference sequences to conduct search against. Accepts a DNABin object, DNAStringSet object, Character string, or filepath. If DNAbin, DNAStringSet or character string is provided, a temporary fasta file is used to construct BLAST database
(Required) type of search to conduct, default 'blastn'
(Required) Minimum percent identity cutoff. Note that this is calculated using all alignments for each query-subject match.
(Required) Minimum percent query coverage cutoff. Note that this is calculated using all alignments for each query-subject match.
(Required) Minimum expect value (E) for saving hits
(Required) Number of aligned sequences to keep. Even if you are only looking for 1 top hit keep this higher for calculations to perform properly.
(Required) Maximum number of HSPs (alignments) to keep for any single query-subject pair.
(Required) The taxonomic ranks contained in the fasta headers
(Required) The delimiter between taxonomic ranks in fasta headers
(Optional) Extra arguments passed to BLAST
(Optional) Whether progress should be printed to console, default is FALSE
Whether gaps should be removed from the fasta file used for the database. Note that makeblastdb can fail if there are too many gaps in the sequence.