WebBLAST

API for Julia to call the BLAST Web API of NCBI.

Exported functions

webblast

webblast(sequence::String, threshold::Float64=0.005, provider::Symbol=:ncbi, cached=false)

Calls the the API of the given provider to search for a protein sequence and returns all hits within a E-Value threshold.

sequence The sequence to search for.

threshhold threshold for the E-value of hits to be returned. Default is 0.005.

provider A provider for a BLAST REST API, e.g. NCBI/EBI BLAST. Default `:ncbi (EBI to be implemented), "ncbi" searches for the sequence in the PDB.

cached Default false, results can be cached, for example during development.

webblast(sequence::Array{AminoAcid,1}, threshold::Float64=0.005, provider::Symbol=:ncbi, cached=false)

Same as above but with an Array of BioSeq AminoAcid types as input.

Types

webblast returns an Array of Hit types, that are defined as:


type Hit
    hit_num::Int
    id::String
    def::String
    accession::String
    len::Int

    hsps::Array{Hsp,1}
end

type Hsp
    hsp_num::Int
    bitScore::Float64
    evalue::Float64
    queryFrom::Int
    queryTo::Int
    queryFrame::Int
    hitFrame::Int
    identity::Int
    positive::Int
    gaps::Int
    alignLen::Int

    qseq::Array{AminoAcid,1}
    qseq_str::String

    hseq::Array{AminoAcid,1}
    midline::Array{AminoAcid,1}
end

This type is an 1:1 adaption of the returned hits in the XML result of the NCBI BLAST web API.

fastarepresentation

fastarepresentation(hit::Hit)

Returns a FastaIO style representation of the hit.

Usecase

First we load a FASTA file with FastaIO:

using FastaIO
fasta = readall(FastaReader("examples/fasta/il4.fasta"))

We can than use a sequence from the FASTA file to search BLAST. We choose the second sequence:

using BiomolecularStructures.WebBLAST
results = webblast(fasta[2][2])

FastaIO reads a FASTA file to an array where each sequence is saved as a tuppel of (descriptio::String, sequence::String). Thats why we use fasta[2][2] to get the second sequence.