Download NCBI Genomes
Genomes can be downloaded from NCIB and they will be stored as a .zip file in an Flink field in the database.
Make sure that the Flink field is present in the wanted table before starting the download.
See, Create new Flink field on how to create the field and after add it to the table view.
-
To download genomes for specfic records, first select those wanted records in the grid.
-
To open the Download NCBI Genomes tool, under Molecular Tools, in the Genomes group, click Download NCBI Genomes.
-
The window NCBI Genomes Download will be opened.
-
Select the data source for clustering, choose one of the 2 data sources and click on the Next button:
-
Records to be included:
-
Highlighted records: Selected records in the grid only.
-
All records. Note that if a search for specific records was done and the results are 10 records, then 'all' means those 10 records.
-
List of fields that should be queried in NCBI (default is strains only)
-
Maximum number of genomes to be retrieved per query (default is 100)
-
API key from NCBI
-
If you do not have an own API key from NCBI then click op the text Get API key to retrieve one. Paste this newly retrieved key in the text box.
-
If you have an own API key from NCBI then enter this in the text box.
-
Note: it is also possible to leave the text box empty. It will work but will be slightly slower.
-
Select the field(s) containing terms to be used for queries on NCBI.The strain name can be used but also any other text field containing strain numbers that are published together with the genomes on NCBI.When a text field contains multiple collection numbers that are split by a given separator, add this separator in the last column for the selected field.In this case we will select he Other collection number field and separate the data with the semi-colon (;) .Then click Next.
-
Select the field where to store the downloaded genomes.All Flink fields in the currently opened table are listed, check the one that should store the Genomes zip file.Then click Next.
-
Start searching NCBI
-
Click on the Start button to start the search for matching genomes.Note that when a large amount of records are selected in the grid, this step could take some time to process.
-
When all available genomes are found and displayed, check the one(s) to be imported into the database.Then click on Next.
-
Decide if the results should replace or add to the existing data in the selected (Genomes) field.
-
Replace existing genomes: If there were already data in the (Genomes) field, then these will be removed and the new value will be saved.
-
Keep existing genomes and append new ones: If there were already data in the (Genomes) field, then these will be kept and the new data will be added.Then click Next.
-
All the found and checked genomes will now be downloaded and saved in the (Genomes) field.Click Finish to close the popup window.
-
A search could be done to find all the records that have a Genome and the KMER genome clustering tool can be used to classify the genome data.
This movie shows how to download NCBI genomes and KMER genome clustering in BioloMICS.
1. Database structure for genomes (0:10)
2. Select records in grid (0:46)
3. Download NCBI genomes tool (0:54)
a. Select data source (1:00)
b. Include fields with terms to search for in NCBI (1:35)
c. Select field to store genomes (1:47)
d. Start searching NCBI (1:53)
e. Download options (2:31)
f. Downloading genomes (2:39)
g. Finish (2:44)
4. Search for all records with genome linked (2:57)
5. KMER Genome Clustering (3:14)
a. Select data source (3:24)
b. Select field containing genomes (3:35)
c. Select KMER size, Sampling, and the preferred algorithms (3:40)
d. Comparing genomes (4:13)
6. Matrix and Clustering tree (4:22)