BioloMICS logo
×
BioloMICS menu

General comments

 
Polyphasic identification and classification modules are available and permit a wide range of comparisons.
 
Several two or three dimensional graphical displays are available to easily represent the results of the combination of similarity (see the descriptions of the different types of fields/characters and Computation of global similarity coefficient sections for more information on the algorithms and the compilation of local similarity coefficients) or correlation based comparison methods and clustering techniques. Comparison methods can be used singly or in combination to take the best of the different techniques.
 
For example, one can first perform a preliminary selection of records based on a basic or an advanced query, then compare or identify an unknown against the selected set of records, then analyze by agglomerative clustering some of the groups obtained in the previous steps and draw a dendrogram or a 3D display.
 
In identification procedures, pairwise similarity comparisons of an unknown against a series of OTU (operational taxonomic unit) result in the production of ordered lists where the best matching OTU appear first.
Interpretation of such lists is usually rather easy and straight forward. Some problems can be encountered when several candidates have similar or identical similarity coefficients. This can be alleviated by showing the characters that differ and letting the users/experts decide for themselves the best and most likely option.
 
In (phenetic) classification procedures, similarity comparisons between the members of a given group give square similarity or distance matrices that can be difficult to interpret or to summarize especially when the number of OTU is large. In order to group items properly, many clustering methods are available. This is not the place to discuss those methods in detail but both divisive and agglomerative clustering are used in our software solution.
 
In divisive clustering (K-means or Virtual Centre Analysis) all objects initially belong to the same single cluster, which is gradually broken down into smaller clusters. The division process is stopped when an a priori decided number of groups is obtained (note: other methods of stopping can be applied). This a priori decision is rather subjective and uneasy. Groupings are never definitive and objects can be clustered together at a certain level, then be separated and (possibly) later re-grouped again with some members of the original group. This latter property is a major advantage of divisive over agglomerative methods. This is particularly true when the number of objects to be clustered is becoming more important (see below). Divisive clustering is computationally intensive and is not widely used, because no graphical representation of the relative position of and within the clusters is proposed.
 
There are also other ordination methods such as Principal Coordinate Analysis (PCoA). For small groups of OTU we use one of the numerous agglomerative clustering methods that were implemented in our BioloMICS software solution (UPGMA, UPGMC, WPGMA, WPGMC, Single and Complete Linkages, Ward, Lance & Williams flexible method, and Neighbor Joining). The latter can be combined with both divisive and ordination methods in order to take advantage of the properties of all of them. This allows more accurate grouping and greater flexibility.
 
Another type of comparison that is implemented is what we call "Functional analysis". This is a method allowing the grouping of objects/records and of their characters/fields at the same time using agglomerative clustering methods. Objects are compared with one another on the basis of a selection of characters. The obtained distance matrix is later represented as a phenotypic tree using one of the selected agglomerative clustering methods. The characters are first normalized and reduced (note that DNA or protein sequences can't be used in such a method), then correlated with one another to obtain a correlation matrix that can be displayed in a tree-like representation using one of the available agglomerative clustering methods. The result is finally displayed as a double tree. The first is a vertical tree showing the relationships between the objects/records. The second tree is horizontal and shows the groups of characters/fields that are positively (or negatively, depending on the display method used - "positive" or "negative" clustering) correlated. Between the two trees a colored "heat map" shows the states of the characters/fields for the different objects/records. With this method, it is easier to isolate characters or groups of characters that are associated with some groups of objects. It is also possible to infer possible relations between different types of criteria, for example the pathogenicity and a given physiological feature or the activity of a gene on a microarray.
 
In the analyses mode, several options are available:
 
  • Identification: This view allows comparing one or more records against another set of records coming from the same table or any other table. See also More...
    Note that if records to be compared are not coming from the same table, then only fields equivalent to fields from the current table, will be used for the comparisons.
 
  • Agglomerative clustering: This view allows clustering three or more records with each other using one of the available hierarchical clustering methods (UPGMA, UPGMC, WPGMA, WPGMC, Single and Complete Linkages, Ward, Lance & Williams flexible method, and Neighbor Joining). Records may come from different tables. See also More...
    Note that if records to be compared are not coming from the same table, then only fields equivalent to fields from the current table, will be used for the comparisons.
     
  • Multiple Dimensional Scaling analysis. More...
     
  • Virtual Center Analysis (divisive clustering) – will be implemented in the near future
     
  • Functional analysis– will be implemented in the near future