C - Algorithms
Three algorithms can be used to handle C characters, the Symmetric Proportional (SYMP), the Symmetric Strict (SYMS) and the Asymmetric (ASYM) algorithms.
All 3 algorithms are first performing a basic comparison of the states of the character. For each state, the following can be reported (report file):
|
|
Source
|
|
|
|
|
|
|
State:
|
?
|
Yes/No
|
Yes
|
No
|
|
|
Value:
|
0
|
1
|
2
|
3
|
Reference:
|
|
|
|
|
|
|
State:
|
Value:
|
|
|
|
|
|
?
|
0
|
|
NaN
|
NaN
|
NaN
|
NaN
|
Yes/No
|
1
|
|
NaN
|
1
|
1
|
1
|
Yes
|
2
|
|
NaN
|
1
|
1
|
0
|
No
|
3
|
|
NaN
|
1
|
0
|
1
|
NaN means "Not a Number". This is a value that cannot be computed.
SYMP (Symmetric proportional):
When comparing two OTU's with the SYMP algorithm, the similarity coefficient (Sm) will be:
Sm = Matching states/Total states accounted
At least one state has to be encoded to account the character. Unknown states are not accounted.
SYMP produces comparisons that are symmetrical (Sab=Sba), but not necessarily transitive.
The final Sm is proportional to the number of states in common. Suitable for classification purpose.
SYMS (Symmetric Strict):
When comparing two OTU's with the SYMS algorithm, the similarity coefficient (Sm) will be:
Sm = 0, if one of the states is different between the two OTU's under comparison.
Sm = 1, if none of the states are different between the two OTU's under comparison.
At least one state has to be encoded to account the character. Unknown states are not accounted.
SYMS produces comparisons that are symmetrical (Sab=Sba), but not necessarily transitive.
The final Sm is not proportional to the number of states in common. Suitable for classification purpose.
ASYM (Asymmetric):
When comparing two OTU's with the ASYM algorithm, the similarity coefficient (Sm) will be:
|
|
Source
|
|
|
|
|
|
|
State:
|
?
|
Yes/No
|
Yes
|
No
|
|
|
Value:
|
0
|
1
|
2
|
3
|
Reference:
|
|
|
|
|
|
|
State:
|
Value:
|
|
|
|
|
|
?
|
0
|
|
NaN
|
NaN
|
NaN
|
NaN
|
Yes/No
|
1
|
|
NaN
|
1
|
1
|
1
|
Yes
|
2
|
|
NaN
|
1
|
1
|
1
|
No
|
3
|
|
NaN
|
1
|
0
|
1
|
NaN means “Not a Number”. This is a value that cannot be computed.
Sm = 0, if one of the states is different between the two OTU's under comparison.
Sm = 1, if none of the states are different between the two OTU's under comparison.
At least one state has to be encoded to account the character. Unknown states are not accounted.
ASYM produces comparisons that are not symmetrical (Sab<>Sba) and not necessarily transitive.
The final Sm is not proportional to the number of states in common.
Suitable for identification purpose.