|
::FAQ
|
|
|
1. How to interpret database search result? What are d’ and ROC? 2. What d’ value is good enough? |
|
| 1. How to interpret database search result? What are d’ and ROC? | |
| The XProteo search engine uses signal detection thoery to evaluate search result. XProteo outputs a result quality index d’ (d prime) and ROC (Receiver Operating Characteristics) curve as measures of quality of a database search result. d’ is the normalized distance between the distribution of assumed correct candidate protein(s) and the distribution of randomly matched proteins in units of standard deviation. ROC curve shows the relationship between correct identification probability (Hit prob.) and false alarm probability (FA prob.). A good search result is characterized by a large d’ value and a ROC curve toward the upper left corner. | |
| 2. What d’ value is good enough? | |
| It depends. Although larger d’ correlates to more confident protein identification, it is difficult to set a single threshold for the certainty of identification. Currently XProteo uses d'=4 as the threshold. At d'=4 the hit probability is about ~0.99 with false alarm rate 0.05. In practice you may vary your own threshold for identification. You may accept a smaller d’ value for explorative project, where you do not wish to miss any protein. On the other hand, you may set a higher threshold d’ value for confirmative project, where you wish to fend off false alarm (false positive). To help you to make decision, XProteo estimates the hit probability (Hit prob.) and false alarm probability (FA prob.) in the form of a ROC curve and the hit probability at a given false alarm rate (PFA=xx, where xx is a user adjustable value). XProteo also calculates the 95% confident interval for estimated d’ (it shows in the help baloon when you put the mouse cursor over a d'). |