Nowadays, many people are difficult to get job and I am also one of such unemployed person. Such people have a enough time to spent for reading my poor writing, so, this vignettes are made for such unemployed people. Please Please me! Please employ me (Heart shout)!

Execute the following R script:

Graphical User Interface: GUI

Or

Or

Provides graphical user interface which let reader know what this package is without this poor writing vignettes. Do not read this bothering (:’-D) vignette, but execute the above!

In this vignette, we provide Bayesian models for FROC analysis. Our work was inspired by a classical 1989 paper of Dev Chakraborty who treated it in the case of Maximal likelihood methods. In classical FROC analysis, modality comparison is discussed in terms of ANOVA. In our new methods, we use hyper-parameter representing AUC. The author hopes to take up some of these applications in Radiology.

- What is FROC
**Data**involving**hits**and**false****alarms**? - Modeling
- What is modality\({}^{\dagger}\)
**comparison**? - hierarchical model to
**compare****modality**in case of MRMC\({}^{\dagger \dagger}\) - Bayesian
**ANOVA**via*Bayes**factor* - Appendix

\({}^{\dagger}\) *Modality* is treatment or imaging methods such as MRI, CT, PET, … etc.

\({}^{\dagger \dagger}\) *MRMC* is a traditional abbreviation for multiple reader and multiple *case*, where *case* means modality in this context. Note that *case* means image (radiographs) in ordinal contexts.

FROC analysis provides an evaluation of observer performance. Fitting FROC model, we obtain AUC representing how much doctor can detect lesions in radiogoraphs (images).

If images were taken different methods, MRI, CT, PET,… (called

*modality*), then there is a problem that

\[\color{green}{\textit{which imaging method is best to detect signal such as lesion or nodule? }}\]

For example, AUC of MRI is *higher* than that of CT, then we may say that MRI is *better* than CT to detect lesion. This is modality comparison issue. So, we want to obtain, e.g., the one of the following

\[\color{red}{\textit{MRI > PET > CT >.. }}\] or

\[\color{red}{\textit{PET > CT > MRI >.. }}\]

or

\[\color{red}{\textit{CT > MRI > PET >.. }}\] or ….

Another approach is the following null hypothesis \(H_0\) that

\[ H_o: \color{red}{\textit{All modalities are same ability to find nodule.}} \]

Radiograhs is associated with notion of *modality*, and there are two case.

An example of modality is imaging method; MRI ,CT, PET, … etc. Which imaging method is better to detect lesion? This is one context of modality comparison issue.

If Radiographs of patients are grouped by a treatment group (case) and another treatment or untreatment (control), then in this context, modality means

**efficacy**of treatment. If an AUC of treatment images is greater than that of untreatment images, then we may say that efficacy of treatment is more significant.

Anyway, the aim of FROC is to compare modality.

In the following section, we will provide what the following table means and how to arise.

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\) | \(F_{5}\) |

4 = probably present | \(H_{4}\) | \(F_{4}\) |

3 = questionable | \(H_{3}\) | \(F_{3}\) |

2 = subtle | \(H_{2}\) | \(F_{2}\) |

1 = very subtle | \(H_{1}\) | \(F_{1}\) |

where \(H_1,H_2,...,H_c,...H_5\) and \(F_1,F_2,...,F_c,...,F_5\) are non negative integers, representing \(\text{H}\)its and \(\text{F}\)alse alarms.

- hit = True Positive = TP
- false alarm = False Positive = FP

The next section, we give the definition of \(H_c,F_c\).

After the next section, please keep in mind the notations \(H_c,F_c\).

- \(\color{red}{\text{Images}}\), which contains shadows indicating signal or noise. Also it is called
in other books.**case****Signal**means**nodule**,**lesion**, diseased … etc.**Non-diseased**image and**diseased**image are prepared, the former contains no signals and later contains at least one nodule (signal). - \(\color{red}{\text{Readers}}\), who tries to detect lesions from shadows in radiographs. Also it is called
**observer**- reader identifies his suspicious locations from shadows in each image with his confidence level.
- Given an image for reader. Then researcher asks where do lesions exist?
- If reader thinks “Oh! Definitely lesion!” Then he identify the location with number 5. And if he thinks “What the hell, it is also another lesion,..subtle but,… lesion?”, then he will mark the low confidence number, e.g., 2 or 1.

- If Reader marked all his suspicious locations, then the researcher gives a new image to reader.
- Be back to the localization and rating.

- If reader thinks there is
*one**more*lesions then reader can answer*several*suspicious locations for each image, so, the answer is*not*dichotomous, it differs from the ROC analysis.

- reader identifies his suspicious locations from shadows in each image with his confidence level.
- \(\color{red}{\text{A Gold Standard of Diagnosis}}\), identifying true lesion locations for all images. One can assign each reader’s suspicious location to
(True positive) or**TP**(False Positive), whether the location of reader is**FP***correct*with some tolerance or not.

In this package, we assume that the each image specified true lesion location which are evaluated base on the gold standard and reader does not know such locations.

If user dataset is not the above format but the Jafroc xlsx format, then use `BayesianFROC::convertFromJafroc()`

Radiologist try to detect lesions from radiographs. In FROC task, there are radiographs, readers, and researcher who knows the truth (*gold*-*standard*).

**Reader**marks his suspicious locations with his confidence level.\({}^{\dagger}\)- In each image,
*mark**locations*which he thinks it will be lesion - Also, write number indicating his
*confidence**level*as follows;**5**=**Definitely**lesion**4**=**Probably**lesion**3**= Hmm …*Probably*lesion**2**=**subtle****1**=**very**subtle

- In each image,
By

*gold**standard*of diagnosis, researcher evaluate reader’s answer, that is, if reader correctly marked his suspicious location for some true lesion location then it is called hit (true positive). If not, then researcher associate such mis-marked location with a false alarm ( false positive)By doing over all radiographs, we obtain FROC data.

\({}^{\dagger}\) If reader thinks **one** **more** lesion exist in a **single** image, then he can mark **one** **more** locations for a image, and it differs from the *ROC* analysis, which allows each reader only **dichotomous** answer whether lesion exists.

Suppose that \(\color{red}{\text{two red circles}}\) indicate true lesions, i.e. signal, which should be detected by reader. ( I am not doctor, it may be

**not**lesion, this is only**fictitious**assumptions)Of course, such red circle does not exists, when reader try to find lesions.

Reader marks the following two;

**Localize**his*suspicious*lesions**Rating**his**confidence**level from the following numbers;**5**=*Definitely*lesion**4**=*Probably*lesion**3**= Hmm …*Probably*lesion**2**= subtle**1**=*very*subtle

In the figure, each yellow triangle means that the reader thinks it is lesion with his confidence level denoted by the inner number.

- Each yellow triangle indicates reader’s suspicious location and the one of them (whose confidence number is 5) is very near one of the red circle indicating true lesion locations, thus we may conclude it as a
*hit**with**confidence**level**5*.

The other marks are false alarms, because they are far from the true lesion location denoted by the two red circles.

Summarizing the above,

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\color{red}{=1}\) | \(F_{5}\color{red}{=0}\) |

4 = probably present | \(H_{4}\color{red}{=0}\) | \(F_{4}\color{red}{=0}\) |

3 = equivocal | \(H_{3}\color{red}{=0}\) | \(F_{3}\color{red}{=1}\) |

2 = subtle | \(H_{2}\color{red}{=0}\) | \(F_{2}\color{red}{=0}\) |

1 = very subtle |
\(H_{1}\color{red}{=0}\) | \(F_{1}\color{red}{=2}\) |

Note that this is done only for the *one* image, and actual trial, there are *many* *images*, so number of hits and false alarms are larger.

Summarizing for the **1-st** image as an above example,

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\color{red}{=1}\) | \(F_{5}\color{red}{=0}\) |

4 = probably present | \(H_{4}\color{red}{=0}\) | \(F_{4}\color{red}{=0}\) |

3 = equivocal | \(H_{3}\color{red}{=0}\) | \(F_{3}\color{red}{=1}\) |

2 = subtle | \(H_{2}\color{red}{=0}\) | \(F_{2}\color{red}{=0}\) |

1 = very subtle |
\(H_{1}\color{red}{=0}\) | \(F_{1}\color{red}{=2}\) |

Summarizing for the **2-nd** image, (which is not shown in the above)

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\color{red}{=0}\) | \(F_{5}\color{red}{=0}\) |

4 = probably present | \(H_{4}\color{red}{=1}\) | \(F_{4}\color{red}{=1}\) |

3 = equivocal | \(H_{3}\color{red}{=0}\) | \(F_{3}\color{red}{=0}\) |

2 = subtle | \(H_{2}\color{red}{=1}\) | \(F_{2}\color{red}{=1}\) |

1 = very subtle |
\(H_{1}\color{red}{=0}\) | \(F_{1}\color{red}{=1}\) |

Summarizing for the **3-rd** image, (which is not shown in the above)

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\color{red}{=0}\) | \(F_{5}\color{red}{=0}\) |

4 = probably present | \(H_{4}\color{red}{=0}\) | \(F_{4}\color{red}{=3}\) |

3 = equivocal | \(H_{3}\color{red}{=1}\) | \(F_{3}\color{red}{=1}\) |

2 = subtle | \(H_{2}\color{red}{=0}\) | \(F_{2}\color{red}{=3}\) |

1 = very subtle |
\(H_{1}\color{red}{=0}\) | \(F_{1}\color{red}{=3}\) |

Summarizing for the **4-th** image,

Summarizing for the **5-th** image,

Summarizing for the **6-th** image,

Summarizing for ….,

Summarizing for ….,

Summarizing for ….,

Summarizing for ….,

…………………,

…………………,

…………………,

Summarizing for the \(N_I\)-th image (the last image), we finish FROC task.

Summing for \(\color{red}{\textit{ all }}\) tables, i.e.,

summing (pooling) for *all* images, we will get, e.g., the following FROC data

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\color{red}{=1+0+0 \cdots =112}\) | \(F_{5}\color{red}{=0+0+0 \cdots =21}\) |

4 = probably present | \(H_{4}\color{red}{=0+1+0 \cdots =56}\) | \(F_{4}\color{red}{=0+1+3 \cdots =33}\) |

3 = equivocal | \(H_{3}\color{red}{=0+0+1 \cdots =44}\) | \(F_{3}\color{red}{=1+0+1 \cdots =65}\) |

2 = subtle | \(H_{2}\color{red}{=0+1+0 \cdots = 32}\) | \(F_{2}\color{red}{=0+1+3 \cdots =78}\) |

1 = very subtle |
\(H_{1}\color{red}{= 0+0+0 \cdots =11}\) | \(F_{1}\color{red}{=2+1+3 \cdots =122}\) |

Consequently, we …

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\color{red}{=112}\) | \(F_{5}\color{red}{=21}\) |

4 = probably present | \(H_{4}\color{red}{=56}\) | \(F_{4}\color{red}{=33}\) |

3 = equivocal | \(H_{3}\color{red}{=44}\) | \(F_{3}\color{red}{=65}\) |

2 = subtle | \(H_{2}\color{red}{=32}\) | \(F_{2}\color{red}{=78}\) |

1 = very subtle |
\(H_{1}\color{red}{=11}\) | \(F_{1}\color{red}{=122}\) |

Note that this is done only for the *one* image, and actual trial, there are *many* *images*, so number of hits and false alarms are larger.

Note that the following monotonicity condition is not required in precisely. However, genreally speaking, the high confidence level would generate high number of hits rather than low confidence levels.

In the above we use 5 confidences, but in the following example we use 3 confidences.

```
#1) Build data for singler reader and single modality case.
dataList <- list(
c=c(3,2,1), # c is ignored, can omit.
h=c(97,32,31),
f=c(1,14,74),
NL=259,
NI=57,
C=3
)
# where,
# c denotes confidence level, each components indicates that
# 3 = Definitely lesion,
# 2 = subtle,
# 1 = very subtle
# h denotes number of hits
# (True Positives: TP) for each confidence level,
# f denotes number of false alarms
# (False Positives: FP) for each confidence level,
# NL denotes number of lesions (signal),
# NI denotes number of images,
BayesianFROC::viewdata(dataList)
#>
#> * Number of Lesions: 259
#>
#> * Number of Images : 57
#>
#>
#> . Confidence.Level False.Positives True.Positives
#> ------------------- ----------------- ---------------- ---------------
#> Obviouly present 3 1 97
#> Relatively obvious 2 14 32
#> Subtle 1 74 31
#>
#>
#> * Higher number of confidence level indicates reader's higher confidence. In your case, the number 3 is the most high confidence level, i.e., we may say that confidence level 3 means that "definitely lesion is present "
```

`dataset_creator_new_version()`

`create_dataset()`

`convertFromJafroc()`

The third function `convertFromJafroc()`

converts the Jafroc formulation to the authors formulation.

There is two case, that is,

- single reader and single modality,
- multiple reader and multiple modality.

First, we explain the former for simplicity.

In order to build up the theory of FROC analysis in a simple manner, we first explain the former case.

It would be difficult for someone, but do not bother you if you cannot understand, since it is all my responsibility

In Statistical **conventional** notation, the parameter of model is denoted by \(\theta \in \mathbb{R}^n\) and a likelihood by \(f(y|\theta)\) for future data \(y\).

In FROC model, we may say that \[\theta = (\mu,\sigma, z_1 ;dz_1,dz_2,...,dz_{C-1}) \in \mathbb{R}^{C+2}.\]

In the following we show what these notations mean.

If number of confidence level is five (\(C=5\)), then the data-set for FROC analysis is the follows;

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\) | \(F_{5}\) |

4 = probably present | \(H_{4}\) | \(F_{4}\) |

3 = equivocal | \(H_{3}\) | \(F_{3}\) |

2 = subtle | \(H_{2}\) | \(F_{2}\) |

1 = very subtle |
\(H_{1}\) | \(F_{1}\) |

In the above table, the \(H_!,H_2,...,H_5\) and \(F_!,F_2,...,F_5\) are non-negative integers. Confidence level is sometimes called * rating* in other books.

Note that, in conventional likelihood notation \(f(y|\theta)\), the data \(y\) is given as follows;

\[y=(H_1,H_2,H_3,...;F_!,F_2,...;N_L,N_I) \] where \(H_c \in \mathbb{N}\) is a number of hits with \(c\)-th confidence level and \(F_c \in \mathbb{N}\) is a number of false alarms with \(c\)-th confidence level and \(N_L\) is a number of lesions (signal) and \(N_I\) is a number of images (trial).

Let \((\theta,\theta')=(\theta_1,\theta_2,\theta_3,...\theta; \theta'_1,\theta'_2,....)\) denote model parameters.

Then, it natural to assume that

\[ H_c \sim \text{Binomial}(p_c(\theta),N_L),\\ F_c \sim \text{Poisson}(q_c(\theta)).\\ \] To determine the hit rate \(p_c(\theta)\) and false alarm rate \(q_c(\theta)\), we use two density functions. denoted \(P_{c,\theta'}\) and \(Q_{c,\theta'}\) such that

\[ p_c(\theta) = \int_{\theta_c}^{\theta_{c+1}}P_{}(x|\theta')dx,\\ q_c(\theta) = \int_{\theta_c}^{\theta_{c+1}}Q_{}(x|\theta')dx. \]

Note that the upper and lower values of integral are common the both of hit and false rate. In the next section we shall determine these functions \(P_{\theta}\) and \(Q_{\theta}\). We call \(P_{\theta}\) signal distribution and \(Q_{\theta}\) noise distribution.

Note that if random variable is distributed by \(P_{\theta}\) or \(Q_{\theta}\), then traditional, it is called latent variable in this context.

In the next section, we will see that \(P_{\theta}\) is a Gaussian and \(Q_{\theta} = Q_{}=N_I \times \frac{d \log \Phi(z)}{dz}\) where the \(\Phi\) is the cumulative distribution function of the canonical Gaussian and \(N_I\) is a number of images (radiographs).

Of course it is not a priory determined, and the deducing process of these functions is not precisely logical. Thus I do not like the following explanation in my mathematical manner. But …

Anyway, I do not use the conventional notation \(\theta\) but use

\((\mu,\sigma, z_1, z_2,...,z_{C})\) instead of \(\theta\) as following manner:

\[ p_c(\theta) = \int_{z_c}^{z_{c+1}}P_{}(x|\mu,\sigma)dx,\\ q_c(\theta) = \int_{z_c}^{z_{c+1}}Q_{}(x)dx. \]

Or more concretely,

\[ p_c(\theta) = \int_{z_c}^{z_{c+1}}\text{Gaussian}_{}(x|\mu,\sigma)dx,\\ q_c(\theta) = \int_{z_c}^{z_{c+1}}N_I \times \frac{d \log \Phi(z)}{dz}dz. \]

In the following, we shall see how these hit rate and false alarm rate are determined. Of course, the two Gaussian and differential logarithmic Gaussian can be changed and this change will generate various FROC models. The author did not changes these two distribution, since the author’s health condition is very very very very very very bad. Really, a few years later, I will be die. This R script will be my last work.

FROC model is not so fitted for some family of data, then we need to change the above two distribution.

Please find what dataset is not so good for fitting via the GUI `fit_GUI_simple()`

or `fit_GUI()`

.

Statistical modeling is equivalent to calculating the probability of arising a data \((H_c,F_c) \in \mathbb{N} \times\mathbb{N}, c=1,2,...,5\) for number of lesions \(N_L \in \mathbb{N}\) and number of images \(N_I\in \mathbb{N}\).

FROC data is very simple, that is, data are hits \(H_c\) and false alarms \(F_c\).

First we focus on the *false* *positive* *fraction* per image. In traditional statistics, it is natural to assume that false alarms are distributed by the Poisson, that is,

\[ \frac{F_c+F_{c+1}+...+F_C}{N_I} \sim \text{Poisson}(\lambda_c),\] where \(\lambda_c\) is non-negative real number representing a false alarm rate. When someone see the left hand side, he might consider what the hell?, does not know such a quantity, but I am afraid that it is a famous quantity in ROC or FROC analysis and it is called False Positive Fraction (FPF) per image. So, please love it.

However, the false positive fraction, that is, the left hand side is *not* a natural number, thus **instead** **of** **the** **above**, we assume the following;

\[F_c+F_{c+1}+...+F_C \sim \text{Poisson}(\lambda_cN_I)\]

or equivalently (see, Appendix),

\[F_{c } \sim \text{Poisson} ( (\lambda _{c} -\lambda _{c+1} )\times N_{I} ),\]

where \(\lambda_c\) is a non negative number.

Especially,

\[F_{C } \sim \text{Poisson} ( (\lambda _{C} -0 )\times N_{I} ),\]

that is, \(\lambda_{C+1}=0\).

We may say \(\lambda_c\) is a false alarm rate per image for generating the \(c\)-th FPF or False Positives per Image (FPI).

Next, we shall show the hit. Because each hit is generated by some lesion, we can count the number of hits in each lesion. We denote the number of hits to the \(l^{\text{th}}\) lesion with the \(c^{\text{th}}\) confidence level by \(H_{c,l}\).

It is natural to assume the following Bernoulli assumption: \[ H_{c,l}\sim \text{Bernoulli} ( p_{c} ), \] where \(p_c\) is a hit rate.

Summarizing for subscript \(l\) yields \[ H_{c } \sim \text{Binomial} ( p_{c}, N_{L} ), \] This is a statistical model for hits.

To determine the rate \(p_c, \lambda_c\) we use two latent random variable which will be denoted by \(Y,W\). The word *latent* indicates that it cannot be observed nor measured from FROC trial.

We may consider the latent Gaussian random variable is such as some biomarker. The roll of biomarker has two, one is the same as the confidence level and the other one is the latent variable.

The author first consider that we use two distributions, one is associated with each lesions and the another one is with each images. But now, I think, it is wrong or redundant.

Anyway, assume the following signal distribution for latent random variable \(Y\) so that

\[ Y \sim \text{Normal}(\mu,\sigma ^2) \\ % X \sim \text{Normal}(0,1) \\ \] and thresholds \(z_1 < z_2 < ... < z_{c} < ... < z_C\) where \(z_c \in \mathbb{R}\).

Let us denote the density function of Gaussian of mean \(m\) and variance \(v\) by \(N_{m,v}()\), that is:

\[ N_{m,v}(x)=\frac{1}{\sqrt{2\pi v}}\exp\frac{-(x-m)^2}{2v}. \]

Then we consider this latent Gaussian variable determine the hit rate \(p_c\) as the following manner.

\[p_{c}=\text{Prob}( z_c < Y <z_{c+1} )\\ = \int ^{z_{c+1 }}_{z_{c}} N_{m,v}(z)dz\\ =\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}). \]

In particular, we use the following definition for the most highest confidence level.

\[p_{C}= 1-\Phi (\frac{z_{c}-\mu}{\sigma}), \]

instead of

\[p_{C}= \Phi (\frac{z_{C +1}-\mu}{\sigma}) -\Phi (\frac{z_{C}-\mu}{\sigma}), \] with theoretically infinity but numerically very large number \(z_{C+1}\). Do not confuse our model does not use the \(z_{C+1}\) which cause the divergent transition issues, see Appendix for more detail. I know, reader thinks that what the hell, the author is very stupid and very redundant explanation. But I am afraid that the author is really stupid dirty old man without no money and no hair, and this special treatment for the highest confidence level is very important to avoid the divergent transition issue causing bias of MCMC sampling, pointed out by the Betanaplha.

Why does we define hit rate as above ? *Intuitively*, this definition of hit rate says that each lesion (signal) assigned some latent variable \(Y\) and if it *falls* into the interval \(z_c < Y <z_{c+1}\), then reader thinks it is his *suspicious* positive. It shows that latent variable merely decides the rate and not the number of hit. Thus if we consider the latent Gaussian has i.i.d. and associated with lesion, then if value falls between \(z_c\) and \(z_{c+1}\), it cannot be said that it generates hit. But it may say that it would be hit in the probability \(p_c\).

Next we shall introduce the differential logarithmic Gaussian distribution to determine the false alarm rate \(\lambda_c\).

Recall that False alarm is distributed as follows;

\[F_{c } \sim \text{Poisson} ( (\lambda _{c} -\lambda _{c+1} )\times N_{I} ),\]

and thus setting \(q_c:=(\lambda _{c} -\lambda _{c+1} )\times N_{I}\), we obtain

\[F_{c } \sim \text{Poisson} ( q_c),\]

OK, pretty crowd, what we have to do next to

We introduce a new latent random variable \(W\) such that

\[ q_c =\text{Prob}( z_c < W <z_{c+1} ). \]

We assume that \(W\) is defined so that the following relation holds;

\[ q_c = N_I \times \int ^{z_{c+1 }}_{z_{c}}d \log \Phi ( z ) \] and especially for the highest confidence level \(C\), we define it by \[ q_C = N_I \times \int ^{\infty}_{z_{c}}d \log \Phi ( z ) \] Note that in programming phase, never use the number \(Z_{C+1}\) which is ideally infinity but numerically very large number which cause the divergent transition issue.

Thus, we may say that false alarm rate is determined by the density function \(N_I \times\frac{ d \log \Phi ( z )}{dz}\) (not probability density), which we shall call *differential* *logarithmic* *Gaussian* *distribution* *with* *scale* \(N_I\) or briefly logarithmic Gaussian. Consequently, we obtain the Bayesian model;

\[H_{c } \sim \text{Binomial} ( p_{c}, N_{L} ),\] \[F_{c } \sim \text{Poisson} (q_c),\] \[q_c = N_I \times \int ^{z_{c+1 }}_{z_{c}}d \log \Phi ( z ),\] \[p_{c} = \int ^{z_{c+1 }}_{z_{c}} N_{m,v}(z)dz\] In this model, \(z_{c},c=1,\cdots,C\), \(\mu\), and \(\sigma\) are the parameters to be estimated.

Or equivalently,

\[H_{c } \sim \text{Binomial} ( p_{c}, N_{L} ),\] \[F_{c } \sim \text{Poisson} ( (\lambda _{c} -\lambda _{c+1} )\times N_{I} ),\] \[\lambda _{c} = - \log \Phi ( z_{c } ),\] \[p_{c} =\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}). \] In this model, \(z_{c},c=1,\cdots,C\), \(\mu\), and \(\sigma\) are the parameters to be estimated.

Thus we can say that the classical bi-normal assumption is redundant in the FROC trial and only we need two distribution, one is the signal distribution and the other is logarithmic Gaussian. So, classical FROC theory use the bi normal assumption which use canonical Gaussian for noise and signal distribution as above. But the noise distribution is wrong or redundant. As shown, we have to use the logarithmic Gaussian for the noise distribution.

Note that \(N_I \times\log \Phi ( z )\) is independent on the model parameter. Thus it is perfectly fixed, and we can observe how signal distribution varys according to data. So, in classical signal detection theory, two Gaussian is plotted, but in FROC theory we should plot two different distribution, one is the Gaussian and the other is the logarithmic Gaussian.

Of course, the logarithmic Gaussian is not necessarily required, that is, we may change the distribution if necessary. The author found that some dataset, the fitting is very bad, thus, in such data, we should change the Gaussian or logarithmic Gaussian for more suitable one.

We shall show how the logarithmic Gaussian appears. This is an heuristic approach by Dev Chakraborty in 1989 paper, We shall follow his logic, slowly and step by step.

Let \(X\) be a real valued random variable distributed by the canonical Gaussian, that is,

\[ X \sim \text{Normal}(0,1) \\ \]

and we call this the noise distribution.

Intuitively, we consider that if \(X\) falls between \(z_c<X<z_{c+1}\), then it generates one false alarm with confidence level \(c\).

Using the noise distribution, it natural to thinks that the probability of that the false positive is not zero with confidence level is greater than \(c\) is

\[\text{Prob}( z_c < X ) =1- \Phi (z_c). \]

On the other hand, by the Poisson, it can be calculated (see Appendix)

\[\mathbb{P} \bigg\{ \frac{F_c+F_{c+1}+...+F_C}{N_I} \neq 0 \bigg\} = 1 - \mathbb{P} \bigg\{ \frac{F_c+F_{c+1}+...+F_C}{N_I} =0 \bigg\} \\ =1-e^{-\lambda_c}.\]

Now, we assume that

\[\text{Prob}( z_c < X ) =\mathbb{P} \bigg\{ \frac{F_c+F_{c+1}+...+F_C}{N_I} \neq 0 \bigg\} . \]

Then, we obtain \[ \lambda_c = - \log \Phi (z_c). \]

Consequently, we can rewrite the Bayesian model;

\[H_{c } \sim \text{Binomial} ( p_{c}, N_{L} ),\] \[F_{c } \sim \text{Poisson} ( (\lambda _{c} -\lambda _{c+1} )\times N_{I} ),\] \[\lambda _{c} = - \log \Phi ( z_{c } ),\] \[p_{c} =\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}). \] In this model, \(z_{c},c=1,\cdots,C\), \(\mu\), and \(\sigma\) are the parameters to be estimated.

Thresholds \(z_1, z_2, ...., z_C\) should satisfy the monotonicity condition \(z_1 < z_2 < .... < z_C\).

To do so, we use the prior that \[z_2 - z_1 \sim \text{Uniform}(0,\infty) \\ z_3 - z_2 \sim \text{Uniform}(0,\infty) \\ :\\ :\\ z_C - z_{C-1} \sim \text{Uniform}(0,\infty) \\ \] where \(\text{Uniform}(0,\infty)\) means improper prior whose support is the interval \((0,\infty)\).

We introduce a new parameter: \[ dz_c:=z_{c+1}-z_{c}, \] where \(c=1,2,\cdots,C\). In practical modeling, we no longer use the previous section’s parameters \(z_2,z_3,\cdots, z_C\), using $ dz_1,dz_2,, dz_{C-1}$ instead. To include an order constraint for Bayesian models, we assume a non-regular uniform prior for \(dz_c\): \[ dz_c \sim \text{Uniform}(0, \infty), \] where \(\text{Uniform}(0,\infty)\) indicates flat improper priors whose integrated values are not one and its support is the interval \((0,\infty)\). These priors are equivalent in that we assume monotonicity in the thresholds, as follows: \[ z_{1}\leq z_{2}\leq z_{3} \leq \dots \leq z_{C}.\]

\[\begin{eqnarray*} H_{c } & \sim &\text{Binomial} ( p_{c}, N_{L} ), \text{ for $c=1,2,...,C$.}\\ F_{c } & \sim &\text{Poisson}( (\lambda _{c} -\lambda _{c+1} )\times N_{L} ), \text{ for $c=1,2,...,C-1$.}\\ \lambda _{c}& =& - \log \Phi ( z_{c } ),\text{ for $c=1,2,...,C$.}\\ p_{c} &=&\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}), \text{ for $c=1,2,...,C-1$.}\\ p_C & =& 1-\Phi (\frac{z_{C}-\mu}{\sigma}),\\ F_{C} & \sim & \text{Poisson}( (\lambda _{C} - 0)N_I),\\ dz_c=z_{c+1}-z_{c} &\sim& \text{Uniform}(0,\infty), \text{ for $c=1,2,...,C-1$.}\\ \mu &\sim& \text{Uniform}(-\infty,\infty),\\ \sigma &\sim& \text{Uniform}(0,\infty),\\ \end{eqnarray*}\] Our model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C-1}\), \(\mu\), and \(\sigma\). Notation \(\text{Uniform}( -\infty,100000)\) means the improper uniform distribution of its support is the unbounded interval \(( -\infty,100000)\).

The last three equation are priors. In past,

It might be redundant, but I think reader’s back ground cannot be not inferred, so, I write down the following redundant descriptions. If \(C=5\), then it can be written without abbreviation as follows;

In the case of \(C=5\), we exclude the subscript notation \(c\) from the above model. I hope it helps reader. Provides the above model without abbreviation as follows;

\[\begin{eqnarray*} H_{1 } & \sim &\text{Binomial} ( p_{1}, N_{L} ) \\ H_{2 } & \sim &\text{Binomial} ( p_{2}, N_{L} ) \\ H_{3 } & \sim &\text{Binomial} ( p_{3}, N_{L} ) \\ H_{4 } & \sim &\text{Binomial} ( p_{4}, N_{L} ) \\ H_{5 } & \sim &\text{Binomial} ( p_{4}, N_{L} ) \\ F_{1 } & \sim &\text{Poisson}( (\lambda _{1} -\lambda _{2} )\times N_{I} ), \\ F_{2 } & \sim &\text{Poisson}( (\lambda _{2} -\lambda _{3} )\times N_{I} ), \\ F_{3 } & \sim &\text{Poisson}( (\lambda _{3} -\lambda _{4} )\times N_{I} ), \\ F_{4 } & \sim &\text{Poisson}( (\lambda _{4} -\lambda _{5} )\times N_{I} ), \\ F_{5 } & \sim &\text{Poisson}( (\lambda _{5} -0 )\times N_{I} ), \text{Be careful !!:'-D}\\ \lambda _{1}& =& - \log \Phi ( z_{1 } ),\\ \lambda _{2}& =& - \log \Phi ( z_{2 } ),\\ \lambda _{3}& =& - \log \Phi ( z_{3 } ),\\ \lambda _{4}& =& - \log \Phi ( z_{4 } ),\\ \lambda _{5}& =& - \log \Phi ( z_{5 } ),\\ p_{1} &:=&\Phi (\frac{z_{2}-\mu}{\sigma})-\Phi (\frac{z_{1}-\mu}{\sigma}), \\ p_{2} &:=&\Phi (\frac{z_{3}-\mu}{\sigma})-\Phi (\frac{z_{2}-\mu}{\sigma}), \\ p_{3} &:=&\Phi (\frac{z_{4}-\mu}{\sigma})-\Phi (\frac{z_{3}-\mu}{\sigma}), \\ p_{4} &:=&\Phi (\frac{z_{5}-\mu}{\sigma})-\Phi (\frac{z_{4}-\mu}{\sigma}), \\ p_5 &:=& 1-\Phi (\frac{z_{5}-\mu}{\sigma}),\text{Be careful !!:'-D}\\ dz_1=z_{2}-z_{1} &\sim& \text{Uniform}(0,\infty), \\ dz_2=z_{3}-z_{2} &\sim& \text{Uniform}(0,\infty), \\ dz_3=z_{4}-z_{3} &\sim& \text{Uniform}(0,\infty), \\ dz_4=z_{5}-z_{4} &\sim& \text{Uniform}(0,\infty), \\ \mu &\sim& \text{Uniform}(-\infty,\infty),\\ \sigma &\sim& \text{Uniform}(0,\infty),\\ \end{eqnarray*}\] Our model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C-1}\), \(\mu\), and \(\sigma\). Notation \(\text{Uniform}( -\infty,100000)\) means the improper uniform distribution of its support is the unbounded interval \(( -\infty,100000)\).

This is a basic example which shows how to fit a model to data `dataList`

of single reader and single modality.

Note that in the following R scripts, a list object `dataList`

representing the following FROC data;

Number of Confidence Level | Number of Hits | Number of False alarms |
---|---|---|

3 = definitely present | 97 | 1 |

2 = equivocal | 32 | 14 |

1 = questionable | 31 | 74 |

```
#1) Build Data
# For singler reader and single modality case.
dataList <- list(
c=c(3,2,1), # c is ignored, can omit.
h=c(97,32,31), # Changing these number, please enjoy fitting
f=c(1,14,74),
NL=259,
NI=57,
C=3
)
# where,
# c denotes confidence level, each components indicates that
# 3 = Definitely lesion,
# 2 = subtle,
# 1 = very subtle
# h denotes number of hits
# (True Positives: TP) for each confidence level,
# f denotes number of false alarms
# (False Positives: FP) for each confidence level,
# NL denotes number of lesions (signal),
# NI denotes number of images,
#2) Fit the FROC model.
fit <- BayesianFROC::fit_Bayesian_FROC(dataList)
#3) Validation of Fit
# via alculation of p -value of the chi square goodness of fit, which is
# calculated by integrating with predictive posterior measure.
p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit(fit)
```

In the last R script `p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit(fit)`

we calculates p value of the chi square goodness of fit by Bayesian manner, so we shall briefly recall how it calculates the p value ( for more details, see Gelmann’s book *Bayesian* *Data* *Analysis*).

In the following, we shall use the general notation. Let \(y_{\text{obs}}\) be observed data and \(f(y|\theta)\) be a model (likelihood) for a future data \(y\). We write a prior and a posterior distributions by \(\pi(\theta)\) and \(\pi(\theta|y) \propto f(y|\theta)\pi(\theta)\). The posterior predictive distribution is defined by \(p(y|y_{\text{obs}}) := \int f(y|\theta)\pi(\theta|y_{\text{obs}}) d\theta\)

In our case, the data \(y\) is a pair of hits and false alarms, that is \(y=(H_1,H_2, \dots H_C; F_1,F_2, \dots F_C)\) and \(\theta = (z_1,dz_1,dz_2,\dots,dz_{C-1},\mu, \sigma)\). We shall define the \(\chi^2\) discrepancy ( goodness of fit statistics ) to validate that the model fit the data.

\[ T(y,\theta) := \sum _{c=1,\cdots,C} \biggr( \frac{\bigl\{H_c-N_L\times p_c \bigr\}^2}{N_L\times p_c}+ \frac{\bigl\{F_c-(\lambda _{c} -\lambda _{c+1} )\times N_{I}\bigr\}^2}{(\lambda _{c} -\lambda _{c+1} )\times N_{I} }\biggr). \]

for single reader and single modality.

\[ T(y,\theta) := \sum _{r=1, \cdots,R} \sum _{m=1,\cdots,M} \sum _{c=1,\cdots,C} \biggr( \frac{\bigl\{H_{c,m,r}-N_L\times p_{c,m,r}\bigr\}^2}{N_L\times p_{c,m,r}}+ \frac{\bigl\{F_{c,m,r}-(\lambda _{c} -\lambda _{c+1} )\times N_{L}\bigr\}^2}{(\lambda _{c} -\lambda _{c+1} )\times N_{L} }\biggr). \]

for multiple reader and multiple modality, where false positive fraction is per lesion instead of per image. Of course it is very easy to implement per image hierarchical Bayesian model, so the author is kind of goofing off in implementing per image model.

Note that \(p_c\) and \(\lambda _{c}\) depend on \(\theta\). In this statistic, the number of degrees of freedom is \(C-2\).

Classical frequentist methods, the parameter \(\theta\) is a fixed estimates, e.g. the maximal likelihood estimator, however, in Bayesian context, the parameter is not deterministic, so, by integrating with the posterior predictive distribution, we can get the posterior predictive \(\chi^2\) value and its p-value. Let \(y_{\text{obs}}\) be observed data. Then the posterior predictive p value is defined by

\[ p \text{ value of $y_{\text{obs}}$ in Theory} = \int \int dy d\theta I_{T(y,\theta) > T(y_{\text{obs}},\theta)}f(y|\theta)\pi(\theta|y_{\text{obs}}) \]

In the following, we show how to calculate the double Integral. Suppose that \(\theta _1, \theta_2,\cdots,\theta_N\) are samples from the posterior distribution via Hamiltonian Monte Carlo simulation, we obtain a sequence of models (likelihoods) ; \(f(y|\theta_1),f(y|\theta_2),\cdots, f(y|\theta_N)\). Drawing the samples \(y_1,y_2,\cdots, y_N\) so that each \(y_i\) is a sample from the distribution whose density function is \(f(y|\theta_i)\), i.e., in conventional notation, we may write;

\[ y_1 \sim f(\cdot|\theta_1),\\ y_2 \sim f(\cdot|\theta_2),\\ \cdots,\\ y_N \sim f(\cdot|\theta_N). \] and it is desired one, that is, we can interpret that the samples \(y_1,y_2,\cdots, y_N\) is drawing from the posterior predictive distributions. Using the law of large number, we can calculate the noble integral of the p value by

\[ p \text{ value approximation to calculate numerically} =\frac{1}{N} \sum I_{T(y_i,\theta_i) > T(y_{\text{obs}},\theta_i)} \]

…oh ache,.. ache ache my body is now very bad ,,, I hate prurigo nodularis.

I exposed some surfactants made by Flower King (fictitious name), then my body become bad, and many many very many prurigo noudularis appeared my whole body, and I struggled it for 1 years and 17 months. I want to say to live in safe, the bio sarfanct is very important, OK ? Bio !! not petroleum. Please make and develop bio sarfactant for the future life.

In this section, we show some distribution determines the false alarm rate.

Recall that the hit rate is determined by the signal distributions, that is,

\[p_{c}=\text{Prob}( z_c < Y <z_{c+1} )\\ =\Phi (\frac{z_{c +1}-\mu}{\sigma})-\Phi (\frac{z_{c}-\mu}{\sigma}). \] where

\[ Y \sim \text{Normal}(\mu,\sigma ^2). \]

Now, False alarm is distributed as follows;

\[F_{c } \sim \text{Poisson} ( (\lambda _{c} -\lambda _{c+1} )\times N_{I} ),\]

and thus the rate is \(q_c:=(\lambda _{c} -\lambda _{c+1} )\times N_{I}\)

What we want is to define a random variable \(W\) such that

\[ q_c =\text{Prob}( z_c < W <z_{c+1} ). \]

Using \[\lambda _{c} = - \log \Phi ( z_{c } ),\] we obtain

\[ q_c = (\log \Phi ( z_{c+1 } ) - \log \Phi ( z_{c } )) \times N_I \] Thus we can obtains

\[ q_c = N_I \times \int ^{z_{c+1 }}_{z_{c}}d \log \Phi ( z ) \]

Thus, we may say that false alarm rate is determined by the probability density function \(N_I \times\frac{ d \log \Phi ( z )}{dz}\), which we shall call *differential* *logarithmic* *Gaussian* *distribution* *with* *scale* \(N_I\) or briefly logarithmic Gaussian. Thus we can say that the classical bi-normal assumption is redundant in the FROC trial and only we need two distribution, one is the signal distribution and the other is logarithmic Gaussian, so, classical theory use the canonical Gaussian for signal, but it is wrong or redundant. As shown, we have to use the logarithmic Gaussian for the noise distribution.

Note that \(N_I \times\log \Phi ( z )\) is independent on the model parameter. Thus it is perfectly fixed, and we can observe how signal distribution varys according to data. So, in classical signal detection theory, two Gaussian is plotted, but in FROC theory we should plot two different distribution, one is the Gaussian and the other is the logarithmic Gaussian.

So, we use the not only bi-normal Gaussian, but also, logarithmic Gaussian,

Note that Multiple reader and multiple \(\color{red}{\text{"case"}}\) (MRMC) implies that multiple reader and multiple \(\color{red}{\text{"modality"}}\) !!

So, I hate the word MRMC, it should be \(\color{red}{\text{"MRMM"}}\) !!

*2* readers, *2* modalities and *3* confidence levels.

Confidence Level | Modality ID | Reader ID | Number of Hits | Number of False alarms |
---|---|---|---|---|

3 = definitely present | 1 | 1 | \(H_{3,1,1}\) | \(F_{3,1,1}\) |

2 = equivocal | 1 | 1 | \(H_{2,1,1}\) | \(F_{2,1,1}\) |

1 = questionable | 1 | 1 | \(H_{1,1,1}\) | \(F_{1,1,1}\) |

3 = definitely present | 1 | 2 | \(H_{3,1,2}\) | \(F_{3,1,2}\) |

2 = equivocal | 1 | 2 | \(H_{2,1,2}\) | \(F_{2,1,2}\) |

1 = questionable | 1 | 2 | \(H_{1,1,2}\) | \(F_{1,1,2}\) |

3 = definitely present | 2 | 1 | \(H_{3,2,1}\) | \(F_{3,2,1}\) |

2 = equivocal | 2 | 1 | \(H_{2,2,1}\) | \(F_{2,2,1}\) |

1 = questionable | 2 | 1 | \(H_{1,2,1}\) | \(F_{1,2,1}\) |

3 = definitely present | 2 | 2 | \(H_{3,2,2}\) | \(F_{3,2,2}\) |

2 = equivocal | 2 | 2 | \(H_{2,2,2}\) | \(F_{2,2,2}\) |

1 = questionable | 2 | 2 | \(H_{1,2,2}\) | \(F_{1,2,2}\) |

where, each component \(H\) and \(F\) are non negative integers. By the multi-index notation, for example, \(H_{3,2,1}\) means the hit or the \(1\)-st reader over all images taken by \(2\)-nd modality with reader’s confidence level is \(3\).

This package has example data, for example, the following object in this package is an MRMC dataset:

`dataset_creator_new_version()`

`create_dataset()`

`convertFromJafroc()`

The third function `convertFromJafroc()`

converts the Jafroc formulation to the authors formulation.

```
BayesianFROC::viewdata(BayesianFROC::dd,head.only = TRUE)
#>
#> * Number of Lesions: 142
#>
#> * Number of Images is not assigned, not required to esimate.
#>
#> ----- Interpretation of the Table ------
#>
#> * The first row shows that the number of hits 50 and the number of false alarms 0 with the 1-th reader under the imaging method ID 1 with his confidence level 5.
#>
#>
#> ModalityID ReaderID Confidence.Level False.Positives True.Positives
#> ----------- --------- ----------------- ---------------- ---------------
#> 1 1 5 0 50
#> 1 1 4 4 30
#> 1 1 3 20 11
#> 1 1 2 29 5
#> 1 1 1 21 1
#> 1 2 5 0 15
#> 1 2 4 0 29
#> 1 2 3 6 29
#> 1 2 2 15 1
#> 1 2 1 22 0
#>
#> * We show the head part of data, i.e., first 10 rows are shown.
#>
#> * To show all rows of data, use viewdata(dataList = Your data, head.only = FALSE)
```

Sorry to change the subject but I have pains in my arms, brain, my chest, got a itchy in my whole body, especially face.

So, go back to the statistical modeling, ah, .. OK. Next, we shall show the hierarchical Bayesian Model to compare modalities, i.e., compare imaging methods such as MRI, CT, PET, … etc, or the another context modality would means the efficacy of treatments. Anyway, in the following we will show how to model.

Mathematical philosophy said that the expression of equation is not important, but the property or relation between notions is more important. So, in the following, I explain the hierarchical model which is very complex and the complex representation will hide the essence of the modeling. To avoid it, the author briefly show the essence or idea of the hierarchical modeling.

In this brief section we use the notation \(f(y|\theta)\) for a likelihood function. That is data is \(y\) and model parameter is denoted by \(\theta\).

The data format of MRMC case, the hits and false alarms are calculated for each reader and modality, thus the data has the form \(y_{m,r}\) representing the hits and false alarms for the \(m\)-th modality and the \(r\)-th reader. So, we estimates the model parameter \(\theta_{m,r}\) representing AUCs for the \(m\)-th modality and the \(r\)-th reader. To obtain the AUCs \(\theta_{m}\) depending on modality only, we use the hyper parameter, such as

\[ \theta_{m,r} \sim \text{Normal}( \theta_m, \text{variance})\].

So, parameter \(\theta_m\) depends only on the \(m\)-th modality. So, using the posterior of \(\theta_m\), we obtains the following basic characteristics between the \(m\)-th and \(m'\)-th modalities:

- \(\mathbb{P}(\theta_m > \theta_{m'} )\)
- \(\theta_m - \theta_{m'}\)

where, the former is the probability of the event that the AUC of \(m\)-th modality is greater than that of \(m'\)-th. The latter is the difference of AUC between the \(m\)-th and \(m'\)-th modalities.

The author of this model thinks it is better to show it without any explanation, since it is obvious;

\[\begin{eqnarray*} H_{c,m,r} & \sim &\text{Binomial }( p_{c,m,r}, N_L ),\\ F_{c,m,r} &\sim& \text{Poisson }( ( \lambda _{c} - \lambda _{c+1})N_L ),\\ \lambda _{c}& =& - \log \Phi (z_{c }),\\ p_{c,m,r} &:=&\Phi (\frac{z_{c +1}-\mu_{m,r}}{\sigma_{m,r}})-\Phi (\frac{z_{c}-\mu_{m,r}}{\sigma_{m,r}}), \\ \color{red}{p_C} & =& 1-\Phi (\frac{z_{C}-\mu_{m,r}}{\sigma_{m,r}}),\\ \color{red}{F_{C,m,r}} & \sim &\text{Poisson } ( (\lambda _{C} - 0)N_I),\\ A_{m,r}&:=&\Phi (\frac{\mu_{m,r}/\sigma_{m,r}}{\sqrt{(1/\sigma_{m,r})^2+1}}), \\ A_{m,r}&\sim&\text{Normal} (A_{m},\sigma_{r}^2), \\ dz_c&:=&z_{c+1}-z_{c},\\ dz_c, \sigma_{m,r} &\sim& \text{Uniform}(0,\infty),\\ z_{c} &\sim& \text{Uniform}( -\infty,100000),\\ A_{m} &\sim& \text{Uniform}(0,1).\\ \end{eqnarray*}\] Our new model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C}\), \(A_{m}\), \(\sigma_{r}\), \(\mu_{m,r}\), and \(\sigma_{m,r}\).

- Prepare
**data**- Create by your hands:
`dataset_creator_new_version()`

or`create_dataset()`

- Convert from Excel data of Jafroc or Rjafroc format:
`convertFromJafroc()`

- Create by your hands:
**Fitting**:`fit_Bayesian_FROC()`

- Estimates of your FROC model
- \(\color{red}{\textit{ Comparison of Modalities }}\) (pairwise)

- Draw
**Curves**:`DrawCurves()`

- FROC curve
- AFROC curve
- Cumulative False positive and cumulative true positives (FPF,TPF)

the null hypothesis the all modalities are same via Bayes factor**Test**- Fit a null model, creating a
`stanfit`

object`fitHo`

- Fut a alternative model, creating a
`stanfit`

object`fitH1`

- evaluate the Bayes factor with these two stanfit objects
`fitHo`

and`fitH1`

.

- Fit a null model, creating a

Prepare data

```
#An example dataset for the case of Multiple readers and Multiple Modalities.
dat <- BayesianFROC::dataList.Chakra.Web
```

Fitting

Draw Curves

Recall that for the \(C=5\) step rating, FROC data is the pair \((H_c,F_c) \in \mathbb{N}^2,c=1,2,...,5\) of hit \(H_c\) for the \(c\)-th confidence level and false alarm \(F_c\) with number of lesions \(N_L\) and number of images \(N_I\).

we define \(C=5\) points as follows: \[ (x_{c},y_{c}):=( \sum_{c\leq c' \leq C } F_{c^\prime}/N_{I}, \sum_{c\leq c' \leq C } H_{c^\prime}/N_{L}) \in \mathbb{R}^2, \] where \(c\) is the \(c^{\text{th}}\) confidence level, and we call these points the pair of False positive fraction (FPF) per image and True positive fraction (TPT) per lesion.

Some reader may think what the hell? The author is stupid. I am afraid that the author is actually stupid dirty old man with prurigo nodularis. Anyway, I show the another description for FPF and TPT as follows. Suppose the three step rating case:

Number of Confidence Level | Number of Hits | Number of False alarms |
---|---|---|

3 = definitely present | \(H_{3}=97\) | \(F_{3}=1\) |

2 = equivocal | \(H_{2}=32\) | \(F_{2}=14\) |

1 = questionable | \(H_{1}=31\) | \(F_{1}=74\) |

In this case, the pair of FPF and TPF \((x_{c},y_{c})\) is give as follows;

\(C=3\) | ||
---|---|---|

\((x_{1},y_{1})=\) | \(\biggr( \frac{F_{3}+F_2+F_1}{N_I},\) | \(\frac{H_{3}+H_2+H_1}{N_L}\biggr)\) |

\((x_{2},y_{2})=\) | \(\biggr(\frac{F_{3}+F_2}{N_I},\) | \(\frac{H_{3}+H_2}{N_L}\biggr)\) |

\((x_{3},y_{3})=\) | \(\biggr(\frac{F_{3}}{N_I},\) | \(\frac{H_{3}}{N_L}\biggr)\) |

Plotting these \(C\)-points in \((x,y)\)-plain, we can visualize the FROC data. Connecting these points by line, we obtain the so called *empirical* *FROC* *curve.*

In the following figure, the three circle means \((x_{c},y_{c}),c=1,2,3\) and the curve is the FROC curve defined later.

I hate the word FPF and TPF. I want to use the word *cumulative* *false* *positives* (CFP) *per* *image* and *cumulative* *true* *positives* (CTP) *per* *lesion*. In my R program, I use CFP, CTP rather than FPF, TPF.

To tell the truth, I like the word cumulative false positives per images rather than FPF. It is my original word, but it means obvious and clear rather than FPF. Also I like the word cumulative true positives per lesions. The abbreviations of these two words are CFP and CTP in my paper.

Suppose that dataset is the following \(C=5\) case,

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\) | \(F_{5}\) |

4 = probably present | \(H_{4}\) | \(F_{4}\) |

3 = questionable | \(H_{3}\) | \(F_{3}\) |

2 = subtle | \(H_{2}\) | \(F_{2}\) |

1 = very subtle | \(H_{1}\) | \(F_{1}\) |

In this case, the pair of FPF and TPF \((x_{c},y_{c})\) is give as follows;

\(C=5\) | ||
---|---|---|

\((x_{1},y_{1})=\) | \(\biggr( \frac{F_5+F_4+F_3+F_2+F_1}{N_I},\) | \(\frac{H_5+H_4+H_3+H_2+H_1}{N_L}\biggr)\) |

\((x_{2},y_{2})=\) | \(\biggr( \frac{F_5+F_4+F_3+F_2}{N_I},\) | \(\frac{H_5+H_4+H_3+H_2}{N_L}\biggr)\) |

\((x_{3},y_{3})=\) | \(\biggr( \frac{F_5+F_4+F_3}{N_I},\) | \(\frac{H_5+H_4+H_3}{N_L}\biggr)\) |

\((x_{4},y_{4})=\) | \(\biggr( \frac{F_5+F_4}{N_I},\) | \(\frac{H_5+H_4}{N_L}\biggr)\) |

\((x_{5},y_{5})=\) | \(\biggr( \frac{F_5}{N_I},\) | \(\frac{H_5}{N_L}\biggr)\) |

Is this redundant, ha,… my head ache will kill me,… bye bye, pretty crowd! I have got some aches, and one of then like hemorrhoids. What the hell! :’-D

For our purposes such as comparison of modality or fitting FROC curve, the following FPF *per* *lesion* instead of per image also make sense.

\[ (x_{c},y_{c}):=( \sum_{c\leq c' \leq C } F_{c^\prime}/N_{L}, \sum_{c\leq c' \leq C } H_{c^\prime}/N_{L}) \in \mathbb{R}^2, \]

The replacement of per image by per lesion is not essential, if reader execute `fit_GUI()`

then reader can observe the change of number of image does not change the FROC curve or AFROC curve. Un- changeless of the curve shape implies that for the modality comparison issue we can use FPF per lesion. The fitting with FPF per lesion is implemented the following code;

So the AUCs of per image and per lesion coincide.

In classical FROC analysis, we always work with FPF per image, the author wishes to work with FPF per lesion, since the number of images are redundant parameter. Of course, noise and signal distribution is affected number of images, but such latent structure is not matter for modality comparison.

We shall give a definition of FROC curve.

- FROC curve is an alternative notion of the ROC curve.
- Unfortunately, area under the FROC curve is
**not****finite**, thus we deform the FROC curve so that the area under the curve is**finite**. The deformation gives us the notion of AFROC curve whose area under the curve is finite.

To define the FROC curve, we first define *pre* FROC curve.

Define

\(y(t):=\mathbb{P}[Y>t]\) and \(x(t):=\mathbb{P}[X>t]\).

Then we obtain the *pre* *FROC* *curve*:

\[ \mathbb{R} \longrightarrow \mathbb{R}^2 \\ t \mapsto (x(t),y(t)). \]

To calculate \(y(t):=\mathbb{P}[Y>t]\) and \(x(t):=\mathbb{P}[X>t]\) more explicitly, we recall that

\[X \sim \text{Normal}(0,1), \\ Y \sim \text{Normal}(\mu,\sigma^2),\] , which is sometimes called the bi-normal assumption. Note that if \(\mu\) is far from the mean of \(X\) (that is 0), then the observer performance will be greater, since the noise and signal are well separated each other.

Please keep in mind that the parameter \(\mu,\sigma^2\) is a part of parameter of our model which should be estimated.

Since \(\frac{Y -\mu}{\sigma}\) is distributed by the standard Gaussian, we can calculate the probability of the event that \(\frac{Y -\mu}{\sigma} > \frac{t -\mu}{\sigma}\) and it follows that

\[y(t):=\mathbb{P}[Y>t] = \mathbb{P}[\frac{Y -\mu}{\sigma} > \frac{t -\mu}{\sigma}] = 1-\Phi( \frac{t -\mu}{\sigma}) \]

On the other hand, \[x(t):=\mathbb{P}[X>t] = 1-\Phi( t),\] thus \[t = t(x) = \Phi ^{-1} (1-x(t) ).\] Substituting this, we get \[y(t) = 1-\Phi( \frac{t -\mu}{\sigma}) = 1-\Phi( \frac{ \Phi ^{-1} (1-x(t) ) -\mu}{\sigma})\]

In the FROC data, each hits and false alarms are counted for each confidence level.

we introduced the notations, \(N_L\), \(N_I\), \(H_c\), \(F_c\), \(C\). In the R console, these notations are represented by `NL, NI, h, f, C`

.

If \(C=5\), then the dataset for FROC analysis is the follows;

Confidence Level | No. of Hits | No. of False alarms |
---|---|---|

5 = definitely present | \(H_{5}\) | \(F_{5}\) |

4 = probably present | \(H_{4}\) | \(F_{4}\) |

3 = equivocal | \(H_{3}\) | \(F_{3}\) |

2 = probably absent | \(H_{2}\) | \(F_{2}\) |

1 = questionable | \(H_{1}\) | \(F_{1}\) |

We divide the set of 1 dimensional real numbers into \(z_1 < z_2 < .... < z_C\). Combining these thresholds with the parameter of the bi-normal assumptions \(\mu,\sigma^2\) , we obtain our model parameter. Other model parameters, for example \(\lambda_c\) can be deduced from thresholds by \(\lambda _c = -\log \Phi({z_c})\).

And we consider the hits rate are generated by \[p_{c} := \Phi (\frac{z_{c +1}-\mu_{}}{\sigma_{}})-\Phi (\frac{z_{c}-\mu_{}}{\sigma_{}})=\mathbb{P}[z_c <Y_l<z_{c+1}], \]

We also define the sequence \(\lambda_1 < \lambda_2 < .... < \lambda_C\) such that \[F_c+...+F_C \sim \text{Poisson}(\lambda_cN_I)\] ,where \(N_I\) is the number of images. This assumption is called the Poisson assumption in FROC context. The following condition combine the bi-normal assumption and the Poisson assumption:

\[ \mathbb{P}[X < z_1]= \mathbb{P}[F_1+F_2+F_3+...+F_C =0]\\ \mathbb{P}[X < z_2]= \mathbb{P}[F_2+F_3...+F_C =0]\\ .....\\ \mathbb{P}[X < z_c]= \mathbb{P}[F_c+...+F_C =0]\\ ...\\ \mathbb{P}[X< z_C]= \mathbb{P}[ F_C =0]\\ \] which is equivalent that

\[\Phi(z_c) = e^{-\lambda_c}\]

for all \(c\).

Or equivalently,

\[1-x(z_c) = e^{-\lambda_c}\]

Thus we can get a correspondence between Poisson rate \(\lambda\) and the threshold \(z\).

Recall that FROC curves parameter \(t\) is the range of the Gaussian random variable, we use \(z\) instead of \(t\), then the curve is \[y(t) =y(z)= 1-\Phi( \frac{t -\mu}{\sigma}) = 1-\Phi( \frac{ \Phi ^{-1} (1-x(t) ) -\mu}{\sigma})\\ = 1-\Phi( \frac{ \Phi ^{-1} (1-x(z) ) -\mu}{\sigma})\\ = 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\\ \] In FROC analysis, the parameter of the FROC curve is taken by \(\lambda\).

Where the last equality, we use the relation \(1-x(z) = e^{-\lambda}\) which is the analogy \({}^{\dagger}\) (see Appendix) of \(1-x(z_c) = e^{-\lambda_c}\).

Thus we can obtain the FROC curve as the following;

From the above arguments, we obtains the following definition of *FROC* *curve*:

\[ \mathbb{R} \longrightarrow \mathbb{R}^2 \\ \lambda \mapsto (\lambda,y(\lambda) ) \]

where \(y(\lambda) = 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\) and \(\mu,\sigma\) are parameter representing the mean and standard deviation of signal distribution.

\[ y^{\text{FROC}}(\lambda):= 1-\Phi( \frac{ \Phi ^{-1} (e^{-\lambda} ) -\mu}{\sigma})\\ x^{\text{FROC}}(\lambda) := \lambda \]

Note that FROC curve can be interpreted as the curve of the Expectation of the pair of TPF and FPF, where we use two abbreviations FPF = False Positive Fraction and TPF = True Positive Fraction. These words are widely used in the ROC theory and thus we omit the definition.

In mathematical philosophy, the expression is not important. The most important thing in mathematical expression is the property. So, in the case of the definition of the notion of the FROC curve, the above equations to represents the FROC curve is not important, so, reader may forget it !! I also forget the expressions. However we never forget about the property that the FROC curve is expectation pair of the FPF and TPF. That is, we can write that

\[ \mathbb{E}[ \sum _{ c \leq c' \leq C }^C H_{c'}/N_L ] = y^{\text{FROC}}(\lambda_c) \\ \mathbb{E}[ \sum _{ c \leq c' \leq C}^C F_{c'}/N_I ] = x^{\text{FROC}}(\lambda_c) \\ \]

for all \(c=1,2,...,C\).

In fact,

\[\begin{eqnarray*} (x^{\text{FROC}}(\lambda_c),y^{\text{FROC}}(\lambda_c))&:=&\bigl(\lambda_c , \mathbb{P} \bigr\{ Y > z_c \bigl\} \bigr),\\ &=&\bigl(\lambda_c , \sum_{c\leq c' \leq C } \mathbb{P} \bigr\{ z_{c^\prime+1}> Y > z_{c^\prime} \bigl\} \bigr),\\ &=&\bigl(\mathbb E [ \sum_{c\leq c' \leq C } F_{c^\prime}]/N_I , \mathbb E [\sum_{c\leq c' \leq C } H_{c^\prime}]/N_L \bigr),\\ \end{eqnarray*}\]

where \(Y\) is a random variable distributed by the signal distribution in the bi-normal assumption.

Consequently, we can say that

\[\color{red}{\textit{FROC curve is an expected curve for the FPF and TPF. }}\]

I think this equality is more important than the expression of the definition of FROC curve.

From this formula, we can obtain the interpretation of the FROC curves as expected pairs of cumulative false positives per image and cumulative hits per lesion. Therefore, under this interpretation, we can evaluate how our model fits the data by plotting the cumulative false positives per image and cumulative hits per lesion calculated from an actual dataset. More precisely, Recall that we defined the pairs of FPF and TPF as follows: \[\begin{eqnarray*} (x_{c},y_{c}):=( \sum_{c\leq c' \leq C } F_{c^\prime}/N_{I}, \sum_{c\leq c' \leq C } H_{c^\prime}/N_{L}) \in \mathbb{R}^2,\\ \end{eqnarray*}\] where \(c\) is the \(c^{\text{th}}\) confidence level, and we also call these points the cumulative false positive-cumulative true positive (CFP-CTP) points over \((N_I, N_L)\). If the model is correct, then the FROC curve passes through the expected CFP-CTP points and provides insight into \[\color{blue}{\textit{ how the model makes sense as a model. }}\]

Note that AUC of FROC is infinity. So, in this package AUC means the next AFROC curve’s AUC.

\(\Phi()\) denotes the cumulative distribution function of the Gaussian distribution with mean 0 and variance 1. \(\Phi^{-1}()\) is the inverse mapping of \(\Phi()\).

Let \((x,y)\) be a Cartesian coordinate system on a plane. Then, for any non-negative real number \(\lambda\), the FROC curve is defined as \[ (x(\lambda),y(\lambda)) := \bigl(\lambda , 1-\Phi(b\Phi ^{-1}(\exp(-\lambda)) -a ) \bigr) \in \mathbb{R}^2 \] Or equivalently,

\[ (x(\lambda),y(\lambda)) := \bigl(\lambda , \text{Prob} \bigr\{ Y > \Phi ^{-1}(\exp(-\lambda)) \bigl\} \bigr) \in \mathbb{R}^2. \]

where \(Y\) is a signal distribution, \(a =\mu/\sigma\) and \(b=\sigma\).

It is easy to see that these two definition is same.

Let \((\xi, \eta)\) be a Cartesian coordinate system for the square \(0\leq \xi \leq 1, 0\leq \eta \leq 1\). Then, for any non-negative real number \(\lambda\), we define an AFROC curve as \[(\xi(\lambda), \eta(\lambda)) := \bigl(1-\exp(-\lambda) , 1-\Phi(b\Phi ^{-1}(\exp(-\lambda)) -a ) \bigr) \in \mathbb{R}^2. \]

If beginner of FROC analysis, should skip this section, but keep in mind the following rough shape of the are under the AFROC curve;

\[\text{AUC} = \Phi\biggr(\frac{a}{\sqrt{1+b^2}}\biggr)\].

No need to remember its precise form. To tell the truth, I forgot it now \(( \check{} \omega \check{})\), and never recall it again. Even if I programmed this package, I always forgot it. So I never force reader to remember it, since I perfectly forgot it! In this section, we prove the above equation. To tell the truth the author calculates this section 2 years later after making this package. So, no need to understand this section.

The author thinks this section will let reader go to bed and says that I’ll never be back. Even if terminator, he will never be back in this section. To tell the truth, the author also thinks I’ll never be back. Anyway, we should be terminator or Batman (I like Michael Keaton) to understand this section. Note that I am not a Spider man generation but I think Spider man is good too. Ha,.. what I says … I do not know,…please leave me alone.

Note that terminator 2 is great, but I am afraid that 3,4,5,… is not so,… James Cameron is best!

On the other hand, 1-st season Dark Angel is best, but 2-nd season is not so,…

Anyway, in the following, the author provides a lullaby in which the calculation of the area under the AFROC curve is shown. If reader cannot sleep only this section, then I suggest the following heavy book, so heavy, now my arm aches cannot take such heavy book, ha,…aches make my life poor. Anyway, for similar calculation, we refer to the following;

- Dev Chakraborty; Observer Performance Methods for Diagnostic Imaging, Foundations, Modeling, and Applications with R-Based Examples ; pp.116-119

In order to spare the read additional labor we have been careful to give all calculations, which are seemed redundant for someone. In order to calculate AUC in a simple manner, we need a notion of convolution.

Let \(f\) and \(g\) be two real valued function defined on the real numbers. Then the convolution of \(f\) and \(g\) is defined by

\[ (f\otimes g)(x):=\int_{\mathbb{R}}f(x-y)g(y)dy. \] The Gaussian density has simple algebraic structure for convolution.

Let us denote the density function of Gaussian with mean \(m\) and variance \(v\) by \(N_{m,v}()\), that is:

\[ N_{m,v}(x)=\frac{1}{\sqrt{2\pi v}}\exp\frac{-(x-m)^2}{2v}. \]

Then

\[ N_{m,v}\otimes N_{m',v'}=N_{m+m',v+v'}. \]

My mathematical heart gets excited, ha,.. algebraization (not Alexander Grothendieck manner but,..) is,… ha beautiful, I want to get married with this formula.

We also use the following trivial relation;

\[ N_{0,v}(x)=\frac{1}{\sqrt{v}}N_{0,1}(\frac{x}{\sqrt{v}}). \]

Recall that we define AFROC curve for any non-negative real number \(\lambda\), as follows: \[(x(\lambda), y(\lambda)) := \bigl(1-\exp(-\lambda) , 1-\Phi(b\Phi ^{-1}(\exp(-\lambda)) -a ) \bigr) \in \mathbb{R}^2. \]

The AUC is the following Lebesgue-Stieltjes integral

\[ \int y(\lambda) dx(\lambda). \]

Since \(\lambda\) moves from \(0\) to \(\infty\), the \(x(\lambda) =1-\exp(-\lambda)\) moves from \(x=0\) to \(x=1\),

\[ \int y(\lambda)) dx(\lambda) = \int_{x=0}^1 \biggl\{ 1-\Phi(b\Phi ^{-1}( 1- x ) -a ) \biggr\} dx\]

Set \(\xi:=\Phi ^{-1}( 1- x )\), then \(N_{0,1}(\xi)d\xi=-dx\) and hence

\[\int_{\xi=+\infty}^{\xi=-\infty} \biggl\{ 1-\Phi(b\xi -a ) \biggr\}N_{0,1}( \xi )d\xi,\]

and immediately,

\[1-\int_{\xi=+\infty}^{\xi=-\infty} \biggl\{ \Phi(b\xi -a ) \biggr\}N_{0,1}( \xi )d\xi,\] so,

\[1-\int_{\xi=+\infty}^{\xi=-\infty} \biggl\{ \int^{\eta=b\xi -a }_{\eta=-\infty}N_{0,1}(\eta) \biggr\}N_{0,1}( \xi )d\xi d\eta.\] Using coordinate transformation \((X,Y)=:(\xi,\eta-b\xi)\) with its Jacobian \(\partial (\xi,\eta)/\partial (X,Y)=1\),

\[1-\int_{X=+\infty}^{X=-\infty} \biggl\{ \int^{Y= -a }_{Y=-\infty}N_{0,1}(Y-bX) \biggr\}N_{0,1}( X )dX dY.\] To apply the convolution formula, we change the integrand such as \[1-\int_{X=+\infty}^{X=-\infty} \biggl\{ \int^{Y= -a }_{Y=-\infty}N_{0,1}(b[Y/b-X]) \biggr\}N_{0,1}( X )dX dY,\] and

\[1-\int_{X=+\infty}^{X=-\infty} \biggl\{ \int^{Y= -a }_{Y=-\infty}\frac{1}{b}N_{0,1/b^2}(Y/b-X) \biggr\}N_{0,1}( X )dX dY.\] We can rewrite it as notation of convolution;

\[1- \int^{Y= -a }_{Y=-\infty}\frac{1}{b}(N_{0,1/b^2}\otimes N_{0,1})( Y/b ) dY.\]

By the convolution formula, \[1- \int^{Y= -a }_{Y=-\infty}\frac{1}{b}N_{0,1+1/b^2}(Y/b) dY,\]

that is,

\[1- \int^{Y= -a }_{Y=-\infty}\frac{1}{b}N_{0,\frac{1+b^2}{b^2}}(Y/b) dY.\] Using the trivial relation of two Gaussians,

\[1- \int^{Y= -a }_{Y=-\infty}\frac{1}{b} \sqrt{\frac{b^2}{1+b^2}}N_{0,1}(\sqrt{\frac{b^2}{1+b^2}}Y/b) dY,\] that is, \[1- \int^{Y= -a }_{Y=-\infty} \sqrt{\frac{1}{1+b^2}}N_{0,1}(\sqrt{\frac{1}{1+b^2}}Y) dY\] Set \(Z:=\sqrt{\frac{1}{1+b^2}}Y\), then \[1- \int^{Y= -a \frac{1}{\sqrt{1+b^2}}}_{Y=-\infty}N_{0,1}(Z) dZ.\]

Using the definition of \(\Phi()\),

\[1- \Phi\biggr(\frac{-a}{\sqrt{1+b^2}}\biggr)\]

Using the definition of \(\Phi()\) again, finial, we obtain the desired expression of AUC, that is, the area under the AFROC curve (Not confuse with FROC curve), \[\text{AUC} = \Phi\biggr(\frac{a}{\sqrt{1+b^2}}\biggr)\].

So, my lullaby finished. Many people had gone or be sleeping?,..go to bed? I also want to sleep, here is freedom of expression, no reviewer, ha, Is there someone? help,… Anyway Good night!

Let \(f\) and \(g\) be two real valued function defined on the real numbers. Then the convolution of \(f\) and \(g\) is defined by

\[ (f\otimes g)(x):=\int_{\mathbb{R}}f(x-y)g(y)dy. \] The Gaussian density has simple algebraic structure for convolution.

Recall that we denoted the density function of Gaussian with mean \(m\) and variance \(v\) by \(N_{m,v}()\), that is:

\[ N_{m,v}(x)=\frac{1}{\sqrt{2\pi v}}\exp\frac{-(x-m)^2}{2v}. \]

In the calculation of AUC, we used the following;

\[ N_{m,v}\otimes N_{m',v'}=N_{m+m',v+v'}. \]

In this section, we shall prove this formula.

Let \((\Omega, \frak{B},\mathbb{P})\) be a probability space. Let \(X\) be a read valued random variable defined on \(\Omega\). Then, for any \(x\in \mathbb{R}\), distribution function \(F\) is defined by;

\[ F(x) := \mathbb{P}(X<x). \]

For any Borel subset \(E\) of \(\mathbb{R}\) we define the push forward measure \(\mathbb{P}^X\) by

\[ \mathbb{P}^X(E) := \mathbb{P}(X\in E) \]

which is measure on target space of \(X\) induced by \(\mathbb{P}\).

We define \[\widehat{F}(\xi) := \int^\infty_{-\infty} e^{\sqrt{-1}\xi x}dF(x)=\int^\infty_{-\infty} e^{\sqrt{-1}\xi x}\mathbb{P}^X(dx)\] When \(F=N_{m,v}\), we denote \(\widehat{F}\) by \(\widehat{F}_{m,v}\).

I am tired,..I stop it,…

UNDER CONSTRUCTION

In this section, we explain where this equality comes from. From the previous section we use the discrete rating \(1,2,...,c,...,C\). But now we shall consider **continuous** rating case. Since continuous rating contains discrete rating, once we obtain the \(1-x(z) = e^{-\lambda}\) for continuous rating, then we can use it in discrete rating case. We shall denote reader’s continuous rating by \(\gamma \in \mathbb{R}\) instead of \(c =1,2,...,C\) for discrete case. For example, \(F_\gamma\) means the number of false positives with confidence level is greater than \(\gamma\). Similarly, \(H_\gamma\) means the number of false positives with confidence level is greater than \(\gamma\). So, if we denote the cumulative false positives (FPF) with confidence level greater than \(\gamma\) by \(f_\gamma\), then we may write the cumulative false positives per image in continuous rating case as

\[ f_\gamma = \int_\gamma ^\infty F_\gamma d \gamma/N_I, \] which is analogy of the discrete case: \[ \frac{F_c+F_{c+1}+...+F_C}{N_I}, \] where \(N_I\) denotes the number of images.

The analog of \[F_c+...+F_C \sim \text{Poisson}(\lambda_cN_I), \]

is

\[ \int_\gamma ^\infty F_\gamma d \gamma\sim \text{Poisson}(\lambda _\gamma N_I). \]

The analog of the followings discrete case;

\[ \mathbb{P}[X < z_1]= \mathbb{P}[F_1+F_2+F_3+...+F_C =0],\\ \mathbb{P}[X < z_2]= \mathbb{P}[F_2+F_3...+F_C =0],\\ .....\\ \mathbb{P}[X < z_c]= \mathbb{P}[F_c+...+F_C =0],\\ ...\\ \mathbb{P}[X< z_C]= \mathbb{P}[ F_C =0],\\ \]

is

\[ \mathbb{P}[X < z_ \gamma] = \mathbb{P} [\int_\gamma ^\infty F_\gamma d \gamma \neq0 ], \]

or equivalently

\[ \mathbb{P}[X < z_ \gamma] = 1- \mathbb{P} [\int_\gamma ^\infty F_\gamma d \gamma =0 ]. \] From which, we obtain

\[1-\Phi(z_ \gamma) = e^{-\lambda_ \gamma}\].

for all \(\gamma\) and hence

\[1-\Phi(z) = e^{-\lambda}\].

In the previous notation, it is

\[1-x(z) = e^{-\lambda}.\]

**Reference**: *“A Default Bayesian Hypothesis Test for ANOVA Designs” *

Ruud Wetzels, Raoul P.P.P.Grasman, and Eric-Jan Wagenmakers

We can test the following null hypothesis for an FROC data in case of the multiple reader and multiple modality:

\[ H_0: \text{All modalities are same.} \]

- Make a null hypothesis \(H_0\) and its alternative \(H_1\).
- Make two models \(\mathcal{M}_0\) constructed under \(H_0\) and \(\mathcal{M}_1\) made under \(H_1\).
- Model selection based on WAIC or Bayes factor
- If \(\mathcal{M}_1\) is selected, then the null hypothesis \(H_0\) is rejected.

Note that p value are not used.

E.g.,

- Under \(H_0\) there are many models \(\mathcal{M}_0\),\(\mathcal{M}_0',\),\(\mathcal{M}_0''\),\(\mathcal{M}_0'''\),…

Similarly,

- Under \(H_1\) there are many models \(\mathcal{M}_1\),\(\mathcal{M}_1',\),\(\mathcal{M}_1''\),\(\mathcal{M}_1'''\),…

but …. you know.. frequentist statistical test is also problematic.

- p value monotonically decreases with respect to sample size.

In this package, there are two hierarchical Bayesian model. One is constructed under the null hypothesis that all modalities are same, and the another is constructed under the alternative hypothesis that at least one modality is not same with the other modality, and we call the former the *null* *hypothesis* *model* or simply the *null* *model*, the latter the *alternative* *model.* By comparing the null model and alternative model based on Bayes factor, we can test the null hypothesis.

The alternative model is same the model which is already shown in the previous section.

In the following we show the null model.

\[\begin{eqnarray*} H_{c,m,r} & \sim &\text{Binomial }( p_{c,m,r}, N_L ),\\ F_{c,m,r} &\sim& \text{Poisson }( ( \lambda _{c} - \lambda _{c+1})N_L ),\\ \lambda _{c}& =& - \log \Phi (z_{c }),\\ p_{c,m,r} &:=&\Phi (\frac{z_{c +1}-\mu_{m,r}}{\sigma_{m,r}})-\Phi (\frac{z_{c}-\mu_{m,r}}{\sigma_{m,r}}), \\ p_C & =& 1-\Phi (\frac{z_{C}-\mu_{m,r}}{\sigma_{m,r}}),\\ F_{C,m,r} & \sim &\text{Poisson } ( (\lambda _{C} - 0)N_I),\\ A_{m,r}&:=&\Phi (\frac{\mu_{m,r}/\sigma_{m,r}}{\sqrt{(1/\sigma_{m,r})^2+1}}), \\ A_{m,r}&\sim&\text{Normal} ( \color{red}{A},\sigma_{r}^2), \\ dz_c&:=&z_{c+1}-z_{c},\\ dz_c, \sigma_{m,r} &\sim& \text{Uniform}(0,\infty),\\ z_{c} &\sim& \text{Uniform}( -\infty,100000),\\ \color{red}{A} &\sim& \text{Uniform}(0,1).\\ \end{eqnarray*}\] Our new model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C}\), \(A\), \(\sigma_{r}\), \(\mu_{m,r}\), and \(\sigma_{m,r}\).

The R scripts for fitting the null model.

The following model is the same in the previous sections, so we reprint here;

\[\begin{eqnarray*} H_{c,m,r} & \sim &\text{Binomial }( p_{c,m,r}, N_L ),\\ F_{c,m,r} &\sim& \text{Poisson }( ( \lambda _{c} - \lambda _{c+1})N_L ),\\ \lambda _{c}& =& - \log \Phi (z_{c }),\\ p_{c,m,r} &:=&\Phi (\frac{z_{c +1}-\mu_{m,r}}{\sigma_{m,r}})-\Phi (\frac{z_{c}-\mu_{m,r}}{\sigma_{m,r}}), \\ p_C & =& 1-\Phi (\frac{z_{C}-\mu_{m,r}}{\sigma_{m,r}}),\\ F_{C,m,r} & \sim &\text{Poisson } ( (\lambda _{C} - 0)N_I),\\ A_{m,r}&:=&\Phi (\frac{\mu_{m,r}/\sigma_{m,r}}{\sqrt{(1/\sigma_{m,r})^2+1}}), \\ A_{m,r}&\sim&\text{Normal} ( \color{red}{ A_{m} },\sigma_{r}^2), \\ dz_c&:=&z_{c+1}-z_{c},\\ dz_c, \sigma_{m,r} &\sim& \text{Uniform}(0,\infty),\\ z_{c} &\sim& \text{Uniform}( -\infty,100000),\\ \color{red}{ A_{m} } &\sim& \text{Uniform}(0,1).\\ \end{eqnarray*}\] Our new model has parameters \(z_{1}, dz_1,dz_2,\cdots, dz_{C}\), \(A_{m}\), \(\sigma_{r}\), \(\mu_{m,r}\), and \(\sigma_{m,r}\).

The parameter \(\color{red}{ A_{m} }\) are used in the alternative model instead of the parameter \(\color{red}{ A}\) in the null hypothesis model.

In the following, we provide R scripts which implement the Bayesian ANOVA to test the null hypothesis that each modalities are same.

```
#An example dataset for the case of Multiple readers and Multiple Modalities.
data <- BayesianFROC::dataList.Chakra.Web
# Test the null hypothesis
Test_Null_Hypothesis_that_all_modalities_are_same(data,ite = 1111)
# where ite is the iteration of MCMC to create two stanfit object, one is the null model and the another is the alternative model.
```

The internal of the code, the functions in the package `bridgesampling`

are running.

R often crashes when the codes run, sorry,… I do not know why ? It is heavy ??

`stanfit`

Let `fitH0`

and `fitH0`

be `stanfit`

objects. The following code compare these two objects by the Bayes factor.

```
# method='WARP' if another algorithm (WARP-III) you want.
# silent is whether calculation are shown
# Null model
H0 <- bridgesampling::bridge_sampler(fitH0, method = "normal", silent = TRUE)
print(H0)
# Alternative model
H1 <- bridgesampling::bridge_sampler(fitH1, method = "normal", silent = TRUE)
print(H1)
# Test the Null hypothesis that all modalities are same.
BF10 <- bridgesampling::bf(H1, H0)
print(BF10)
# If the number is greater, then we reject H0 with more confidence.
```

Our modeling begin from the hits and false alarms obtained reducing the primitive FROC data, and the later is more informative, and the so-called figure or merit (FOM) is calculated with more informative dataset instead of the author’s dataset.

I have to say our hierarchical model is a toy. The starting point of this model has very problematic since the data of hits and false alarms are obtained by reducing the information of FROC trials. The Jafroc data formulation contains the more precisely information and the notion of figure of merit is obtained by using this precise information. Thus without changing our modeling starting point, our model have to be said toy. However, This is the first attempt in the world to analyze the FROC by Bayesian model, I think it is sufficient. This is only hobby, I am not a statistician nor radiologist nor academic post. I am only unemployed dirty old man. Good Luck for future research and my future. ..ha i have a ache in my arms, right arms, so I no longer write the R scripts or .. ha,….

Talts, S., Betancourt, M., Simpson, D., Vehtari, A., and Gelman, A. (2018). Validating Bayesian Inference Algorithms with Simulation-Based Calibration. arXiv preprint arXiv:1804.06788

SBC is a method to detect bias in MCMC sampling. If we identify prior of model, then SBC algorithm return pseudo uniform distribution, and if the pseudo uniform distribution is exactly uniform distribution, then we cannot conclude the MCMC sampling has bias, and if pseudo uniform distribution is far from uniformity, then we reject the hypothesis that MCMC sampling has no bias. In 2019 July, 15 the author implemented the SBC using rstan’s function `sbc`

for the case of single reader and single modality.

Example code for SBC

```
fit <- Simulation_Based_Calibration_single_reader_single_modality_via_rstan_sbc()
plot(fit, bins = 10) # it is best to specify the bins argument yourself
plot(fit, bins = 100) # it is best to specify the bins argument yourself
plot(fit, bins = 200) # it is best to specify the bins argument yourself
```

Excuting the above R code, the histograms are drawn for each parameters of model. Then we will see how our model has correct MCMC sampling.

Let \(\hat{\theta} = \int \theta \pi(\theta|D_0)d\theta\) where \(\pi(\theta|D_0)\) is a posterior measure for the given data \(D_0\) and model parameter \(\theta\). Then, we calculate the chi square in the following manner:

\[\chi^2(D_0) = \chi^2(D_0| \hat{\theta}).\]

It is very easy to calculate and the value has very compatible with the data plot and estimated FROC curve, thus I trust this value.

I did not know such quantities in any book, so it might be not so good in theoritical perspective, but I use this since its calculation is very easy from a fitted model object of class `stanfit`

.

```
# 1) Build the data for singler reader and single modality case.
dat <- list(
c=c(3,2,1), # Confidence level, which is ignored.
h=c(97,32,31), # Number of hits for each confidence level
f=c(1,14,74), # Number of false alarms for each confidence level
NL=259, # Number of lesions
NI=57, # Number of images
C=3) # Number of confidence level
# where,
# c denotes confidence level, , each components indicates that
# 3 = Definitely lesion,
# 2 = subtle,
# 1 = very subtle
# That is the high number indicates the high confidence level.
# h denotes number of hits
# (True Positives: TP) for each confidence level,
# f denotes number of false alarms
# (False Positives: FP) for each confidence level,
# NL denotes number of lesions,
# NI denotes number of images,
# 2) Fit and draw FROC and AFROC curves.
fit <- fit_Bayesian_FROC(dat, DrawCurve = TRUE)
# 3) Extract a chi square statistics (not p value)
chi_square_goodness_of_fit <- fit@chisquare
# 4) print a chi square statistics (not p value)
print(chi_square_goodness_of_fit)
```

Posterior predictive p value.

First, we outline the *p value* of the *chi square goodness of fit statistics* for Bayesian version. In frequentist p values are calculated with data \(D\) and an estimated parameter \(\theta\) such as maximal likelihood estimates (MLEs) \(\hat{\theta}_{\text{MLE}}(D)\), so we may write \(\chi^2(D)=\chi^2(D|\hat{\theta}_{\text{MLE}}(D))\) in freqentist context. However, in Bayesian context, the parameter of model is not deterministic, so, the p values in Bayesian context also depend on the MCMC samples of parameter of model. To obtain the p values in deterministic way in Bayesian context, we integrate a p values by the posterior predictive distribution. The posterior predictive distribution \(\pi(D|D_0)\) for given data \(D_0\) is defined by the posterior mean of the likelihood;

\[\pi(D|D_0) := \int f(D|\theta)\pi(\theta|D_0)d\theta \times \text{const}\]

where $ f(D|)$ denotes likelihood and \(\pi(\theta|D_0)\) denotes posterior. So, \(\pi(D|D_0)\) is a probability measure on the set of all data-sets \(\{ D|D\text{ is data} \}\) under the given data \(D_0\). I lost my hair,… ha,… why? I love my hair, .. but my hair leave me alone, … why ? why?

There are possible three chi squares depend only on given data \(D_0\).

\(\chi^2(D_0) = \int \chi^2(D|\theta) f(D|\theta)\pi(\theta|D_0)d\theta dD\)

\(\chi^2(D_0) = \int \chi^2(D_0|\theta) \pi(\theta|D_0)d\theta\)

\(\chi^2(D_0) = \chi^2(D_0|\int \theta \pi(\theta|D_0)d\theta )\)

I use the first and the third quantity.

I am not sure their precise distinction. If someone is familiar with it, please tell me :-)

For the detail of p value, please see the Gelman’s book “ *Bayesian Data Analysis*”.

The package *BayesianFROC* implement the calculation of the posterior probability p value for the chi square goodness of fit statistics for a single reader and single modality FROC model.

In this vignette, we explain how to run the function which calculate the posterior predictive p value.

Prepare data in case of a single reader and single modality

Get

of class*fitted model object*by fitting model to data via the function*stanfitExtended*`fit_Bayesian_FROC()`

in this package.Calculate the posterior predictive p value via

`p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit()`

So all we have to do is simple, that is, merely run only two functions.

`fit_Bayesian_FROC()`

`p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit()`

The resulting object contained p value.

Using pipe operator `%>%`

from the package *stringr*, the whole code can be written as follows;

```
# 1) Prepare a dataset
dat <- BayesianFROC::dataList.Chakra.1
# 2) Fitting
fit <- BayesianFROC::fit_Bayesian_FROC(dat)
# 3) Calculation of the P value
p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit(fit)
```

`p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit()`

- p value
- Chi squares

In the following, we shall show what the it means in the printed result in the R console when we execute the above code to evaluate p-value.

```
|the 7984-th | 6.440| 12.00|TRUE |
|the 7985-th | 0.857| 13.00|TRUE |
|the 7986-th | 1.670| 13.50|TRUE |
|the 7987-th | 3.880| 12.20|TRUE |
|the 7988-th | 1.300| 16.20|TRUE |
|the 7989-th | 6.190| 11.40|TRUE |
|the 7990-th | 10.900| 12.60|TRUE |
|the 7991-th | 3.360| 12.80|TRUE |
|the 7992-th | 5.110| 14.50|TRUE |
|the 7993-th | 5.530| 12.90|TRUE |
|the 7994-th | 4.430| 13.00|TRUE |
|the 7995-th | 3.570| 14.70|TRUE |
|the 7996-th | 9.890| 17.70|TRUE |
|the 7997-th | 72.200| 9.40|FALSE |
|the 7998-th | 2.140| 10.40|TRUE |
|the 7999-th | 7.750| 9.95|TRUE |
|the 8000-th | 3.310| 15.20|TRUE |
* Note that the posterior predictive p value is a rate of TRUE in the right column in the above table.
The p value of the posterior predictive measure for the chi square discrepancy.
0.916375
```

The last number `0.916375`

is the desired p value of the chi square in the Bayesian context according to Gelmann’s book.

- The first column means the
**sample**of the**ID***Hamiltonian Monte Carlo Simulation*without warming up. - The second column means the value of the chi square with
*observed*data and the corresponding MCMC sample. - The Third column means the value of the chi square with
*replicated*data and an MCMC sample, I forget, - The fourth column means the inequality that the second column < the third column. So these inequality is required for calculating p value of the Bayesian seance according to Gelmann’s book.
- More
`TRUE`

in the 4-th column means*better*fitting model to data - p-value is proportion to the number of
`TRUE`

in the 4-th column.

Running the above R script, the graphic devices appears and shows the replicated FROC data from the posterior predictive distribution are drawn. The sample are used to calculate the p-value in the Bayesian sense, we needs samples form the* posterior predictive distribution*.

Note that p-value is depend on only data. So, odd data cause the small p value, which will rejects the *null* *hypothesis* that fitting is good.

`dataList.Chakra.1`

Intuitively, all data points drawn by circle are on FROC curve, thus we may say \(\color{red}{\textit{ good }}\) fitting. \(\color{red}{\textit{P value is 0.925. }}\)

`BayesianFROC::dataList.High`

Intuitively, two data points drawn by circle are *not* on FROC curve, thus we may say \(\color{red}{\textit{ bad }}\) fitting. \(\color{red}{\textit{P value is 0.669. }}\) which is smaller than the above 0.925, indicating fitting is *not* better than above.

\(\color{blue}{\textit{The above two examples show that p value is }} \color{red}{\textit{ compatible }} \color{blue}{\textit{with our intuition. }}\)

```
#1) Build data for singler reader and single modality case.
dataList <- list(
c=c(3,2,1), # c is ignored, can omit.
h=c(97,32,31),
f=c(1,14,74),
NL=259,
NI=57,
C=3
)
# where,
# c denotes confidence level, each components indicates that
# 3 = Definitely lesion,
# 2 = subtle,
# 1 = very subtle
# h denotes number of hits
# (True Positives: TP) for each confidence level,
# f denotes number of false alarms
# (False Positives: FP) for each confidence level,
# NL denotes number of lesions (signal),
# NI denotes number of images,
# Make a Fitted Model Object of class stanfitExtended inherited from stanfit
fit <- fit_Bayesian_FROC( dataList )
# Calculate the p value, obtained as integral using posterior predictive measure.
p_value_of_the_Bayesian_sense_for_chi_square_goodness_of_fit(fit)
```

In the following, we shall use the general notation. Let \(y_{\text{obs}}\) be observed data and \(f(y|\theta)\) be a model (likelihood) for a future data \(y\). We write a prior and a posterior distributions by \(\pi(\theta)\) and \(\pi(\theta|y) \propto f(y|\theta)\pi(\theta)\). The posterior predictive distribution is defined by \(p(y|y_{\text{obs}}) := \int f(y|\theta)\pi(\theta|y_{\text{obs}}) d\theta\)

In our case, the data \(y\) is a pair of hits and false alarms, that is \(y=(H_1,H_2, \dots H_C; F_1,F_2, \dots F_C)\) and \(\theta = (z_1,dz_1,dz_2,\dots,dz_{C-1},\mu, \sigma)\). We shall define the \(\chi^2\) discrepancy ( goodness of fit statistics ) to validate that the model fit the data.

\[ T(y,\theta) := \sum _{c=1,\cdots,C} \biggr( \frac{[H_c-N_L\times p_c]^2}{N_L\times p_c}+ \frac{[F_c-(\lambda _{c} -\lambda _{c+1} )\times N_{I}]^2}{(\lambda _{c} -\lambda _{c+1} )\times N_{I} }\biggr). \]

for single reader and single modality case.

\[ T(y,\theta) := \sum _{r=1}^R \sum _{m=1}^M \sum _{c=1}^C \biggr( \frac{[H_{c,m,r}-N_L\times p_{c,m,r}]^2}{N_L\times p_{c,m,r}}+ \frac{[F_{c,m,r}-(\lambda _{c} -\lambda _{c+1} )\times N_{L}]^2}{(\lambda _{c} -\lambda _{c+1} )\times N_{L} }\biggr). \]

for multiple reader and multiple case, where the false positive fraction is per lesion instead of per image.

Note that \(p_c\) and \(\lambda _{c}\) depend on \(\theta\). In this statistic, the number of degrees of freedom is \(C-2\).

Classical frequentist methods, the parameter \(\theta\) is a fixed estimates, e.g. the maximal likelihood estimator, however, in Bayesian context, the parameter is not deterministic, so, by integrating with the posterior predictive distribution, we can get the posterior predictive \(\chi^2\) value and its p-value. Let \(y_{\text{obs}}\) be observed data. Then the posterior predictive p value is defined by

\[ p \text{ value of $y_{\text{obs}}$} = \int \int dy d\theta I_{T(y,\theta) > T(y_{\text{obs}},\theta)}f(y|\theta)\pi(\theta|y_{\text{obs}}) \]

In the following, we show how to calculate the double Integral. Suppose that \(\theta _1, \theta_2,\cdots,\theta_N\) are samples from the posterior distribution via Hamiltonian Monte Carlo simulation, we obtain a sequence of models (likelihoods) ; \(f(y|\theta_1),f(y|\theta_2),\cdots, f(y|\theta_N)\). Drawing the samples \(y_1,y_2,\cdots, y_N\) so that each \(y_i\) is a sample from the distribution whose density function is \(f(y|\theta_i)\), i.e., in conventional notation, we may write;

\[ y_1 \sim f(\cdot|\theta_1),\\ y_2 \sim f(\cdot|\theta_2),\\ \cdots,\\ y_N \sim f(\cdot|\theta_N). \]

and it is desired one, that is, we can interpret that the samples \(y_1,y_2,\cdots, y_N\) is drawing from the posterior predictive distributions. Using the law of large number, we can calculate the noble integral of the p value by

\[ p \text{ value} =\frac{1}{N} \sum I_{T(y_i,\theta_i) > T(y_{\text{obs}},\theta_i)} \]

I implement whether estimates contains bias or not. But, now, I exclude the dataset that the its fitting has not pass the R hat criterion, so I have to say the bias is calculated in convergence case only, and it is not suitable, so I will exclude it. But program shows the convergence rate and almost all procedure had converged, so the exclusion has not strong effect on the bias calculation.

For the user it is sufficient to know only one function `validation.dataset_srsc_for_different_NI_NL()`

. This function replicate many data-sets from known distributions which user specified before execution. In FROC data, the number of lesions and images corresponds samples size. So, large number of image gives us more small bias which will be confirmed by the function `validation.dataset_srsc_for_different_NI_NL()`

.

The following, the important variables of `validation.dataset_srsc_for_different_NI_NL()`

are shown.

`NLvector`

: A vector, whose each component means No. of lesions. We fit models to the replicated data with these fixed lesions specified this vector`NLvector`

. For, example, if`NLvector=c(100,1000)`

then the first we replicate various hits and false alarms under the assumption of 100 lesions, and evaluates the bias. Next we do same but changing 1000 lesions instead.

`ratio`

: The ratio of image:lesions. It bother for me to input both the images and lesions. Thus I construct so that it is sufficient to input only one of them, that is if lesion is specified, then the number of images are automatically generated by satisfying this`ratio`

`replicate.datset`

: For fixed number of lesions, images, the dataset of hits and false alarms are replicated, and the number of replicated data sets are specified by this variable.`mean.truth`

: The mean of signal distribution.`sd.truth`

: The standard deviation of signal distribution.`z.truth`

: The threshold of the bi-normal assumption`ite`

: The Hamiltonian Monte Carlo iterations used in each fitting for replicated data-sets.

Each number in table means the errors of parameter, i.e., the mean values of estimates minus true parameter over all replications. We can see that the number of images and lesions become more large, then these error tends to zero. That is, in FROC models, the number of lesions and images corresponds sample size, so if these values are larger, then the hits and false alarms also become larger, and it leads us to smaller biases.

` validation.dataset_srsc_for_different_NI_NL()`

Name.of.Parameters | 1-th model | 2-th model | 3-th model | 4-th model |
---|---|---|---|---|

Number of Images | 200.0000000 | 20000.0000000 | 2.00000e+06 | 2.0000e+08 |

Number of Lesions | 100.0000000 | 10000.0000000 | 1.00000e+06 | 1.0000e+08 |

z[ 1 ] | 0.1055186 | 0.0103358 | 1.01020e-03 | 1.0720e-04 |

z[ 2 ] | 0.1956183 | 0.0132404 | 1.20850e-03 | 1.3030e-04 |

z[ 3 ] | 1.1307551 | 0.0309471 | 2.52580e-03 | 2.9210e-04 |

mean.of.signal | -0.4843169 | -0.0463222 | -4.95890e-03 | -4.2730e-04 |

sd.of.signal | 3.8036830 | 0.1160011 | 1.07218e-02 | 1.0843e-03 |

AUC | -0.0348429 | -0.0041814 | -4.49200e-04 | -4.0100e-05 |

dz[ 1 ] | 0.0900996 | 0.0029046 | 1.98300e-04 | 2.3100e-05 |

dz[ 2 ] | 0.9351368 | 0.0177067 | 1.31730e-03 | 1.6170e-04 |

p[ 1 ] | -0.0384759 | -0.0019671 | -1.90400e-04 | -1.9200e-05 |

p[ 2 ] | -0.0119591 | -0.0015342 | -1.72000e-04 | -1.5200e-05 |

p[ 3 ] | -0.0121365 | -0.0026324 | -2.75400e-04 | -2.5300e-05 |

lambda[ 1 ] | -0.0384759 | -0.0019671 | -1.90400e-04 | -1.9200e-05 |

lambda[ 2 ] | -0.0119591 | -0.0015342 | -1.72000e-04 | -1.5200e-05 |

lambda[ 3 ] | -0.0121365 | -0.0026324 | -2.75400e-04 | -2.5300e-05 |

This is a statistical test for the null hypothesis that the MCMC sampling contains no bias.

From SBC algorithm runs, then the pseudo uniform distribution are obtained.

if it is exactly uniform distribution, then we cannot say that the MCMC contains bias,

If pseudo uniform distribution is differ from uniformity, then we reject the null hypothesis and we may consider that MCMC sample contains bias.

: Imaging methods such as MRI, CT, PET, …, etc**modality**: radiography contains signal and noise. In Randiological context, signal is nodule or lesion.**image**: signal which should be detected by reader, it is also called nodule, diseased-case…etc**lesion**: True positive; TP**hits***False*: False positive; FP**alarm**: An abbreviation of False Positive Fraction which is cumulative sums of false positives per lesion or per image.*FPF*: An abbreviation of True Positive Fraction which is cumulative sums of true positives per lesion.*TPF*: Abbreviation of cumulative false positive per image (or per lesion) which is the author’s word, it - customary called False Positive Fraction or False positive per Image (FPI) in FROC context.**CFP**: Abbreviation of Cumulative True Positive per lesion, note that signal is each lesion, thus this cannot be said per image.**CTP**: Abbreviation of Free-response ROC. The difference of ROC and FROC is that ROC allows only dichotomous answer whether the image is diseased or not. On the other hand, in FROC task, each reader can answer multiple answer for a single image. This multiple answer for single image cause the multiple False alarms for each single image. Thus the upper bound of false alarm is infinity which cause that the FROC is not the unbounded curve.**FROC****FROC**: Provides an alternative of ROC curve, so interpretation of FROC curve is same as ROC curve, that is, if FROC curve is more upper convex, then the observer performance is better. This curve fit to the FPF and TPF points, that is, this is an expected points of FPFs and TPFs.**curve**: Abbreviation of Alternative FROC curve. Since FROC curve is not bounded, the area under the FROC curve is infinity and thus cannot be defines. Indefiniteness of AUC for FROC curves leads us to the AFROC curve contained in the bounded region, and it provides the AUC for FROC trial.**AFROC****curve****Bi-normal**: Two Gaussian distribution are used to specify the hit rate and false alarm rate.**assumption****Poisson**: It determined the false alarm rate, required some compatibility condition with the Bi-normal assumptions.**assumption**: Area under the AFROC curve.**AUC**

Basic words | Truth = Positive | Truth = negative |
---|---|---|

Reader’s positive | TP | FP |

Reader’s negative | FN | TN |

We only focus TP and FP, thus we call it ** hit** and

Basic words | Truth = Positive | Truth = negative |
---|---|---|

Reader’s positive | hit |
Falsealarm |

Reader’s negative | FN | TN |

Number of hits are denoted by

`h`

and number of false alarms are denoted by`f`

in the R console, respectively.Number of hits are denoted by \(H\) in TeX and number of false alarms are denoted by \(F\) in TeX.

In this section, I show the most important things.

I worked some company, then as a work, someone let me use some washing soap, whose component is

- Alkyl ether sulfate sodium ester (AES),
- Alkyl benzyl dimethyl ammonium salt
- and others.

Then, when I exposed it for (4 h; 1 year 18 month ago), then I felt Irritated stimulus in my whole body, shock pains. Then my body become bad, especially, my skin shows my pain as an prurigo nodularis. Now I am prescribed the following steroid of rand *very* *strong* and protopic and Bilanoa tablet.

I regret to use it, since in my private life, I never have used such things.

Reader should not use such a things. My all body, brain, eyes, teeth, leg, hands, breathing, elbow, finger, anus, karasuma, scalp, blood vessels, thinning hair, …. etc.

Irritated stimulus is very strong fist month. Especially, in brain, the stimulus continue 10 months. Now, 1 year and 18 months passed from the exposure, but my body is still very bad.

The traces of prurigo nodularis make me very sad.

So, we should not use any petroleum detergent, causes environmental hormone, very strong effect on our body.

The most important things are *bio* *surfactant.* We should establish the technology of bio surfactant.

2017 May ~ 2017 August, the author had made theses Bayesian FROC models.

For detail, please see my pre-print on Arxiv: \((';\omega {}; ')\)

" *Bayesian* *Models* *for* *free*- *response* *receiver* *operating* *characteristic* *analysis*"

Note that we use the following definition

\[p_{C}= 1-\Phi (\frac{z_{c}-\mu}{\sigma}), \]

instead of

\[p_{C}= \Phi (\frac{z_{C +1}-\mu}{\sigma}) -\Phi (\frac{z_{c}-\mu}{\sigma}), \] with theoretically infinity but numerically very large number \(z_{C+1}\).

One would think that this note is redundant, but this is very important to avoid the divergent transition issues which is a bias in MCMC sampling. The author first make a stan file with the latter, then divergent transition occurs. To avoid the divergent transitions, the centering is the one known methods, but in my case, I think it does not relates the centering. Now, my package implements the former.

In our model, author write down the highest confidence level separately. Because if not, then it cause the divergent transition issue. In the past, I wrote the model with the assumption that the highest threshold \(z_{C+1}\) is the very large number (theoretically it is infinity) and whose prior is the uniform distribution with very large support, then it cause the divergent transitions almost all iterations in the MCMC. Also, I use the target formulation to avoid Jacobin warnings, such warnings also difficult for me to solve or understand what it say or how overcome.

One may thinks my models are very complex. In fact many people will hate my explanation, my models. One reason why my model is so complex is the FROC statistical model is very complex and without the reader’s effort, cannot understand. Further more, the description of my model is changed to avoid the divergent transition issues, it make my model more complex one. Further I had to overcome Jacobian issues, which formulation is not intuitive for me.

I am very tired to explain my model and also disappeared my ability :’-D

These model are made in two years ago, 2017.May ~ 2017. August.

I briefly explain the divergent transition in my model. In theoretical perspective, it natural to introduce the highest decision threshold as an very large number, which is a parameter of model. Unfortunately, it is distributed by the uniform distributions with very large support. Then once we introduce such theoretically infinity parameter in the numerical program, it case the divergent transition issues.

- WAIC or Bayes factor approach
- My hierarchical model does not have finite WAIC, so why ? I have to show it, .. but … I am tired.

If \(X \sim \text{Poisson}(\lambda_X)\) and \(Y \sim \text{Poisson}(\lambda_Y)\), then \(X+Y \sim \text{Poisson}(\lambda_X + \lambda_Y)\). We use this relation in the above of the false alarm context.

If some random variable \(N\) satisfy \(N \sim \text{Poisson}(\lambda)\) then \(\mathbb{P}(N=0)= e^{-\lambda}\) (redundant ?).