main function defines the long options in its global g_longOpts
array. The g_longOpts and g_longEnd global variables are not declared
for general use in a header file but are declared in class specific headers
when needed. Currently that's only scenario/scenario.ih.
int main(int argc, char **argv)
try
{
Arg const &arg = Arg::initialize("a:B:l:C:D:hoP:R:S:t:vV",
g_longOpts, g_longEnd, argc, argv);
arg.versionHelp(usage, Icmake::version, arg.option('o') ? 0 : 1);
Options::instance(); // construct the Options object
Simulator simulator; // construct the simulator
simulator.run(); // and run it.
}
The simulations themselves are controlled by a Simulator, having only one
public member: run, performing the simulations.
Simulator is constructed by main. Its constructor determines the
source of the analysis specifications (setAnalysisSource).
The class defines the following data members:
bool d_next = false; // true if 'run()' should run
// an analysis
uint16_t d_lineNr = 1; // updated by 'fileAnalysis'
std::ifstream d_ifstream; // multiple analysis specs
std::string (Simulator::*d_nextSpecs)(); // ptr to function handling
// the (next) analysis spec.
If the command-line option -o was specified then analysis specifications
may be provided as command-line arguments, handled by the member
cmdLineAnalysis.
Otherwise a specification file must be provided as first command-line
argument. This file is read until a line that begins with analysis:, which
may then be followed by specifications, which are read by the member
fileAnalysis.
In both cases the Simulator's member d_next is set to true.
Specifications that are specified in the analysis specification file (or at
the command line when -o was specified) are appended to a specification
string, separated by newline characters. The member fileAnalysis then
returns these specifications.
The member d_nextSpecs is set to either cmdLineAnalysis or
fileAnalysis. Eventually these members set d_next to false, ending
the analyses.
If d_next is not false then the analysis-specific modifications are
then used by run to initialize a stream which is read by the Analysis
object performing the next analysis. Refer to section 2.3 for a
description of how the Analysis class modifies default parameter settings.
run member performs all analyses. At the construction d_next may
be set to true indicating that an analysis must be performed:
void Simulator::run()
{
while (d_next)
{
uint16_t lineNr = d_lineNr;
// read the next analysis specs
string spec = (this->*d_nextSpecs)();
emsg.setCount(0);
Analysis analysis{ istringstream{ spec }, lineNr };
analysis.run(); // run the simulation
}
}
If a simulation must be performed then the non-default specifications are
provided by the member to which d_nextSpec points.
The actual simulation is then performed by an Analysis object,
which also has a run public member performing the actual analysis.
Analysis class objects handle one simulation analysis. Since multiple
analyses are performed independently from each other, each Analysis object
initializes its own error count (class Error) and option handling (class
Options).
As options may specify the name of the file containing the
analysis-parameters Analysis objects also define a configuration file
object (ConfFile).
In simrisc's scientific context simulation parameters are also known as
`scenarios'. Scenarios contain information like the number of iterations to
perform and the number of cases to simulate. The analysis's Scenario
object is initialized by Analysis's constructor, and is used by its
run member.
Figure 2 provides an overview of Analysis's data members.
The class Analysis uses the classes Scenario, Loop, ConfFile,
Options and Error, which are covered in separate sections of this
technical manual.
When the Analysis object is constructed it constructs its own Scenario
object. That object receives the modifications that are specific for the
current analysis (e.g., that were specified following an analysis: line in
an analysis-specification file). When the Scenario object (cf. section
SCENARIO) is returning specifications (e.g., using its lines members,
then it returns the modified specifications if available. Modified
specification must be complete, as they replace the corresponding
specifications in the used (standard) configuration file.
run member may immediately end if errors were encountered in the
specifications of the Scanario and/or ConfigFile parameters.
One of the options defines the base directory into which the output is
written. The member requireBase checks whether the base directory exists,
and if not creates it. If that fails the program ends, showing the message
Cannot create base directory ...
where ... is replaced by the name of the base directory provided by the
Options object.
If all's well, the actual simulation is then performed by a Loop object.
Scenario class objects are defined by Analysis class objects and
contain parameter values of the simulation to perform. For each separate
analysis a new Scenario object is constructed.
Most of its members are accessors: const members returning values of
scenario parameters.
Some parameter values are stored in the Scenario object itself. Refer to
the simrisc(1) manual page for a description of their default values:
d_iterations, the number of iterations to perform (defaults to 1);
d_labels, a vector of strings containing the values of label:
lines that may be specified in analysis: sections to briefly
describe the essence of simulations;
d_nWomen, the number of cases to simulate (defaults to 100). This
name was used in the original versions of the simrisc program, and may be
changed to d_nCases in future versions. For now it is kept;
d_seed, the initial seed of for the fixed-seed or increasing-seed
random number generator. See the Random section for details about
how random values are gegenerated.
d_spread, when set to true some (otherwise fixed) configuration
parameters may show random fluctuations. Cf. chapter 5 for
details.
Configuration parameters start with identifying names, like costs: or
screeningRounds:. Those names are then followed by the parameter's
specifications. Those specifications are made available by the members
value, returning the value of a numeric parameter;
lines, containing iterators to the lines in a ConfFile identified
by the identifying name;
find, returning the iterator to a single line identified by the
identifying name.
Loop is the `workhorse' class. It performs the simulation as
specified by the Scenario object which is passed to its objects when they
are constructed. The class Loop uses many other classes, most of which
encapsulate parameters specified in the configuration file. Those classes read
configuration file parameters and prepare these parameters for their use by
Loop.
The constructor of a Loop object defines the following objects:
d_costs: a Costs object providing various costs
parameters (cf. section 3.1). This object does not provide the
costs associated with various modalities. Those costs are directly
stored at the modalities themselves;
d_densities: a Densities object providing the breast-densities
parameters (cf. section 3.2);
d_modalities: a Modalities object providing the details about
which screening modalities are used in a specific simulation
(cf. section 3.3);
d_screening: a Screening object providing screening parameters
(cf. section 2.5.5);
d_tumor: a Tumor object handling the simulation related to the
occurrence of tumors (cf. section 3.5);
d_tumorInfo: a TumorInfo object providing tumor-related
configuration parameters (like incidence, growth and survival
parameters. Cf. section 3.6);
It also defines a Random object (d_random), generating random numbers
(cf. section 4.1).
iterate member performs the number of iterations (scenario:
iterations). At each iteration
TumorInfo (cf. section 3.6) parameters may be
refreshed;
Loop's counters are reset:
d_totalCost, the sum of all simulated costs over all
screening rounds;
d_sumDeathAge, the sum of all simulated dying ages over all
screening rounds;
d_nIntervals, the vector storing the number of simulated interval
cancers per screening round;
d_nRoundFP, the vector storing the number of false positive
diagnoses per screening round;
d_nRoundFN, the vector storing the number of false negative
diagnoses per screening round;
d_nDetections, the vector storing the number of self-detected
cancers per screening round;
d_roundCost, the vector storing the total costs per screening
round;
d_modalities, the counters maintained by the used Modalities
are reset.
genCases;
genCases member itself).
genCases performs one complete iteration over all screening
rounds for all simulated cases. The option nCases may be used to simulate
a specific number of cases. When nCases is specified only the data of the
final case are written to file. By default as many cases as specified at the
`nWomen' parameter (stored in the Scenario object) are simulated, and the
data of all those simulated cases are written to file. Analyses up to a
specific number of cases, or a single simulation using a preset death-age (see
option --death-age in the simrisc(1) man-page) can also be
performed. The number of cases to simulate is determined in the nCases
member.
For each simulated case:
(d)_caseCost), the tumor-caused death age
(d_deathAge), the screening round in which a tumor is detected
(d_roundDetected), and the indicator indicating whether the tumor has
been self-detected (d_selfDetected);
d_indices for the
different screening ages (the index computed for the first screening
age is stored at the first element). For each age a randomly
determined proportion is located in the cumulative proportions of the
bi-rads categories of its age group. The thus determined bi-rad
category is stored in d_indices. E.g., for age 54 these bi-rads
proportions could be used:
agegroup bi-rads1 bi-rads2 bi-rads3 bi-rads4
breastDensities: 50 - 60 0.08 0.50 0.37 0.05
resulting in category 3 if the randomly selected proportion is 70;
PRESENT. The
case's state changes during the simulation and the case's simulation
may end during the pre-screening phase, the screening phase, or the
post-screening phase, because of the case's natural death, because of
self-detection of a tumor or because of a tumor that's detected during
a screening cycle;
--nCases or
--death-age was specified: only the data of the requested single
simulated case).
if (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
&& tumor.selfDetectAge() < 1st screening age)))
This results in a needlessly complex implementation of the pre-screening
phase. It's much simpler to use the complement of this expression, skipping the
pre-screening phase if the complementary condition is true. The pre-screening
phase is therefore skipped if the following condition holds true:
not (Nscr > 0 && (naturalDeathAge < 1st screening age || (tumor present
&& tumor.selfDetectAge() < 1st screening age)))
The expression can be simplified using De Morgan's rule
a && b == !a || !b:
not (Nscr > 0) or
not (
naturalDeathAge < 1st screening age or
(tumor present and tumor.selfDetectAge() < 1st screening age)
)
Consequently, pre-screening is skipped if there are no screening rounds
(not (Nscr > 0)) and also if the following condition holds true:
not (
naturalDeathAge < 1st screening age or
(tumor present and tumor.selfDetectAge() < 1st screening age)
)
Distributing the not-operator over the terms of the above condition, and
applying De Morgan's rule !(a || b) == !a && !b we get:
naturalDeathAge >= 1st screening age and
not (
tumor present and
tumor.selfDetectAge() < 1st screening age
)
Applying De Morgan's rule once more this finally results in:
naturalDeathAge >= 1st screening age and
(
not tumor present or
tumor.selfDetectAge() >= 1st screening age
)
Thus, pre-screening is skipped if the above condition holds true.
loop/pretumordeath.cc) the case's simulation
ends, setting its status to LEFT_PRE.
If at this point there also happens to be an existing tumor then the
tumor's characteristics are determined as well (by Tumor::characteristics,
cf. section 3.5).
Loop:characteristics may be called during the pre-screening
phase and during the post-screening phase.
In both phases the tumor is self-detected, the tumor characteristics are
determined (Tumor::characteristics, Tumor::setDeathAge, cf. sections
3.5.2 and 3.5.3), and the treatment costs are
determined using the tumor's induced death age and the tumor's diameter
(Costs::treatment, cf. section 3.1).
If the case's natural death occurs before the tumor would have caused the
case's death then the case leaves the pre-screening or post-screening
simulation with status (respectively) LEFT_PRE and
LEFT_POST. Otherwise death was caused by the tumor and the the case leaves
the pre-screening or post-screening simulation with status (respectively)
TUMOR_PRE and TUMOR_POST.
As long as the case simulation has not ended (i.e., the case's state is
PRESENT) a screening is performed for each of the screening rounds defined
by the Screening object (cf. section 3.4), initialized in
Loop's constructor.
At each screening round two actions are performend:
PRESENT and the screening phase
ends for that case (see the next section for details);
Loop::leaving. In the
original program this is determined as follows:
Converting this condition, then the case leaves the simulation if
If at this point the case hasn't left the simulation, then
Loop::intervalCancer);
LEFT_DURING as the
case's natural death age has caused the case to leave the simulation.
However, a tumor might still have developed at that point, and if so
the tumor's characteristics are determined at the case's natural death
age (cf. Tumor::characteristics, section 3.5).
Loop::screen
simulates a screening at a given screening age.
At this point the modalities are considered. Each of the modalities configured for the current screening age is considered in turn. They are considered in their order of specification in the configuration file. E.g., when specifying
screeningRound: 50 Mammo MRI
screeningRound: 52 MRI Mammo
then Mammo is considered before MRI at screening age 50, and MRI is
considered before Mammo at screening age 52.
Modalities are made available by the Modalities member (d_modalities,
cf. section 3.3). The use member of this member returns the
information of all modalities that have been configured for the current
screening round.
Whether a configured modality is actually going to be used at a particular
screening round is determined by chance. In the configuration file the
parameter attendanceRate defines the probability that a case attends a
screening round. If the next random value drawn from the uniform random
distribution exceeds the configured attendance rate then the screening round
for that modality is skipped for the current case.
If a case attends a screening round then the screening round's costs are
determined (cf. section 3.1) and are added to the costs so far
(cf. Loop::addCost): the costs are added to the case's accumulated cost
and to the accumulated costs of the current screening round.
In addition, if a tumor exists at a screening round then the tumor's characteristics are determined for the current screening age.
Two factors determine whether a tumor may be detected or whether its detection may be a false positive. One factor (factor-1) is the (apparent) presence of a tumor, the other factor (factor-2) is whether the screening round's age is at least equal to the age that the tumor can be detected.
If both factors are true, then the tumor may be detected. Otherwise there may be a false positive tumor detection.
Maybe detecting the tumor (maybe a false negative conclusion):
The member maybeDetect (cf. Loop::maybeDetect) is used to decide
whether a tumor may be found during the screening. A false negative screening
result is obtained if a random value exceeds the current modality's
sensitivity. The sensitivities of the various modalities are returned by the
ModBase::sensitivity member (which in turn call their derived class's
members vSensitivity returning the return values of the actually used
modality's overridden vSensitivity members.
If a false negative result isn't obtained then a tumor was detected: its treatment costs are added to the accumulated costs and the dying age because of the tumor is set to the age of the current screening round.
If the natural dying age is earlier than the dying age caused by the
cancer, then the case leaves the simulation (using status LEFT_DURING).
Otherwise the case leaves the simulation using status TUMOR_DURING.
Maybe incorrectly detecting the tumor (maybe a false positive conclusion):
Once a tumor has apparently been observed it may in fact not exist, in which
case a false positive observation was made. The memebr maybeFalsePositive
handles this situation.
The specificity of the used modality, given the current screening age is
compared to a random value from the uniform random distribution. If the
generated random value exceeds the modality's specificity then the simulation
has encountered a false positive tumor detection. The numbers of false
positive decisions for the modality and for the screening round are
incremented and addCoist is called with argument the biopsy costs at the
current screening age.
If there is a tumor and the tumor's self-detection age is before the case's natural death then the tumor characteristics and treatment costs are determined (cf. section 2.5.4).
On the other hand, if there is no tumor or if the tumor's self-detection age would have been after the case's natural death then the case leaves the simulation at the case's natural death age. In this latter case (athough there is a tumor, it hasn't caused death) the tumor's characteristics are determined as well (cf. section 3.5.2).