Chapter 6 Troubleshooting PMwG errors
6.1 Writing your log-likelihood function: Tips, errors and check list
1. The parameter specified does not exist
The parameter name is not specified to be estimated i.e. it is not in the parameter names argument or it is misspelled. Make sure pars
vector contains the same parameter names you have included in your log-likelihood function and it is the same length. Do not rely on the log likelihood function to throw an error in this case.
(e.g.x[‘b’]
)
2. All non-continuous data frame variables must be a factor
.
Data frame variables should be factors
unless the variable is a continuous variable e.g. response time.
If you pass character
variables to if
statements and/or for
loops in your log likelihood function, errors will not occur, however, your log likelihood estimate will be incorrect. For example,
avoid using character strings like data$condition == “easy”
. If you must use a character string, be sure to convert the string to a factor with as.factor
.
3. Spelling errors or mismatched column name references
Correctly reference data frame column names in your log likelihood function e.g. data$RT
!= data$rt
.
4. When initialising a vector of parameter values - values are not filling in properly
E.g. When a vector for b for all the values across the data set to be used, but there are NAs filling it somewhere.
5. Make sure operations are done on the right scale.
6. Data frame variables are scaled appropriately for the model
Check your variables are correctly scaled and in the correct units. For example, with the LBA, response times must be in seconds rather than milliseconds.
7. The log-likelihood is printed/outputted at the end of function
Make sure your log-likelihood function prints an estimate at the end of the function and the estimate is correctly obtained e.g. sum the log-likelihood estimates for all trials/rows.
8. Sampling error occurs
When sampling, the generated columns are not outputted
9. When executing functions row by row (i.e. trial-wise), index MUST be included
If writing a trial-wise/row-wise function (e.g. if
statement, for
loop), index i
must be specified.
if (data$condition == “easy”) # Incorrect reference when iterating over variable
if (data$condition[i] == “easy”) # Include i index
10. Changing parameter values changes the log-likelihood estimate
A simple check to run on your log-likelihood function is to modify your parameter values and observe the change to log-likelihood estimate. Then check if changing parameter values which rely on conditions actually change the log-likelihood estimate.
11. Make sure you have the latest version of the PMwG Samplers package
checkVersion(“pmwg”)
12. Number of iterations for ’burn-in` stage of sampler
We suggest running burn-in
for few iterations and particles first. This will give you a sense of a) whether the sampler is working as intended (see troubleshooting/checks for what parameter chains should look like), b) the number of iterations & particles needed to achieve the target acceptance rate, as well as the appropriate epsilon value. The acceptance rate is generally very high for the first few iterations ( > 100) and then declines. After the initial short run, you can check and optimise the number of particles to be used (and balance with epsilon), so the acceptance rate is close to 30% on a longer, full run. We recommend you start with epsilon = .5 to increase efficiency, then adjust as needed.
NOTE: Overall, we aim for ~30% acceptance rate of particles. High acceptance rates may be inaccurate if burn-in runs for few iterations. Low acceptance rates are inefficient and may fail to create an efficient distribution for the sampling stage.
13. My model is taking a lifetime to run If you’re running your models on a laptop and the run time is unreasonable, if possible, run your models on a computing grid with multiple CPU cores.