Note that the time and event variables are encoded as ‘time’ and ‘cen’ in both the model and the data cohort. This is key for when we want to make our comparison.
We should also check that the categorical variables have class factor and are dummy encoded, and that continuous variables are class numeric in the data cohort.
'data.frame': 362 obs. of 7 variables:
$ ResecM : Factor w/ 2 levels "0","1": 1 1 1 2 2 2 2 1 2 2 ...
$ LymphN : Factor w/ 2 levels "0","1": 2 1 2 1 2 1 2 1 1 1 ...
$ Diff_Status: Factor w/ 3 levels "0","1","2": 3 1 2 2 1 1 1 2 3 2 ...
$ treat : chr "GEMCAP" "GEMCAP" "GEMCAP" "GEMCAP" ...
$ time : num 8.71 49.28 6.74 21.91 23.55 ...
$ cen : int 1 1 1 1 1 0 0 1 0 0 ...
$ PostOpCA199: num 1.38 0 1.82 1.13 5.86 ...
If variables are not the correct class this can be changed:
espac4_gemcap$LymphN <- as.factor(espac4_gemcap$LymphN)
espac4_gemcap$PostOpCA199 <- as.numeric(espac4_gemcap$PostOpCA199)
It is also important to ensure that the variables present in the model are also present in the comparison dataset, and that their names match exactly:
names(flsm$data$m[,2:5]) %in% names(espac4_gemcap)
Now that we have checked the correct variables are in our dataset and that they are encoded correctly, we are ready to use the psc package to make a comparison of two treatments!