I have a data frame as below where I have participant ID’s, 2 types of clinical diagnoses marked as 1 and 2, with control participants marked with 0.
I also have a subclass column that is the result of using the package Match-it
– this has an output where each treatment variable with a clinical diagnosis has been assigned a control participant based on nearest neighbor matching.
df <- data.frame( ID = c(1,2,3,4,5,6,7,8),
clinical diagnosis = c(1,1,2,2,0,0,0,0),
subclass = c(1,2,3,4,1,2,3,4))
summary(df)
ID Clinical_Diagnosis Subclass
1 1 1
2 1 2
3 2 3
4 2 4
5 0 1
6 0 2
7 0 3
8 0 4
I would like to create a simple variable where all participants with a clinical diagnosis == 1
and their respective matched control participants, in this case ID = 5 & 7
are assigned a value of 1
in a variable called match_group
, and all participants with a clinical diagnosis == 2
and their respective matched control participants are assigned a value of 2
.
I have tried to create two separate matching files and simply creating a variable based on whichever matching group participants were assigned from. However, when using full
and exact
matching processes with a larger dataset, I end up with duplicated participants which I can identify, however I want to avoid this and hence want to be able to derive this from a singular Match-it
process.