R strsplit ignore some text

I’m working on a survey, and many of the written categories on an answer are separated by commas. I have used gsub successfully in order to separate them, like this.

sss6 <- str_trim(unlist(strsplit(aiprm$step_do_you_anticipate, split=",")))

I have successfully separated strings like these, so I can count them each correctly in order to make visualizations.

Grammar, None of the above, Grammar, Subject matter expertise, Grammar, Subject matter expertise, Bias, Grammar, Subject matter expertise, Bias, Fact-checking

The problem now is that I have text with parenthesis and commas inside, and I would like that the commas inside the parenthesis “()” are ignored. Here are some examples of that.

Ad copy, JavaScript code, headlines, compelling copy, commercial ideas, Ad copy, Title & meta description, Idea generation (topics, headlines), Code, Idea generation (topics, headlines), Ad copy, Idea generation (topics, headlines)

Is there any way to tell the strsplit() function to not separate or ignore the commas that are inside the parenthesis? The main problem is (topics, headlines)

Thanks!

  • 1

    Is the problematic string in parentheses always “(topics, headlines)” or will it change? Also, do you want to keep the problematic part in parentheses or do you want to remove it?

    – 




Leave a Comment