Parsing column and splitting it into tabular form

倾然丶 夕夏残阳落幕 提交于 2020-08-10 19:13:33

问题


This is my sample output table which i want to parse and get this the image i will post.

     dput(head(tbl2,20))
structure(list(PMCID = c("PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", "PMC7362563", 
"PMC7362563", "PMC7362563"), table = c("Table 1", "Table 1", 
"Table 1", "Table 1", "Table 1", "Table 1", "Table 1", "Table 1", 
"Table 1", "Table 2", "Table 2", "Table 2", "Table 2", "Table 2", 
"Table 2", "Table 2", "Table 2", "Table 2", "Table 2", "Table 2"
), row = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L), text = c("subheading=Achieved CR; Best overall response, n (%)=CR; Glasdegib + LDACN = 78=15 (19.2); LDAC aloneN = 38=1 (2.6)", 
"subheading=Did not achieve CR; Best overall response, n (%)=CRi; Glasdegib + LDACN = 78=4 (5.1); LDAC aloneN = 38=1 (2.6)", 
"subheading=Did not achieve CR; Best overall response, n (%)=PR; Glasdegib + LDACN = 78=5 (6.4); LDAC aloneN = 38=0", 
"subheading=Did not achieve CR; Best overall response, n (%)=PRi; Glasdegib + LDACN = 78=2 (2.6); LDAC aloneN = 38=0", 
"subheading=Did not achieve CR; Best overall response, n (%)=MLFS; Glasdegib + LDACN = 78=2 (2.6); LDAC aloneN = 38=0", 
"subheading=Did not achieve CR; Best overall response, n (%)=MR; Glasdegib + LDACN = 78=4 (5.1); LDAC aloneN = 38=4 (10.5)", 
"subheading=Did not achieve CR; Best overall response, n (%)=SD; Glasdegib + LDACN = 78=14 (17.9); LDAC aloneN = 38=9 (23.7)", 
"subheading=Did not achieve CR; Best overall response, n (%)=Treatment failure; Glasdegib + LDACN = 78=9 (11.5); LDAC aloneN = 38=7 (18.4)", 
"subheading=Did not achieve CR; Best overall response, n (%)=Not evaluable; Glasdegib + LDACN = 78=23 (29.5); LDAC aloneN = 38=16 (42.1)", 
"subheading=Age (years), n (%); Characteristic=45–64; Achieved CR: Glasdegib + LDACN = 15=0; Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=1 (1.6); Did not achieve CR: LDAC aloneN = 37=1 (2.7)", 
"subheading=Age (years), n (%); Characteristic=≥ 65; Achieved CR: Glasdegib + LDACN = 15=15 (100); Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=62 (98.4); Did not achieve CR: LDAC aloneN = 37=36 (97.3)", 
"subheading=Age (years), n (%); Characteristic=Median (range); Achieved CR: Glasdegib + LDACN = 15=74 (65–87); Achieved CR: LDAC aloneN = 1=78 (78–78); Did not achieve CR: Glasdegib + LDACN = 63=77 (64–92); Did not achieve CR: LDAC aloneN = 37=76 (58–83)", 
"subheading=Sex, n (%); Characteristic=Female; Achieved CR: Glasdegib + LDACN = 15=5 (33.3); Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=14 (22.2); Did not achieve CR: LDAC aloneN = 37=14 (37.8)", 
"subheading=Sex, n (%); Characteristic=Male; Achieved CR: Glasdegib + LDACN = 15=10 (66.7); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=49 (77.8); Did not achieve CR: LDAC aloneN = 37=23 (62.2)", 
"subheading=ECOG PS, n (%); Characteristic=0; Achieved CR: Glasdegib + LDACN = 15=0; Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=10 (15.9); Did not achieve CR: LDAC aloneN = 37=2 (5.4)", 
"subheading=ECOG PS, n (%); Characteristic=1; Achieved CR: Glasdegib + LDACN = 15=5 (33.3); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=21 (33.3); Did not achieve CR: LDAC aloneN = 37=17 (45.9)", 
"subheading=ECOG PS, n (%); Characteristic=2; Achieved CR: Glasdegib + LDACN = 15=10 (66.7); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=31 (49.2); Did not achieve CR: LDAC aloneN = 37=18 (48.6)", 
"subheading=ECOG PS, n (%); Characteristic=Not reported; Achieved CR: Glasdegib + LDACN = 15=0; Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=1 (1.6); Did not achieve CR: LDAC aloneN = 37=0", 
"subheading=Cytogenetic risk, n (%); Characteristic=Good/intermediate risk; Achieved CR: Glasdegib + LDACN = 15=12 (80.0); Achieved CR: LDAC aloneN = 1=0; Did not achieve CR: Glasdegib + LDACN = 63=41 (65.1); Did not achieve CR: LDAC aloneN = 37=22 (59.5)", 
"subheading=Cytogenetic risk, n (%); Characteristic=Poor risk; Achieved CR: Glasdegib + LDACN = 15=3 (20.0); Achieved CR: LDAC aloneN = 1=1 (100); Did not achieve CR: Glasdegib + LDACN = 63=22 (34.9); Did not achieve CR: LDAC aloneN = 37=15 (40.5)"
)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"
))

So i want to get this data frame back into this table 1 and table 2 form.

I tried splitting columns without much success apparently I'm not able to define the pattern which is "=" and ":" in all the tables.

Question:

  1. How to split the txt column in the data frame as my above figure.?
  2. After separating the column I would like to write each table with respective PMCIDs with folder with same PMCID name if there are 3 table in a paper then 3 table must be written separately in the same folder.

Suggestion or help would be highly and really appreciated.

来源:https://stackoverflow.com/questions/63171300/parsing-column-and-splitting-it-into-tabular-form

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!