问题
I am having significant trouble attempting to use the tabulizer package in R to extract text within tables. The issue is that the tables have a very odd structure (merged cells)...
I am trying to extract a section of the table that is highlighted in red. The numbers at the top of the highlighted portion are the days of the month. For each day, I need to records the values for "Row1" to "Row5". However, when I use the extract_tables function I get the following table (only a small portion)...
For some reason days 1, 2 and 3 are being squished into a single cell. Has anyone else run into this issue using tabulizer? I would specify the coordinates of the table that I am trying to extract, however, the positioning of the table changes with each PDF document. I also cannot specify the region manually because I am trying to automate the process. I can't upload the PDF document to dropbox and then post the link here because I am on my work computer. I can post it tonight if anyone wants to try this particular example. Any help/resources are very much appreciated!
来源:https://stackoverflow.com/questions/60571187/extracting-text-from-a-table-in-r