Notepad++ regex reorder text a bit more complex

若如初见. 提交于 2019-12-23 05:11:42

问题


In trying to simplify, I had originally asked this question and I got an answer that works correctly here: regex to reorder text with MS Word 2010 or Notepad++

Unfortunately I tried to then apply what I learned but I could not figure it out!

It's a lexicon, so for each entry (÷H1, ÷H2, ÷H3, etc etc etc) there are 1 or more "KJV Occurrences". IE: "÷H1" only has 1 word: "abidan".
But, "÷H2" has 2 words "abida" and "abidah".
and, "÷H3" has 3 words "abijah", "abiah" and "abia".

(some entries have as many as 100 words! And I need to account for them as shown at the bottom, as tab delimited so I can then open the text file with MS Excel).

This is how I have it:

÷H1
אֲבִידָן
’ăbı̂ydânn
BDB Definition:
Abidan = “my father is judge”
1) a prince (ruler) of Benjamin
Part of Speech: noun proper masculine
A Related Word by BDB/Strong’s Number: from H1 and H1777
Total KJV Occurrences: 5
abidan, 5
Num_1:11, Num_2:22, Num_7:60, Num_7:65, Num_10:24

÷H2
אֲבִידָע
’ăbı̂ydâ‛
BDB Definition:
Abida or Abidah = “my father knows”
1) fourth son of Midian and grandson of Abraham by his wife Keturah (after Sarah died)
Part of Speech: noun proper masculine
A Related Word by BDB/Strong’s Number: from H1 and H3045
Total KJV Occurrences: 2
abida, 1
1Ch_1:33
abidah, 1
Gen_25:4

÷H3
אֲבִיָּהוּ / אֲבִיָּה
’ăbı̂yâh / ’ăbı̂yâhû
BDB Definition:
Abia or Abiah or Abijah = “Jehovah is (my) father”
1) king of Judah, son and successor of Rehoboam
2) second son of Samuel
3) son of Jeroboam the first, king of Israel
4) son of Becher, a Benjamite
5) head of a priestly house (one of the 24 Levite groups)
6) head of a priestly house (after the exile)
7) wife of Hezron
8) mother of Hezekiah (compare H21)
Part of Speech: noun proper masculine
A Related Word by BDB/Strong’s Number: from H1 and H3050
Total KJV Occurrences: 25
abijah, 20
1Ki_14:1, 1Ch_24:10, 2Ch_11:20, 2Ch_11:22, 2Ch_12:16, 2Ch_13:1-4 (4), 2Ch_13:15, 2Ch_13:17, 2Ch_13:19-22 (4), 2Ch_29:1 (2), Neh_10:7, Neh_12:4, Neh_12:17
abiah, 4
1Sa_8:2, 1Ch_2:24, 1Ch_6:28, 1Ch_7:8
abia, 1
1Ch_3:10


This is how I need it:

÷H1 TABDELIMITED אֲבִידָן TABDELIMITED ’ăbı̂ydân TABDELIMITED BDB Definition: Abidan = “my father is judge”. 1) a prince (ruler) of Benjamin. TABDELIMITED Part of Speech: noun proper masculine. TABDELIMITED A Related Word by BDB/Strong’s Number: from H1 and H1777. TABDELIMITED Total KJV Occurrences: 5 TABDELIMITED abidan TABDELIMITED , 5 TABDELIMITED Num_1:11, Num_2:22, Num_7:60, Num_7:65, Num_10:24

÷H2 TABDELIMITED אֲבִידָע TABDELIMITED ’ăbı̂ydâ‛ TABDELIMITED BDB Definition: Abida or Abidah = “my father knows”. 1) fourth son of Midian and grandson of Abraham by his wife Keturah (after Sarah died). TABDELIMITED Part of Speech: noun proper masculine. TABDELIMITED A Related Word by BDB/Strong’s Number: from H1 and H3045. TABDELIMITED Total KJV Occurrences: 2 TABDELIMITED abida TABDELIMITED , 1 TABDELIMITED 1Ch_1:33

÷H2 TABDELIMITED אֲבִידָע TABDELIMITED ’ăbı̂ydâ‛ TABDELIMITED BDB Definition: Abida or Abidah = “my father knows”. 1) fourth son of Midian and grandson of Abraham by his wife Keturah (after Sarah died). TABDELIMITED Part of Speech: noun proper masculine. TABDELIMITED A Related Word by BDB/Strong’s Number: from H1 and H3045. TABDELIMITED Total KJV Occurrences: 2 TABDELIMITED abidah TABDELIMITED , 1 TABDELIMITED Gen_25:4

÷H3 TABDELIMITED אֲבִיָּהוּ / אֲבִיָּה TABDELIMITED ’ăbı̂yâh / ’ăbı̂yâhû TABDELIMITED BDB Definition: Abia or Abiah or Abijah = “Jehovah is (my) father”. 1) king of Judah, son and successor of Rehoboam. 2) second son of Samuel. 3) son of Jeroboam the first, king of Israel. 4) son of Becher, a Benjamite. 5) head of a priestly house (one of the 24 Levite groups). 6) head of a priestly house (after the exile). 7) wife of Hezron. 8) mother of Hezekiah (compare H21). TABDELIMITED Part of Speech: noun proper masculine. TABDELIMITED A Related Word by BDB/Strong’s Number: from H1 and H3050. TABDELIMITED Total KJV Occurrences: 25 TABDELIMITED abijah TABDELIMITED , 20 TABDELIMITED 1Ki_14:1, 1Ch_24:10, 2Ch_11:20, 2Ch_11:22, 2Ch_12:16, 2Ch_13:1-4 (4), 2Ch_13:15, 2Ch_13:17, 2Ch_13:19-22 (4), 2Ch_29:1 (2), Neh_10:7, Neh_12:4, Neh_12:17

÷H3 TABDELIMITED אֲבִיָּהוּ / אֲבִיָּה TABDELIMITED ’ăbı̂yâh / ’ăbı̂yâhû TABDELIMITED BDB Definition: Abia or Abiah or Abijah = “Jehovah is (my) father”. 1) king of Judah, son and successor of Rehoboam. 2) second son of Samuel. 3) son of Jeroboam the first, king of Israel. 4) son of Becher, a Benjamite. 5) head of a priestly house (one of the 24 Levite groups). 6) head of a priestly house (after the exile). 7) wife of Hezron. 8) mother of Hezekiah (compare H21). TABDELIMITED Part of Speech: noun proper masculine. TABDELIMITED A Related Word by BDB/Strong’s Number: from H1 and H3050. TABDELIMITED Total KJV Occurrences: 25 TABDELIMITED abiah TABDELIMITED , 4 TABDELIMITED 1Sa_8:2, 1Ch_2:24, 1Ch_6:28, 1Ch_7:8

÷H3 TABDELIMITED אֲבִיָּהוּ / אֲבִיָּה TABDELIMITED ’ăbı̂yâh / ’ăbı̂yâhû TABDELIMITED BDB Definition: Abia or Abiah or Abijah = “Jehovah is (my) father”. 1) king of Judah, son and successor of Rehoboam. 2) second son of Samuel. 3) son of Jeroboam the first, king of Israel. 4) son of Becher, a Benjamite. 5) head of a priestly house (one of the 24 Levite groups). 6) head of a priestly house (after the exile). 7) wife of Hezron. 8) mother of Hezekiah (compare H21). TABDELIMITED Part of Speech: noun proper masculine. TABDELIMITED A Related Word by BDB/Strong’s Number: from H1 and H3050. TABDELIMITED Total KJV Occurrences: 25 TABDELIMITED abia TABDELIMITED , 1 TABDELIMITED 1Ch_3:10

I need to have it tab delimited so I can then open it with MS Excel, with a row for each one or more ÷H1's ÷H2's ÷H3's ... where all the words I emphasized in bold/italic will eventually fall into column H of my Excel spreadsheet.

Thanks! Alex


回答1:


Huh, this one is much more complex than your previous question... I will use <space> to indicate a space character and not the literal "space" word.

Steps:

  1. Find: (Definition:)[\r\n]{1,2}|[\r\n]{1,2}(?=\d+\))

    Replace by: <space>

    This part should bring all the definitions in a single line.

  2. Find:

    (÷[^\r\n]+)[\r\n]{1,2}([^\r\n]+)[\r\n]{1,2}([^\r\n]+)[\r\n]{1,2}([^\r\n]+)[\r\n]{1,2}([^\r\n]+)[\r\n]{1,2}([^\r\n]+)[\r\n]{1,2}([^\r\n]+)[\r\n]{1,2}([^\r\n,]+),([^\r\n]+)[\r\n]{1,2}([^\r\n]+)

    Replace by: $1\t$2\t$3\t$4\t$5\t$6\t$7\t$8\t$9\t$10

  3. Find:

    ((÷(?![^\r\n]+[\r\n]{1,2}÷)(?:[^\t]+\t){7})[^\r\n]+[\r\n]{1,2})([^\r\n,]+),([^\r\n]+)[\r\n]{1,2}([^\r\n]+)

    Replace by: $1$2$3\t$4\t$5

And repeat step 3 as many times as necessary.

EDIT: If you just want ÷H1 followed by the bold words, you could try:

  1. Find: ^(÷[^\r\n]+)[\s\S]+?Total KJV Occurrences: \d+

    Replace by: $1

  2. Find: (÷[^\r\n]+)[\r\n]{1,2}([^,]+),[^\r\n]+[\r\n]{1,2}[^\r\n]+

    Replace by: $1\t$2

  3. Find

    ((÷[^\s]+)[^\r\n]+[\r\n]{1,2}(?![\r\n]*÷))([^,]+),[^\r\n]+[\r\n]{1,2}[^\r\n]+

    Replace by: $1$2\t$3

And repeat step 3 as many times as necessary.



来源:https://stackoverflow.com/questions/18890244/notepad-regex-reorder-text-a-bit-more-complex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!