How to Parse LaTex file

十年热恋 提交于 2019-12-22 10:30:27

问题


I just finished to write the summary for calculus in Latex.

The main problem now is that the files contains many things I don't really need now.

The .tex files contains many definitions and theorems that i need to study by heart.

The definitions have their own definition in the tex file, so any definition in the file will start with:

\begin{definition}

and ends with

\end{definition}

And the same for theorems.

I need to write something to take out whatever is inside the \begin{}...\end{}.

For example in a list called A:

\begin{document}

\begin{center}
\begin{definition} Hello WOrld! \end{definition}
\begin{example}A+B \end{example}
\begin{theorem} Tre Capre \end{theorem}
\begin{definition} Hello WOrld2! \end{definition}
\end{center}
\end{document}

should contains :[[\begin{definition} Hello WOrld! \end{definition}],[\begin{theorem} Tre Capre \end{theorem}],[\begin{definition} Hello WOrld2! \end{definition}]]

Looking in this site i found that i can use Regular Expressions:

for i in range(5):
    x = i+1
    raw = open('tex/chapter' + str(x) + '.tex')
    A = []
    for line in raw:
        A.append(re.match(r'(\begin{definition})://.*\.(\end{definition})$', line))
print(A)

but the output is just None and i don't really know why

Edit:

import re


for i in range(5):
    x = i+1
    raw = open('tex/chapter' + str(x) + '.tex')
    A = re.findall(r'\\begin{definition}(.*?)\\end{definition}', raw.read())
    print(A)

the output is the following:

[]
[]
[]
[]
[]

回答1:


From what I get from the question you just want the definitions from the Latex file. You can use findall to directly get your definitions:

A = re.findall(r'{definition}(.*?)\\end{definition}', raw.read())

Note the usage to .*? in order to tackle the greedy regex matching



来源:https://stackoverflow.com/questions/30752351/how-to-parse-latex-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!