问题
I am trying to use regular expressions in sublime 3, to remove all the content between two strings, an XML file.
Suppose this is my content:
<Body name="ground">
<mass>0</mass>
<mass_center> 0 0 0</mass_center>
<inertia_xx>0</inertia_xx>
<inertia_yy>0</inertia_yy>
<inertia_zz>0</inertia_zz>
<inertia_xy>0</inertia_xy>
<inertia_xz>0</inertia_xz>
<inertia_yz>0</inertia_yz>
<!--Joint that connects this body with the parent body.-->
<Joint />
<VisibleObject>
<!--Set of geometry files and associated attributes, allow .vtp, .stl, .obj-->
<GeometrySet>
<objects />
<groups />
</GeometrySet>
<!--Three scale factors for display purposes: scaleX scaleY scaleZ-->
<scale_factors> 1 1 1</scale_factors>
<!--transform relative to owner specified as 3 rotations (rad) followed by 3 translations rX rY rZ tx ty tz-->
<transform> -0 0 -0 0 0 0</transform>
<!--Whether to show a coordinate frame-->
<show_axes>false</show_axes>
<!--Display Pref. 0:Hide 1:Wire 3:Flat 4:Shaded Can be overriden for individual geometries-->
<display_preference>4</display_preference>
</VisibleObject>
<WrapObjectSet>
<objects />
<groups />
</WrapObjectSet>
</Body>
Now suppose I want to remove all the content between <VisibleObject>
and </VisibleObject>
to leave only:
<Body name="ground">
<mass>0</mass>
<mass_center> 0 0 0</mass_center>
<inertia_xx>0</inertia_xx>
<inertia_yy>0</inertia_yy>
<inertia_zz>0</inertia_zz>
<inertia_xy>0</inertia_xy>
<inertia_xz>0</inertia_xz>
<inertia_yz>0</inertia_yz>
<!--Joint that connects this body with the parent body.-->
<Joint />
<VisibleObject>
</VisibleObject>
<WrapObjectSet>
<objects />
<groups />
</WrapObjectSet>
</Body>
There are a few similar threads and problems, to the above but none of them seem to work particularly well (or at all) for this problem.
Any help would be most appreciated.
回答1:
An image with the sublime window:
You can find it via Find, then Replace and make sure you tick the most outer left options.
回答2:
Sublime appears to use PCRE, according to this page.
That means that you should be able to use the cool tricks PCRE offers (mostly negative look-ahead). This can speed up performance considerably.
The regex I recommend is:
<VisibleObject>(?:[^<]*(?!</VisibleObject).)+</VisibleObject>
Essentially, the negative look-ahead ensures that whenever a <
is present (namely at the start of a tag), it's not the closing </VisibleObject>
.
The .
is needed so that the engine can backtrack one character when the negative look-ahead sees the closing tag.
You will need to use the replacement <VisibleObject></VisibleObject>
.
来源:https://stackoverflow.com/questions/38960282/remove-all-content-between-two-strings-using-regular-expressions