regex | 易学教程

Parsing Interview Text

阅读更多关于 Parsing Interview Text

问题 I have a text file of a presidential debate. Eventually, I want to parse the text into a dataframe where each row is a statement, with one column with the speaker's name and another column with the statement. For example: "Bob Smith: Hi Steve. How are you doing? Steve Brown: Hi Bob. I'm doing well!" Would become: name text 1 Bob Smith Hi Steve. How are you doing? 2 Steve Brown Hi Bob. I'm doing well! Question: How do I split the statements from the names? I tried splitting on the colon: data

python regex to find accented words

阅读更多关于 python regex to find accented words

问题 Please I need help. I've got a problem when trying to find accented words in a text (in Spanish). I have to search in a large text the first paragraph starting with the words 'Nombre vernáculo' For example, the text is like: "Nombre vernáculo registrado en la zona de ..." But accented words are not recoginzed by my python script. I've tryed with: re.compile('/(?<!\p{L})(vern[áa]culo*)(?!\p{L})/') re.compile(r'Nombre vern[a\xc3\xa1]culo\.', re.UNICODE) re.compile ('[A-Z][a-záéíóúñ]+') \p{Lu}]

python regex to find accented words

阅读更多关于 python regex to find accented words

Calculate the string length in sed

阅读更多关于 Calculate the string length in sed

问题 I was forced to calculate the string length in sed . The string is always a nonempty sequence of a 's. sed -n ':c /a/! be; s/^a/1/; s/0a/1/; s/1a/2/; s/2a/3/; s/3a/4/; s/4a/5/; s/5a/6/; s/6a/7/; s/7a/8/; s/8a/9/; s/9a/a0/; /a/ bc; :e p' It's quite long :) So now I wonder if it is possible to rewrite this script more concisely using the y or other sed command? I know that it is better to use awk or another tool. However, this is not a question here. Note that the sed script basically simulates

Calculate the string length in sed

阅读更多关于 Calculate the string length in sed

RegEx to find credit card numbers with embedded spaces

阅读更多关于 RegEx to find credit card numbers with embedded spaces

问题 We currently have a content compliance in place where by we monitor anything that contains a credit card number with no spaces (e.g 5100080000000000 ) What we need is for a reg ex to pick up credit card numbers that are entered with spaces every 4 digits (eg: 5100 0800 0000 0000 ) We've been looking at alternate reg exs but have not yet found one that works for both scenarios mentioned above. The current reg ex we use is below ^((4\d{3})|(5[1-5]\d{2})|(6011)|(34\d{1})|(37\d{1}))-?\d{4}-?\d{4}

Python Regex for Words & single space

阅读更多关于 Python Regex for Words & single space

问题 I am using re.sub in order to forcibly convert a "bad" string into a "valid" string via regex. I am struggling with creating the right regex that will parse a string and "remove the bad parts". Specifically, I would like to force a string to be all alphabetical, and allow for a single space between words. Any values that disagree with this rule I would like to substitute with ''. This includes multiple spaces. Any help would be appreciated! import re list_of_strings = ["3He2l2lo Wo45rld!",

Word文档开发处理工具Aspose.Words v21.2发布！（含新功能演示）

阅读更多关于 Word文档开发处理工具Aspose.Words v21.2发布！（含新功能演示）

Aspose.Words for .Net是一种高级Word文档处理API，用于执行各种文档管理和操作任务。API支持生成，修改，转换，呈现和打印文档，而无需在跨平台应用程序中直接使用Microsoft Word。2021 年2月更新来啦，.NET版Aspose.Words更新至v21.2新版本！主要特点如下：实现了API以操纵Font对象的主题属性。添加了在保存时更新CreatedTime属性的选项。使用新的CustomTimeZoneInfo选项扩展了SaveOptions。使用新的SmartParagraphBreakReplacement选项扩展了FindReplaceOptions类。提供了从COM应用程序中的IStream对象加载文档的功能。 >>你可以下载 Aspose.Words for .NET v21.2测试体验。具体更新内容关键概括类别 WORDSNET-21363 支持为LINQ Reporting Engine动态添加组合框和下拉列表项新功能 WORDSNET-6146 允许从OLE对象提取可见的纯文本新功能 WORDSNET11848 添加保存选项以模仿MS Word行为或不模仿创建，修改和打印日期新功能 WORDSNET-6125 添加选项以将文档中的图像导出为SVG格式的HTML 新功能 WORDSNET-10148

How to query text-nodes from DOM, find markdown-patterns, replace matches with HTML-markup and replace the original text-node with the new content?

阅读更多关于 How to query text-nodes from DOM, find markdown-patterns, replace matches with HTML-markup and replace the original text-node with the new content?

问题 Markdown-like functionality for tooltips Problem: Using Vanilla JavaScript I want to: Change this: <div> <p> Hello [world]{big round planet we live on}, how is it [going]{verb that means walking}? </p> <p> It is [fine]{a word that expresses gratitude}. </p> </div> To this: <div> <p> Hello <mark data-toggle="tooltip" data-placement="top" title="big round planet we live on">world</mark>, how is it <mark data-toggle="tooltip" data-placement="top" title="verb means walking">world</mark>? </p> <p>

Parse measurements (multiple dimensions) from a given string in Python 3

阅读更多关于 Parse measurements (multiple dimensions) from a given string in Python 3

问题 I'm aware of this post and this library but they didn't help me with these specific cases below. How can I parse measurements like below: I have strings like below; "Square 10 x 3 x 5 mm" "Round 23/22; 24,9 x 12,2 x 12,3" "Square 10x2" "Straight 10x2mm" I'm looking for a Python package or some way to get results like below; >>> a = amazing_parser.parse("Square 10 x 3 x 5 mm") >>> print(a) 10 x 3 x 5 mm Likewise; >>> a = amazing_parser.parse("Round 23/22; 24,9x12,2") >>> print(a) 24,9 x 12,2 I