regex

Removing markup links in text

怎甘沉沦 提交于 2021-02-19 05:34:41
问题 I'm cleaning some text from Reddit. When you include a link in a Reddit self-text, you do so like this: [the text you read](https://website.com/to/go/to) . I'd like to use regex to remove the hyperlink (e.g. https://website.com/to/go/to ) but keep the text you read . Here is another example: [the podcast list](https://www.reddit.com/r/datascience/wiki/podcasts) I'd like to keep: the podcast list . How can I do this with Python's re library? What is the appropriate regex? 回答1: I have created

Non-capturing optional URL elements in Django

删除回忆录丶 提交于 2021-02-19 05:34:37
问题 I'm using Django and would like to match the URLs domain.com/w and domain.com/words . I have a configuration line of the form: url(r'^w(ords)?$', 'app_name.views.view_words') view_words takes only one parameter ( request ), but it seems that Django captures the (ords) part of the regular expression and passes it to the view. When I remove (ords) from the regex and access domain.com/w , it works properly. The Django documentation and similar StackOverflow questions cover how to capture

Strip tag from text (in React JS)

你说的曾经没有我的故事 提交于 2021-02-19 05:33:54
问题 I have multiple whole html code in variable cleanHTML and i need to strip specific tags from text. let cleanHTML = document.documentElement.outerHTML this: <span class="remove-me">please</span> <span class="remove-me">me too</span> <span class="remove-me">and me</span> to this: please me too and me I´m trying to do it with: var list = cleanHTML.getElementsByClassName("remove-me"); var i; for (i = 0; i < list.length; i++) { list[i] = list[i].innerHTML; } But i´m getting error from React

Strip tag from text (in React JS)

不羁岁月 提交于 2021-02-19 05:33:03
问题 I have multiple whole html code in variable cleanHTML and i need to strip specific tags from text. let cleanHTML = document.documentElement.outerHTML this: <span class="remove-me">please</span> <span class="remove-me">me too</span> <span class="remove-me">and me</span> to this: please me too and me I´m trying to do it with: var list = cleanHTML.getElementsByClassName("remove-me"); var i; for (i = 0; i < list.length; i++) { list[i] = list[i].innerHTML; } But i´m getting error from React

Regex for matching all words before a specific character

亡梦爱人 提交于 2021-02-19 04:57:06
问题 I need to extract all the words in a string before a specific character, in this example a colon (:). For example: String temp = "root/naming-will-look-like-this:1.0.0-SNAP"; From the string above I would like to return: "root" "naming" "will" "look" "like" "this" I'm not great at regular expressions, and I've come up with this so far. \w+(?=:) Which only returns me the one word directly preceding the colon ("this"). How can I retrieve all words before? Thanks in advance. 回答1: You can use a

How to replace spaces with %20 in <img> tags

可紊 提交于 2021-02-19 04:40:06
问题 I would like to replace all spaces in the image tags of a html text. Example: <img src="photo 1.jpg" /> to <img src="photo%201.jpg"/> I didn't find a soultion with preg_replace, but it may be a simple regexp line. Thanks! Edit : Sorry guys, my description was not very clear. So, I have a full html page and I only want to replace inside the img tags. I can't use urlencode here as it encodes the other stuff as well. 回答1: The space is represented by a %20 in the url but there are other chars

实战 | MySQL Binlog通过Canal同步HDFS

[亡魂溺海] 提交于 2021-02-19 04:02:42
大数据技术与架构 点击右侧关注,大数据开发领域最强公众号! 暴走大数据 点击右侧关注,暴走大数据! 之前 《MySQL Binlog同步HDFS的方案》 介绍性的文章简单介绍了实时同步mysql到hdfs的几种方案,本篇主要记录下利用canal同步mysql到hdfs的具体方案。 本文来自:http://bigdatadecode.club/MysqlToHDFSWithCanal.html canal server 部署 在canal中一个mysql实例对应一个配置文件,配置文件放在conf目录下的一个文件夹中,该文件夹的名字就代表了mysql实例。结构如下 -rwxr-xr-x 1 dc user 2645 Jul 18 14:25 canal.properties -rwxr-xr-x 1 dc user 2521 Jul 17 18:31 canal.properties.bak -rwxr-xr-x 1 dc user 3045 Jul 17 18:31 logback.xml drwxr-xr-x 2 dc user 4096 Jul 17 18:38 spring drwxr-xr-x 2 dc user 4096 Jul 19 11:55 trans1 trans1代表一个mysql实例,该文件夹中有个instance.properties文件

Split string by all spaces except those in parentheses

对着背影说爱祢 提交于 2021-02-19 03:59:20
问题 I'm trying to split text the following like on spaces: var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}" but I want it to ignore the spaces within parentheses. This should produce an array with: var words = ["Text", "(what is)|what's", "a", "story|fable" "called|named|about", "{Search}|{Title}"]; I know this should involve some sort of regex with line.match(). Bonus points if the regex removes the parentheses. I know that word.replace() would get rid of them

Split string by all spaces except those in parentheses

▼魔方 西西 提交于 2021-02-19 03:59:04
问题 I'm trying to split text the following like on spaces: var line = "Text (what is)|what's a story|fable called|named|about {Search}|{Title}" but I want it to ignore the spaces within parentheses. This should produce an array with: var words = ["Text", "(what is)|what's", "a", "story|fable" "called|named|about", "{Search}|{Title}"]; I know this should involve some sort of regex with line.match(). Bonus points if the regex removes the parentheses. I know that word.replace() would get rid of them

Anyone know a good regex to remove extra whitespace? [duplicate]

我是研究僧i 提交于 2021-02-19 03:47:07
问题 This question already has answers here : Closed 10 years ago . Possible Duplicate: Substitute multiple whitespace with single whitespace in Python trying to figure out how to write a regex that given the string: "hi this is a test" I can turn it into "hi this is a test" where the whitespace is normalized to just one space any ideas? thanks so much 回答1: import re re.sub("\s+"," ",string) 回答2: Does it need to be a regex? I'd just use new_string = " ".join(re.split(s'\s+', old_string.strip()))