punctuation

Splitting String in regex with - as one word

余生颓废 提交于 2021-01-28 14:37:03
问题 I am trying to split a sentence with 32 chars in each group of regex. The sentence is split after the complete word if 32nd character is a letter in the word. When my input is a sentence which has "-" it splits that word too. This is the regex I am using (\b.{1,32}\b\W?) Input string: Half Bone-in Spiral int with dark Packd Smithfield Half Bone-in Spiral Ham with Glaze Pack resulting groups: Half Bone-in Spiral int with dark Packd Smithfield Half Bone- in Spiral Ham with Glaze Pack In above

Splitting String in regex with - as one word

时光毁灭记忆、已成空白 提交于 2021-01-28 14:34:27
问题 I am trying to split a sentence with 32 chars in each group of regex. The sentence is split after the complete word if 32nd character is a letter in the word. When my input is a sentence which has "-" it splits that word too. This is the regex I am using (\b.{1,32}\b\W?) Input string: Half Bone-in Spiral int with dark Packd Smithfield Half Bone-in Spiral Ham with Glaze Pack resulting groups: Half Bone-in Spiral int with dark Packd Smithfield Half Bone- in Spiral Ham with Glaze Pack In above

Python removing punctuation from unicode string except apostrophe

末鹿安然 提交于 2020-05-24 21:54:30
问题 I found several topics of this and I found this solution: sentence=re.sub(ur"[^\P{P}'|-]+",'',sentence) This should remove every punctuation except ', the problem is it also strips everything else from the sentence. Example: >>> sentence="warhol's art used many types of media, including hand drawing, painting, printmaking, photography, silk screening, sculpture, film, and music." >>> sentence=re.sub(ur"[^\P{P}']+",'',sentence) >>> print sentence ' of course what I want is to keep the sentence

Python removing punctuation from unicode string except apostrophe

杀马特。学长 韩版系。学妹 提交于 2020-05-24 21:54:28
问题 I found several topics of this and I found this solution: sentence=re.sub(ur"[^\P{P}'|-]+",'',sentence) This should remove every punctuation except ', the problem is it also strips everything else from the sentence. Example: >>> sentence="warhol's art used many types of media, including hand drawing, painting, printmaking, photography, silk screening, sculpture, film, and music." >>> sentence=re.sub(ur"[^\P{P}']+",'',sentence) >>> print sentence ' of course what I want is to keep the sentence

In Erlang, when do I use ; or , or .?

折月煮酒 提交于 2020-04-29 05:26:11
问题 I've been trying to learn Erlang and have been running into some problems with ending lines in functions and case statements. Namely, when do I use a semicolon, comma, or period inside my functions or case statements? I've gotten stuff to work, but I don't really understand why and was looking for a little more information. 回答1: Comma at the end of a line of normal code. Semicolon at the end of case statement, or if statement, etc. The last case or if statement doesn't have anything at the

In Erlang, when do I use ; or , or .?

拈花ヽ惹草 提交于 2020-04-29 05:18:25
问题 I've been trying to learn Erlang and have been running into some problems with ending lines in functions and case statements. Namely, when do I use a semicolon, comma, or period inside my functions or case statements? I've gotten stuff to work, but I don't really understand why and was looking for a little more information. 回答1: Comma at the end of a line of normal code. Semicolon at the end of case statement, or if statement, etc. The last case or if statement doesn't have anything at the

Are there character collections for all international full stop punctuations?

青春壹個敷衍的年華 提交于 2020-01-22 19:41:29
问题 I am trying to parse utf-8 strings into "bite sized" segments. For example, I would like to break down a text into "sentences". Is there a comprehensive collection of characters (or regex) that correspond to end of sentences in all languages? I'm looking for something that would capture the Latin period, exclamation and interrogation marks, the Chinese and Japanese full stop, etc. Something like the above but for the equivalent of a comma would be great too. 回答1: I haven’t encountered any

Are there character collections for all international full stop punctuations?

偶尔善良 提交于 2020-01-22 19:41:04
问题 I am trying to parse utf-8 strings into "bite sized" segments. For example, I would like to break down a text into "sentences". Is there a comprehensive collection of characters (or regex) that correspond to end of sentences in all languages? I'm looking for something that would capture the Latin period, exclamation and interrogation marks, the Chinese and Japanese full stop, etc. Something like the above but for the equivalent of a comma would be great too. 回答1: I haven’t encountered any

Strip punctuation in an address field in PHP

半世苍凉 提交于 2020-01-16 03:17:24
问题 Hey all. I'm having some trouble getting punctuation to be stripped out of an address field... Basically I want to take things like: 1234 Apple St. N. And turn it into: 1234 Apple St N A period is really the only piece of punctuation I can envision... but I suppose I'd really want to strip EVERYTHING out. Can somebody help me here? Nothing i do works... argh! 回答1: You can use a preg_replace get the desired result. and \w is short-hand for [a-zA-Z0-9_] , FYI. $newAddress = preg_replace('/[^\w