Titlecasing a string with exceptions

孤者浪人 提交于 2019-12-27 17:06:31

问题


Is there a standard way in Python to titlecase a string (i.e. words start with uppercase characters, all remaining cased characters have lowercase) but leaving articles like and, in, and of lowercased?


回答1:


There are a few problems with this. If you use split and join, some white space characters will be ignored. The built-in capitalize and title methods do not ignore white space.

>>> 'There     is a way'.title()
'There     Is A Way'

If a sentence starts with an article, you do not want the first word of a title in lowercase.

Keeping these in mind:

import re 
def title_except(s, exceptions):
    word_list = re.split(' ', s)       # re.split behaves as expected
    final = [word_list[0].capitalize()]
    for word in word_list[1:]:
        final.append(word if word in exceptions else word.capitalize())
    return " ".join(final)

articles = ['a', 'an', 'of', 'the', 'is']
print title_except('there is a    way', articles)
# There is a    Way
print title_except('a whim   of an elephant', articles)
# A Whim   of an Elephant



回答2:


Use the titlecase.py module! Works only for English.

>>> from titlecase import titlecase
>>> titlecase('i am a foobar bazbar')
'I Am a Foobar Bazbar'

GitHub: https://github.com/ppannuto/python-titlecase




回答3:


There are these methods:

>>> mytext = u'i am a foobar bazbar'
>>> print mytext.capitalize()
I am a foobar bazbar
>>> print mytext.title()
I Am A Foobar Bazbar

There's no lowercase article option. You'd have to code that yourself, probably by using a list of articles you want to lower.




回答4:


Stuart Colville has made a Python port of a Perl script written by John Gruber to convert strings into title case but avoids capitalizing small words based on rules from the New York Times Manual of style, as well as catering for several special cases.

Some of the cleverness of these scripts:

  • they capitalizes small words like if, in, of, on, etc., but will un-capitalize them if they’re erroneously capitalized in the input.

  • the scripts assume that words with capitalized letters other than the first character are already correctly capitalized. This means they will leave a word like “iTunes” alone, rather than mangling it into “ITunes” or, worse, “Itunes”.

  • they skip over any words with line dots; “example.com” and “del.icio.us” will remain lowercase.

  • they have hard-coded hacks specifically to deal with odd cases, like “AT&T” and “Q&A”, both of which contain small words (at and a) which normally should be lowercase.

  • The first and last word of the title are always capitalized, so input such as “Nothing to be afraid of” will be turned into “Nothing to Be Afraid Of”.

  • A small word after a colon will be capitalized.

You can download it here.




回答5:


capitalize (word)

This should do. I get it differently.

>>> mytext = u'i am a foobar bazbar'
>>> mytext.capitalize()
u'I am a foobar bazbar'
>>>

Ok as said in reply above, you have to make a custom capitalize:

mytext = u'i am a foobar bazbar'

def xcaptilize(word):
    skipList = ['a', 'an', 'the', 'am']
    if word not in skipList:
        return word.capitalize()
    return word

k = mytext.split(" ") 
l = map(xcaptilize, k)
print " ".join(l)   

This outputs

I am a Foobar Bazbar



回答6:


Python 2.7's title method has a flaw in it.

value.title()

will return Carpenter'S Assistant when value is Carpenter's Assistant

The best solution is probably the one from @BioGeek using titlecase from Stuart Colville. Which is the same solution proposed by @Etienne.




回答7:


 not_these = ['a','the', 'of']
thestring = 'the secret of a disappointed programmer'
print ' '.join(word
               if word in not_these
               else word.title()
               for word in thestring.capitalize().split(' '))
"""Output:
The Secret of a Disappointed Programmer
"""

The title starts with capitalized word and that does not match the article.




回答8:


One-liner using list comprehension and the ternary operator

reslt = " ".join([word.title() if word not in "the a on in of an" else word for word in "Wow, a python one liner for titles".split(" ")])
print(reslt)

Breakdown:

for word in "Wow, a python one liner for titles".split(" ") Splits the string into an list and initiates a for loop (in the list comprehenstion)

word.title() if word not in "the a on in of an" else word uses native method title() to title case the string if it's not an article

" ".join joins the list elements with a seperator of (space)



来源:https://stackoverflow.com/questions/3728655/titlecasing-a-string-with-exceptions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!