Is there a library in python that can convert words (mainly names) to Arpabet phonetic transcription?
BARBELS -> B AA1 R B AH0 L Z
BARBEQUE -> B AA1 R B IH0
Using nltk with the cmudict corpus installed:
arpabet = nltk.corpus.cmudict.dict()
for word in ('barbels', 'barbeque', 'barbequed', 'barbequeing', 'barbeques'):
print(arpabet[word])
yields
[['B', 'AA1', 'R', 'B', 'AH0', 'L', 'Z']]
[['B', 'AA1', 'R', 'B', 'IH0', 'K', 'Y', 'UW2']]
[['B', 'AA1', 'R', 'B', 'IH0', 'K', 'Y', 'UW2', 'D']]
[['B', 'AA1', 'R', 'B', 'IH0', 'K', 'Y', 'UW2', 'IH0', 'NG']]
[['B', 'AA1', 'R', 'B', 'IH0', 'K', 'Y', 'UW2', 'Z']]
To install the cmudict corpus in the python interpreter type:
>>> import nltk
>>> nltk.download()
Use GUI to install
corpora>cmudict