Temporary removal of HTML from string for Google Translate API to reduce cost

南笙酒味 提交于 2019-12-31 04:18:05

问题


I have to translate some details using a Google API which we're paying for. The details contain HTML, and Google charges for each character. I don't want to send the complete content, but only the English text instead, with the HTML removed. I can remove HTML tags and entities using PHP functions, but I have to place the English content back in the HTML tags after translation for proper display. It will also include CSS.

Example:

<strong>This is a test</strong><br /> &nbsp; <custom tag>This is a test</custom tag><br />

After translation to Spanish I need:

<strong>Translated content </strong><br /> &nbsp; <p>Translated content </p><br />

How can I preserve the HTML format with out sending HTML to the API?


回答1:


Haha, I also had that problem. But it has been while ago...

I think, there was a problem were - due to translation-nature - some sentenceparts were swaped. So I was not able to fit the tags in at the same position, first. But I think there was a way to get some metadata from the translationprocess, were you can see which part of the sentence have moved to a new position and what the content was... I know, I solved it finally. But I cant recall how :(

If every word takes the same place again after translation, you could first separate all words by whitespace OR htmltag into an array and remember where each HTML-tag was and reapply that after translation...




回答2:


Add google translate service to your website and add notes witch words not to translate.

https://translate.google.com/manager/



来源:https://stackoverflow.com/questions/11541908/temporary-removal-of-html-from-string-for-google-translate-api-to-reduce-cost

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!