Google Translate API outputs HTML entities

谁说胖子不能爱 提交于 2019-12-08 18:46:49

问题


ENGLISH: Sale ID prefix is a required field

FRENCH: Vente préfixe d'ID est un champ obligatoire

Is there a way to have google translate NOT output the html entity and instead output the actual character (')

CODE: (SEE translateTo)

#!/usr/bin/php
<?php
$languages = array('english' => 'en', 'spanish' => 'es', 'indonesia' => 'id', 'french' => 'fr', 'italian' => 'it', 'dutch' => 'nl', 'portugues' => 'pt', 'arabic' => 'ar');

fwrite(STDOUT, "Please enter file: ");
$file = trim(fgets(STDIN));

//Run until user kills it
while(true)
{
    fwrite(STDOUT, "Please enter key: ");
    $key = trim(fgets(STDIN));

    fwrite(STDOUT, "Please enter english value: ");
    $value = trim(fgets(STDIN));

    foreach($languages as $folder=>$code)
    {
        $path = dirname(__FILE__).'/../../application/language/'.$folder.'/'.$file;
        $transaltedValue = translateTo($value, $code);

        $current_file_contents = file_get_contents($path); 

        //If we have already translated, update it
        if (preg_match("/['\"]{1}${key}['\"]{1}/",$current_file_contents))
        {
            $find_existing_translation = "/(\[['\"]{1})(${key}['\"]{1}[^=]+=[ ]*['\"]{1})([^'\"]+)(['\"]{1};)/";
            $new_file_contents = preg_replace($find_existing_translation, '${1}${2}'.$transaltedValue.'${4}', $current_file_contents);
            file_put_contents($path, $new_file_contents);
        }
        else //We haven't translated: Add
        {
            $pair = "\$lang['$key'] = '$transaltedValue';";
            file_put_contents($path, str_replace('?>', "$pair\n?>", $current_file_contents));
        }
    }


    fwrite(STDOUT, "Quit? (y/n): ");
    $quit = strtolower(trim(fgets(STDIN)));

    if ($quit == 'y' || $quit == 'yes')
    {
        exit(0);
    }
}

function translateTo($value, $language_key)
{
    if ($language_key == 'en')
    {
        return $value;
    }

    $api_key = 'MY_API_KEY';
    $value = urlencode($value);

    $url ="https://www.googleapis.com/language/translate/v2?key=$api_key&q=$value&source=en&target=$language_key";

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
    $body = curl_exec($ch);
    curl_close($ch);

    $json = json_decode($body);

    return $json->data->translations[0]->translatedText;
}
?>

回答1:


According to the Google Translate documentation, you can choose which format you will provide the text which is to be translated (see format in query parameters). The format defaults to HTML if not specfied.

You should set this query parameter to text to indicate that you are sending plain-text as Google will likely return the translated text in the same format as it is received.

So your PHP code could become:

$baseUrl = "https://www.googleapis.com/language/translate/v2";
$params ="?key=$api_key&q=$value&source=en&target=$language_key&format=text";
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $baseUrl + $params );



回答2:


If you use google translate client lib, you should pass format_ in translate method, not format, it is format_ below are google translate python api:




回答3:


If you specify format Text, content inside HTML tags will be translated as well. Assume your input is:

This is a <a href="https://example.com/path">link</a>

then example and path will be translated as well, which breaks the link.

To avoid this and fix your problem, stick with format HTML and unescape the text you received back from google translate. In php you might use html_entity_decode.



来源:https://stackoverflow.com/questions/26851419/google-translate-api-outputs-html-entities

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!