问题
I already have following
[attr]POFILE merge=merge-po-files
locale/*.po POFILE
in the .gitattributes
and I'd like to get merging of branches to work correctly when the same localization file (e.g. locale/en.po
) has been modified in paraller branches. I'm currently using following merge driver:
#!/bin/bash
# git merge driver for .PO files (gettext localizations)
# Install:
# git config merge.merge-po-files.driver "./bin/merge-po-files %A %O %B"
LOCAL="${1}._LOCAL_"
BASE="${2}._BASE_"
REMOTE="${3}._REMOTE_"
# rename to bit more meaningful filenames to get better conflict results
cp "${1}" "$LOCAL"
cp "${2}" "$BASE"
cp "${3}" "$REMOTE"
# merge files and overwrite local file with the result
msgcat "$LOCAL" "$BASE" "$REMOTE" -o "${1}" || exit 1
# cleanup
rm -f "$LOCAL" "$BASE" "$REMOTE"
# check if merge has conflicts
fgrep -q '#-#-#-#-#' "${1}" && exit 1
# if we get here, merge is successful
exit 0
However, the msgcat
is too dumb and this is not a true three way merge.
For example, if I have
BASE version
msgid "foo" msgstr "foo"
LOCAL version
msgid "foo" msgstr "bar"
REMOTE version
msgid "foo" msgstr "foo"
I'll end up with a conflict. However, a true three way merge driver would output correct merge:
msgid "foo"
msgstr "bar"
Note that I cannot simply add --use-first
to msgcat
because the REMOTE could contain the updated translation. In addition, if BASE, LOCAL and REMOTE are all unique, I still want a conflict, because that would really be a conflict.
What do I need to change to make this work? Bonus points for less insane conflict marker than '#-#-#-#-#', if possible.
回答1:
Taking some inspiration from Mikko's answer, we've added a full-fledged 3-way merger to the git-whistles Ruby gem.
It doesn't rely of git-merge
or rewriting string with Perl, and only manipulates PO files with Gettext tools.
Here's the code (MIT licensed):
#!/bin/sh
#
# Three-way merge driver for PO files
#
set -e
# failure handler
on_error() {
local parent_lineno="$1"
local message="$2"
local code="${3:-1}"
if [[ -n "$message" ]] ; then
echo "Error on or near line ${parent_lineno}: ${message}; exiting with status ${code}"
else
echo "Error on or near line ${parent_lineno}; exiting with status ${code}"
fi
exit 255
}
trap 'on_error ${LINENO}' ERR
# given a file, find the path that matches its contents
show_file() {
hash=`git hash-object "${1}"`
git ls-tree -r HEAD | fgrep "$hash" | cut -b54-
}
# wraps msgmerge with default options
function m_msgmerge() {
msgmerge --force-po --quiet --no-fuzzy-matching $@
}
# wraps msgcat with default options
function m_msgcat() {
msgcat --force-po $@
}
# removes the "graveyard strings" from the input
function strip_graveyard() {
sed -e '/^#~/d'
}
# select messages with a conflict marker
# pass -v to inverse selection
function grep_conflicts() {
msggrep $@ --msgstr -F -e '#-#-#' -
}
# select messages from $1 that are also in $2 but whose contents have changed
function extract_changes() {
msgcat -o - $1 $2 \
| grep_conflicts \
| m_msgmerge -o - $1 - \
| strip_graveyard
}
BASE=$1
LOCAL=$2
REMOTE=$3
OUTPUT=$LOCAL
TEMP=`mktemp /tmp/merge-po.XXXX`
echo "Using custom PO merge driver (`show_file ${LOCAL}`; $TEMP)"
# Extract the PO header from the current branch (top of file until first empty line)
sed -e '/^$/q' < $LOCAL > ${TEMP}.header
# clean input files
msguniq --force-po -o ${TEMP}.base --unique ${BASE}
msguniq --force-po -o ${TEMP}.local --unique ${LOCAL}
msguniq --force-po -o ${TEMP}.remote --unique ${REMOTE}
# messages changed on local
extract_changes ${TEMP}.local ${TEMP}.base > ${TEMP}.local-changes
# messages changed on remote
extract_changes ${TEMP}.remote ${TEMP}.base > ${TEMP}.remote-changes
# unchanged messages
m_msgcat -o - ${TEMP}.base ${TEMP}.local ${TEMP}.remote \
| grep_conflicts -v \
> ${TEMP}.unchanged
# messages changed on both local and remote (conflicts)
m_msgcat -o - ${TEMP}.remote-changes ${TEMP}.local-changes \
| grep_conflicts \
> ${TEMP}.conflicts
# messages changed on local, not on remote; and vice-versa
m_msgcat -o ${TEMP}.local-only --unique ${TEMP}.local-changes ${TEMP}.conflicts
m_msgcat -o ${TEMP}.remote-only --unique ${TEMP}.remote-changes ${TEMP}.conflicts
# the big merge
m_msgcat -o ${TEMP}.merge1 ${TEMP}.unchanged ${TEMP}.conflicts ${TEMP}.local-only ${TEMP}.remote-only
# create a template to filter messages actually needed (those on local and remote)
m_msgcat -o - ${TEMP}.local ${TEMP}.remote \
| m_msgmerge -o ${TEMP}.merge2 ${TEMP}.merge1 -
# final merge, adds saved header
m_msgcat -o ${TEMP}.merge3 --use-first ${TEMP}.header ${TEMP}.merge2
# produce output file (overwrites input LOCAL file)
cat ${TEMP}.merge3 > $OUTPUT
# check for conflicts
if grep '#-#' $OUTPUT > /dev/null ; then
echo "Conflict(s) detected"
echo " between ${TEMP}.local and ${TEMP}.remote"
exit 1
fi
rm -f ${TEMP}*
exit 0
回答2:
Here's a bit complex example driver that seems to output correct merge which may contain some translations that should have been deleted by local or remote version.
Nothing should be missing so this driver just adds some extra clutter in some cases.
This version uses gettext
native conflict marker that looks like #-#-#-#-#
combined with fuzzy
flag instead of normal git conflict markers.
The driver is a bit ugly to workaround bugs (or features) in msgcat
and msguniq
:
#!/bin/bash
# git merge driver for .PO files
# Copyright (c) Mikko Rantalainen <mikko.rantalainen@peda.net>, 2013
# License: MIT
ORIG_HASH=$(git hash-object "${1}")
WORKFILE=$(git ls-tree -r HEAD | fgrep "$ORIG_HASH" | cut -b54-)
echo "Using custom merge driver for $WORKFILE..."
LOCAL="${1}._LOCAL_"
BASE="${2}._BASE_"
REMOTE="${3}._REMOTE_"
LOCAL_ONELINE="$LOCAL""ONELINE_"
BASE_ONELINE="$BASE""ONELINE_"
REMOTE_ONELINE="$REMOTE""ONELINE_"
OUTPUT="$LOCAL""OUTPUT_"
MERGED="$LOCAL""MERGED_"
MERGED2="$LOCAL""MERGED2_"
TEMPLATE1="$LOCAL""TEMPLATE1_"
TEMPLATE2="$LOCAL""TEMPLATE2_"
FALLBACK_OBSOLETE="$LOCAL""FALLBACK_OBSOLETE_"
# standardize the input files for regexping
# default to UTF-8 in case charset is still the placeholder "CHARSET"
cat "${1}" | perl -npe 's!(^"Content-Type: text/plain; charset=)(CHARSET)(\\n"$)!$1UTF-8$3!' | msgcat --no-wrap --sort-output - > "$LOCAL"
cat "${2}" | perl -npe 's!(^"Content-Type: text/plain; charset=)(CHARSET)(\\n"$)!$1UTF-8$3!' | msgcat --no-wrap --sort-output - > "$BASE"
cat "${3}" | perl -npe 's!(^"Content-Type: text/plain; charset=)(CHARSET)(\\n"$)!$1UTF-8$3!' | msgcat --no-wrap --sort-output - > "$REMOTE"
# convert each definition to single line presentation
# extra fill is required to make sure that git separates each conflict
perl -npe 'BEGIN {$/ = "\n\n"}; s/#\n$/\n/s; s/#/##/sg; s/\n/#n/sg; s/#n$/\n/sg; s/#n$/\n/sg; $_.="#fill#\n" x 4' "$LOCAL" > "$LOCAL_ONELINE"
perl -npe 'BEGIN {$/ = "\n\n"}; s/#\n$/\n/s; s/#/##/sg; s/\n/#n/sg; s/#n$/\n/sg; s/#n$/\n/sg; $_.="#fill#\n" x 4' "$BASE" > "$BASE_ONELINE"
perl -npe 'BEGIN {$/ = "\n\n"}; s/#\n$/\n/s; s/#/##/sg; s/\n/#n/sg; s/#n$/\n/sg; s/#n$/\n/sg; $_.="#fill#\n" x 4' "$REMOTE" > "$REMOTE_ONELINE"
# merge files using normal git merge machinery
git merge-file -p --union -L "Current (working directory)" -L "Base (common ancestor)" -L "Incoming (applied changeset)" "$LOCAL_ONELINE" "$BASE_ONELINE" "$REMOTE_ONELINE" > "$MERGED"
MERGESTATUS=$?
# remove possibly duplicated headers (workaround msguniq bug http://comments.gmane.org/gmane.comp.gnu.gettext.bugs/96)
cat "$MERGED" | perl -npe 'BEGIN {$/ = "\n\n"}; s/^([^\n]+#nmsgid ""#nmsgstr ""#n.*?\n)([^\n]+#nmsgid ""#nmsgstr ""#n.*?\n)+/$1/gs' > "$MERGED2"
# remove lines that have totally empty msgstr
# and convert back to normal PO file representation
cat "$MERGED2" | grep -v '#nmsgstr ""$' | grep -v '^#fill#$' | perl -npe 's/#n/\n/g; s/##/#/g' > "$MERGED"
# run the output through msguniq to merge conflicts gettext style
# msguniq seems to have a bug that causes empty output if zero msgids
# are found after the header. Expected output would be the header...
# Workaround the bug by adding an empty obsolete fallback msgid
# that will be automatically removed by msguniq
cat > "$FALLBACK_OBSOLETE" << 'EOF'
#~ msgid "obsolete fallback"
#~ msgstr ""
EOF
cat "$MERGED" "$FALLBACK_OBSOLETE" | msguniq --no-wrap --sort-output > "$MERGED2"
# create a hacked template from default merge between 3 versions
# we do this to try to preserve original file ordering
msgcat --use-first "$LOCAL" "$REMOTE" "$BASE" > "$TEMPLATE1"
msghack --empty "$TEMPLATE1" > "$TEMPLATE2"
msgmerge --silent --no-wrap --no-fuzzy-matching "$MERGED2" "$TEMPLATE2" > "$OUTPUT"
# show some results to stdout
if grep -q '#-#-#-#-#' "$OUTPUT"
then
FUZZY=$(cat "$OUTPUT" | msgattrib --only-fuzzy --no-obsolete --color | perl -npe 'BEGIN{ undef $/; }; s/^.*?msgid "".*?\n\n//s')
if test -n "$FUZZY"
then
echo "-------------------------------"
echo "Fuzzy translations after merge:"
echo "-------------------------------"
echo "$FUZZY"
echo "-------------------------------"
fi
fi
# git merge driver must overwrite the first parameter with output
mv "$OUTPUT" "${1}"
# cleanup
rm -f "$LOCAL" "$BASE" "$REMOTE" "$LOCAL_ONELINE" "$BASE_ONELINE" "$REMOTE_ONELINE" "$MERGED" "$MERGED2" "$TEMPLATE1" "$TEMPLATE2" "$FALLBACK_OBSOLETE"
# return conflict if merge has conflicts according to msgcat/msguniq
grep -q '#-#-#-#-#' "${1}" && exit 1
# otherwise, return git merge status
exit $MERGESTATUS
# Steps to install this driver:
# (1) Edit ".git/config" in your repository directory
# (2) Add following section:
#
# [merge "merge-po-files"]
# name = merge po-files driver
# driver = ./bin/merge-po-files %A %O %B
# recursive = binary
#
# or
#
# git config merge.merge-po-files.driver "./bin/merge-po-files %A %O %B"
#
# The file ".gitattributes" will point git to use this merge driver.
Short explanation about this driver:
- It converts regular PO file format to single line format where each line is a translation entry.
- Then it uses regular
git merge-file --union
to do the merge and after the merge the resulting single line format is converted back to regular PO file format.
The actual conflict resolution is done after this usingmsguniq
, - and then it finally merges the resulting file with template generated by regular
msgcat
combining original input files to restore possibly lost metadata.
Warning: this driver will use msgcat --no-wrap
on the .PO
file and will force UTF-8
encoding if actual encoding is not specified.
If you want to use this merge driver but inspect the results always, change the final exit $MERGESTATUS
to look like exit 1
.
After getting merge conflict from this driver, the best method for fixing the conflict is to open the conflicting file with virtaal
and select Navigation: Incomplete
.
I find this UI a pretty nice tool for fixing the conflict.
回答3:
Here's an example driver that does correct text based diff with conflict markers in correct places. However, in case of conflict, git mergetool
is sure to mess the results so this is not really good. If you want to fix conflicting merges using just a text editor, then this should be fine:
#!/bin/bash
# git merge driver for .PO files
# Copyright (c) Mikko Rantalainen <mikko.rantalainen@peda.net>, 2013
# License: MIT
LOCAL="${1}._LOCAL_"
BASE="${2}._BASE_"
REMOTE="${3}._REMOTE_"
MERGED="${1}._MERGED_"
OUTPUT="$LOCAL""OUTPUT_"
LOCAL_ONELINE="$LOCAL""ONELINE_"
BASE_ONELINE="$BASE""ONELINE_"
REMOTE_ONELINE="$REMOTE""ONELINE_"
# standardize the input files for regexping
msgcat --no-wrap --strict --sort-output "${1}" > "$LOCAL"
msgcat --no-wrap --strict --sort-output "${2}" > "$BASE"
msgcat --no-wrap --strict --sort-output "${3}" > "$REMOTE"
# convert each definition to single line presentation
# extra fill is required to make sure that git separates each conflict
perl -npe 'BEGIN {$/ = "#\n"}; s/#\n$/\n/s; s/#/##/sg; s/\n/#n/sg; s/#n$/\n/sg; s/#n$/\n/sg; $_.="#fill#\n" x 4' "$LOCAL" > "$LOCAL_ONELINE"
perl -npe 'BEGIN {$/ = "#\n"}; s/#\n$/\n/s; s/#/##/sg; s/\n/#n/sg; s/#n$/\n/sg; s/#n$/\n/sg; $_.="#fill#\n" x 4' "$BASE" > "$BASE_ONELINE"
perl -npe 'BEGIN {$/ = "#\n"}; s/#\n$/\n/s; s/#/##/sg; s/\n/#n/sg; s/#n$/\n/sg; s/#n$/\n/sg; $_.="#fill#\n" x 4' "$REMOTE" > "$REMOTE_ONELINE"
# merge files using normal git merge machinery
git merge-file -p -L "Current (working directory)" -L "Base (common ancestor)" -L "Incoming (another change)" "$LOCAL_ONELINE" "$BASE_ONELINE" "$REMOTE_ONELINE" > "$MERGED"
MERGESTATUS=$?
# convert back to normal PO file representation
cat "$MERGED" | grep -v '^#fill#$' | perl -npe 's/#n/\n/g; s/##/#/g' > "$OUTPUT"
# git merge driver must overwrite the first parameter with output
mv "$OUTPUT" "${1}"
# cleanup
rm -f "$LOCAL" "$BASE" "$REMOTE" "$LOCAL_ONELINE" "$BASE_ONELINE" "$REMOTE_ONELINE" "$MERGED"
exit $MERGESTATUS
# Steps to install this driver:
# (1) Edit ".git/config" in your repository directory
# (2) Add following section:
#
# [merge "merge-po-files"]
# name = merge po-files driver
# driver = ./bin/merge-po-files %A %O %B
# recursive = binary
#
# or
#
# git config merge.merge-po-files.driver "./bin/merge-po-files %A %O %B"
#
# The file ".gitattributes" will point git to use this merge driver.
Short explanation about this driver: it converts regular PO file format to single line format where each line is a translation entry. Then it uses regular git merge-file
to do the merge and after the merge the resulting single line format is converted back to regular PO file format. Warning: this driver will use msgcat --sort-output
on the .PO file so if you want your PO files in some specific order, this may not be the tool for you.
来源:https://stackoverflow.com/questions/16214067/wheres-the-3-way-git-merge-driver-for-po-gettext-files