How to replace a string in a whole Git history?

人走茶凉 提交于 2020-12-02 14:48:16

问题


I have one of my passwords commited in probably few files in my Git repo. Is there some way to replace this password with some other string in whole history automatically so that there is no trace of it? Ideally if I could write simple bash script receiving strings to find and replace by and doing whole work itself, something like:

./replaceStringInWholeGitHistory.sh "my_password" "xxxxxxxx"

Edit: this question is not a duplicate of that one, because I am asking about replacing strings without removing whole files.


回答1:


First, find all the files that could contain the password. Suppose the password is abc123 and the branch is master. You may need to exclude those files which have abc123 only as a normal string.

git log -S "abc123" master --name-only --pretty=format: | sort -u

Then replace "abc123" with "******". Suppose one of the files is foo/bar.txt.

git filter-branch --tree-filter "if [ -f foo/bar.txt ];then sed -i s/abc123/******/g foo/bar.txt;fi"

Finally, force push master to the remote repository if it exists.

git push origin -f master:master

I made a simple test and it worked but I'm not sure if it's okay with your case. You need to deal with all the files from all branches. As to the tags, you may have to delete all the old ones, and create new ones.




回答2:


At the beginning I'd like to thank ElpieKay, who posted core functions of my solutions, which I've only automatized.

So, finally I have script I wanted to have. I divided it into pieces which depend on each other and can serve as independent scripts. It looks like this:

censorStringsInWholeGitHistory.sh:

#!/bin/bash
#arguments are strings to censore

for string in "$@"
do
  echo ""
  echo "================ Censoring string "$string": ================"
  ~/replaceStringInWholeGitHistory.sh "$string" "********"
done

usage:

~/censorStringsInWholeGitHistory.sh "my_password1" "my_password2" "some_f_word"

replaceStringInWholeGitHistory.sh:

#!/bin/bash
# $1 - string to find
# $2 - string to replace with

for branch in $(git branch | cut -c 3-); do
  echo ""
  echo ">>> Replacing strings in branch $branch:"
  echo ""
  ~/replaceStringInBranch.sh "$branch" "$1" "$2"
done

usage:

~/replaceStringInWholeGitHistory.sh "my_password" "********"

replaceStringInBranch.sh:

#!/bin/bash
# $1 - branch
# $2 - string to find
# $3 - string to replace with

git checkout $1
for file in $(~/findFilesContainingStringInBranch.sh "$2"); do
  echo "          Filtering file $file:"
  ~/changeStringsInFileInCurrentBranch.sh "$file" "$2" "$3"
done

usage:

~/replaceStringInBranch.sh master "my_password" "********"

findFilesContainingStringInBranch.sh:

#!/bin/bash

# $1 - string to find
# $2 - branch name or nothing (current branch in that case)

git log -S "$1" $2 --name-only --pretty=format: -- | sort -u

usage:

~/findFilesContainingStringInBranch.sh "my_password" master

changeStringsInFileInCurrentBranch.sh:

#!/bin/bash

# $1 - file name
# $2 - string to find
# $3 - string to replace

git filter-branch -f --tree-filter "if [ -f $1 ];then sed -i s/$2/$3/g $1;fi"

usage:

~/changeStringsInFileInCurrentBranch.sh "abc.txt" "my_password" "********"

I have all those scripts located in my home folder, what is necessary for proper working in this version. I'm not sure that's the best option, but for now I cannot find better one. Of course every script has to be executable, what we can achieve with chmod +x ~/myscript.sh.

Probably my script is not optimal, for big repos it will process very long, but it works :)

And, at the very end, we can push our censored repo to any remote with:

git push <remote> -f --all

Edit: important hint from ElpieKay:

Don't forget to delete and recreate tags that you have pushed. They are still pointing to the old commits that may contain your password.

Maybe I'll improve my script in future to do this automatically.




回答3:


git filter-repo --replace-text

Git 2.25 man git-filter-branch already clearly recommends using git filter-repo instead of git filter-tree, so here we go.

Install https://superuser.com/questions/1563034/how-do-you-install-git-filter-repo/1589985#1589985

python3 -m pip install --user git-filter-repo

and then use:

echo 'my_password==>xxxxxxxx' > replace.txt
git filter-repo --replace-text replace.txt

or equivalent with Bash magic:

git filter-repo --replace-text <(echo 'my_password==>xxxxxxxx')

Tested with this simple test repository: https://github.com/cirosantilli/test-git-filter-repository and replacement strings:

d1==>asdf
d2==>qwer

The above acts on all branches by default (so invasive!!!), to act only on selected branches use: git filter-repo: can it be used on a specific branch? e.g.:

--refs HEAD
--refs refs/heads/master

The option --replace-text option is documented at: https://github.com/newren/git-filter-repo/blob/7b3e714b94a6e5b9f478cb981c7f560ef3f36506/Documentation/git-filter-repo.txt#L155

--replace-text <expressions_file>::

A file with expressions that, if found, will be replaced. By default, each expression is treated as literal text, but regex: and glob: prefixes are supported. You can end the line with ==> and some replacement text to choose a replacement choice other than the default of ***REMOVED***.

Of course, once you've pushed a password publicly, it is always too late, and you will have to change the password, so I wouldn't even bother with the replace in this case: Remove sensitive files and their commits from Git history

This seems to be the same question: How to substitute text from files in git history?

Tested on git-filter-repo ac039ecc095d.



来源:https://stackoverflow.com/questions/46950829/how-to-replace-a-string-in-a-whole-git-history

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!