git-hash https://www.e-learn.cn/tag/git-hash zh-hans Difference between PowerShell's echo and CMD's echo https://www.e-learn.cn/topic/4052150 <span>Difference between PowerShell&#039;s echo and CMD&#039;s echo</span> <span><span lang="" about="/user/231" typeof="schema:Person" property="schema:name" datatype="">安稳与你</span></span> <span>2021-02-05 06:45:46</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>I get the following in PowerShell:</p> <pre><code>D:\&gt; echo "Apple Pie" | git hash-object --stdin 157cb7be4778a9cfad23b6fb514e364522167053 D:\&gt; "Apple Pie" | git hash-object --stdin 157cb7be4778a9cfad23b6fb514e364522167053 </code></pre> <p>but in CMD.exe:</p> <pre><code>C:\&gt;echo "Apple Pie" | git hash-object --stdin bb3918d5053fea31fc9a58fae1e5bdeabe3ec647 </code></pre> <p>In a PluralSight video, I see a different value from what seems to be a Mac console:</p> <p></p> <p>What is the exact value piped from <code>echo</code> in each case? </p> <p>I get a different hash if I go to one of those online SHA1 generators and enter the string <code>Apple Pie</code>. From those I get:</p> <p><code>8d69b7365f59237d3fb052c1f2f15ea33457fe51</code></p> <br /><h3>回答1:</h3><br /><p>As far as I understand :</p> <p>Using CMD :</p> <pre><code>echo Apple Pie|git hash-object --stdin </code></pre> <p>return the same think as the following in PowerShell </p> <pre><code>"Apple Pie" | git hash-object --stdin </code></pre> <p>That is to say : </p> <pre><code>157cb7be4778a9cfad23b6fb514e364522167053 </code></pre> <p>@Mofi seems to be right, you can reproduce the CMD result in Powershell using :</p> <pre><code>'"Apple Pie" ' | git hash-object --stdin </code></pre> <p>To explain the Mac OS one : To obtain 157cb7be4778a9cfad23b6fb514e364522167053 the real list of chars that is hashed is <code>'Apple Pie\r\n'</code> (with carage return line feed), in Mac or linux like command line it's <code>'Apple Pie\r'</code>. </p> <p>If you want to test this : put <code>'Apple Pie'</code> in a text file with a cariage return and save it as a Windows text style (CR+LF), and use <code>git hash-object yourfile.txt</code>. Then save it in Linux style (LF) and test again, you will find your two hashes.</p> <hr /><p><strong><em>The part about \r\n.</em></strong></p> <p><code>"Apple Pie" | where {$_.length -eq 9}</code> shows that the string is exactly 9 characters long</p> <p>For me it's because in your case the pipe is between two PowerShell parts, the pipe transmit an object. When the pipe is between PowerShell and an external EXE then the \r\n are added. Here is a way to test that with a small exe file written in C# :</p> <pre><code>using System; namespace C_Param { class Program { static void Main(string[] args) { string line = Console.In.ReadToEnd(); foreach (char character in line){ Console.WriteLine(String.Format("{0:X2}", Convert.ToByte(character))); } } } } </code></pre> <p>The result in a PowerShell console is :</p> <pre><code>"Apple Pie" | .\C_Param.exe 41 70 70 6C 65 20 50 69 65 0D 0A </code></pre> <p>The result in a CMD console is :</p> <pre><code>echo "Apple Pie" | .\C_Param.exe 22 41 70 70 6C 65 20 50 69 65 22 20 0D 0A </code></pre> <p>QED ?</p> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/61153475/difference-between-powershells-echo-and-cmds-echo</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/powershell" hreflang="zh-hans">powershell</a></div> <div class="field--item"><a href="/tag/cmd" hreflang="zh-hans">cmd</a></div> <div class="field--item"><a href="/tag/hash" hreflang="zh-hans">hash</a></div> <div class="field--item"><a href="/tag/sha1" hreflang="zh-hans">sha1</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Thu, 04 Feb 2021 22:45:46 +0000 安稳与你 4052150 at https://www.e-learn.cn How does git compute file hashes? https://www.e-learn.cn/topic/2526775 <span>How does git compute file hashes?</span> <span><span lang="" about="/user/142" typeof="schema:Person" property="schema:name" datatype="">别说谁变了你拦得住时间么</span></span> <span>2019-12-17 03:27:47</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>The SHA1 hashes stored in the tree objects (as returned by <code>git ls-tree</code>) do not match the SHA1 hashes of the file content (as returned by <code>sha1sum</code>)</p> <pre><code>$ git cat-file blob 4716ca912495c805b94a88ef6dc3fb4aff46bf3c | sha1sum de20247992af0f949ae8df4fa9a37e4a03d7063e - </code></pre> <p>How does git compute file hashes? Does it compress the content before computing the hash?</p> <br /><h3>回答1:</h3><br /><blockquote> <p>Git prefixes the object with "blob ", followed by the length (as a human-readable integer), followed by a NUL character</p> </blockquote> <p><code>$ echo -e 'blob 14\0Hello, World!' | shasum 8ab686eafeb1f44702738c8b0f24f2567c36da6d</code></p> <p>Source: http://alblue.bandlem.com/2011/08/git-tip-of-week-objects.html</p> <br /><br /><br /><h3>回答2:</h3><br /><p>I am only expanding on the answer by <code>@Leif Gruenwoldt</code> and detailing what is in the reference provided by <code>@Leif Gruenwoldt</code></p> <p><strong>Do It Yourself..</strong></p> <blockquote> <ul><li>Step 1. Create an empty text document (name does not matter) in your repository</li> <li>Step 2. Stage and Commit the document</li> <li>Step 3. Identify the hash of the blob by executing <code>git ls-tree HEAD</code></li> <li>Step 4. Find the blob's hash to be <code>e69de29bb2d1d6434b8b29ae775ad8c2e48c5391</code></li> <li>Step 5. Snap out of your surprise and read below</li> </ul></blockquote> <p><strong>How does GIT compute its commit hashes</strong></p> <pre><code> Commit Hash (SHA1) = SHA1("blob " + &lt;size_of_file&gt; + "\0" + &lt;contents_of_file&gt;) </code></pre> <p>The text <code>blob⎵</code> is a constant prefix and <code>\0</code> is also constant and is the <code>NULL</code> character. The <code>&lt;size_of_file&gt;</code> and <code>&lt;contents_of_file&gt;</code> vary depending on the file.</p> <p>See: What is the file format of a git commit object?</p> <p>And thats all folks!</p> <p><strong>But wait!</strong>, did you notice that the <code>&lt;filename&gt;</code> is not a parameter used for the hash computation? Two files could potentially have the same hash if their contents are same indifferent of the date and time they were created and their name. This is one of the reasons Git handles moves and renames better than other version control systems.</p> <p><strong>Do It Yourself (Ext)</strong></p> <blockquote> <ul><li>Step 6. Create another empty file with a different <code>filename</code> in the same directory</li> <li>Step 7. Compare the hashes of both your files.</li> </ul></blockquote> <p><strong>Note:</strong></p> <p>The link does not mention how the <code>tree</code> object is hashed. I am not certain of the algorithm and parameters however from my observation it probably computes a hash based on all the <code>blobs</code> and <code>trees</code> (their hashes probably) it contains</p> <br /><br /><br /><h3>回答3:</h3><br /><p><strong><code>git hash-object</code></strong></p> <p>This is a quick way to verify your test method:</p> <pre><code>s='abc' printf "$s" | git hash-object --stdin printf "blob $(printf "$s" | wc -c)\0$s" | sha1sum </code></pre> <p>Output:</p> <pre><code>f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f - </code></pre> <p>where <code>sha1sum</code> is in GNU Coreutils.</p> <p>Then it comes down to understanding the format of each object type. We have already covered the trivial <code>blob</code>, here are the others:</p> <ul><li>commit: What is the file format of a git commit object?</li> <li>tree: What is the internal format of a git tree object?</li> <li>tag: How is a Git Tag Object SHA1 Created?</li> </ul><br /><br /><br /><h3>回答4:</h3><br /><p>Based on Leif Gruenwoldt answer, here is a shell function substitute to git hash-object :</p> <pre class="lang-bash prettyprint-override"><code>git-hash-object () { # substitute when the `git` command is not available local type=blob [ "$1" = "-t" ] &amp;&amp; shift &amp;&amp; type=$1 &amp;&amp; shift # depending on eol/autocrlf settings, you may want to substitute CRLFs by LFs # by using `perl -pe 's/\r$//g'` instead of `cat` in the next 2 commands local size=$(cat $1 | wc -c | sed 's/ .*$//') ( echo -en "$type $size\0"; cat "$1" ) | sha1sum | sed 's/ .*$//' } </code></pre> <p>Test:</p> <pre class="lang-bash prettyprint-override"><code>$ echo 'Hello, World!' &gt; test.txt $ git hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d $ git-hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d </code></pre> <br /><br /><br /><h3>回答5:</h3><br /><p>I needed this for some unit tests in Python 3 so thought I'd leave it here.</p> <pre><code>def git_blob_hash(data): if isinstance(data, str): data = data.encode() data = b'blob ' + str(len(data)).encode() + b'\0' + data h = hashlib.sha1() h.update(data) return h.hexdigest() </code></pre> <p>I stick to <code>\n</code> line endings everywhere but in some circumstances Git might also be changing your line endings before calculating this hash so you may need a .replace('\r\n', '\n') in there too.</p> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/7225313/how-does-git-compute-file-hashes</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/hash" hreflang="zh-hans">hash</a></div> <div class="field--item"><a href="/tag/sha1" hreflang="zh-hans">sha1</a></div> <div class="field--item"><a href="/tag/checksum" hreflang="zh-hans">checksum</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Mon, 16 Dec 2019 19:27:47 +0000 别说谁变了你拦得住时间么 2526775 at https://www.e-learn.cn How does git compute file hashes? https://www.e-learn.cn/topic/2526757 <span>How does git compute file hashes?</span> <span><span lang="" about="/user/63" typeof="schema:Person" property="schema:name" datatype="">女生的网名这么多〃</span></span> <span>2019-12-17 03:27:13</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>The SHA1 hashes stored in the tree objects (as returned by <code>git ls-tree</code>) do not match the SHA1 hashes of the file content (as returned by <code>sha1sum</code>)</p> <pre><code>$ git cat-file blob 4716ca912495c805b94a88ef6dc3fb4aff46bf3c | sha1sum de20247992af0f949ae8df4fa9a37e4a03d7063e - </code></pre> <p>How does git compute file hashes? Does it compress the content before computing the hash?</p> <br /><h3>回答1:</h3><br /><blockquote> <p>Git prefixes the object with "blob ", followed by the length (as a human-readable integer), followed by a NUL character</p> </blockquote> <p><code>$ echo -e 'blob 14\0Hello, World!' | shasum 8ab686eafeb1f44702738c8b0f24f2567c36da6d</code></p> <p>Source: http://alblue.bandlem.com/2011/08/git-tip-of-week-objects.html</p> <br /><br /><br /><h3>回答2:</h3><br /><p>I am only expanding on the answer by <code>@Leif Gruenwoldt</code> and detailing what is in the reference provided by <code>@Leif Gruenwoldt</code></p> <p><strong>Do It Yourself..</strong></p> <blockquote> <ul><li>Step 1. Create an empty text document (name does not matter) in your repository</li> <li>Step 2. Stage and Commit the document</li> <li>Step 3. Identify the hash of the blob by executing <code>git ls-tree HEAD</code></li> <li>Step 4. Find the blob's hash to be <code>e69de29bb2d1d6434b8b29ae775ad8c2e48c5391</code></li> <li>Step 5. Snap out of your surprise and read below</li> </ul></blockquote> <p><strong>How does GIT compute its commit hashes</strong></p> <pre><code> Commit Hash (SHA1) = SHA1("blob " + &lt;size_of_file&gt; + "\0" + &lt;contents_of_file&gt;) </code></pre> <p>The text <code>blob⎵</code> is a constant prefix and <code>\0</code> is also constant and is the <code>NULL</code> character. The <code>&lt;size_of_file&gt;</code> and <code>&lt;contents_of_file&gt;</code> vary depending on the file.</p> <p>See: What is the file format of a git commit object?</p> <p>And thats all folks!</p> <p><strong>But wait!</strong>, did you notice that the <code>&lt;filename&gt;</code> is not a parameter used for the hash computation? Two files could potentially have the same hash if their contents are same indifferent of the date and time they were created and their name. This is one of the reasons Git handles moves and renames better than other version control systems.</p> <p><strong>Do It Yourself (Ext)</strong></p> <blockquote> <ul><li>Step 6. Create another empty file with a different <code>filename</code> in the same directory</li> <li>Step 7. Compare the hashes of both your files.</li> </ul></blockquote> <p><strong>Note:</strong></p> <p>The link does not mention how the <code>tree</code> object is hashed. I am not certain of the algorithm and parameters however from my observation it probably computes a hash based on all the <code>blobs</code> and <code>trees</code> (their hashes probably) it contains</p> <br /><br /><br /><h3>回答3:</h3><br /><p><strong><code>git hash-object</code></strong></p> <p>This is a quick way to verify your test method:</p> <pre><code>s='abc' printf "$s" | git hash-object --stdin printf "blob $(printf "$s" | wc -c)\0$s" | sha1sum </code></pre> <p>Output:</p> <pre><code>f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f - </code></pre> <p>where <code>sha1sum</code> is in GNU Coreutils.</p> <p>Then it comes down to understanding the format of each object type. We have already covered the trivial <code>blob</code>, here are the others:</p> <ul><li>commit: What is the file format of a git commit object?</li> <li>tree: What is the internal format of a git tree object?</li> <li>tag: How is a Git Tag Object SHA1 Created?</li> </ul><br /><br /><br /><h3>回答4:</h3><br /><p>Based on Leif Gruenwoldt answer, here is a shell function substitute to git hash-object :</p> <pre class="lang-bash prettyprint-override"><code>git-hash-object () { # substitute when the `git` command is not available local type=blob [ "$1" = "-t" ] &amp;&amp; shift &amp;&amp; type=$1 &amp;&amp; shift # depending on eol/autocrlf settings, you may want to substitute CRLFs by LFs # by using `perl -pe 's/\r$//g'` instead of `cat` in the next 2 commands local size=$(cat $1 | wc -c | sed 's/ .*$//') ( echo -en "$type $size\0"; cat "$1" ) | sha1sum | sed 's/ .*$//' } </code></pre> <p>Test:</p> <pre class="lang-bash prettyprint-override"><code>$ echo 'Hello, World!' &gt; test.txt $ git hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d $ git-hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d </code></pre> <br /><br /><br /><h3>回答5:</h3><br /><p>I needed this for some unit tests in Python 3 so thought I'd leave it here.</p> <pre><code>def git_blob_hash(data): if isinstance(data, str): data = data.encode() data = b'blob ' + str(len(data)).encode() + b'\0' + data h = hashlib.sha1() h.update(data) return h.hexdigest() </code></pre> <p>I stick to <code>\n</code> line endings everywhere but in some circumstances Git might also be changing your line endings before calculating this hash so you may need a .replace('\r\n', '\n') in there too.</p> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/7225313/how-does-git-compute-file-hashes</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/hash" hreflang="zh-hans">hash</a></div> <div class="field--item"><a href="/tag/sha1" hreflang="zh-hans">sha1</a></div> <div class="field--item"><a href="/tag/checksum" hreflang="zh-hans">checksum</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Mon, 16 Dec 2019 19:27:13 +0000 女生的网名这么多〃 2526757 at https://www.e-learn.cn How does git compute file hashes? https://www.e-learn.cn/topic/2526753 <span>How does git compute file hashes?</span> <span><span lang="" about="/user/86" typeof="schema:Person" property="schema:name" datatype="">谁说我不能喝</span></span> <span>2019-12-17 03:27:10</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>The SHA1 hashes stored in the tree objects (as returned by <code>git ls-tree</code>) do not match the SHA1 hashes of the file content (as returned by <code>sha1sum</code>)</p> <pre><code>$ git cat-file blob 4716ca912495c805b94a88ef6dc3fb4aff46bf3c | sha1sum de20247992af0f949ae8df4fa9a37e4a03d7063e - </code></pre> <p>How does git compute file hashes? Does it compress the content before computing the hash?</p> <br /><h3>回答1:</h3><br /><blockquote> <p>Git prefixes the object with "blob ", followed by the length (as a human-readable integer), followed by a NUL character</p> </blockquote> <p><code>$ echo -e 'blob 14\0Hello, World!' | shasum 8ab686eafeb1f44702738c8b0f24f2567c36da6d</code></p> <p>Source: http://alblue.bandlem.com/2011/08/git-tip-of-week-objects.html</p> <br /><br /><br /><h3>回答2:</h3><br /><p>I am only expanding on the answer by <code>@Leif Gruenwoldt</code> and detailing what is in the reference provided by <code>@Leif Gruenwoldt</code></p> <p><strong>Do It Yourself..</strong></p> <blockquote> <ul><li>Step 1. Create an empty text document (name does not matter) in your repository</li> <li>Step 2. Stage and Commit the document</li> <li>Step 3. Identify the hash of the blob by executing <code>git ls-tree HEAD</code></li> <li>Step 4. Find the blob's hash to be <code>e69de29bb2d1d6434b8b29ae775ad8c2e48c5391</code></li> <li>Step 5. Snap out of your surprise and read below</li> </ul></blockquote> <p><strong>How does GIT compute its commit hashes</strong></p> <pre><code> Commit Hash (SHA1) = SHA1("blob " + &lt;size_of_file&gt; + "\0" + &lt;contents_of_file&gt;) </code></pre> <p>The text <code>blob⎵</code> is a constant prefix and <code>\0</code> is also constant and is the <code>NULL</code> character. The <code>&lt;size_of_file&gt;</code> and <code>&lt;contents_of_file&gt;</code> vary depending on the file.</p> <p>See: What is the file format of a git commit object?</p> <p>And thats all folks!</p> <p><strong>But wait!</strong>, did you notice that the <code>&lt;filename&gt;</code> is not a parameter used for the hash computation? Two files could potentially have the same hash if their contents are same indifferent of the date and time they were created and their name. This is one of the reasons Git handles moves and renames better than other version control systems.</p> <p><strong>Do It Yourself (Ext)</strong></p> <blockquote> <ul><li>Step 6. Create another empty file with a different <code>filename</code> in the same directory</li> <li>Step 7. Compare the hashes of both your files.</li> </ul></blockquote> <p><strong>Note:</strong></p> <p>The link does not mention how the <code>tree</code> object is hashed. I am not certain of the algorithm and parameters however from my observation it probably computes a hash based on all the <code>blobs</code> and <code>trees</code> (their hashes probably) it contains</p> <br /><br /><br /><h3>回答3:</h3><br /><p><strong><code>git hash-object</code></strong></p> <p>This is a quick way to verify your test method:</p> <pre><code>s='abc' printf "$s" | git hash-object --stdin printf "blob $(printf "$s" | wc -c)\0$s" | sha1sum </code></pre> <p>Output:</p> <pre><code>f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f f2ba8f84ab5c1bce84a7b441cb1959cfc7093b7f - </code></pre> <p>where <code>sha1sum</code> is in GNU Coreutils.</p> <p>Then it comes down to understanding the format of each object type. We have already covered the trivial <code>blob</code>, here are the others:</p> <ul><li>commit: What is the file format of a git commit object?</li> <li>tree: What is the internal format of a git tree object?</li> <li>tag: How is a Git Tag Object SHA1 Created?</li> </ul><br /><br /><br /><h3>回答4:</h3><br /><p>Based on Leif Gruenwoldt answer, here is a shell function substitute to git hash-object :</p> <pre class="lang-bash prettyprint-override"><code>git-hash-object () { # substitute when the `git` command is not available local type=blob [ "$1" = "-t" ] &amp;&amp; shift &amp;&amp; type=$1 &amp;&amp; shift # depending on eol/autocrlf settings, you may want to substitute CRLFs by LFs # by using `perl -pe 's/\r$//g'` instead of `cat` in the next 2 commands local size=$(cat $1 | wc -c | sed 's/ .*$//') ( echo -en "$type $size\0"; cat "$1" ) | sha1sum | sed 's/ .*$//' } </code></pre> <p>Test:</p> <pre class="lang-bash prettyprint-override"><code>$ echo 'Hello, World!' &gt; test.txt $ git hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d $ git-hash-object test.txt 8ab686eafeb1f44702738c8b0f24f2567c36da6d </code></pre> <br /><br /><br /><h3>回答5:</h3><br /><p>I needed this for some unit tests in Python 3 so thought I'd leave it here.</p> <pre><code>def git_blob_hash(data): if isinstance(data, str): data = data.encode() data = b'blob ' + str(len(data)).encode() + b'\0' + data h = hashlib.sha1() h.update(data) return h.hexdigest() </code></pre> <p>I stick to <code>\n</code> line endings everywhere but in some circumstances Git might also be changing your line endings before calculating this hash so you may need a .replace('\r\n', '\n') in there too.</p> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/7225313/how-does-git-compute-file-hashes</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/hash" hreflang="zh-hans">hash</a></div> <div class="field--item"><a href="/tag/sha1" hreflang="zh-hans">sha1</a></div> <div class="field--item"><a href="/tag/checksum" hreflang="zh-hans">checksum</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Mon, 16 Dec 2019 19:27:10 +0000 谁说我不能喝 2526753 at https://www.e-learn.cn Git objects SHA-1 are file contents or file names? https://www.e-learn.cn/topic/1913153 <span>Git objects SHA-1 are file contents or file names?</span> <span><span lang="" about="/user/90" typeof="schema:Person" property="schema:name" datatype="">生来就可爱ヽ(ⅴ<●)</span></span> <span>2019-12-07 10:01:32</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>I am confused with how a file's actual contents are stored in .git.</p> <p>For e.g. <code>Version 1</code> is the actual text content in <code>test.txt</code>. When I commit (first commit) it to the repo, git returns a SHA-1 for that file which is located in <code>.git\objects\0c\15af113a95643d7c244332b0e0b287184cd049</code>. </p> <p>When I open the file <code>15af113a95643d7c244332b0e0b287184cd049</code> in a text editor, it's all garbage, something like this </p> <p><code>x+)JMU074f040031QÐKÏ,ÉLÏË/Je¨}ºõw[Éœ„ÇR­ ñ·Î}úyGª*±8#³¨,1%&gt;9?¯$5¯D¯¤¢„áôÏ3%³þú&gt;š~}Ž÷*ë²-¶ç¡êÊòR“KâKòãs+‹sô</code> </p> <p>But I'm not sure whether this garbage represents the encrypted form of the text <code>Version 1</code> or it's represented by the SHA-1 <code>15af113a95643d7c244332b0e0b287184cd049</code>.</p> <br /><h3>回答1:</h3><br /><p>The correct answer to the question in the subject line:</p> <blockquote> <p>Git objects SHA-1 are file contents or file names?</p> </blockquote> <p>is probably "neither", since you were referring to the contents of the loose object file, rather than the original file—and even if you were referring to the original file, that's still not quite right.</p> <p>A <em>loose object</em>, in Git, is a plain file. The name of the file is constructed from the object's hash ID. The object's hash ID, in turn, is constructed by computing a hash of the object's contents <em>with a prefix header attached</em>.</p> <p>The prefixed header depends on the object type. There are four types: <code>blob</code>, <code>commit</code>, <code>tag</code>, and <code>tree</code>. The header consists of the a zero-terminated byte string composed of the type name as an ASCII (or equivalently, UTF-8) byte string, followed by a space, followed by a decimalized representation of the size of the object in bytes, followed by an ASCII NUL (<code>b'\x00'</code> in Python, if you prefer modern Python notation, or <code>'\0'</code> if you prefer C).</p> <p>After the header come the actual object contents. So, for a file containing the byte string <code>b'hello\n'</code>, the data to be hashed consist of <code>b'blob 6\0hello\n</code>:</p> <pre><code>$ echo 'hello' | git hash-object -t blob --stdin ce013625030ba8dba906f756967f9e9ca394464a $ python3 [...] &gt;&gt;&gt; import hashlib &gt;&gt;&gt; s = b'blob 6\0hello\n' &gt;&gt;&gt; hashlib.sha1(s).hexdigest() 'ce013625030ba8dba906f756967f9e9ca394464a' </code></pre> <p>Hence, the file name that would be used to store this file is (derived from) <code>ce013625030ba8dba906f756967f9e9ca394464a</code>. As a loose object, it becomes <code>.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a</code>.</p> <p>The <em>contents</em> of that file, however, are the zlib-compressed form of <code>b'blob 6\0hello\n'</code> (with, apparently, <code>level=1</code>—the default is currently 6 and the result does not match at that level; it's not clear whether Git's zlib deflate exactly matches Python's, but using level 1 did work here):</p> <pre><code>$ echo 'hello' | git hash-object -w -t blob --stdin ce013625030ba8dba906f756967f9e9ca394464a $ vis .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a x\^AK\M-J\M-IOR0c\M-HH\M-M\M-I\M-I\M-g\^B\000\^]\M-E\^D\^T$ </code></pre> <p>(note that the final <code>$</code> is the shell prompt again; now back to Python3)</p> <pre><code>&gt;&gt;&gt; import zlib &gt;&gt;&gt; zlib.compress(s, 1) b'x\x01K\xca\xc9OR0c\xc8H\xcd\xc9\xc9\xe7\x02\x00\x1d\xc5\x04\x14' &gt;&gt;&gt; import vis &gt;&gt;&gt; print(vis.vis(zlib.compress(s, 1))) x\^AK\M-J\M-IOR0c\M-HH\M-M\M-I\M-I\M-g\^B\^@\^]\M-E\^D\^T </code></pre> <p>where <code>vis.py</code> is:</p> <pre><code>def vischr(byte): "encode characters the way vis(1) does by default" if byte in b' \t\n': return chr(byte) # control chars: \^X; del: \^? if byte &lt; 32 or byte == 127: return r'\^' + chr(byte ^ 64) # printable characters, 32..126 if byte &lt; 128: return chr(byte) # meta characters: prefix with \M^ or \M- byte -= 128 if byte &lt; 32 or byte == 127: return r'\M^' + chr(byte ^ 64) return r'\M-' + chr(byte) def vis(bytestr): "same as vis(1)" return ''.join(vischr(c) for c in bytestr) </code></pre> <p>(<code>vis</code> produces an invertible but printable encoding of binary files; it was my 1993-ish answer to problems with <code>cat -v</code>).</p> <p>Note that the <em>names of files</em> stored in a Git repository (under a commit) appear only as <em>path name components</em> stored in individual <code>tree</code> objects. Computing the hash ID of a tree object is nontrivial; I have Python code that does this in my public "scripts" repository under githash.py.</p> <br /><br /><br /><h3>回答2:</h3><br /><p>Git Magic mentions:</p> <blockquote> <p>By the way, the files within .git/objects are compressed with zlib so you should not stare at them directly. Filter them through <code>zpipe -d</code>, or type (using git cat-file):</p> </blockquote> <pre><code>$ git cat-file -p .git/objects/0c/15af113a95643d7c244332b0e0b287184cd049 </code></pre> <p>With <code>zpipe</code>:</p> <pre><code>$ ./zpipe -d &lt; .git/objects/0c/15af113a95643d7c244332b0e0b287184cd049 </code></pre> <p>Note: for zpipe, I had to compile zpipe.c first:</p> <pre><code>sudo apt-get install zlib1g-dev cd /usr/share/doc/zlib1g-dev/examples sudo gunzip zpipe.c.gz sudo gcc -o zpipe zpipe.c -lz </code></pre> <p>Then:</p> <pre><code>$ /usr/share/doc/zlib1g-dev/examples/zpipe -d &lt; /usr/share/doc/zlib1g-dev/examples/zpipe -d &lt; </code></pre> <p>You will get a result like:</p> <pre><code>vonc@VONCAVN7:/mnt/d/git/seec$ /usr/share/doc/zlib1g-dev/examples/zpipe -d &lt; .git/objects/0d/b6225927ef60e21138a9762c41ea0db714ca0d blob 2142 &lt;full content there...&gt; </code></pre> <p>You see a header composed of the type and content size, followed by the actual content.</p> <p>See "Understanding Git Internals" from Jeff Kunkle, slide 8, for an illustration of a blob actual content:</p> <p></p> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/44475891/git-objects-sha-1-are-file-contents-or-file-names</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Sat, 07 Dec 2019 02:01:32 +0000 生来就可爱ヽ(ⅴ<●) 1913153 at https://www.e-learn.cn Git objects SHA-1 are file contents or file names? https://www.e-learn.cn/topic/1728342 <span>Git objects SHA-1 are file contents or file names?</span> <span><span lang="" about="/user/204" typeof="schema:Person" property="schema:name" datatype="">徘徊边缘</span></span> <span>2019-12-05 17:02:14</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><div class="alert alert-danger" role="alert"> <p>I am confused with how a file's actual contents are stored in .git.</p> <p>For e.g. <code>Version 1</code> is the actual text content in <code>test.txt</code>. When I commit (first commit) it to the repo, git returns a SHA-1 for that file which is located in <code>.git\objects\0c\15af113a95643d7c244332b0e0b287184cd049</code>. </p> <p>When I open the file <code>15af113a95643d7c244332b0e0b287184cd049</code> in a text editor, it's all garbage, something like this </p> <p><code>x+)JMU074f040031QÐKÏ,ÉLÏË/Je¨}ºõw[Éœ„ÇR­ ñ·Î}úyGª*±8#³¨,1%&gt;9?¯$5¯D¯¤¢„áôÏ3%³þú&gt;š~}Ž÷*ë²-¶ç¡êÊòR“KâKòãs+‹sô</code> </p> <p>But I'm not sure whether this garbage represents the encrypted form of the text <code>Version 1</code> or it's represented by the SHA-1 <code>15af113a95643d7c244332b0e0b287184cd049</code>.</p> </div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>The correct answer to the question in the subject line:</p> <blockquote> <p>Git objects SHA-1 are file contents or file names?</p> </blockquote> <p>is probably "neither", since you were referring to the contents of the loose object file, rather than the original file—and even if you were referring to the original file, that's still not quite right.</p> <p>A <em>loose object</em>, in Git, is a plain file. The name of the file is constructed from the object's hash ID. The object's hash ID, in turn, is constructed by computing a hash of the object's contents <em>with a prefix header attached</em>.</p> <p>The prefixed header depends on the object type. There are four types: <code>blob</code>, <code>commit</code>, <code>tag</code>, and <code>tree</code>. The header consists of the a zero-terminated byte string composed of the type name as an ASCII (or equivalently, UTF-8) byte string, followed by a space, followed by a decimalized representation of the size of the object in bytes, followed by an ASCII NUL (<code>b'\x00'</code> in Python, if you prefer modern Python notation, or <code>'\0'</code> if you prefer C).</p> <p>After the header come the actual object contents. So, for a file containing the byte string <code>b'hello\n'</code>, the data to be hashed consist of <code>b'blob 6\0hello\n</code>:</p> <pre><code>$ echo 'hello' | git hash-object -t blob --stdin ce013625030ba8dba906f756967f9e9ca394464a $ python3 [...] &gt;&gt;&gt; import hashlib &gt;&gt;&gt; s = b'blob 6\0hello\n' &gt;&gt;&gt; hashlib.sha1(s).hexdigest() 'ce013625030ba8dba906f756967f9e9ca394464a' </code></pre> <p>Hence, the file name that would be used to store this file is (derived from) <code>ce013625030ba8dba906f756967f9e9ca394464a</code>. As a loose object, it becomes <code>.git/objects/ce/013625030ba8dba906f756967f9e9ca394464a</code>.</p> <p>The <em>contents</em> of that file, however, are the zlib-compressed form of <code>b'blob 6\0hello\n'</code> (with, apparently, <code>level=1</code>—the default is currently 6 and the result does not match at that level; it's not clear whether Git's zlib deflate exactly matches Python's, but using level 1 did work here):</p> <pre><code>$ echo 'hello' | git hash-object -w -t blob --stdin ce013625030ba8dba906f756967f9e9ca394464a $ vis .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a x\^AK\M-J\M-IOR0c\M-HH\M-M\M-I\M-I\M-g\^B\000\^]\M-E\^D\^T$ </code></pre> <p>(note that the final <code>$</code> is the shell prompt again; now back to Python3)</p> <pre><code>&gt;&gt;&gt; import zlib &gt;&gt;&gt; zlib.compress(s, 1) b'x\x01K\xca\xc9OR0c\xc8H\xcd\xc9\xc9\xe7\x02\x00\x1d\xc5\x04\x14' &gt;&gt;&gt; import vis &gt;&gt;&gt; print(vis.vis(zlib.compress(s, 1))) x\^AK\M-J\M-IOR0c\M-HH\M-M\M-I\M-I\M-g\^B\^@\^]\M-E\^D\^T </code></pre> <p>where <code>vis.py</code> is:</p> <pre><code>def vischr(byte): "encode characters the way vis(1) does by default" if byte in b' \t\n': return chr(byte) # control chars: \^X; del: \^? if byte &lt; 32 or byte == 127: return r'\^' + chr(byte ^ 64) # printable characters, 32..126 if byte &lt; 128: return chr(byte) # meta characters: prefix with \M^ or \M- byte -= 128 if byte &lt; 32 or byte == 127: return r'\M^' + chr(byte ^ 64) return r'\M-' + chr(byte) def vis(bytestr): "same as vis(1)" return ''.join(vischr(c) for c in bytestr) </code></pre> <p>(<code>vis</code> produces an invertible but printable encoding of binary files; it was my 1993-ish answer to problems with <code>cat -v</code>).</p> <p>Note that the <em>names of files</em> stored in a Git repository (under a commit) appear only as <em>path name components</em> stored in individual <code>tree</code> objects. Computing the hash ID of a tree object is nontrivial; I have Python code that does this in my public "scripts" repository under <a href="https://github.com/chris3torek/scripts/blob/master/githash.py" rel="nofollow">githash.py</a>.</p> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p><a href="http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html" rel="nofollow">Git Magic</a> mentions:</p> <blockquote> <p>By the way, the files within .git/objects are compressed with zlib so you should not stare at them directly. Filter them through <code>zpipe -d</code>, or type (using <a href="https://git-scm.com/docs/git-cat-file" rel="nofollow"><code>git cat-file</code></a>):</p> </blockquote> <pre><code>$ git cat-file -p .git/objects/0c/15af113a95643d7c244332b0e0b287184cd049 </code></pre> <p>With <code>zpipe</code>:</p> <pre><code>$ ./zpipe -d &lt; .git/objects/0c/15af113a95643d7c244332b0e0b287184cd049 </code></pre> <p>Note: for zpipe, I had to compile <a href="https://github.com/madler/zlib/blob/master/examples/zpipe.c" rel="nofollow"><code>zpipe.c</code></a> first:</p> <pre><code>sudo apt-get install zlib1g-dev cd /usr/share/doc/zlib1g-dev/examples sudo gunzip zpipe.c.gz sudo gcc -o zpipe zpipe.c -lz </code></pre> <p>Then:</p> <pre><code>$ /usr/share/doc/zlib1g-dev/examples/zpipe -d &lt; /usr/share/doc/zlib1g-dev/examples/zpipe -d &lt; </code></pre> <p>You will get a result like:</p> <pre><code>vonc@VONCAVN7:/mnt/d/git/seec$ /usr/share/doc/zlib1g-dev/examples/zpipe -d &lt; .git/objects/0d/b6225927ef60e21138a9762c41ea0db714ca0d blob 2142 &lt;full content there...&gt; </code></pre> <p>You see a header composed of the type and content size, followed by the actual content.</p> <p>See "<a href="https://www.slideshare.net/JeffKunkle/understanding-git" rel="nofollow">Understanding Git Internals</a>" from Jeff Kunkle, slide 8, for an illustration of a blob actual content:</p> <p><a href="https://i.stack.imgur.com/IJUMg.png" rel="nofollow"><p></p><p></p><img class="b-lazy" data-src="https://www.eimg.top/images/2020/03/17/39eda053220630dca42ebffdd838345f.png" data-original="https://www.eimg.top/images/2020/03/17/39eda053220630dca42ebffdd838345f.png" src="" /><p></p><p></p></a></p> </div></div><div class="alert alert-warning" role="alert"><p>来源:<code>https://stackoverflow.com/questions/44475891/git-objects-sha-1-are-file-contents-or-file-names</code></p></div></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Thu, 05 Dec 2019 09:02:14 +0000 徘徊边缘 1728342 at https://www.e-learn.cn Telling if a Git commit is a Merge/Revert commit https://www.e-learn.cn/topic/580763 <span>Telling if a Git commit is a Merge/Revert commit</span> <span><span lang="" about="/user/182" typeof="schema:Person" property="schema:name" datatype="">南笙酒味</span></span> <span>2019-11-29 02:50:45</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><div class="alert alert-danger" role="alert"> <p>I am writing a script that requires checking whether a particular commit is a Merge/Revert commit or not, and I am wondering if there is a git trick for that.</p> <p>What I came up with so far (and I definitely don't want to depend on the commit message here) is to check <code>HASH^2</code> and see if I don't get an error, is there a better way?</p> </div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>Figuring out if something is a merge is easy. That's all commits with more than one parent. To check for that, you can do, for example</p> <pre><code>$ git cat-file -p $commit_id </code></pre> <p>If there's more than one `parent' line in the output, you found a merge.</p> <p>For reverts it's not as easy. Generally reverts are just normal commits that happen to apply the diff of a previous commit in reverse, effectively removing the changes that commit introduced. There're not special otherwise.</p> <p>If a revert was created with <code>git revert $commit</code>, then git usually generates a commit message indication the revert and what commit it reverted. However, it's quite possible to do reverts in other ways, or to just change the commit message of a commit generated by <code>git revert</code>.</p> <p>Looking for those generated revert commit message might already be a good enough heuristic for what you're trying to achieve. If not, you'd have to actually look through other commits, comparing their diffs against each other, looking of one is the exact reverse operation of another. But even that isn't a good solution. Often enough reverts are slightly different than just the reverse of the commit they're reverting, for example to accomodate for code changes that happened between the commit and the revert.</p> </div></div><div class="panel panel-info"><div class="panel-heading">Dave</div><div class="panel-body"> <p>The following instruction will dump out <strong>only</strong> the parent hashes. Less filtering needed...</p> <p><code>git show --no-patch --format="%P" &lt;commit hash&gt;</code> </p> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>The answer using <code>git cat-file</code> is using a git <strong>"plumbing"</strong> command, which is generally better for building scripts as the output format is not likely to change. The ones using <code>git show</code> and <code>git rev-parse</code> may need to change over time as they are using <a href="https://stackoverflow.com/questions/6976473/what-does-the-term-porcelain-mean-in-git" rel="nofollow"><strong>porcelain</strong></a> commands.</p> <p>The bash function I've been using for a long time uses <code>git rev-list</code>:</p> <pre><code>gitismerge () { local sha="$1" msha=$(git rev-list -1 --merges ${sha}~1..${sha}) [ -z "$msha" ] &amp;&amp; return 1 return 0 } </code></pre> <p>The list of porcelain/plumbing commands can be found in the docs for the top level <a href="https://git-scm.com/docs/git" rel="nofollow"> git</a> command.</p> <p>This code uses <a href="https://git-scm.com/docs/git-rev-list" rel="nofollow">git-rev-list</a> with a specific <a href="https://git-scm.com/docs/gitrevisions" rel="nofollow">gitrevisions</a> query <code>${sha}~1..${sha}</code> in a way that prints a SHA's second parent if it exists, or nothing if it is not present, which is the exact definition of a merge commit.</p> <p>Specifically, <code>SHA~1..SHA</code> means <em>include commits that are reachable from SHA but exclude those that are reachable SHA~1, which is the first parent of SHA</em>.</p> <p>The results are stored in $msha and tested for emptiness using bash <code>[ -z "$msha" ]</code> failing (returning 1) if empty, or passing (returning 0) if non-empty.</p> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>One way to test for a merge commit:</p> <pre><code>$ test -z $(git rev-parse --verify $commit^2 2&gt; /dev/null) || echo MERGE COMMIT </code></pre> <p>As for git revert commits, I agree with <a href="/users/54157/rafl" rel="nofollow">@rafl</a> that the most realistic approach is to look for the revert message boilerplate in the commit message; if someone changed it, detecting so would be very involved.</p> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>Easy way to test for merge commit:</p> <pre class="lang-sh prettyprint-override"><code>git show --summary HEAD | grep -q ^Merge: </code></pre> <p>This will return 0 for merge commits, 1 for non-merge commits. Replace HEAD by your desired commit to test.</p> <p>Example usage:</p> <pre><code>if git show --summary some-branch | grep -q ^Merge: ; then echo "some-branch is a merge" fi </code></pre> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>Yet another way to find a commit's parents:</p> <pre><code>git show -s --pretty=%p &lt;commit&gt; </code></pre> <p>Use <code>%P</code> for full hash. This prints how many parents <code>HEAD</code> has:</p> <pre><code>git show -s --pretty=%p HEAD | wc -w </code></pre> </div></div><div class="alert alert-warning" role="alert"><p>来源:<code>https://stackoverflow.com/questions/3824050/telling-if-a-git-commit-is-a-merge-revert-commit</code></p></div></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/git-merge" hreflang="zh-hans">git-merge</a></div> <div class="field--item"><a href="/tag/git-commit" hreflang="zh-hans">git-commit</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Thu, 28 Nov 2019 18:50:45 +0000 南笙酒味 580763 at https://www.e-learn.cn How does Git create unique commit hashes, mainly the first few characters? https://www.e-learn.cn/topic/580482 <span>How does Git create unique commit hashes, mainly the first few characters?</span> <span><span lang="" about="/user/164" typeof="schema:Person" property="schema:name" datatype="">烈酒焚心</span></span> <span>2019-11-29 02:48:50</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>I find it hard to wrap my head around how Git creates fully unique hashes that aren't allowed to be the same even in the first 4 characters. I'm able to call commits in Git Bash using only the first four characters. Is it specifically decided in the algorithm that the first characters are "ultra"-unique and will not ever conflict with other similar hashes, or does the algorithm generate every part of the hash in the same way?</p> <br /><h3>回答1:</h3><br /><p>Git uses the following information to generate the sha-1:</p> <ul><li>The source tree of the commit (which unravels to all the subtrees and blobs) </li> <li>The parent commit sha1 </li> <li>The author info (with timestamp)</li> <li>The committer info (right, those are different!, also with timestamp)</li> <li>The commit message</li> </ul><p>(on the complete explanation; look here).</p> <p>Git <strong>does NOT</strong> guarantee that the first 4 characters will be unique. In chapter 7 of the Pro Git Book it is written:</p> <blockquote> <p>Git can figure out a short, unique abbreviation for your SHA-1 values. If you pass --abbrev-commit to the git log command, the output will use shorter values but keep them unique; it defaults to using seven characters but makes them longer if necessary to keep the SHA-1 unambiguous:</p> </blockquote> <p>So Git just makes the abbreviation <strong>as long as necessary</strong> to remain unique. They even note that:</p> <blockquote> <p>Generally, eight to ten characters are more than enough to be unique within a project.</p> <p>As an example, the Linux kernel, which is a pretty large project with over 450k commits and 3.6 million objects, has no two objects whose SHA-1s overlap more than the first 11 characters.</p> </blockquote> <p>So in fact they just depend on the great <strong>improbability</strong> of having the exact same (X first characters of a) sha.</p> <br /><br /><br /><h3>回答2:</h3><br /><p>Apr. 2017: Beware that after the all shattered.io episode (where a SHA1 collision was achieved by Google), the 20-byte format won't be there forever.</p> <p>A first step for that is to replace <code>unsigned char sha1[20]</code> which is hard-code all over the Git codebase by a generic object whose definition might change in the future (SHA2?, Blake2, ...)</p> <p>See commit e86ab2c (21 Feb 2017) by brian m. carlson (bk2204). </p> <blockquote> <p>Convert the remaining uses of <code>unsigned char [20]</code> to <code>struct object_id</code>.</p> </blockquote> <p>That is an example of an ongoing effort started with commit 5f7817c (13 Mar 2015) by brian m. carlson (bk2204), for v2.5.0-rc0, in cache.h:</p> <pre><code>/* The length in bytes and in hex digits of an object name (SHA-1 value). */ #define GIT_SHA1_RAWSZ 20 #define GIT_SHA1_HEXSZ (2 * GIT_SHA1_RAWSZ) struct object_id { unsigned char hash[GIT_SHA1_RAWSZ]; }; </code></pre> <p>And don't forget that, even with SHA1, the 4 first characters are no longer enough to guarantee uniqueness, as I explain in "How much of a git sha is generally considered necessary to uniquely identify a change in a given codebase?".</p> <hr /><p><strong>Update Dec. 2017</strong> with Git 2.16 (Q1 2018): this effort to support an alternative SHA is underway: see "Why doesn't Git use more modern SHA?".</p> <p>You will be able to use another hash: SHA1 is no longer the only one for Git.</p> <p><strong>Update 2018-2019</strong>: the choice has been made in Git 2.19+: <strong>SHA-256</strong>.<br /> See "hash-function-transition".</p> <p>This is not yet active (meaning git 2.21 is still using SHA1), but the code is being done to support in the future SHA-256.</p> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/34764195/how-does-git-create-unique-commit-hashes-mainly-the-first-few-characters</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/algorithm" hreflang="zh-hans">algorithm</a></div> <div class="field--item"><a href="/tag/hash" hreflang="zh-hans">hash</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Thu, 28 Nov 2019 18:48:50 +0000 烈酒焚心 580482 at https://www.e-learn.cn Get the current git hash in a Python script https://www.e-learn.cn/topic/515457 <span>Get the current git hash in a Python script</span> <span><span lang="" about="/user/183" typeof="schema:Person" property="schema:name" datatype="">北城余情</span></span> <span>2019-11-28 16:01:53</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><div class="alert alert-danger" role="alert"> <p>I would like to include the current git hash in the output of a Python script (as a the <em>version number</em> of the code that generated that output).</p> <p>How can I access the current git hash in my Python script?</p> </div><div class="panel panel-info"><div class="panel-heading">Greg Hewgill</div><div class="panel-body"> <p>The <a href="https://www.kernel.org/pub/software/scm/git/docs/git-describe.html" rel="nofollow"><code>git describe</code></a> command is a good way of creating a human-presentable "version number" of the code. From the examples in the documentation:</p> <blockquote> <p>With something like git.git current tree, I get:</p> <pre class="lang-none prettyprint-override"><code>[torvalds@g5 git]$ git describe parent v1.0.4-14-g2414721 </code></pre> <p>i.e. the current head of my "parent" branch is based on v1.0.4, but since it has a few commits on top of that, describe has added the number of additional commits ("14") and an abbreviated object name for the commit itself ("2414721") at the end.</p> </blockquote> <p>From within Python, you can do something like the following:</p> <pre><code>import subprocess label = subprocess.check_output(["git", "describe"]).strip() </code></pre> </div></div><div class="panel panel-info"><div class="panel-heading">guaka</div><div class="panel-body"> <p>No need to hack around getting data from the <code>git</code> command yourself. <a href="http://gitpython.readthedocs.io/en/stable/" rel="nofollow">GitPython</a> is a very nice way to do this and a lot of other <code>git</code> stuff. It even has "best effort" support for Windows.</p> <p>After <code>pip install gitpython</code> you can do</p> <pre><code>import git repo = git.Repo(search_parent_directories=True) sha = repo.head.object.hexsha </code></pre> </div></div><div class="panel panel-info"><div class="panel-heading">Yuji 'Tomita' Tomita</div><div class="panel-body"> <p><a href="https://stackoverflow.com/questions/949314/how-to-retrieve-the-hash-for-the-current-commit-in-git" rel="nofollow">This post</a> contains the command, <a href="https://stackoverflow.com/a/14989911/3357935" rel="nofollow">Greg's answer</a> contains the subprocess command.</p> <pre><code>import subprocess def get_git_revision_hash(): return subprocess.check_output(['git', 'rev-parse', 'HEAD']) def get_git_revision_short_hash(): return subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD']) </code></pre> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p><code>numpy</code> has a nice looking <a href="https://github.com/numpy/numpy/blob/master/setup.py#L70-L92" rel="nofollow">multi-platform routine</a> in its <code>setup.py</code>:</p> <pre><code>import os import subprocess # Return the git revision as a string def git_version(): def _minimal_ext_cmd(cmd): # construct minimal environment env = {} for k in ['SYSTEMROOT', 'PATH']: v = os.environ.get(k) if v is not None: env[k] = v # LANGUAGE is used on win32 env['LANGUAGE'] = 'C' env['LANG'] = 'C' env['LC_ALL'] = 'C' out = subprocess.Popen(cmd, stdout = subprocess.PIPE, env=env).communicate()[0] return out try: out = _minimal_ext_cmd(['git', 'rev-parse', 'HEAD']) GIT_REVISION = out.strip().decode('ascii') except OSError: GIT_REVISION = "Unknown" return GIT_REVISION </code></pre> </div></div><div class="panel panel-info"><div class="panel-heading"></div><div class="panel-body"> <p>If subprocess isn't portable and you don't want to install a package to do something this simple you can also do this.</p> <pre class="lang-py prettyprint-override"><code>import pathlib def get_git_revision(base_path): git_dir = pathlib.Path(base_path) / '.git' with (git_dir / 'HEAD').open('r') as head: ref = head.readline().split(' ')[-1].strip() with (git_dir / ref).open('r') as git_hash: return git_hash.readline().strip() </code></pre> <p>I've only tested this on my repos but it seems to work pretty consistantly.</p> </div></div><div class="alert alert-warning" role="alert"><p>来源:<code>https://stackoverflow.com/questions/14989858/get-the-current-git-hash-in-a-python-script</code></p></div></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/python" hreflang="zh-hans">python</a></div> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Thu, 28 Nov 2019 08:01:53 +0000 北城余情 515457 at https://www.e-learn.cn Telling if a Git commit is a Merge/Revert commit https://www.e-learn.cn/topic/353082 <span>Telling if a Git commit is a Merge/Revert commit</span> <span><span lang="" about="/user/144" typeof="schema:Person" property="schema:name" datatype="">核能气质少年</span></span> <span>2019-11-27 17:08:01</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><h3>问题</h3><br /><p>I am writing a script that requires checking whether a particular commit is a Merge/Revert commit or not, and I am wondering if there is a git trick for that.</p> <p>What I came up with so far (and I definitely don't want to depend on the commit message here) is to check <code>HASH^2</code> and see if I don't get an error, is there a better way?</p> <br /><h3>回答1:</h3><br /><p>Figuring out if something is a merge is easy. That's all commits with more than one parent. To check for that, you can do, for example</p> <pre><code>$ git cat-file -p $commit_id </code></pre> <p>If there's more than one `parent' line in the output, you found a merge.</p> <p>For reverts it's not as easy. Generally reverts are just normal commits that happen to apply the diff of a previous commit in reverse, effectively removing the changes that commit introduced. There're not special otherwise.</p> <p>If a revert was created with <code>git revert $commit</code>, then git usually generates a commit message indication the revert and what commit it reverted. However, it's quite possible to do reverts in other ways, or to just change the commit message of a commit generated by <code>git revert</code>.</p> <p>Looking for those generated revert commit message might already be a good enough heuristic for what you're trying to achieve. If not, you'd have to actually look through other commits, comparing their diffs against each other, looking of one is the exact reverse operation of another. But even that isn't a good solution. Often enough reverts are slightly different than just the reverse of the commit they're reverting, for example to accomodate for code changes that happened between the commit and the revert.</p> <br /><br /><br /><h3>回答2:</h3><br /><p>The following instruction will dump out <strong>only</strong> the parent hashes. Less filtering needed...</p> <p><code>git show --no-patch --format="%P" &lt;commit hash&gt;</code> </p> <br /><br /><br /><h3>回答3:</h3><br /><p>The answer using <code>git cat-file</code> is using a git <strong>"plumbing"</strong> command, which is generally better for building scripts as the output format is not likely to change. The ones using <code>git show</code> and <code>git rev-parse</code> may need to change over time as they are using porcelain commands.</p> <p>The bash function I've been using for a long time uses <code>git rev-list</code>:</p> <pre><code>gitismerge () { local sha="$1" msha=$(git rev-list -1 --merges ${sha}~1..${sha}) [ -z "$msha" ] &amp;&amp; return 1 return 0 } </code></pre> <p>The list of porcelain/plumbing commands can be found in the docs for the top level git command.</p> <p>This code uses git-rev-list with a specific gitrevisions query <code>${sha}~1..${sha}</code> in a way that prints a SHA's second parent if it exists, or nothing if it is not present, which is the exact definition of a merge commit.</p> <p>Specifically, <code>SHA~1..SHA</code> means <em>include commits that are reachable from SHA but exclude those that are reachable SHA~1, which is the first parent of SHA</em>.</p> <p>The results are stored in $msha and tested for emptiness using bash <code>[ -z "$msha" ]</code> failing (returning 1) if empty, or passing (returning 0) if non-empty.</p> <br /><br /><br /><h3>回答4:</h3><br /><p>One way to test for a merge commit:</p> <pre><code>$ test -z $(git rev-parse --verify $commit^2 2&gt; /dev/null) || echo MERGE COMMIT </code></pre> <p>As for git revert commits, I agree with @rafl that the most realistic approach is to look for the revert message boilerplate in the commit message; if someone changed it, detecting so would be very involved.</p> <br /><br /><br /><h3>回答5:</h3><br /><p>Easy way to test for merge commit:</p> <pre class="lang-sh prettyprint-override"><code>git show --summary HEAD | grep -q ^Merge: </code></pre> <p>This will return 0 for merge commits, 1 for non-merge commits. Replace HEAD by your desired commit to test.</p> <p>Example usage:</p> <pre><code>if git show --summary some-branch | grep -q ^Merge: ; then echo "some-branch is a merge" fi </code></pre> <br /><br /><br /><h3>回答6:</h3><br /><p>Yet another way to find a commit's parents:</p> <pre><code>git show -s --pretty=%p &lt;commit&gt; </code></pre> <p>Use <code>%P</code> for full hash. This prints how many parents <code>HEAD</code> has:</p> <pre><code>git show -s --pretty=%p HEAD | wc -w </code></pre> <br /><br /><p>来源:<code>https://stackoverflow.com/questions/3824050/telling-if-a-git-commit-is-a-merge-revert-commit</code></p></div> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">标签</div> <div class="field--items"> <div class="field--item"><a href="/tag/git" hreflang="zh-hans">git</a></div> <div class="field--item"><a href="/tag/git-merge" hreflang="zh-hans">git-merge</a></div> <div class="field--item"><a href="/tag/git-commit" hreflang="zh-hans">git-commit</a></div> <div class="field--item"><a href="/tag/git-hash" hreflang="zh-hans">git-hash</a></div> </div> </div> Wed, 27 Nov 2019 09:08:01 +0000 核能气质少年 353082 at https://www.e-learn.cn