How to force consistent line endings in Git commits with cross-platform compatibility

时光怂恿深爱的人放手 提交于 2020-01-21 19:39:50

问题


I am having issues with merge conflicts due to line endings while working with someone who uses a different OS. I work on Windows and my colleague is on Mac. When he pushes his changes, sometimes files he hasn't worked on show up in the diff as being changed, because the line endings now show ^M on each file. This has lead to merge conflicts. I read in the Git docs the following:

Git can handle this by auto-converting CRLF line endings into LF when you add a file to the index, and vice versa when it checks out code onto your filesystem. You can turn on this functionality with the core.autocrlf setting. If you’re on a Windows machine, set it to true — this converts LF endings into CRLF when you check out code:

$ git config --global core.autocrlf true If you’re on a Linux or macOS system that uses LF line endings, then you don’t want Git to automatically convert them when you check out files; however, if a file with CRLF endings accidentally gets introduced, then you may want Git to fix it. You can tell Git to convert CRLF to LF on commit but not the other way around by setting core.autocrlf to input:

$ git config --global core.autocrlf input This setup should leave you with CRLF endings in Windows checkouts, but LF endings on macOS and Linux systems and in the repository.

This makes sense, but I am still unclear on how the files are actually committed in the repo. For example, if he creates a file on his system, it will have all LF line endings, correct? So when he commits, I presume those line endings are retained as-is. When I pull, my autocrlf being true will check them out with CRLF line endings, as far as I understand. (I get the warnings warning: LF will be replaced by CRLF in <file x>; The file will have its original line endings in your working directory)

A couple questions about this: when the warning says "working directory", what is that referring to? Also, when I then make changes, or create other files, all of which have the CRLF line endings and commit+push, are they stored in the repo as CRLF or LF?

I imagine the ideal is to have the repo strip anything but LF everytime a commit is made; is this what happens? What's going on under the hood and how can we force this to behave consistently?


回答1:


autocrlf is widely considered to be broken. The modern way to handle line endings is with .gitattributes. GitHub has a great tutorial about how to use it here.




回答2:


Q1 Enforcing consistent lineendings

Q2 Enforcing at commit as well as checkout (comment)

I'll divide this into 2 parts: Practice and Principle

Practice

Expansion of @code-apprentice 's suggestion

  1. Strictly avoid autocrlf — See why autocrlf is always wrong. And here for the core git devs arguing about the ill-thoughtout-ness of autocrlf. Note particularly that the implementor is annoyed at the critic but doesn't deny the criticism.
  2. Religiously use .gitattributes instead
  3. Use safecrlf=true to enforce commit-cleanliness. safecrlf is the answer to your Q2 – a file that would change on check-in check-out round tripping would error out on the check-in stage itself.

When a new repo is init-ed:
Go through ls -lR and choose for it's type text, binary or ignore (ie put it in .gitignore)

Debugging:
Use git-check-attr to check that attribute matching and computation are as desired

Principle

Data Store

We may treat git as a data-store loosely analogous to how a USB drive is one.

We say the drive is working if the stuff we put in comes out the same. Else it's corrupted. Likewise if the file we commit comes out the same on checkout the repo is fine else (something) is borked. The key question is

What does "same" mean?

It's non-trivial because we implicitly apply different standards of "sameness" in different contexts!

Binary Files

  • A binary file is a sequence of bytes
  • Preserving that sequence faithfully amounts to reproducing the file

Text Files

...are different

  • A text file consists of a sequence of «printable characters» — let's leave the printable char notion unspecified other than to say no cr no lf!
  • How these lines are separated (or terminated) is again unspecified
  • Symbolically:
    type Line = [Char]
    type File = [Line]
  • Expanding on the 1st unspecified gives us ASCII, Latins, Unicode etc etc... Not relevant to this question
  • Expanding on the 2nd is what distinguishes windows *nix etc. JFTR this kind of file may be little known by the younger generation but also exists. And is particularly useful to remember that the notion "sequence of lines" can be imposed at many different levels.

    We don't care how the sameness respects the unspecified parts

To return to our

USB drive analogy

When I copy foo.txt from Windows to Linux I expect the contents to be invariant. However I'm quite satisfied if H:foo.txt changes to /media/name/Transcend/foo.txt. In fact it would be more than a bit annoying if the windowsisms came through untranslated or vice versa.

Far-fetched?? ¡¡Think again!!

IOW thanks to splendid folks like Theodore T'so we take it for granted that Linux can read a windows file (system). This happens because a non-trivial amt of

  • abstraction matching
  • abstraction hiding

happens under the hood.

Back to Git

We therefore expect that a file checked in to git is the same that's checked out... at a different time... And OS!

The catch is that the notion of same is sufficiently non-trivial that git needs some help from us in achieving that "sameness" to our satisfaction... That help is called .gitattributes!



来源:https://stackoverflow.com/questions/57960566/how-to-force-consistent-line-endings-in-git-commits-with-cross-platform-compatib

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!