Mercurial sparse checkout

一世执手 提交于 2019-12-11 02:18:34

问题


In this F8 conference video(starting 8:40) from 2015 they speak about the advantages of using Mercurial and a single repository across facebook.

How does this work in practice? Using Mercurial, can i checkout a subdirectory (live in SVN)? If so, how? Do i need a facebook-mercurial-extension for this

P.S.: I only found answers like this or this from 2010 on SO where i am not sure if the answers still apply with all the efforts FB put into it.


回答1:


From your question it is not clear if you are looking for a workflow (the monorepo vs multiple repos debate) or for performance and scaling for a huge code base.

For the workflow, I suggest googling for monorepo. It has its pros and cons, you need to understand your situation and current workflow to decide. For the performance and scaling, keep reading.

The idea of remotefilelog is not to checkout a subdirectory (as you mention), the idea is to checkout everything. In order to do that in an efficient way, you need two extensions actively developed by Facebook:

  • remotefilelog. This gives you something conceptually similar to a shallow clone. This reduces hg clone and hg pull time.
  • fsmonitor (previously called hgwatchman, it is now part of mercurial core). This dramatically reduces time of local operations such as hg status. Note that fsmonitor is independent from remotefilelog. You can start experimenting with this, since it doesn't require any setup on the server side.

With a recent mercurial (which I strongly suggest) you can shave off the additional startup time of the Python interpreter using CommandServer + CHg.

Some additional notes:

  • I tested extensively fsmonitor. It works very well, on huge repos the time of hg status is reduced from 10 secs to less than 1 sec (and the majority of this 1 sec is Python startup time, see above for CHg). If your repository is really huge, you might need to fine tune some inotify kernel parameters (or the equivalent on MacOSX). The fsmonitor documentation has all the information you need.
  • I didn't test remotefilelog, although I read everything I found about it and I am sure it works. Depending on how development is done (everybody has always Internet connectivity or not, the organization has its own master repo or not) there can be a caveat: it partially transforms the decentralized hg into a centralized VCS like svn: some operations that normally can be done offline (for example: hg log and the first hg update to a changeset in the past) will now require connectivity to the master repository.
  • Before considering remotefilelog, I used extensively the largefiles extension on a huge repo. It has the same drawbacks than remotefilelog and some confusing corner cases for users that want to use hg just to get things done without taking the time to understand how it works. If I were to manage another huge repo, I would use remotefilelog instead than largefiles, although their use case is not really the same.
  • Mercurial has also support for subrepositories (doc1, doc2). The problem is that it changes the behavior of hg depending on where you are in the source tree. Again, if the developers don't care about really understanding how hg works, it will be just too confusing.

Additional information:

  • Facebook Engineering blog post
  • scaling mercurial wiki, although not completely up to date
  • just by googling mercurial facebook.



回答2:


i am not sure if the answers still apply with all the efforts FB put into it

(Early 2017) The answers in the questions linked still apply (because they occasionally get updated) but note that you will have to read all the comments and answers.

remotefilelog essentially allows on demand shallow clones (so you don't fetch the history for everything for all time) but you still fetch the essential metadata for, and checkout across, all the directories of the repo at the desired revision.

Using Mercurial, can i checkout a subdirectory (li[k]e in SVN)? If so, how?

https://stackoverflow.com/a/40355673/7836056 discusses how you might use third party extensions to allow narrow/sparse checkouts (Facebook's sparse.py) or narrow clones (Google's NarrowHG) with Mercurial thus only "creating" a single directory from within the main repository (albeit with radically different tradeoffs).

(Note phrasing matters: "sparse checkout" means a very specific action when referring to distributed version control in a manner that doesn't exist when using it to refer to centralised version control)



来源:https://stackoverflow.com/questions/39317634/mercurial-sparse-checkout

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!