你如何组织多个git存储库,以便所有这些存储库一起备份?

时间:2023-01-15 09:36:48

With SVN, I had a single big repository I kept on a server, and checked-out on a few machines. This was a pretty good backup system, and allowed me easily work on any of the machines. I could checkout a specific project, commit and it updated the 'master' project, or I could checkout the entire thing.

使用SVN,我有一个大的存储库,我保存在服务器上,并在几台机器上签出。这是一个非常好的备份系统,让我可以轻松地在任何机器上工作。我可以检查一个特定的项目,提交并更新“主”项目,或者我可以检查整个事情。

Now, I have a bunch of git repositories, for various projects, several of which are on github. I also have the SVN repository I mentioned, imported via the git-svn command..

现在,我有一堆git存储库,用于各种项目,其中一些是在github上。我也有我提到的SVN存储库,通过git-svn命令导入..

Basically, I like having all my code (not just projects, but random snippets and scripts, some things like my CV, articles I've written, websites I've made and so on) in one big repository I can easily clone onto remote machines, or memory-sticks/harddrives as backup.

基本上,我喜欢把我所有的代码(不只是项目,而是随机片段和脚本,像我的简历,我写过的文章,我制作的网站等等)放在一个大的存储库中我可以很容易地克隆到远程机器,或记忆棒/硬盘作为备份。

The problem is, since it's a private repository, and git doesn't allow checking out of a specific folder (that I could push to github as a separate project, but have the changes appear in both the master-repo, and the sub-repos)

问题是,因为它是一个私有存储库,并且git不允许检出特定文件夹(我可以将其作为一个单独的项目推送到github,但是更改出现在master-repo和sub-回购)

I could use the git submodule system, but it doesn't act how I want it too (submodules are pointers to other repositories, and don't really contain the actual code, so it's useless for backup)

我可以使用git子模块系统,但它并不是我想要它的行为(子模块是指向其他存储库的指针,并不真正包含实际代码,所以它对备份没用)

Currently I have a folder of git-repos (for example, ~/code_projects/proj1/.git/ ~/code_projects/proj2/.git/), and after doing changes to proj1 I do git push github, then I copy the files into ~/Documents/code/python/projects/proj1/ and do a single commit (instead of the numerous ones in the individual repos). Then do git push backupdrive1, git push mymemorystick etc

目前我有一个git-repos的文件夹(例如,〜/ code_projects / proj1 / .git /〜/ code_projects / proj2 / .git /),在对proj1进行更改之后我执行git push github,然后我复制文件进入〜/ Documents / code / python / projects / proj1 /并进行一次提交(而不是单个repos中的众多提交)。然后做git push backupdrive1,git push mymemorystick等

So, the question: How do your personal code and projects with git repositories, and keep them synced and backed-up?

所以,问题是:你的个人代码和项目如何使用git存储库,并保持同步和备份?

6 个解决方案

#1


75  

I would strongly advise against putting unrelated data in a given Git repository. The overhead of creating new repositories is quite low, and that is a feature that makes it possible to keep different lineages completely separate.

我强烈建议不要在给定的Git存储库中放入不相关的数据。创建新存储库的开销非常低,这是一种可以使不同谱系完全分离的功能。

Fighting that idea means ending up with unnecessarily tangled history, which renders administration more difficult and--more importantly--"archeology" tools less useful because of the resulting dilution. Also, as you mentioned, Git assumes that the "unit of cloning" is the repository, and practically has to do so because of its distributed nature.

战斗这个想法意味着最终会出现不必要的纠结历史,这会使管理变得更加困难 - 更重要的是 - “考古学”工具由于产生的稀释而变得不那么有用。另外,正如您所提到的,Git假定“克隆单元”是存储库,并且由于其分布式特性,实际上必须这样做。

One solution is to keep every project/package/etc. as its own bare repository (i.e., without working tree) under a blessed hierarchy, like:

一个解决方案是保留每个项目/包/等。作为自己的裸存储库(即没有工作树)在一个受祝福的层次结构下,如:

/repos/a.git
/repos/b.git
/repos/c.git

Once a few conventions have been established, it becomes trivial to apply administrative operations (backup, packing, web publishing) to the complete hierarchy, which serves a role not entirely dissimilar to "monolithic" SVN repositories. Working with these repositories also becomes somewhat similar to SVN workflows, with the addition that one can use local commits and branches:

一旦建立了一些约定,将管理操作(备份,打包,Web发布)应用于完整的层次结构变得微不足道,完整的层次结构扮演的角色与“单片”SVN存储库并不完全不同。使用这些存储库也变得与SVN工作流程有些类似,另外还可以使用本地提交和分支:

svn checkout   --> git clone
svn update     --> git pull
svn commit     --> git push

You can have multiple remotes in each working clone, for the ease of synchronizing between the multiple parties:

您可以在每个工作克隆中使用多个遥控器,以便在多方之间轻松实现同步:

$ cd ~/dev
$ git clone /repos/foo.git       # or the one from github, ...
$ cd foo
$ git remote add github ...
$ git remote add memorystick ...

You can then fetch/pull from each of the "sources", work and commit locally, and then push ("backup") to each of these remotes when you are ready with something like (note how that pushes the same commits and history to each of the remotes!):

然后,您可以从每个“源”获取/拉取,在本地工作和提交,然后在准备好类似的东西时将(“备份”)推送到每个远程控制器(注意如何将相同的提交和历史推送到每个遥控器!):

$ for remote in origin github memorystick; do git push $remote; done

The easiest way to turn an existing working repository ~/dev/foo into such a bare repository is probably:

将现有工作存储库〜/ dev / foo转换为这样一个裸存储库的最简单方法可能是:

$ cd ~/dev
$ git clone --bare foo /repos/foo.git
$ mv foo foo.old
$ git clone /repos/foo.git

which is mostly equivalent to a svn import--but does not throw the existing, "local" history away.

这大部分相当于一个svn导入 - 但不会抛弃现有的“本地”历史。

Note: submodules are a mechanism to include shared related lineages, so I indeed wouldn't consider them an appropriate tool for the problem you are trying to solve.

注意:子模块是一种包含共享相关谱系的机制,因此我确实不认为它们是您尝试解决的问题的合适工具。

#2


28  

I want to add to Damien's answer where he recommends:

我想在Damien的回答中添加:

$ for remote in origin github memorystick; do git push $remote; done

You can set up a special remote to push to all the individual real remotes with 1 command; I found it at http://marc.info/?l=git&m=116231242118202&w=2:

您可以使用1命令设置一个特殊的遥控器以推送到所有单独的真实遥控器;我在http://marc.info/?l=git&m=116231242118202&w=2找到了它:

So for "git push" (where it makes sense to push the same branches multiple times), you can actually do what I do:

所以对于“git push”(多次推送相同的分支是有意义的),你可以实际做我做的事情:

  • .git/config contains:

    [remote "all"]
    url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
    url = login.osdl.org:linux-2.6.git
    
  • .git / config包含:[remote“all”] url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 url = login.osdl.org:linux-2.6.git

  • and now git push all master will push the "master" branch to both
    of those remote repositories.

    现在git push all master将把“master”分支推送到这两个远程存储库。

You can also save yourself typing the URLs twice by using the contruction:

您还可以使用构造保存自己键入两次URL:

[url "<actual url base>"]
    insteadOf = <other url base>

#3


4  

,I haven't tried nesting git repositories yet because I haven't run into a situation where I need to. As I've read on the #git channel git seems to get confused by nesting the repositories, i.e. you're trying to git-init inside a git repository. The only way to manage a nested git structure is to either use git-submodule or Android's repo utility.

,我还没有尝试嵌套git存储库,因为我没有遇到过我需要的情况。正如我在#git频道上看到的那样,git似乎因嵌套存储库而感到困惑,即你试图在git存储库中使用git-init。管理嵌套git结构的唯一方法是使用git-submodule或Android的repo实用程序。

As for that backup responsibility you're describing I say delegate it... For me I usually put the "origin" repository for each project at a network drive at work that is backed up regularly by the IT-techs by their backup strategy of choice. It is simple and I don't have to worry about it. ;)

关于你所描述的备份责任,我说代理它...对我来说,我通常把每个项目的“原始”存储库放在工作的网络驱动器上,由IT技术人员通过他们的备份策略定期备份。选择。这很简单,我不必担心。 ;)

#4


3  

I also am curious about suggested ways to handle this and will describe the current setup that I use (with SVN). I have basically created a repository that contains a mini-filesystem hierarchy including its own bin and lib dirs. There is script in the root of this tree that will setup your environment to add these bin, lib, etc... other dirs to the proper environment variables. So the root directory essentially looks like:

我也对建议的处理方法感到好奇,并将描述我使用的当前设置(使用SVN)。我基本上创建了一个包含迷你文件系统层次结构的存储库,包括它自己的bin和lib目录。在这棵树的根目录中有一个脚本,它将设置你的环境以添加这些bin,lib等...其他dirs到适当的环境变量。所以根目录基本上看起来像:

./bin/            # prepended to $PATH
./lib/            # prepended to $LD_LIBRARY_PATH
./lib/python/     # prepended to $PYTHONPATH
./setup_env.bash  # sets up the environment

Now inside /bin and /lib there are the multiple projects and and their corresponding libraries. I know this isn't a standard project, but it is very easy for someone else in my group to checkout the repo, run the 'setup_env.bash' script and have the most up to date versions of all of the projects locally in their checkout. They don't have to worry about installing/updating /usr/bin or /usr/lib and it keeps it simple to have multiple checkouts and a very localized environment per checkout. Someone can also just rm the entire repository and not worry about uninstalling any programs.

现在在/ bin和/ lib里面有多个项目及其相应的库。我知道这不是一个标准项目,但是我的小组中的其他人很容易签出回购,运行'setup_env.bash'脚本并在其本地拥有所有项目的最新版本。查看。他们不必担心安装/更新/ usr / bin或/ usr / lib,并且每次检出都有多个检出和非常本地化的环境。有人也可以整个存储库,而不用担心卸载任何程序。

This is working fine for us, and I'm not sure if we'll change it. The problem with this is that there are many projects in this one big repository. Is there a git/Hg/bzr standard way of creating an environment like this and breaking out the projects into their own repositories?

这对我们来说很好,我不确定我们是否会改变它。这个问题是这个大型存储库中有很多项目。有没有git / Hg / bzr标准方法来创建这样的环境并将项目分解到自己的存储库中?

#5


2  

What about using mr for managing your multiple Git repos at once:

如何使用mr一次管理多个Git回购:

The mr(1) command can checkout, update, or perform other actions on a set of repositories as if they were one combined respository. It supports any combination of subversion, git, cvs, mercurial, bzr, darcs, cvs, vcsh, fossil and veracity repositories, and support for other revision control systems can easily be added. [...]

mr(1)命令可以在一组存储库上检出,更新或执行其他操作,就好像它们是一个组合存储库一样。它支持subversion,git,cvs,mercurial,bzr,darcs,cvs,vcsh,fossil和veracity存储库的任意组合,并且可以轻松添加对其他修订控制系统的支持。 [...]

It is extremely configurable via simple shell scripting. Some examples of things it can do include:

它可以通过简单的shell脚本进行配置。它可以做的一些事情包括:

[...]

  • When updating a git repository, pull from two different upstreams and merge the two together.
  • 更新git存储库时,从两个不同的上游拉出并将两者合并在一起。

  • Run several repository updates in parallel, greatly speeding up the update process.
  • 并行运行多个存储库更新,大大加快了更新过程。

  • Remember actions that failed due to a laptop being offline, so they can be retried when it comes back online.
  • 记住由于笔记本电脑处于脱机状态而失败的操作,因此可以在重新联机时重试这些操作。

#6


1  

There is another method for having nested git repos, but it doesn't solve the problem you're after. Still, for others who are looking for the solution I was:

有另一种方法可以使用嵌套的git repos,但它并没有解决你所遇到的问题。不过,对于那些正在寻找解决方案的人来说,我是:

In the top level git repo just hide the folder in .gitignore containing the nested git repo. This makes it easy to have two separate (but nested!) git repos.

在*git repo中隐藏包含嵌套git仓库的.gitignore中的文件夹。这样可以很容易地拥有两个独立的(但嵌套的!)git repos。

#1


75  

I would strongly advise against putting unrelated data in a given Git repository. The overhead of creating new repositories is quite low, and that is a feature that makes it possible to keep different lineages completely separate.

我强烈建议不要在给定的Git存储库中放入不相关的数据。创建新存储库的开销非常低,这是一种可以使不同谱系完全分离的功能。

Fighting that idea means ending up with unnecessarily tangled history, which renders administration more difficult and--more importantly--"archeology" tools less useful because of the resulting dilution. Also, as you mentioned, Git assumes that the "unit of cloning" is the repository, and practically has to do so because of its distributed nature.

战斗这个想法意味着最终会出现不必要的纠结历史,这会使管理变得更加困难 - 更重要的是 - “考古学”工具由于产生的稀释而变得不那么有用。另外,正如您所提到的,Git假定“克隆单元”是存储库,并且由于其分布式特性,实际上必须这样做。

One solution is to keep every project/package/etc. as its own bare repository (i.e., without working tree) under a blessed hierarchy, like:

一个解决方案是保留每个项目/包/等。作为自己的裸存储库(即没有工作树)在一个受祝福的层次结构下,如:

/repos/a.git
/repos/b.git
/repos/c.git

Once a few conventions have been established, it becomes trivial to apply administrative operations (backup, packing, web publishing) to the complete hierarchy, which serves a role not entirely dissimilar to "monolithic" SVN repositories. Working with these repositories also becomes somewhat similar to SVN workflows, with the addition that one can use local commits and branches:

一旦建立了一些约定,将管理操作(备份,打包,Web发布)应用于完整的层次结构变得微不足道,完整的层次结构扮演的角色与“单片”SVN存储库并不完全不同。使用这些存储库也变得与SVN工作流程有些类似,另外还可以使用本地提交和分支:

svn checkout   --> git clone
svn update     --> git pull
svn commit     --> git push

You can have multiple remotes in each working clone, for the ease of synchronizing between the multiple parties:

您可以在每个工作克隆中使用多个遥控器,以便在多方之间轻松实现同步:

$ cd ~/dev
$ git clone /repos/foo.git       # or the one from github, ...
$ cd foo
$ git remote add github ...
$ git remote add memorystick ...

You can then fetch/pull from each of the "sources", work and commit locally, and then push ("backup") to each of these remotes when you are ready with something like (note how that pushes the same commits and history to each of the remotes!):

然后,您可以从每个“源”获取/拉取,在本地工作和提交,然后在准备好类似的东西时将(“备份”)推送到每个远程控制器(注意如何将相同的提交和历史推送到每个遥控器!):

$ for remote in origin github memorystick; do git push $remote; done

The easiest way to turn an existing working repository ~/dev/foo into such a bare repository is probably:

将现有工作存储库〜/ dev / foo转换为这样一个裸存储库的最简单方法可能是:

$ cd ~/dev
$ git clone --bare foo /repos/foo.git
$ mv foo foo.old
$ git clone /repos/foo.git

which is mostly equivalent to a svn import--but does not throw the existing, "local" history away.

这大部分相当于一个svn导入 - 但不会抛弃现有的“本地”历史。

Note: submodules are a mechanism to include shared related lineages, so I indeed wouldn't consider them an appropriate tool for the problem you are trying to solve.

注意:子模块是一种包含共享相关谱系的机制,因此我确实不认为它们是您尝试解决的问题的合适工具。

#2


28  

I want to add to Damien's answer where he recommends:

我想在Damien的回答中添加:

$ for remote in origin github memorystick; do git push $remote; done

You can set up a special remote to push to all the individual real remotes with 1 command; I found it at http://marc.info/?l=git&m=116231242118202&w=2:

您可以使用1命令设置一个特殊的遥控器以推送到所有单独的真实遥控器;我在http://marc.info/?l=git&m=116231242118202&w=2找到了它:

So for "git push" (where it makes sense to push the same branches multiple times), you can actually do what I do:

所以对于“git push”(多次推送相同的分支是有意义的),你可以实际做我做的事情:

  • .git/config contains:

    [remote "all"]
    url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
    url = login.osdl.org:linux-2.6.git
    
  • .git / config包含:[remote“all”] url = master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 url = login.osdl.org:linux-2.6.git

  • and now git push all master will push the "master" branch to both
    of those remote repositories.

    现在git push all master将把“master”分支推送到这两个远程存储库。

You can also save yourself typing the URLs twice by using the contruction:

您还可以使用构造保存自己键入两次URL:

[url "<actual url base>"]
    insteadOf = <other url base>

#3


4  

,I haven't tried nesting git repositories yet because I haven't run into a situation where I need to. As I've read on the #git channel git seems to get confused by nesting the repositories, i.e. you're trying to git-init inside a git repository. The only way to manage a nested git structure is to either use git-submodule or Android's repo utility.

,我还没有尝试嵌套git存储库,因为我没有遇到过我需要的情况。正如我在#git频道上看到的那样,git似乎因嵌套存储库而感到困惑,即你试图在git存储库中使用git-init。管理嵌套git结构的唯一方法是使用git-submodule或Android的repo实用程序。

As for that backup responsibility you're describing I say delegate it... For me I usually put the "origin" repository for each project at a network drive at work that is backed up regularly by the IT-techs by their backup strategy of choice. It is simple and I don't have to worry about it. ;)

关于你所描述的备份责任,我说代理它...对我来说,我通常把每个项目的“原始”存储库放在工作的网络驱动器上,由IT技术人员通过他们的备份策略定期备份。选择。这很简单,我不必担心。 ;)

#4


3  

I also am curious about suggested ways to handle this and will describe the current setup that I use (with SVN). I have basically created a repository that contains a mini-filesystem hierarchy including its own bin and lib dirs. There is script in the root of this tree that will setup your environment to add these bin, lib, etc... other dirs to the proper environment variables. So the root directory essentially looks like:

我也对建议的处理方法感到好奇,并将描述我使用的当前设置(使用SVN)。我基本上创建了一个包含迷你文件系统层次结构的存储库,包括它自己的bin和lib目录。在这棵树的根目录中有一个脚本,它将设置你的环境以添加这些bin,lib等...其他dirs到适当的环境变量。所以根目录基本上看起来像:

./bin/            # prepended to $PATH
./lib/            # prepended to $LD_LIBRARY_PATH
./lib/python/     # prepended to $PYTHONPATH
./setup_env.bash  # sets up the environment

Now inside /bin and /lib there are the multiple projects and and their corresponding libraries. I know this isn't a standard project, but it is very easy for someone else in my group to checkout the repo, run the 'setup_env.bash' script and have the most up to date versions of all of the projects locally in their checkout. They don't have to worry about installing/updating /usr/bin or /usr/lib and it keeps it simple to have multiple checkouts and a very localized environment per checkout. Someone can also just rm the entire repository and not worry about uninstalling any programs.

现在在/ bin和/ lib里面有多个项目及其相应的库。我知道这不是一个标准项目,但是我的小组中的其他人很容易签出回购,运行'setup_env.bash'脚本并在其本地拥有所有项目的最新版本。查看。他们不必担心安装/更新/ usr / bin或/ usr / lib,并且每次检出都有多个检出和非常本地化的环境。有人也可以整个存储库,而不用担心卸载任何程序。

This is working fine for us, and I'm not sure if we'll change it. The problem with this is that there are many projects in this one big repository. Is there a git/Hg/bzr standard way of creating an environment like this and breaking out the projects into their own repositories?

这对我们来说很好,我不确定我们是否会改变它。这个问题是这个大型存储库中有很多项目。有没有git / Hg / bzr标准方法来创建这样的环境并将项目分解到自己的存储库中?

#5


2  

What about using mr for managing your multiple Git repos at once:

如何使用mr一次管理多个Git回购:

The mr(1) command can checkout, update, or perform other actions on a set of repositories as if they were one combined respository. It supports any combination of subversion, git, cvs, mercurial, bzr, darcs, cvs, vcsh, fossil and veracity repositories, and support for other revision control systems can easily be added. [...]

mr(1)命令可以在一组存储库上检出,更新或执行其他操作,就好像它们是一个组合存储库一样。它支持subversion,git,cvs,mercurial,bzr,darcs,cvs,vcsh,fossil和veracity存储库的任意组合,并且可以轻松添加对其他修订控制系统的支持。 [...]

It is extremely configurable via simple shell scripting. Some examples of things it can do include:

它可以通过简单的shell脚本进行配置。它可以做的一些事情包括:

[...]

  • When updating a git repository, pull from two different upstreams and merge the two together.
  • 更新git存储库时,从两个不同的上游拉出并将两者合并在一起。

  • Run several repository updates in parallel, greatly speeding up the update process.
  • 并行运行多个存储库更新,大大加快了更新过程。

  • Remember actions that failed due to a laptop being offline, so they can be retried when it comes back online.
  • 记住由于笔记本电脑处于脱机状态而失败的操作,因此可以在重新联机时重试这些操作。

#6


1  

There is another method for having nested git repos, but it doesn't solve the problem you're after. Still, for others who are looking for the solution I was:

有另一种方法可以使用嵌套的git repos,但它并没有解决你所遇到的问题。不过,对于那些正在寻找解决方案的人来说,我是:

In the top level git repo just hide the folder in .gitignore containing the nested git repo. This makes it easy to have two separate (but nested!) git repos.

在*git repo中隐藏包含嵌套git仓库的.gitignore中的文件夹。这样可以很容易地拥有两个独立的(但嵌套的!)git repos。