Merge two Git repositories without breaking file history

I need to merge two Git repositories into a brand new, third repository. I've found many descriptions of how to do this using a subtree merge (for example Jakub Narębski's answer on How do you merge two Git repositories?) and following those instructions mostly works, except that when I commit the subtree merge all of the files from the old repositories are recorded as new added files. I can see the commit history from the old repositories when I do git log , but if I do git log <file> it shows only one commit for that file - the subtree merge. Judging from the comments on the above answer, I'm not alone in seeing this problem but I've found no published solutions for it.

Is there any way do merge repositories and leave individual file history intact?


It turns out that the answer is much simpler if you're simply trying to glue two repositories together and make it look like it was that way all along rather than manage an external dependency. You simply need to add remotes to your old repos, merge them to your new master, move the files and folders to a subdirectory, commit the move, and repeat for all additional repos. Submodules, subtree merges, and fancy rebases are intended to solve a slightly different problem and aren't suitable for what I was trying to do.

Here's an example Powershell script to glue two repositories together:

# Assume the current directory is where we want the new repository to be created
# Create the new repository
git init

# Before we do a merge, we have to have an initial commit, so we'll make a dummy commit
dir > deleteme.txt
git add .
git commit -m "Initial dummy commit"

# Add a remote for and fetch the old repo
git remote add -f old_a <OldA repo URL>

# Merge the files from old_a/master into new/master
git merge old_a/master --allow-unrelated-histories

# Clean up our dummy file because we don't need it any more
git rm .deleteme.txt
git commit -m "Clean up initial file"

# Move the old_a repo files and folders into a subdirectory so they don't collide with the other repo coming later
mkdir old_a
dir -exclude old_a | %{git mv $_.Name old_a}

# Commit the move
git commit -m "Move old_a files into subdir"

# Do the same thing for old_b
git remote add -f old_b <OldB repo URL>
git merge old_b/master --allow-unrelated-histories
mkdir old_b
dir –exclude old_a,old_b | %{git mv $_.Name old_b}
git commit -m "Move old_b files into subdir"

Obviously you could instead merge old_b into old_a (which becomes the new combined repo) if you'd rather do that – modify the script to suit.

If you want to bring over in-progress feature branches as well, use this:

# Bring over a feature branch from one of the old repos
git checkout -b feature-in-progress
git merge -s recursive -Xsubtree=old_a old_a/feature-in-progress

That's the only non-obvious part of the process - that's not a subtree merge, but rather an argument to the normal recursive merge that tells Git that we renamed the target and that helps Git line everything up correctly.

I wrote up a slightly more detailed explanation here.


Here's a way that doesn't rewrite any history, so all commit IDs will remain valid. The end-result is that the second repo's files will end up in a subdirectory.

  • Add the second repo as a remote:

    cd firstgitrepo/
    git remote add secondrepo username@servername:andsoon
    
  • Make sure that you've downloaded all of the secondrepo's commits:

    git fetch secondrepo
    
  • Create a local branch from the second repo's branch:

    git branch branchfromsecondrepo secondrepo/master
    
  • Move all its files into a subdirectory:

    git checkout branchfromsecondrepo
    mkdir subdir/
    git ls-tree -z --name-only HEAD | xargs -0 -I {} git mv {} subdir/
    git commit -m "Moved files to subdir/"
    
  • Merge the second branch into the first repo's master branch:

    git checkout master
    git merge --allow-unrelated-histories branchfromsecondrepo
    
  • Your repository will have more than one root commit, but that shouldn't pose a problem.


    please have a look at using

    git rebase --root --preserve-merges --onto
    

    to link two histories early on in their lives.

    If you have paths that overlap, fix them up with

    git filter-branch --index-filter
    

    when you use log, ensure you "find copies harder" with

    git log -CC
    

    that way you will find any movements of files in the path.

    链接地址: http://www.djcxy.com/p/46968.html

    上一篇: 混帐合并不同的存储库?

    下一篇: 合并两个Git存储库而不会破坏文件历史记录