Our Git repository starts as part of a single Monster SVN repository, where each project has its own tree, as follows:
project1/branches /tags /trunk project2/branches /tags /trunk
Obviously, using svn mv to move files from one file to another is easy. But in Git, each project is in its own repository, and today I was asked to move subdirectories from project2 to project1. I did this:
$ git clone project2 $ cd project2 $ git filter-branch --subdirectory-filter deeply/buried/java/source/directory/A -- --all $ git remote rm origin # so I don't accidentally the repo ;-) $ mkdir -p deeply/buried/different/java/source/directory/B $ for f in *.java; do > git mv $f deeply/buried/different/java/source/directory/B > done $ git commit -m "moved files to new subdirectory" $ cd .. $ $ git clone project1 $ cd project1 $ git remote add p2 ../project2 $ git fetch p2 $ git branch p2 remotes/p2/master $ git merge p2 # --allow-unrelated-histories for git 2.9 $ git remote rm p2 $ git push
But it seems puzzling. In general, is there a better way to do this? Or did I take the right approach?
Note that this involves merging history into an existing repository rather than simply creating a new stand-alone repository from part of another repository( As shown in the previous question ).
#1 building
If your history is reasonable, you can take the commit as a patch and apply it to a new repository:
cd repository git log --pretty=email --patch-with-stat --reverse --full-index --binary -- path/to/file_or_folder > patch cd ../another_repository git am < ../repository/patch
Or a row
git log --pretty=email --patch-with-stat --reverse -- path/to/file_or_folder | (cd /path/to/new_repository && git am)
(excerpts from Documents for exorbo )
#2 building
Yes, the sub directory filter that hit the filter branch is the key. The fact that you use it essentially proves that there is no other way to make it easier - you can only rewrite the history, because you want to keep only one (renamed) subset of the file, and by definition, this changes the hash value. You cannot use standard commands, such as pull, to do this because they do not override history.
Of course, you can refine the details - some cloning and branching are not necessary - but the overall approach is good! It's too complicated, it's a pity, but of course, git's purpose is not to make rewriting history easy.
#3 building
After trying various ways to move files or folders from one Git repository to another, the only files or folders that seem to work reliably are outlined below.
It involves cloning the repository from which you want to move a file or folder, moving the file or folder to the root, rewriting Git history, cloning the target repository, and pulling files or folders with history directly into the target repository.
Stage one
-
Make A copy of repository A, because the following steps have made significant changes to this copy, you should not push!
git clone --branch <branch> --origin origin --progress \\ -v <git repository A url> # eg. git clone --branch master --origin origin --progress \\ # -v https://username@giturl/scm/projects/myprojects.git # (assuming myprojects is the repository you want to copy from)
-
CD entry
cd <git repository A directory> # eg. cd /c/Working/GIT/myprojects
-
Remove the link to the original repository to avoid any unexpected remote changes (for example, by push)
git remote rm origin
-
Browse your history and files and delete anything that is not in directory 1. As A result, the contents of directory 1 are injected into repository A's library.
git filter-branch --subdirectory-filter <directory> -- --all # eg. git filter-branch --subdirectory-filter subfolder1/subfolder2/FOLDER_TO_KEEP -- --all
-
For single file move only: browse the rest and delete everything except the required file. (you may need to delete unwanted files with the same name and submit them.)
git filter-branch -f --index-filter \\ 'git ls-files -s | grep $'\\t'FILE_TO_KEEP$ | GIT_INDEX_FILE=$GIT_INDEX_FILE.new \\ git update-index --index-info && \\ mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE || echo "Nothing to do"' --prune-empty -- --all # eg. FILE_TO_KEEP = pom.xml to keep only the pom.xml file from FOLDER_TO_KEEP
The second stage
-
Cleaning steps
git reset --hard
-
Cleaning steps
git gc --aggressive
-
Cleaning steps
git prune
You may want to import these files into directory B instead of repository B in the root directory:
-
Create the directory
mkdir <base directory> eg. mkdir FOLDER_TO_KEEP
-
Move files to this directory
git mv * <base directory> eg. git mv * FOLDER_TO_KEEP
-
Add files to this directory
git add .
-
Commit changes and we're ready to merge these files into a new repository
git commit
The third stage
-
If you do not already have repository B, copy it
git clone <git repository B url> # eg. git clone https://username@giturl/scm/projects/FOLDER_TO_KEEP.git
(suppose folder to keep is the name of the new repository you want to copy to)
-
CD entry
cd <git repository B directory> # eg. cd /c/Working/GIT/FOLDER_TO_KEEP
-
Create A remote connection to repository A as A branch in repository B
git remote add repo-A-branch <git repository A directory> # (repo-A-branch can be anything - it's just an arbitrary name) # eg. git remote add repo-A-branch /c/Working/GIT/myprojects
-
Pull repository B from the branch, which contains only the directories you want to move.
git pull repo-A-branch master --allow-unrelated-histories
Pull copy files and history. Note: you can use merge instead of pull, but pull works better.
-
Finally, you may want to do some cleanup by removing the remote connection to repository A
git remote rm repo-A-branch
-
Press it, it's all set.
git push
#4 building
Reserved directory name
The subdirectory filters (or shorter command git subtrees) work fine, but they don't work for me because they remove the directory name from the submission. In my scenario, I just want to merge part of one repository into another and keep a history with the full pathname.
My solution is to use a tree filter and simply remove unwanted files and directories from a temporary clone of the source repository, then extract them from the clone to the target repository in five simple steps.
# 1. clone the source git clone ssh://<user>@<source-repo url> cd <source-repo> # 2. remove the stuff we want to exclude git filter-branch --tree-filter "rm -rf <files to exclude>" --prune-empty HEAD # 3. move to target repo and create a merge branch (for safety) cd <path to target-repo> git checkout -b <merge branch> # 4. Add the source-repo as remote git remote add source-repo <path to source-repo> # 5. fetch it git pull source-repo master # 6. check that you got it right (better safe than sorry, right?) gitk
#5 building
The answer is based on git am Interesting commands, and step-by-step through examples.
objective
- You want to move some or all of your files from one repository to another.
- You want to keep their history.
- But you don't care about retaining tags and branches.
- You accept a limited history of renamed files (and files in renamed directories).
program
- Extract the history of the e-mail format using the following format
git log --pretty=email -p --reverse --full-index --binary - Reorganize file tree and update file name changes in history [optional]
- Using git am to apply new history
1. Extract history in email format
For example: extracted history file3, file4 and file5
my_repo ├── dirA │ ├── file1 │ └── file2 ├── dirB ^ │ ├── subdir | To be moved │ │ ├── file3 | with history │ │ └── file4 | │ └── file5 v └── dirC ├── file6 └── file7
Clean up temporary directory targets
export historydir=/tmp/mail/dir # Absolute path rm -rf "$historydir" # Caution when cleaning
Clean up your buyback sources
git commit ... # Commit your working files rm .gitignore # Disable gitignore git clean -n # Simulate removal git clean -f # Remove untracked file git checkout .gitignore # Restore gitignore
Extract the history of each file in e-mail format
cd my_repo/dirB find -name .git -prune -o -type d -o -exec bash -c 'mkdir -p "$historydir/${0%/*}" && git log --pretty=email -p --stat --reverse --full-index --binary -- "$0" > "$historydir/$0"' {} ';'
Unfortunately, the option -- follow or -- find copies harder cannot be used with -- reverse. This is why history is cut when a file (or parent directory) is renamed.
After: temporary history in email format
/tmp/mail/dir ├── subdir │ ├── file3 │ └── file4 └── file5
2. Reorganize the file tree and update the file name changes in the history [optional]
Suppose you want to move these three files to another warehouse (which can be the same warehouse).
my_other_repo ├── dirF │ ├── file55 │ └── file56 ├── dirB # New tree │ ├── dirB1 # was subdir │ │ ├── file33 # was file3 │ │ └── file44 # was file4 │ └── dirB2 # new dir │ └── file5 # = file5 └── dirH └── file77
So reorganize your files:
cd /tmp/mail/dir mkdir dirB mv subdir dirB/dirB1 mv dirB/dirB1/file3 dirB/dirB1/file33 mv dirB/dirB1/file4 dirB/dirB1/file44 mkdir dirB/dirB2 mv file5 dirB/dirB2
Your temporary history is now:
/tmp/mail/dir └── dirB ├── dirB1 │ ├── file33 │ └── file44 └── dirB2 └── file5
Also change the file name in the history:
cd "$historydir" find * -type f -exec bash -c 'sed "/^diff --git a\|^--- a\|^+++ b/s:\( [ab]\)/[^ ]*:\1/$0:g" -i "$0"' {} ';'
Note: This overrides the history to reflect the path and filename changes.
(i.e. change new location / name in new warehouse)
3. Apply new history
Your other warehouses are:
my_other_repo ├── dirF │ ├── file55 │ └── file56 └── dirH └── file77
Apply commit from temporary history file:
cd my_other_repo find "$historydir" -type f -exec cat {} + | git am
Your other warehouse is now:
my_other_repo ├── dirF │ ├── file55 │ └── file56 ├── dirB ^ │ ├── dirB1 | New files │ │ ├── file33 | with │ │ └── file44 | history │ └── dirB2 | kept │ └── file5 v └── dirH └── file77
Use git status to view the number of submissions to push: -)
Note: because history has been overridden to reflect path and filename changes:
(i.e. compare with location / name in previous repo)
- You do not need git mv to change the location / filename.
- You do not need git log --follow to access the complete history.
Additional tip: detect renamed / moved files in your warehouse
List renamed files:
find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow {} ';' | grep '=>'
More custom settings: you can use the -- find copies harder -- reverse or -- reverse options to complete the command git log. You can also use cut -f3 - and cut -f3 - full patterns' {. * = >. *}'delete the first two columns.
find -name .git -prune -o -exec git log --pretty=tformat:'' --numstat --follow --find-copies-harder --reverse {} ';' | cut -f3- | grep '{.* => .*}'