Sunday, June 21, 2015

How to move a project from Google Code to GitHub

Google Code is shutting down, so you need to move your projects to a new hosting site.

This article tells you how to export a project from Google Code to GitHub.  Google Code provides an "Export to GitHub" button at the top of your project pages.  That button uploads the repository to GitHub (first converting it to Git if needed), but it doesn't always work; even when it does, there is a lot more to moving an entire project.  This blog posting takes you through the extra steps that you need to perform.  Some of these steps ought to be done by the "export to GitHub" button, but the export tool is incomplete.  Others are necessarily manual steps.
  1. Warn your teammates that you are going to move the repository; give them a chance to push their changes or do other cleanup.  Ask them not to make any more changes to their clone, ever.
  2. I suffered authentication problems when I clicked the "Export to GitHub" button in parallel for multiple projects in different browser tabs.  Click it for one project; go through GitHub authentication, etc.; and then you can click the "Export to GitHub" button for the next project.
  3. The "Export to GitHub" button won't work if your repository is too big or if it ever contained a file larger than 100MB.  In this case, you need to do the conversion by hand.
    1. Create the new repository on GitHub
    2. Run the following:
      cd DIRFORGITCLONE
      git init
      fast-export/hg-fast-export.sh -r HGCLONE
      git checkout HEAD
      git remote add origin git@github.com:USERNAME/PROJECTNAME.git
      git push --set-upstream origin master
    3. If the push failed because your repository contains files that are too large, use the BFG Repo-Cleaner to remove them, then redo the push.  Sample command lines:
      bfg --strip-blobs-bigger-than 100M
      git gc --aggressive --prune=now
    4. Migrate the issues:  see https://code.google.com/p/support-tools/wiki/IssueExporterTool .  I didn't have a problem with the throttling; issues seemed to be uploaded faster than one per second.
  4. Redo the repository export if you didn't do it manually in the previous step.
    The reason is that often, the tool run by the "Export to GitHub" button says that repository conversion was successful, but in fact some history is lost or munged (especially around merges and file renames).  One way to identify problems is to run  git log --graph  and look at the end of the output.  There should be just one root, which is from the beginning of the history; if there are multiple roots or a root that is not at the beginning of the history, then you must regenerate the history from the Mercurial repository.  However, I recommend that you always do so, just in case.
    Run the following commands:
    cd ~
    git clone https://github.com/frej/fast-export.git
    REPO=myrepositoryname
    cd REPOSITORY_PARENT
    rm -rf $REPO
    git init $REPO

    cd $REPO

    ~/fast-export/hg-fast-export.sh -r HGCLONE --force  -A ~/authornames.txt

    git checkout HEAD

    git remote add origin git@github.com:mygitusername/$REPO
    git push --force --set-upstream origin master
  5. The GitHub repository appeared in your personal GitHub account.  If you collaborate with other people on the project, move the repository to an organization's account by browsing to "Settings > Transfer ownership".
  6. Rename your old clone of the Google Code repository, and clone the new GitHub repository.
    If you manage your clones using a program such as mvc, then update its control file, such as ~/.mvc-checkouts.
  7. If the old Google Code repository used Mercurial:
    1. convert the ignore file from Mercurial to Git:
      git mv .hgignore .gitignore
    2. Remove any occurrence of syntax: glob, and convert any regex patterns in the .gitignore file into globs.  Convert other patterns as necessary.  Take care because git interprets patterns differently depending on whether they contain a slash ("/") or not.
    3. Build the project and run tests to create temporary files that should be ignored.
    4. Run git status to ensure that they are ignored.
    5. git commit -m "Rename .hgignore -> .gitignore" .hgignore .gitignore
  8. At top of the GitHub page for the project (not the GitHub Pages page):
    1. Click the red "Stop ignoring" button
    2. Add text for the description field (from Google Code description)
    3. Add a link to a real homepage for the project.
      If it doesn't have one already, then use http://USERNAME.github.io/PROJECTNAME/ (this is a better choice than the wiki), and create this homepage per the next step.
  9. Create a homepage for the project, if it doesn't have one already.
    A good way to do this is to create a GitHub Pages (github.io) homepage at http://USERNAME.github.io/PROJECTNAME/:
    git checkout --orphan gh-pages
    git rm -rf .
    rm .gitignore

    Create index.html
    git add index.html
    git commit -a -m "First GitHub Pages commit"
    git push --set-upstream origin gh-pages
    Browse thttp://USERNAME.github.io/PROJECTNAME/ to verify the content.
  10. If your project doesn't arlready have a README file, then create a README file (or maybe README.md), which will show up at the bottom of your GitHub page.  If you are creating the README purely for the benefit of people browsing on GitHub, and your real documentation appears elsewhere, then your README will be short and will redirect people to the project's real homepage.  Your project ought to include two different types of documentation:  for users and for contributors.  These often appear in different places but should be easy to find.
    You should also include a LICENSE file.  Otherwise, nobody (who is under the control of a legal department) will be able to use your software.
  11. If your wiki is intended as developer-written documentation, then move all wiki pages to GitHub pages or to the main repository: see https://github.com/morgant/finishGoogleCodeGitHubWikiMigration .  (The instructions are hard to follow; read them carefully!)
    Only keep the pages in a wiki if you expect users to edit it.
    The import process creates a "wiki" branch.  If this branch is useless (especially if it contains only one file or is redundant with the project's homepage or GitHub Pages page), then delete it by clicking the appropriate trashcan icon at https://github.com/USERNAME/PROJECTNAME/branches
  12. Look up the permissions on Google Code at https://code.google.com/p/PROJECTNAME/adminMembers and give corresponding permissions on GitHub at https://github.com/USERNAME/PROJECTNAME/settings/collaboration .  Users in the Owners group don't get notifications, so put all of those people also in some other group.
  13. Convert post-commit hooks to GitHub; see "Post-Commit URL" and "Post-Commit Authentication Key" at https://code.google.com/p/PROJECTNAME/adminSource and add them at https://github.com/USERNAME/PROJECTNAME/settings/hooks/new .
  14. Add email notifications at https://github.com/USERNAME/PROJECTNAME/settings/hooks/new?service=email corresponding to the "Activity notifications" at https://code.google.com/p/PROJECTNAME/adminSource .
  15. Copy files in the Google Code "Downloads" section to elsewhere.
  16. The project's documentation and buildfiles probably refer to Google Code.  Update all such mentions.
    1. Update direct references to files in the repository:
      preplace "https?://PROJECTNAME.googlecode.com/hg/"   "https://raw.githubusercontent.com/USERNAME/PROJECTNAME/master/"
    2. Seach for more occurrences of googlecode.com and code.google.com, and edit each one manually.
    3. If the old Google Code repository used Mercurial, search for occurrences of hg, and edit each one manually.
  17. Write up the change of repositories for your changelog or release notes.
  18. Run tests, then commit and push your changes to GitHub.  For example:
    git commit -a -m "Update references from Google Code to GitHub and from Mercurial (Hg) to Git"
  19. If the push created or changed your project's homepage, run a link checker to check for broken links:
    plume-lib/bin/checklink -q -r `grep -v '^#' plume-lib/bin/checklink-args.txt` MYURL
  20. Update your continuous integration server so that it refers to the new repository location.
  21. If you have a README.html or similar HTML file in your Google Code repository, then external links to it may exist.  In the old Google Code repository, edit it to add a line such as
      <meta http-equiv="Refresh" content="1;URL=http://USERNAME.github.io/PROJECTNAME/" />
    to the header, and commit and push to Google Code.
  22. Set up forwarding ("project moved") at Google Code, at https://code.google.com/p/PROJECTNAME/adminAdvanced .  Forward to the project's homepage or new GitHub Pages homepage.
    If you ever need to temporarily undo the forwarding, go to https://code.google.com/p/PROJECTNAME/adminAdvanced .  Google Code may or may not remember the forwarding URL; you will need it when you re-enable forwarding.
  23. Tell anyone who might have cloned the old Google Code repository to rename their clone, not use it any more, and instead clone and use the new GitHub repository.
If you have any corrections or additions to this guide, let me know.

No comments: