Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 391473

Summary: Process for accepting GitHub pull requests
Product: Community Reporter: Wayne Beaton <wayne.beaton>
Component: GitAssignee: Eclipse Webmaster <webmaster>
Status: RESOLVED WONTFIX QA Contact:
Severity: normal    
Priority: P3 CC: adietish, caniszczyk, denis.roy, gunnar, irbull, jacek.pospychala, jeremie.bresson, jesse.mcconnell, kevin, KevinSawicki, matthias.sohn, mistria, pnehrer, robin, sbouchet, stephan.leichtvogt, thatnitind
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard: stalebug

Description Wayne Beaton CLA 2012-10-09 15:16:02 EDT
With our Git repositories mirrored on GitHub, it's natural that we should start to receive pull requests.

The GitHub mirrors are not official eclipse.org repositories themselves, and so are not strictly subject to our policies and procedures. Having said that, having these repositories grossly out of sync with the official repositories would be a disservice to the community.

The eclipse.org repository must be treated as the main/canonical repository. All builds must run from source taken exclusively from the eclipse.org repository. All distributed artefacts must be derived from the eclipse.org repository.

Before anybody can merge a pull request, they need to have access to the GitHub repository. We can do this by setting up a group with access to the repository; I've already set up groups for Jetty, PDT, and a couple of other projects. I tend to set up these groups with the project leads as members with administration privileges (so the project leads can take care of adding and removing people to the group). The EF doesn't have the bandwidth (or interest) to maintain GitHub groups based on committer status.

Group members can merge a pull request into the GitHub repository.

As stated above, the GitHub repository isn't subject to our rules. It is perfectly reasonable for the GitHub repository to contain commits that haven't made it to the eclipse.org repository.

The eclipse.org repository is subject to the terms of the IP Due Diligence process. If the process so dictates, a CQ must be created and processed for a commit before it can be pulled into the eclipse.org repository. The contribution must be recorded in the IP Log, and--at least as of this point--the contributor must have answered the notorious three questions.

Once the IP Due Diligence process approves the contribution, it can be pushed into the eclipse.org repository.

This is where we run into a problem. GitHub sets both the "author" and "commit" field values to that of the contributor. I am not aware of any means to override the values for the "commit" fields to permit the commit to pass the "committers only" restriction on pushes to git.eclipse.org.

Once pulled out of GitHub, a developer can mess around with the commits to set the value before pushing a revised commit to git.eclipse.org; that revised commit would then be pushed to GitHub and--eventually--to forks. But this would result in merge conflicts that everybody would have to solve (at least that's my theoretical take on it). 

Ideally, the merged commit should be able to just flow into the eclipse.org repository. Maybe we can set up a means to temporarily turn off the commit restriction? Or should we be working on means to just remove the restriction altogether? Or am I just not understanding something?

Gerrit may solve this problem. As soon as the pull request is merged in GitHub, it can be pushed to Gerrit and enter the IP Due Diligence workflow there.
Comment 1 Mickael Istria CLA 2012-10-10 02:31:29 EDT
Currently Eclipse repo at GitHub are mirror of Git repos at Eclipse.org. Will mirroring still work if we start working (ie merge pull requests) on GitHub repos?

Multiplying tools (Gerrit + GitHub) will increase the amount of work for "mergers". It would probably be better to be able to rely on a single one (Gerrit).

Would it be possible to import GitHub pull requests as Gerrit contirutions at Eclipse.org? We could think about a script that polls GitHub and imports pull requests to Gerrit and sends a mail to GitHub contributor telling him that his pull request is followed on Gerrit (with a link). The difficult part here is that for some users, it would mean automatically create an Eclipse.org account for the,.
If we choose that way, it has the advantage of removing need to set up new tools, processes & policies.
Comment 2 Wayne Beaton CLA 2012-10-10 10:18:55 EDT
I've added Kevin Sawicki to the conversation; he should be able to answer the GitHub questions.

(In reply to comment #1)
> Currently Eclipse repo at GitHub are mirror of Git repos at Eclipse.org.
> Will mirroring still work if we start working (ie merge pull requests) on
> GitHub repos?

Based on my understanding of how Git works, the mirroring should still work. Kevin, can you confirm.
 
> Multiplying tools (Gerrit + GitHub) will increase the amount of work for
> "mergers". It would probably be better to be able to rely on a single one
> (Gerrit).

I prefer to avoid adding burden to developers (or mergers). I'm trying to smooth out the process for projects that opt to use GitHub. There's no rule that says that you have to use GitHub at all. Ultimately, each project has to decide for themselves how they will interact with adopters. Having said that, projects need to have some flexibility. If contributors come to GitHub in droves, then you really need to spend some time there.

> Would it be possible to import GitHub pull requests as Gerrit contirutions
> at Eclipse.org? We could think about a script that polls GitHub and imports
> pull requests to Gerrit and sends a mail to GitHub contributor telling him
> that his pull request is followed on Gerrit (with a link). 

I'm not sure if this is possible. Sounds like a cool idea if we could make it work.

> The difficult
> part here is that for some users, it would mean automatically create an
> Eclipse.org account for the,.

This is probably the biggest challenge; automatically-created accounts do not have the legal protections that we get when a user creates their own account.

> If we choose that way, it has the advantage of removing need to set up new
> tools, processes & policies.

The landscape always has and always will change. We need to constantly adapt to stay current and relevant. As I suggested earlier, if the world decides that they want to come to your project through GitHub, then you'd be crazy not to meet them there.
Comment 3 Kevin Sawicki CLA 2012-10-10 12:22:10 EDT
You can push to a mirrored GitHub repository but the repository will still updates from its source URL meaning the commits pushed to GitHub will disappear the next time the mirror updates if they aren't in the source repository.

(In reply to comment #2)
> I've added Kevin Sawicki to the conversation; he should be able to answer
> the GitHub questions.
> 
> (In reply to comment #1)
> > Currently Eclipse repo at GitHub are mirror of Git repos at Eclipse.org.
> > Will mirroring still work if we start working (ie merge pull requests) on
> > GitHub repos?
> 
> Based on my understanding of how Git works, the mirroring should still work.
> Kevin, can you confirm.
Comment 4 Gunnar Wagenknecht CLA 2012-10-10 13:54:48 EDT
FWIW, I found two very good reads:
https://github.com/SpringSource/spring-framework/wiki/Manually-merging-pull-requests
https://help.github.com/articles/using-pull-requests

Especially the SpringSource case is a good template (IMHO) to how things can be done. I think there are only two differences for Eclipse projects.

1. You need to due IP clearance before starting the merge
2. The final "origin" to push master to won't be GitHub but Eclipse.org which then gets mirrored to GitHub.

I'm ok with number two. 

I'm also ok with number one. It seems to be very easy to download a pull request as a patch and attach that to a CQ.

The way I see it, the author would remain the contributor. But if I don't squash all the commits before merging into master I won't be able to push it to Eclipse.org because the comitter email will be different. There is a script that ensures the committer id is the same person that pushes.

However, that might not be a problem if I squash all the commits before the merge. This changes the committer to me.

Kevin, do you know if there is metadata in commit message that indicates if the merge involves a pull request? It would be very convenient if GitHub recognizes on the next sync that I merged a commit from a pull request and auto-closes the pull request.
Comment 5 Gunnar Wagenknecht CLA 2012-10-10 13:59:19 EDT
(In reply to comment #1)
> Multiplying tools (Gerrit + GitHub) will increase the amount of work for
> "mergers". It would probably be better to be able to rely on a single one
> (Gerrit).

As the initiator of the request to EMO, I'd like to state that the request actually evolved because of me using Gerrit. I'm sorry, but I find GitHub pull requests much more appealing than working with Gerrit. There is also evidence in our team that getting started with GitHub and pull requests is easier than getting started with Gerrit. 

However, as stated by Wayne, it should not be considered another systems that projects have to use. It really is just an option for projects, i.e. "opt-in" model (as is Gerrit, btw).
Comment 6 Kevin Sawicki CLA 2012-10-10 14:02:55 EDT
If merged via the merge button the merge commit does include metadata about the pull request.

Otherwise no additional metadata is included in the commit message beyond what the author has entered.

(In reply to comment #4)
> Kevin, do you know if there is metadata in commit message that indicates if
> the merge involves a pull request? It would be very convenient if GitHub
> recognizes on the next sync that I merged a commit from a pull request and
> auto-closes the pull request.
Comment 7 Mickael Istria CLA 2012-12-04 03:31:15 EST
(In reply to comment #3)
> You can push to a mirrored GitHub repository but the repository will still
> updates from its source URL meaning the commits pushed to GitHub will
> disappear the next time the mirror updates if they aren't in the source
> repository.

So that means that while GitHub repos are just mirrors, we can't expect the pull request to work for Eclipse projects.
So I'm back to the idea of importing pull requests:
* If the submitter email matches an Eclipse.org login, import the Pull Request to Gerrit, and annotate the Pull Request on GitHub to link to the relevant Gerrit entry
* If the submitter email does not match an Eclipse.org login, send a mail pointing to this pull request to the <project>-dev@eclipse.org list, and annotate PR on GitHub to encourage contributor to use Gerrit. Also link him to the <project>-dev@eclipse.org list since he'll probably need some guidance for his contribution.
Comment 8 Mickael Istria CLA 2012-12-04 03:35:41 EST
Maybe the Eclipse Foundation should enforce usage and creation of a "CONTRIBUTING" file to guide people to Bugzilla or Gerrit before doing a Pull Request: https://github.com/blog/1184-contributing-guidelines
Comment 9 Wayne Beaton CLA 2012-12-04 11:46:34 EST
(In reply to comment #8)
> Maybe the Eclipse Foundation should enforce usage and creation of a
> "CONTRIBUTING" file to guide people to Bugzilla or Gerrit before doing a
> Pull Request: https://github.com/blog/1184-contributing-guidelines

+1 except for the enforce part. Unless this is something that we can do automatically. Perhaps "encourage" or "suggest" are more appropriate words.

I have added a "Recommended Practices" section [1] to the Git wiki page.

[1] http://wiki.eclipse.org/Git#Recommended_Practices
Comment 10 Gunnar Wagenknecht CLA 2012-12-17 01:42:37 EST
(In reply to comment #0)
> The eclipse.org repository is subject to the terms of the IP Due Diligence
> process. If the process so dictates, a CQ must be created and processed for
> a commit before it can be pulled into the eclipse.org repository. The
> contribution must be recorded in the IP Log, and--at least as of this
> point--the contributor must have answered the notorious three questions.

Wayne, I just found this:
http://wiki.eclipse.org/Development_Resources/Handling_Git_Contributions

Is this the preferred process to follow? Just to confirm, I'll outline the steps I intend to do.

1. As I got a first pull request, I'll open a bug and ask the three magic questions. 

2. Once that is in, I'll pull the commit and merge it into my local Git repository cloned from Eclipse.org.

3. I'll amend the commit in order to mark me as the committer. This is necessary to satisfy the Eclipse.org push hook committer check.

4. While amending the commit I'll retain the original author information in order to allow automatic detection of the contribution by the IP Log tool.
Comment 11 Gunnar Wagenknecht CLA 2012-12-17 01:42:57 EST
Kevin,

(In reply to comment #6)
> If merged via the merge button the merge commit does include metadata about
> the pull request.
> 
> Otherwise no additional metadata is included in the commit message beyond
> what the author has entered.

It would be really nice to have something like Gerrit's Change-ID in the commit message to allow automatic discovery of merged pull requests. Otherwise the process seems to be broken from a GitHub's perspective.
Comment 12 Wayne Beaton CLA 2012-12-17 09:24:56 EST
(In reply to comment #10)
> Wayne, I just found this:
> http://wiki.eclipse.org/Development_Resources/Handling_Git_Contributions
> 
> Is this the preferred process to follow? Just to confirm, I'll outline the
> steps I intend to do.
> 
> 1. As I got a first pull request, I'll open a bug and ask the three magic
> questions. 
> 
> 2. Once that is in, I'll pull the commit and merge it into my local Git
> repository cloned from Eclipse.org.
> 
> 3. I'll amend the commit in order to mark me as the committer. This is
> necessary to satisfy the Eclipse.org push hook committer check.
> 
> 4. While amending the commit I'll retain the original author information in
> order to allow automatic detection of the contribution by the IP Log tool.

Correct.

The contributor needs to answer the three questions themselves, directly in Bugzilla. We need the contributor to agree to our terms of use; the best way to do this is to get them to create the eclipse.org account required to respond on Bugzilla.
Comment 13 Kevin Sawicki CLA 2012-12-17 11:40:43 EST
(In reply to comment #11)
> Kevin,
> 
> (In reply to comment #6)
> > If merged via the merge button the merge commit does include metadata about
> > the pull request.
> > 
> > Otherwise no additional metadata is included in the commit message beyond
> > what the author has entered.
> 
> It would be really nice to have something like Gerrit's Change-ID in the
> commit message to allow automatic discovery of merged pull requests.
> Otherwise the process seems to be broken from a GitHub's perspective.

The default commit message when merging pull requests includes the pull request number, repository, and branch.

For example: https://github.com/github/hubot/commit/4571d541ade89f8fccec1ff2b2c851c1ffb6eb6e
Comment 14 Mickael Istria CLA 2012-12-17 11:50:56 EST
FWIW, we almost forbid usage of the GitHub "merge" button on pull request for JBoss Tools components. It actually adds a lot of "merge" commits when we prefer rebasing and keeping history readable (mostly linear).
So in the end, we deal with merges manually, using git pull --rebase when possible, and only when it's not straightforward, we let a merge commit appear; but the merge commit also contains necessary changes, not only the merge metadata (multiple parents).

I think I would use the same process for Eclipse projects.
Comment 15 Jeremie Bresson CLA 2014-07-16 07:57:21 EDT
I just found this Bug on the MediaWiki Bugzilla:
https://bugzilla.wikimedia.org/show_bug.cgi?id=35497

It seems to me that each organization/foundation tries to do the same...
Comment 16 Jesse McConnell CLA 2014-09-09 09:07:45 EDT
What is the status on this?

https://www.eclipse.org/org/SocialCodingFAQ.php

That page has not been updated since June 2013...
Comment 17 Jesse McConnell CLA 2014-09-10 09:10:13 EDT
updated for anyone else that stumbled across this issue 

https://wiki.eclipse.org/Social_Coding/Hosting_a_Project_at_GitHub

can this be closed then?
Comment 18 Denis Roy CLA 2014-09-10 09:13:56 EDT
Jesse, I'm not sure what you're asking. Since this bug has been opened, we've allowed projects to host on GitHub, so perhaps this is no longer an issue?
Comment 19 Eclipse Genie CLA 2016-08-31 12:31:50 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 20 Wayne Beaton CLA 2016-08-31 13:36:03 EDT
(In reply to Denis Roy from comment #18)
> Jesse, I'm not sure what you're asking. Since this bug has been opened,
> we've allowed projects to host on GitHub, so perhaps this is no longer an
> issue?

I believe that we're done here. While we haven't completely disassembled all of the mirrors, we've deprecated the concept. Pull requests are pull requests.