Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 514041

Summary: Incomplete authors at Winery and support for .mailmap for "Who's involved"
Product: Community Reporter: Oliver Kopp <oliver.kopp>
Component: WebsiteAssignee: phoenix.ui <phoenix.ui-inbox>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: wayne.beaton
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Windows NT   
Whiteboard: stalebug

Description Oliver Kopp CLA 2017-03-22 02:30:53 EDT
When checking https://projects.eclipse.org/projects/soa.winery/who, there are authors missing. The git repository is https://github.com/eclipse/winery

When I execute

   git log --format='%aE - %aN' | sort --unique --ignore-case

the output is as follows:

agrimberg@linuxfoundation.org - Andrew Grimberg
alexander.stifel@googlemail.com - Alexander Stifel
balzer814_dev@web.de - Lukas Balzer
hueneburg.armin@gmail.com - Armin Hueneburg
kalman.kepes@iaas.uni-stuttgart.de - Kálmán Képes
keppler1github@gmail.com - Nicole Keppler
kopp.dev@gmail.com - Oliver Kopp
lharzenetter@gmx.de - Lukas Harzenetter
lv.bo163@zte.com.cn - lvbo
lv.bo163@zte.com.cn - Lvbo163
meyer.github@gmail.com - Philipp Meyer
nstadelmaier.dev@gmail.com - Niko
nstadelmaier.dev@gmail.com - Niko Stadelmaier
nyuuyn@users.noreply.github.com - nyuuyn
pascalhirmer@googlemail.com - Pascal Hirmer
sebastian.wagner@iaas.uni-stuttgart.de - Sebastian Wagner
tbz.fourtytwo@gmail.com - Tobias Binz
tstadelmaier.github@gmail.com - Tino Stadelmaier
zhao.huabing@zte.com.cn - HuabingZhao

For instance, "Nicole Keppler" is not shown in the "Who's involved page".

Furthermore, the author names are wrong sometimes. I created a `.mailmap` file (https://git-scm.com/docs/git-blame#_mapping_authors) to fix it. Reading https://bugs.eclipse.org/bugs/show_bug.cgi?id=361611, I wonder whether I have to rewrite the git history so that all email addresses and Names are correct or is a proper `.mailmap` file enough?
Comment 1 Oliver Kopp CLA 2017-03-22 02:31:35 EDT
I am aware that JGit does not support .mailmap: https://bugs.eclipse.org/bugs/show_bug.cgi?id=458616
Comment 2 Wayne Beaton CLA 2017-03-22 12:30:16 EDT
(In reply to Oliver Kopp from comment #0)
> For instance, "Nicole Keppler" is not shown in the "Who's involved page".

Nicole's last commit was on December 19/2016. The chart only shows those who have committed in the last three months.

> Furthermore, the author names are wrong sometimes. I created a `.mailmap`
> file (https://git-scm.com/docs/git-blame#_mapping_authors) to fix it.
> Reading https://bugs.eclipse.org/bugs/show_bug.cgi?id=361611, I wonder
> whether I have to rewrite the git history so that all email addresses and
> Names are correct or is a proper `.mailmap` file enough?

I hadn't thought of using a mail map. For committers, we can record multiple email addresses and we always map them to the entry in the foundation database. For no committers, we don't have that ability and take the name from the commit record (if they have multiple names, the MySQL "group by email" picks one). 

A .mailmap should give us the right level of control.

(In reply to Oliver Kopp from comment #1)
> I am aware that JGit does not support .mailmap:
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=458616

The script that harvests this information is PHP based and uses CLI Git. 

Let's try the mailmap to see if it solves the problem. Let me know when you have that in place, and I'll get the script to do a complete rebuild.
Comment 3 Oliver Kopp CLA 2017-04-26 03:52:32 EDT
(In reply to Wayne Beaton from comment #2)

> Let's try the mailmap to see if it solves the problem.

Yeah :)

> Let me know when you
> have that in place, and I'll get the script to do a complete rebuild.

I added two more mappings and it is ready now: https://github.com/eclipse/winery/blob/master/.mailmap
Comment 4 Oliver Kopp CLA 2017-07-14 16:42:06 EDT
I also noticed that this has impact on the IP log - https://projects.eclipse.org/projects/soa.winery/iplog/preview
Comment 5 Wayne Beaton CLA 2017-07-17 11:00:47 EDT
(In reply to Oliver Kopp from comment #4)
> I also noticed that this has impact on the IP log -
> https://projects.eclipse.org/projects/soa.winery/iplog/preview

I have initiated a complete rebuild of the index.
Comment 6 Wayne Beaton CLA 2017-07-18 10:56:35 EDT
(In reply to Wayne Beaton from comment #5)
> I have initiated a complete rebuild of the index.

How does it look now?
Comment 7 Oliver Kopp CLA 2017-07-19 03:43:11 EDT
Is "A problem has occurred (Unknown error). Please contact EMO for assistance." something bad?

The current output I see below has following issues:

- "HuabingZhao" has no space, even if configured in .mailmap
- 吕波 is not used (should we support Chinese characters?)

Minor thing: bugzilla issues are linked, but not GitHub-Pullrequest (e.g., "	fix css buttons (#26)"). Should I open another issue?
Comment 8 Wayne Beaton CLA 2017-07-19 23:58:49 EDT
(In reply to Oliver Kopp from comment #7)
> Minor thing: bugzilla issues are linked, but not GitHub-Pullrequest (e.g., "
> fix css buttons (#26)"). Should I open another issue?

The Bugzilla issues are legacy at this point and the use of Bugzilla to track contributions is deprecated. In the old (CVS) days, we had no other means of recording author information (only the actual committer credentials were represented) so we added the iplog flag to bugs and attachments.

Each Git commit should (must) record the author information; pull requests should not contain any additional information.
Comment 9 Oliver Kopp CLA 2017-08-10 20:53:14 EDT
I get no error displayed at https://projects.eclipse.org/projects/soa.winery/iplog/preview, but the names there are not formatted using the information of .mailmap. 

However, the output at https://projects.eclipse.org/projects/soa.winery/who looks fine.

Regarding the PR links: Sometimes, I check pull requests for some more documentation on the ideas and motivation to do something. I know, that Architectural Decision Records (https://adr.github.io/) are the way to go, but until we have that, the PRs are nice.
Comment 10 Eclipse Genie CLA 2019-08-01 07:21:13 EDT
This bug hasn't had any activity in quite some time. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet.

If you have further information on the current state of the bug, please add it. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant.

--
The automated Eclipse Genie.
Comment 11 Wayne Beaton CLA 2019-08-01 09:22:17 EDT
(In reply to Oliver Kopp from comment #9)
> I get no error displayed at
> https://projects.eclipse.org/projects/soa.winery/iplog/preview, but the
> names there are not formatted using the information of .mailmap. 

We take the email address out of the Git commit and map it to a committer in our database. The name that gets displayed is from the information provided to us by the committer (and stored in our database). Effectively, the actual name in the Git commit is ignored for committers or anybody else that we can map to in our database when we generate the IP Log. In cases (generally only when a commit predates our use of CLAs/ECAs) where the information is not in our database, we'll use the name right off the commit.

> However, the output at https://projects.eclipse.org/projects/soa.winery/who
> looks fine.

The "Committers" section comes directly from our database; it does not use the Git commit record in any way. The charts use the same mechanism as described above.

AFAICT, this is working as expected. Since we actually did make some changes, I'm going to mark this as FIXED.