Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 323293

Summary: Accented characters generate incorrect user names
Product: z_Archived Reporter: Denis Roy <denis.roy>
Component: Dash Submission SystemAssignee: Denis Roy <denis.roy>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: anne.jacko
Version: unspecified   
Target Milestone: ---   
Hardware: PC   
OS: Linux   
Whiteboard:

Description Denis Roy CLA 2010-08-20 17:02:09 EDT
The Submission System generates a user name based on the user's first and last names.  If the names contain accented/extended characters, the username becomes unusable.

For now, I'm passing the proposed username through iconv to drop the extended characters.  A newer version of PHP would likely perform a correct translation.
Comment 2 Anne Jacko CLA 2010-08-21 15:26:32 EDT
Hi Denis -- reopening because I have a question -- thanks!

I'm presuming that when someone creates a bugzilla account, there is no problem if they enter their name using an accented character -- bugzilla handles this correctly and displays the name with the accented character. The problem was how the accented character was handled by the submission system. And with this fix, a bugzilla account "human name" with an accented character will used and displayed without that character by the submission system -- the accented character will be converted to a "plain" character.

But I'm not clear on how the sub sys will handle searches on a name with an accented character. If a user enters a name with an accented character, such as when they are searching for a person by clicking on the "Search for talks..." button, or when they are editing a submission and want to add another person via the "add new author" or "Add an assistant" options and need to search for that person to add -- what happens then?
Comment 3 Anne Jacko CLA 2010-08-21 16:01:00 EDT
Denis, not sure if this should be added to this bug, but we have some ESE 2010 submissions that need some fixing because of the accented character issue, and I'm not sure how to proceed. I started to describe the issues in the comment, but it would take a lot of typing, and I'm not really sure what is happening so my descriptions would probably just be confusing. 

I suggest that we talk about these on Tuesday, and then I'll update the bug to reflect our conversation.

Here are the submissions in question:

https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1905
author is Norbert Schoepke, with an umlaut over the "o" in Schopke

https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1912
https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1931
(1931 is a duplicate of 1912 -- sort of)
author is Jabier Martinez, with an accent aigu over the "i" in Martinez

Thanks.
Comment 4 Denis Roy CLA 2010-08-23 09:59:03 EDT
> I'm presuming that when someone creates a bugzilla account, there is no problem
> if they enter their name using an accented character -- bugzilla handles this
> correctly and displays the name with the accented character. The problem was
> how the accented character was handled by the submission system. And with this
> fix, a bugzilla account "human name" with an accented character will used and
> displayed without that character by the submission system -- the accented
> character will be converted to a "plain" character.

This bug only fixes the invalid creation of an unusable ID due to it containing an accented character.  The use case I've seen lately is where someone creates an account, creates a submission, then cannot edit their submission because their user ID is invalid.

There are other bugs that are opened to discuss the proper display of extended characters: bug 263432

Of course, this doesn't fix existing invalid IDs.  It simply prevents new ones from being created.


> But I'm not clear on how the sub sys will handle searches on a name with an
> accented character.

I don't know either.  Did that work in the past?
Comment 5 Denis Roy CLA 2010-08-23 10:06:16 EDT
(In reply to comment #3)
> https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1905
> author is Norbert Schoepke, with an umlaut over the "o" in Schopke

This is a display issue only, right?  That issue is in bug 263432. 

> 
> https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1912
> https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1931
> (1931 is a duplicate of 1912 -- sort of)
> author is Jabier Martinez, with an accent aigu over the "i" in Martinez

I have fixed 1912, go ahead and delete 1931.
Comment 6 Anne Jacko CLA 2010-08-23 13:26:33 EDT
(In reply to comment #5)
> (In reply to comment #3)
> > https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1905
> > author is Norbert Schoepke, with an umlaut over the "o" in Schopke
> 
> This is a display issue only, right?  That issue is in bug 263432. 

I'm not clear on whether this submission is OK now -- I think it may be despite my inept meddling. When Norbert was having trouble associating his bugzilla account with the submission, I used the "add author without bugzilla account" feature to add him as an author. Because he was in our database, I used Norbert's email address when I added him. I later discovered that this feature is designed for adding keynotes, and probably should not be used for fixing submissions where the authors are having problems.

Because of the problem Norbert was having when he first submitted the talk, he did not get the "thanks for proposing" email. But after I added him to the talk, I created a comment on the talk to test to see if the comment would be emailed to him. It was, so I'm presuming that the submission system will now correctly send emails to him at that address. If that's the case, then this is fixed.

> 
> > 
> > https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1912
> > https://www.eclipsecon.org/submissions/ese2010/view_talk.php?id=1931
> > (1931 is a duplicate of 1912 -- sort of)
> > author is Jabier Martinez, with an accent aigu over the "i" in Martinez
> 
> I have fixed 1912, go ahead and delete 1931.

I would love to delete 1931, but I don't know how to actually delete a submission. 

IIRC, we can make a talk go away before the Big Button is pushed by simply marking it as declined, and it eventually becomes a declined talk when the BB is pushed. If we need to make an accepted talk go away after the BB is pushed, we change its status and then push the BB again. I believe this is considered to be a "withdrawn" talk.
Comment 7 Anne Jacko CLA 2010-08-23 13:44:39 EDT
(In reply to comment #4)

> 
> This bug only fixes the invalid creation of an unusable ID due to it containing
> an accented character.  The use case I've seen lately is where someone creates
> an account, creates a submission, then cannot edit their submission because
> their user ID is invalid.

Is the ID "unusable" only in that the submission system  couldn't deal with the accented character -- in other words, it's a valid ID as far as Bugzilla is concerned.

> 
> There are other bugs that are opened to discuss the proper display of extended
> characters: bug 263432
> 
> Of course, this doesn't fix existing invalid IDs.  It simply prevents new ones
> from being created.

So I guess we have to watch for problems due to this, or wait for submitters to let us know they are having problems either by emailing us or filing bugs.

> 
> 
> > But I'm not clear on how the sub sys will handle searches on a name with an
> > accented character.
> 
> I don't know either.  Did that work in the past?

I don't know for sure, but it looks as though searches may be working OK. It's just that if the accented character caused the ID to be invalid, the ID can be found, but it can't be used properly (to add an author, to upload a bio, etc.)

I'll try to do a bit more testing to figure this out.
Comment 8 Denis Roy CLA 2010-08-23 14:22:46 EDT
(In reply to comment #6)
>  When Norbert was having trouble associating his bugzilla
> account with the submission, I used the "add author without bugzilla account"
> feature to add him as an author.

Yep, let's try to avoid doing that.  The problem is fixed now, so new users won't get invalid IDs anymore.  We'll just have to 'fix' the currently broken ones.  Just send me an email if you see others.

> I would love to delete 1931, but I don't know how to actually delete a
> submission. 

Do we have an easy way of withdrawing it?  I can delete the record from the database, but I'd rather not do that if there's another way.


(In reply to comment #7)
> Is the ID "unusable" only in that the submission system  couldn't deal with the
> accented character -- in other words, it's a valid ID as far as Bugzilla is
> concerned.

Yes.  Bugzilla uses the email address as a user id, whereas we internally use an ID constructed from the name (like ajacko).  Since the accented characters were not being filtered/translated properly, and since there were encoding problems with the Sub System, the saved user IDs were unreachable.

> I don't know for sure, but it looks as though searches may be working OK. It's
> just that if the accented character caused the ID to be invalid, the ID can be
> found, but it can't be used properly (to add an author, to upload a bio, etc.)
> 
> I'll try to do a bit more testing to figure this out.

Ok, I'll close this bug as FIXED since the Sub System will not generate unusable IDs anymore.  As mentioned, if there are broken accounts, please open a separate bug, or send me an email.

If the search is broken or needs tweaking, please open a separate bug.
Comment 9 Anne Jacko CLA 2010-08-23 14:30:08 EDT
(In reply to comment #8)

> Do we have an easy way of withdrawing it?  I can delete the record from the
> database, but I'd rather not do that if there's another way.

I suggest that we not mess with the database, and just let the extra submission be declined. It looks as though Gabe will be doing some work for us, and we can ask him to document how to delete/withdraw a submission if that feature exists. 

> 
> (In reply to comment #7)
> > Is the ID "unusable" only in that the submission system  couldn't deal with the
> > accented character -- in other words, it's a valid ID as far as Bugzilla is
> > concerned.
> 
> Yes.  Bugzilla uses the email address as a user id, whereas we internally use
> an ID constructed from the name (like ajacko).

OK, that makes sense. Thanks for the explanation.
  
> 
> If the search is broken or needs tweaking, please open a separate bug.

Will do.