Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 120523 - [Commands] parameters: support serializing and deserializing of ParameterizedCommands
Summary: [Commands] parameters: support serializing and deserializing of Parameterized...
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: UI (show other bugs)
Version: 3.2   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 3.2 M5   Edit
Assignee: Douglas Pollock CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-12-13 03:33 EST by Christopher Daly CLA
Modified: 2006-02-14 14:13 EST (History)
1 user (show)

See Also:


Attachments
patch for ParameterizedCommand serialization and deserialization (22.11 KB, patch)
2005-12-13 03:34 EST, Christopher Daly CLA
no flags Details | Diff
JUnit test patch for ParameterizedCommand serialization and deserialization (13.59 KB, patch)
2005-12-13 03:35 EST, Christopher Daly CLA
no flags Details | Diff
new patch (based on dpollock_Linkability) (30.51 KB, patch)
2005-12-20 02:20 EST, Christopher Daly CLA
no flags Details | Diff
one more rev of the patch (31.10 KB, patch)
2005-12-20 02:26 EST, Christopher Daly CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christopher Daly CLA 2005-12-13 03:33:02 EST
UA wants to be able to store a ParameterizedCommand as a String in documents like Help, Cheatsheets and Intro parts.  The stored commands will be shown as links and when the user clicks the link we will deserialize from String back to ParameterizedCommand and execute it.

The informational content of a ParameterizedCommand amounts to the command id and the ids and values of each of the parameters.  We'd like to propose a standard serialization format and offer API for (de)serializing so that others can use it as well as us.

The serialization format we have looks like this:

  commandId?param1Id=value1&param2Id=value2

Breaking it up, looks like this:

  commandId '?' param1Id '=' value1 '&' param2Id '=' value2 ...

The commandId, paramIds and values are UTF-8 encoded to allow arbitrary characters to appear in them.

I'm going to attach 2 patches for this.  The first patch implements this functionality. It has changes to org.eclipse.core.commands and org.eclipse.ui.workbench

The second patch is some JUnit tests to add to org.eclipse.ui.tests.
Comment 1 Christopher Daly CLA 2005-12-13 03:34:35 EST
Created attachment 31618 [details]
patch for ParameterizedCommand serialization and deserialization
Comment 2 Christopher Daly CLA 2005-12-13 03:35:18 EST
Created attachment 31619 [details]
JUnit test patch for ParameterizedCommand serialization and deserialization
Comment 3 Christopher Daly CLA 2005-12-13 03:38:34 EST
By the way, the patch also fixes bug 120150.  I'll note that on bug 120150.
Comment 4 Dejan Glozic CLA 2005-12-15 15:00:34 EST
> The serialization format we have looks like this:
>   commandId?param1Id=value1&param2Id=value2

For the sake of consistency, we should also toss in the schema:

command:commandId?p1=v2&p2=v2...

I addition, I would rather encode the serialization format using URL encoding/decoding rules (see java.net.URLEncoder and java.net.URLDecoder), rather than UTF-8.
Comment 5 Christopher Daly CLA 2005-12-15 15:28:13 EST
regarding command #4,

Prepending a scheme like "command:" might be useful/desirable to some clients and not to others.  Also it doesn't convey any information that CommandManager actually needs to deserialize the string into a ParameterizedCommand.

So what I've done is provide a "URI-friendly" format.  Clients can turn it in to a well-formed URI by prepending "command:" or they can use a different prefix, like "http://...", or they can use the strings as-is if they are not concerned about URIs.


About URLDecoder/URLEncoder - I agree.  Actually my initial comment about this wasn't very clear.  What I meant to say is:

The commandId, paramIds and values are URL encoded (see java.net.URLEncoder) with the UTF-8 encoding to allow arbitrary characters to appear in them.
Comment 6 Douglas Pollock CLA 2005-12-15 16:49:20 EST
I have been looking at the patch a bit.  There were some minor things (e.g., javadoc style), but one major problem:
    URLEncoder.encode(String,String) is @since 1.4
    URLDecoder.decode(String,String) is @since 1.4

Also, Exception(String,Throwable) is @since 1.4 as well (I've corrected this problem).
Comment 7 Dejan Glozic CLA 2005-12-15 17:03:17 EST
Doug, it is possible to use URLEncoder.encode(String). It is deprecated but is as old as the class itself (JDK 1.0), so we can use it and just ignore the deprecation warning. The same for URLDecoder.decode(String).
Comment 8 Christopher Daly CLA 2005-12-15 17:44:53 EST
The deprecated methods use the "the platform's default encoding".  could this vary between platforms (e.g. Linux vs. windows) such that strings produced on one would be broken on the other?  URLEncoder looks at the "file.encoding" property.  My (windows) eclipse configuration says:

file.encoding=Cp1252

I guess one possibility is to implement a URL encoder/decoder.  EMF has a URI class that (I think) has the guts of what we would need.  Maybe there are other implementations somewhere in Eclipse (WTP?).
Comment 9 Dejan Glozic CLA 2005-12-15 18:05:43 EST
Update has URLEncoder but not URLDecoder.

As a meta-comment, why would we limit ourselves to 1.3 in the UI? I know that some very low-level plug-ins do.
Comment 10 Douglas Pollock CLA 2005-12-16 10:02:48 EST
(In reply to comment #8)
> The deprecated methods use the "the platform's default encoding".  could
> this vary between platforms (e.g. Linux vs. windows) such that strings
> produced on one would be broken on the other?

That's a big yes.  Plus, since this is intended for web browsers, using something other than a UTF-8 string could lead to badness.  Also see the deprecation comment:

    Deprecated. The resulting string may vary depending
    on the platform's default encoding. Instead, use the 
    encode(String,String) method to specify the encoding.
Comment 11 Douglas Pollock CLA 2005-12-16 10:07:17 EST
(In reply to comment #9)
> As a meta-comment, why would we limit ourselves to 1.3 in the UI? I know
> that some very low-level plug-ins do.

http://www.eclipse.org/eclipse/development/eclipse_project_plan_3_2.html
Take a look at the appendix.  Everything at the RCP level or below needs to be foundation-compatible.  IIRC, there are clients using Eclipse RCP in foundation-only environments.  Even if that weren't the case, "org.eclipse.core.commands" is exceptionally low-level.  It is designed to run in headless environments, has no dependencies on OSGi ... and even JFace (on which stand-alone applications can be built) depends on it.

And then the philosophical argument: why would we restrict the environments on which we can run?  More supported environments can lead to more users and more clients.
Comment 12 Dejan Glozic CLA 2005-12-16 11:33:18 EST
No, I understand the rational to be foundation-compatible. Unfortunately, it will have a consequence of us 'reinventing' URL encoding and decoding. Sigh...
We can borrow from other Eclipse teams (e.g. EMF).
Comment 13 Douglas Pollock CLA 2005-12-16 11:35:59 EST
(In reply to comment #12)
> No, I understand the rational to be foundation-compatible. Unfortunately, it
> will have a consequence of us 'reinventing' URL encoding and decoding.

Yeah, I'm sorry about that.
Comment 14 Dejan Glozic CLA 2005-12-16 11:37:54 EST
So, appart from the encoding/decoding that we can deal with, do you suggest any changes to the patch? It would be great if we could get it in fairly soon so that we can start adding linkability support in the UA around this new code.
Comment 15 Douglas Pollock CLA 2005-12-16 11:46:52 EST
It's the only thing keeping my from committing this patch.  All the other changes were minor.  If you can promise that it will be ready sometime soon (today or Monday at the latest), then I can commit what I have to a branch.  And then you can make the required changes based on that branch.  (It would make my life easier)
Comment 16 Douglas Pollock CLA 2005-12-16 12:35:16 EST
From Chris:
I'm not sure I can promise that.  I haven't looked closely at the other 
encoding stuff, like EMF.  But my first impression is that it's not 
exactly "url encoding with UTF-8".  Would another encoding be ok as long 
as it's consistent on all platforms?

Another random idea: I could try to move the encoding part up to 
org.eclipse.ui.workbench, so that (for now) it only works for workbench 
clients and later.  I thought about this some and felt like I would end up 
needing a getCommandManager() method on Command.  Is there any reason why 
that doesn't exist today?

Could you create another patch with your changes so I can start from that 
(rathar than creating a branch)?
Comment 17 Douglas Pollock CLA 2005-12-16 12:47:36 EST
(In reply to comment #16)
> I'm not sure I can promise that.  I haven't looked closely at the other 
> encoding stuff, like EMF.  But my first impression is that it's not 
> exactly "url encoding with UTF-8".  Would another encoding be ok as long 
> as it's consistent on all platforms?

Well, from my perspective, yes.  At the level of commands, all I want is a cross-platform, cross-locale invertible function.  At your level, you may have some grief if it's not UTF-8.  You would need to test all your browsers with non-UTF-8 strings, to make sure they work.

Another approach might be to simply append all the data together.  At your layer (where you can use non-foundation libraries), you could then UTF-8 encode the string.

So, commands would return a plain UTF-16 string:
    .../commandId?parameter=value with space&...

And then at your layer you would do the UTF-8 URI encoding:
    .../commandId?parameter=value%20with%20space&...

This is not ideal.  But, if writing a UTF-8 URIs encoder/decoder is too much work, then this might be the best compromise.


> Another random idea: I could try to move the encoding part up to 
> org.eclipse.ui.workbench, so that (for now) it only works for workbench 
> clients and later.  I thought about this some and felt like I would end up 
> needing a getCommandManager() method on Command.  Is there any reason why 
> that doesn't exist today?

The link between CommandManager and Command is one-way for style reasons.  I would be against making it a two-way link unless we really needed to.  I'm worried that it would muddy the API from the perspective of clients.


> Could you create another patch with your changes so I can start from that 
> (rathar than creating a branch)?

Yes, I'll do so after 3.2 M4 becomes available.
Comment 18 Christopher Daly CLA 2005-12-16 13:41:03 EST
(In reply to comment #17)
> Another approach might be to simply append all the data together.  At your
> layer (where you can use non-foundation libraries), you could then UTF-8 encode
> the string.
> 
> So, commands would return a plain UTF-16 string:
>     .../commandId?parameter=value with space&...

This could work provided commands will encode and decode the separator characters '?', '=' and '&' if they appear in ids and values.

However, if we are going to do that, I think we should change the separator characters.  I wouldn't want to make these strings look like URI suffixes if they are not well-formed.

Maybe something like this:

  command.id '(' param1.id '=' value1 ',' param2.id '=' value2 ')'

But if given the choice I'd prefer to use UTF-8 and URL encoding - so let me look into that a little more and try to figure out how much work it will be.
Comment 19 Christopher Daly CLA 2005-12-16 14:56:29 EST
For my reference - if I need to implement UTF-8 url encoding - here are some relevant links:

URL encoding:
http://en.wikipedia.org/wiki/URL_encoding
http://www.ietf.org/rfc/rfc3986.txt

UTF-8:
http://en.wikipedia.org/wiki/UTF-8
http://www.ietf.org/rfc/rfc3629.txt
Comment 20 Christopher Daly CLA 2005-12-16 15:34:48 EST
The more I think about this, the more i'd like to go with something like Doug and I are discussing in comment #17 and comment #18.  That is, a very simple encoding that just escapes the separator characters.

The UTF-8 and URL encoding stuff looks sort of easy to hack something together that mostly works, but maybe hard to cover all of the corner cases, and I'm reluctant to do yet-another implementation of this stuff when there are good ones out there.

Here's another idea:  We can describe and use our simple encoding now, and in the future - if someone really needs a different encoding - we could add serialize() and deserialize() methods that take an extra "encoding" parameter.  e.g:

  class ParameterizedCommand {
    public final String serialize(String encoding) // ...
  }

null would be ok and would mean our default, simple encoding.  Additional encodings would be described and labeled with some string identifier for clients to pass in.

Doug and Dejan: does this sound ok to you?  That is, do the simple separator encoding now and overload serialize() and deserialize() in the future if we need to support other encodings.
Comment 21 Dejan Glozic CLA 2005-12-17 17:48:27 EST
+1

I would rather extend ParametrizedCommand and override 'seralize' by using URLencoder.encode(String, String)
Comment 22 Douglas Pollock CLA 2005-12-19 12:42:39 EST
The work in progress is committed under the dpollock_Linkability branch in the following projects:
    org.eclipse.core.commands
    org.eclipse.ui.tests
    org.eclipse.ui.workbench

Please try to make any future patches based on this branch.  (It will make my life easier.)
Comment 23 Christopher Daly CLA 2005-12-20 02:20:43 EST
Created attachment 32012 [details]
new patch (based on dpollock_Linkability)

This new patch implements a simple escaping mechanism for the serialization so URLEncoder and URLDecoder are no longer used.

Some notes:
I made the escaping mechanism public. See ParameterizedCommand.escape() and CommandManager.unescape().  If you feel strongly that these shouldn't be public (or at least not to start) I wouldn't argue, so just change it.

Can you look again at SerializationException?  You added getCause(), but I think this was unnecessary.  I understand the JDK-level concern, but SerializationException extends CommandException and I think the problem was already addressed there.

Also SerializationException now only gets thrown by the "deserialize()" method.  Maybe this is wierd - should it be DeserializationException?  Or should we leave it as-is in case we add additional serialize() goop in the future and then we have (as before) one single exception for serialize/deserialize situations. Whatever you think is best...
Comment 24 Christopher Daly CLA 2005-12-20 02:26:02 EST
Created attachment 32013 [details]
one more rev of the patch

oops I had noticed this before - but forgot to fix it: I missed a copyright notice on the JUnit test case class.  The only difference between this patch and the one I posted 5 minutes ago is I just added that copyright notice.
Comment 25 Douglas Pollock CLA 2005-12-20 10:31:11 EST
I switched the utility methods to private.  You're right about the SerializationException, so I've fixed that up.  I think it's fine to leave the same as is.

I have extract the separator characters into constants.

Why the choice of '%' as a separator character?  Normally, I would have expected a '\'.  Does this make it easier for you guys to do the conversion into UTF-8 URI encoding later?
Comment 26 Christopher Daly CLA 2005-12-20 12:06:40 EST
I'd rathar not use '\' because it will have to be escaped again in Java string literals. So you would have "\\\\" to represent a single '\' character.

I chose '%' because it looked like a good escape character, but for no better reason than that. Of course it will have to be escaped with %25 in URLs, so maybe that will be annoying. Maybe Dejan has an opinion.  If you want to change it. I'd suggest '~' as an alternative.
Comment 27 Dejan Glozic CLA 2005-12-20 13:32:10 EST
Sorry, I am not tracking the patches to I cannot give you intelligent answer. However, for compatibility with URLs and to be able to process a URL that is already encoded, I would prefer if we are compatible with URL encoding/decoding (hence the % escape character). 
Comment 28 Christopher Daly CLA 2005-12-20 13:53:10 EST
Hi Doug,

I just checked your changes in the branch.  It looks good except the comments for PARAMETER_END_CHAR and PARAMETER_START_CHAR are backwards:

/**
 * The character that indicates the >>>start<<< of a list of parameters.
 */
static final char PARAMETER_END_CHAR = ')';
Comment 29 Douglas Pollock CLA 2005-12-20 16:05:34 EST
(In reply to comment #26)
> I'd rathar not use '\' because it will have to be escaped again in Java
> string literals. So you would have "\\\\" to represent a single '\'
> character.

Makes sense.  Oh, and I fixed the javadoc.  I'm going to close this bug as fixed now, as I don't believe there are any outstanding issues.
Comment 30 Douglas Pollock CLA 2006-02-14 13:05:06 EST
Moving to verified, as I believe Chris has been using this code for a while now.  There are also test suites for this code.  Please open a separate bug if there are any issues.
Comment 31 Christopher Daly CLA 2006-02-14 14:13:40 EST
I have been using this, and it's been working well.  For more on how we're using this support, see bug 123921.