Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 235572

Summary: [relengtool] "Fix copyrights" doesnt detect existing Copyright in Windows .bat files
Product: [Eclipse Project] Platform Reporter: Martin Oberhuber <mober.at+eclipse>
Component: RelengAssignee: Kim Moir <kim.moir>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: P3 CC: bokowski, carolynmacleod4, jean-michel_lemieux, john.arthorne, Michael.Valenta, mober.at+eclipse, schacher
Version: 3.4   
Target Milestone: 3.5.1   
Hardware: PC   
OS: Windows XP   
Whiteboard:
Bug Depends on:    
Bug Blocks: 234872    
Attachments:
Description Flags
Simple patch fixing the issue
none
Improved patch with pattern matching none

Description Martin Oberhuber CLA 2008-06-04 08:02:46 EDT
+++ This bug was initially created as a clone of Bug #234872 +++

My batchfile has this at its top:

@REM *************************************************************************
@REM # Copyright (c) 2006, 2007 Wind River Systems, Inc.
@REM # All rights reserved. This program and the accompanying materials 

When running "Fix Copyrights..." on the file, I get an IBM copyright prepended to my original Copyright. This is incorrect and a major limitation in functionality, since it means that I need to manually review and undo all such changes.
Comment 1 Michael Valenta CLA 2008-06-04 09:12:57 EDT
Martin, I agree that this is major limitation of the current tool. This tool is not really owned by any one group and traditionally users have modified it to meet their needs when they have encountered errors (i.e. there isn't anyone performing maintenance on the tool). We recently used the tool for Jazz copyrights and found that the Advanced Fix Copyright action worked better as it allow you to specify a template comment in a preference page (i.e. our copyrights are not EPL). You may find that works better for you. If not, you may want to consider modifying the tool yourself (i.e. you can decide whether it is more work to modify the tool or correct all the mistakes).
Comment 2 Martin Oberhuber CLA 2008-06-04 09:51:50 EDT
Thanks Michael. I guess I'm doing all that you said already :-) 

Problem is just, at the time where I'm using the tool (Release endgame), I'm just so swamped with other stuff that I don't find time fixing the tool. And it looks like others are just in the same situation :-)

So I thought I'd report the bug for now and once Ganymede is out the door someone (perhaps including myself) take a stab at improving the tool.

Thanks,
Martin
Comment 3 Martin Oberhuber CLA 2009-05-29 13:40:23 EDT
I found where this can be fixed (it's actually pretty simple), and think I can provide a patch for 3.5.1 if people like.
Comment 4 Martin Oberhuber CLA 2009-05-29 14:16:09 EDT
Created attachment 137691 [details]
Simple patch fixing the issue

Attached simple patch fixes the issue, by making detection of block comments slightly more configurable. Tested and works fine for me.
Comment 5 Martin Oberhuber CLA 2009-05-29 14:32:33 EDT
In fact the fix was so simple that I made it right away - see attached.

Anybody care for pushing this into 3.5? - I'd personally rather see it in 3.5.1, but on the other hand people need to update their copyrights NOW and they will need an updated version of the tool to do the best that the tool can do.

One compromise might be committing the patch into HEAD but not release it, then
people can get the tool from an N-build.

Note that the patch just fixes
   rem ****
style comments but the problem persists for
   rem ----
   rem ####
etc -- these are slightly harder to fix because we couldn't detect the comment end that easily as today when we allow such variations of existing comments. In the worst case (when an existing copyright comment is not detected), an additional new comment is inserted although an existing one is in the file already. A similar problem exists for shellscripts and .properties files -- any language that doesn't really have a concept of block comments like C, Java or JavaScript.

I guess a "proper" fix would consider anything a block comment based on a pattern that starts with comment character and includes any excessive repetition of the same character. Comment end would then check for repetition of the same character. This regex might do the job (untested):

// rem, whitespace, and any non-word character repeated till EOL
Pattern p = Pattern.compile("@?rem\\s+\\W{2,}");
return p.matcher(aLine.trim().toLowerCase()).matches();

storing away the characters that can then end the block comment should not be too hard.
Comment 6 Martin Oberhuber CLA 2009-05-29 15:13:35 EDT
Created attachment 137702 [details]
Improved patch with pattern matching

Knowing what could be done, I just couldn't let go of it :)

So here is an improved patch that actually does pattern matching. I think that's pretty much perfect now, unless you have real funny comments like this:

  @rem foo -+-+-+-+-+

At the same time, the patch adds support for perl (*.pl) and tcl (*.tcl) which I happened to have in my workspace. Enjoy :)
Comment 7 Kim Moir CLA 2009-08-12 14:32:25 EDT
Patch released for 3.5.1 and 3.6 stream builds.