Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.

Bug 353490

Summary: Virgo fails deployment of big WAR files
Product: [RT] Virgo Reporter: Borislav Kapukaranov <b.kapukaranov>
Component: runtimeAssignee: Violeta Georgieva <milesg78>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: P3 CC: assia.djambazova, erdal.karaca.de, glyn.normington, hsiliev, milesg78
Version: 3.0.0.RC1   
Target Milestone: ---   
Hardware: PC   
OS: Windows 7   
Whiteboard:
Attachments:
Description Flags
Patch for deployment of big archives milesg78: iplog+, milesg78: review+

Description Borislav Kapukaranov CLA 2011-08-01 08:30:58 EDT
A WAR file 200mb+ that is big enough to be copied for 5-10 seconds reproduces the issue.

Currently Virgo monitors new files in intervals of 1s for changes of their size in order to determine if they are still being copied.
Probably the issue is localised only on Windows as NTFS sometimes updates the file's size in intervals larger than 1s.

A better and safer solution is try to open an InputStream to the file - if it's still being copied this will return a FileNotFoundException.

How to reproduce:
1. Deploy a big enough WAR(copied in ~10 seconds)
Comment 1 Glyn Normington CLA 2011-08-02 05:34:24 EDT
Please ensure the fix is tested on Mac (HFS), Linux (EXT3), and Windows (NTFS) as we suspect that some file systems might allow an input stream to be opened on a file which is being updated.
Comment 2 Hristo Iliev CLA 2011-08-02 06:05:05 EDT
Actually this is possible on NTFS with shared access flags (OF_SHARE_*):

http://msdn.microsoft.com/en-us/library/aa365430%28v=vs.85%29.aspx
Comment 3 Glyn Normington CLA 2011-08-02 06:41:56 EDT
I wondere if there is some relatively fast way of checking the completeness of a JAR download without having to start the deployment process? Think jar tvf and check the return code.
Comment 4 Borislav Kapukaranov CLA 2011-08-02 06:50:22 EDT
The tvf option is interesting. If that will be used to get the entries of the jar file we can do this programatically.
The fix was always meant to be in the monitoring step prior deployment.
Comment 5 Hristo Iliev CLA 2011-09-08 11:15:36 EDT
Opening ZIP file should do the trick for JAR, WAR, PAR. It appears that incomplete archives simply cannot be opened. The end of central directory record (http://en.wikipedia.org/wiki/ZIP_%28file_format%29) should provide us with a way to detect the complete archive.

Still there will be problems for big property files or unzipped bundles.
Comment 6 Hristo Iliev CLA 2011-09-08 11:33:59 EDT
I think we should have parametrized scan interval. 

This will help with both non-ZIP based artefacts and reducing resource consumption (opening ZIP file every second can cost too much especially for big files).
Comment 7 Violeta Georgieva CLA 2011-09-08 13:41:26 EDT
It would also be a good idea to have parameterized the time which we will wait for a file to be copied on file system.
Comment 8 Hristo Iliev CLA 2011-09-08 13:49:42 EDT
The thread now scans indefinitely and checks for file size. Some copy operations however set the size to the final value before the file is complete hence the problem.

If we have a interval after which the artefact is no longer processed we risk to ignore a file that has been fixed (copied / edited) in the pickup.
Comment 9 Violeta Georgieva CLA 2011-09-12 08:09:18 EDT
One drawback is that a ZIP exception might be thrown because of a corrupted file, not because it is still copying.

I tested under Windows that renaming to the same file always returns false while copying, so my proposal is to combine the current solution with this additional check so we will solve the problem under Windows.

Wdyt?
Comment 10 Assia Djambazova CLA 2011-09-16 10:48:42 EDT
Created attachment 203491 [details]
Patch for deployment of big archives

Hello,

We tried a simple patch on windows and it is working.
Please take a look and comment.

Regards,
Assia & Violeta
Comment 11 Erdal Karaca CLA 2011-09-16 11:17:38 EDT
What about comparing the size of two checks between an interval.
E.g. scan at time t0 and t1, if t1-t0>0, then it is still being copied...
Just a thought.
Comment 12 Borislav Kapukaranov CLA 2011-09-16 17:43:19 EDT
Erdal, this is how Virgo currently handles large files copied in pickup.
Unfortunatelly that isn't enough as on Windows the size between the intervals is unchanged, actually the file is created with seemingly complete size right from the start although the copy process is still ongoing.

That is why we're searching for a better solution for Windows - rename seems to be the best option here.
In *nix systems the FS is good enough to update regularly the file size.
Comment 13 Glyn Normington CLA 2011-09-20 10:29:42 EDT
I reviewed the rename patch and it looks plausible. I have no idea whether this will improve matters on all Windows platforms though.
Comment 14 Violeta Georgieva CLA 2011-09-27 04:46:35 EDT
(In reply to comment #13)
> I reviewed the rename patch and it looks plausible. I have no idea whether this
> will improve matters on all Windows platforms though.

I tested the solution on Windows XP and Windows 7 and it is working perfect.
Assia tested it on Windows Vista and it is working there also.

Assia, please could you confirm that you wrote 100% of the code, you have the right to contribute the code to Eclipse.

Thanks
Violeta
Comment 15 Assia Djambazova CLA 2011-09-27 04:49:12 EDT
I confirm that I wrote 100% of the code and I have the
right to contribute the code to Eclipse.
Comment 16 Violeta Georgieva CLA 2011-09-27 05:52:32 EDT
Change is committed, tested and pushed.
Commit ID: cf8fc0b9fb2562f292de6269fa5c5ef47a6a5e46
Comment 17 Violeta Georgieva CLA 2011-09-27 14:37:16 EDT
Comment #6 -> Bug 359124 - Parameterize hot deployment's FS scan interval 
Comment #7 -> Bug 359125 - Parameterize the time that hot deployer will wait file to be copied to pickup folder