| Summary: | IFile#setContents eats I/O Exception on close => potential file corruption | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Eclipse Project] Platform | Reporter: | James Blackburn <jamesblackburn+eclipse> | ||||||||
| Component: | Resources | Assignee: | James Blackburn <jamesblackburn+eclipse> | ||||||||
| Status: | VERIFIED FIXED | QA Contact: | |||||||||
| Severity: | normal | ||||||||||
| Priority: | P3 | CC: | john.arthorne, Szymon.Brandys | ||||||||
| Version: | 3.7 | Flags: | Szymon.Brandys:
review+
john.arthorne: review+ |
||||||||
| Target Milestone: | 3.7 M5 | ||||||||||
| Hardware: | PC | ||||||||||
| OS: | Linux-GTK | ||||||||||
| Whiteboard: | |||||||||||
| Bug Depends on: | |||||||||||
| Bug Blocks: | 334081, 334085 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
James Blackburn
FileUtil#transferStreams copies bytes to the input stream to the file output stream. At the end of this it calls #safeClose.
In Eclipse the editor loses it's dirty flag and believes that the #setContents has completed successfully.
On our NetApp NFS connected hosts, we see that if the file is opened and locked by a windows app, the close() can fail. (In this case it's worse than this: instead of denying the write, the filer somehow allows the file to become truncated...)
strace of a simple text editor shows:
17246 16:18:26 lstat("/home/jamesb/foo.txt", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
17246 16:18:26 open("/home/jamesb/foo.txt", O_WRONLY|O_CREAT|O_TRUNC, 0600) = 3
17246 16:18:26 fcntl(3, F_GETFL) = 0x8001 (flags O_WRONLY|O_LARGEFILE)
17246 16:18:26 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
17246 16:18:26 mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9852f000
17246 16:18:26 lseek(3, 0, SEEK_CUR) = 0
17246 16:18:26 write(3, "asdf\nasdf2\n", 11) = 11
17246 16:18:26 close(3) = -1 EACCES (Permission denied)
man 2 close says:
Not checking the return value of close is a common but nevertheless
serious programming error. It is quite possible that errors on a pre-
vious write(2) operation are first reported at the final close. Not
checking the return value when closing the file may lead to silent loss
of data. This can especially be observed with NFS and disk quotas.
A successful close does not guarantee that the data has been success-
fully saved to disk, as the kernel defers writes. It is not common for
a filesystem to flush the buffers when the stream is closed. If you
need to be sure that the data is physically stored use fsync(2). (It
will depend on the disk hardware at this point.)
It seems wrong to ignore any exceptions on close(). Is there a good reason for doing this?
It is done this way because the file is closed in a finally block. Any exception thrown from a finally block will cause previous exceptions leading up to the finally block to be lost. So, the finally block can't throw an exception. It could be coded to handle the exception later, but the code is quite ugly:
IOException closeFailure;
try {
f.write();
} catch (IOException e) {
//handle exception
} finally
try {
f.close();
} catch (IOException e) {
closeFailure = e;
}
}
if (closeFailure != null)
//handle close failure
As an exercise for the reader, try doing the same thing for two streams that need closing! In the end, you can only throw one exception from the method so either the read/write failure or the close failure can get lost. I guess this is the reason for the proposed try-with-resources in Java 7, which automatically figures out which exception should be propagated and which should be swallowed.
Our general experience was that exceptions on close only happened in practice following an exception on read or write. This case where the read and write succeeds but the close fails is not something I've seen before. Perhaps its unique to network mount scenarios...
In any case, safeClose is used to avoid the complexity of handling close failures while ensuring the original exception is never lost. You're welcome to try writing it in a way that also handles the close failure in the case where the read/write succeeded.
Thanks John. I had a quick look at fixing this and it's not too horrible. Basically safeClose should continue to be called in the finally's to ensure file descriptors aren't leaked. However, importantly, close should explicitly be called when we've finished with the output stream so any latent IOExceptions are re-thrown in accordance with the API contract. In this case, the issue is 100% reproducible here and the fix I've got is pretty clean. (Unfortunately as the file is truncated, the editor prompts to replace contents -- but at least this reflects what's on disk.) Will create a test and attach a patch. Oh I see, you just do something like this:
try {
f.write();
f.close();
} catch (IOException e) {
//handle exception
} finally
safeClose(f);
}
That way the case of a successful write but a failed close is handled just like any other IOException.
Created attachment 186165 [details]
Tests 1
Test for the issue.
Enhances WrapperFileSystem to allow overriding the default WrapperFileStore to make it easy to write tests provoking strange fs behaviour (like I/OException on OutputStream #close()).
Created attachment 186166 [details]
Fix 1
Fix for the the issue.
I've done a survey of OutputStream#close() in core.resources. The attached patch makes the following changes:
- ensure that where IOException is thrown (or wrapped in CoreE and re-thrown) we don't ignore it on #close()
- Document that OutputStreams should be explicitly close()d outside a safeClose()
- ensure that first IOException is thrown, not second - i.e. #safeClose in a finally, not #close().
- ModelObjectWriter#write(Object, OStream) throws IOException if PrintWriter#checkError() (PrintWriter usually eats IOExceptions). Is there a reason for not checking this before?
This patch touches 13 (lucky for some) files in core.resources, but I believe on the whole it shouldn't be controversial (except possibly the last point above?).
I've generally left InputStreams alone to minimize change. They're less likely to throw IOExceptions on #close & if they do there isn't the risk of data loss.
Existing tests pass.
Editing external files shows the same problem, though worse as FileStoreTextFileBuffer eats all IOExceptions. Patch on: Bug 333660 Looks good James. Feel free to release when ready. Also add your name or company to the copyright. FWIW, I don't think it is necessary to put much detail in the copyright... if someone else authored the file I usually add something like : Contributors: IBM Corporation - ongoing development Someone who cared to see what changes you made can always browse the repo history. Created attachment 186381 [details]
patch 2
Thanks John + Szymon for taking a look. I'll commit to HEAD once this week's I-Build has gone through.
Changes:
- FileStore#transferStreams had the same bug
- Copyright + Imports
Committed to HEAD. Verified in I20110124-1800 |