Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 332543 - IFile#setContents eats I/O Exception on close => potential file corruption
Summary: IFile#setContents eats I/O Exception on close => potential file corruption
Status: VERIFIED FIXED
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 3.7   Edit
Hardware: PC Linux-GTK
: P3 normal (vote)
Target Milestone: 3.7 M5   Edit
Assignee: James Blackburn CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 334081 334085
  Show dependency tree
 
Reported: 2010-12-14 11:41 EST by James Blackburn CLA
Modified: 2011-01-25 08:10 EST (History)
2 users (show)

See Also:
Szymon.Brandys: review+
john.arthorne: review+


Attachments
Tests 1 (12.00 KB, patch)
2011-01-06 09:05 EST, James Blackburn CLA
no flags Details | Diff
Fix 1 (19.71 KB, patch)
2011-01-06 09:26 EST, James Blackburn CLA
no flags Details | Diff
patch 2 (44.48 KB, patch)
2011-01-10 07:24 EST, James Blackburn CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description James Blackburn CLA 2010-12-14 11:41:56 EST

    
Comment 1 James Blackburn CLA 2010-12-14 11:47:41 EST
FileUtil#transferStreams copies bytes to the input stream to the file output stream.  At the end of this it calls #safeClose.

In Eclipse the editor loses it's dirty flag and believes that the #setContents has completed successfully.  

On our NetApp NFS connected hosts, we see that if the file is opened and locked by a windows app, the close() can fail.  (In this case it's worse than this: instead of denying the write, the filer somehow allows the file to become truncated...)


strace of a simple text editor shows:

17246 16:18:26 lstat("/home/jamesb/foo.txt", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
17246 16:18:26 open("/home/jamesb/foo.txt", O_WRONLY|O_CREAT|O_TRUNC, 0600) = 3
17246 16:18:26 fcntl(3, F_GETFL)        = 0x8001 (flags O_WRONLY|O_LARGEFILE)
17246 16:18:26 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
17246 16:18:26 mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a9852f000
17246 16:18:26 lseek(3, 0, SEEK_CUR)    = 0
17246 16:18:26 write(3, "asdf\nasdf2\n", 11) = 11
17246 16:18:26 close(3)                 = -1 EACCES (Permission denied)

man 2 close says:

       Not  checking  the  return  value of close is a common but nevertheless
       serious programming error.  It is quite possible that errors on a  pre-
       vious  write(2)  operation  are first reported at the final close.  Not
       checking the return value when closing the file may lead to silent loss
       of data.  This can especially be observed with NFS and disk quotas.

       A  successful  close does not guarantee that the data has been success-
       fully saved to disk, as the kernel defers writes. It is not common  for
       a  filesystem  to  flush  the buffers when the stream is closed. If you
       need to be sure that the data is physically stored use  fsync(2).   (It
       will depend on the disk hardware at this point.)

It seems wrong to ignore any exceptions on close().  Is there a good reason for doing this?
Comment 2 John Arthorne CLA 2010-12-14 16:27:33 EST
It is done this way because the file is closed in a finally block. Any exception thrown from a finally block will cause previous exceptions leading up to the finally block to be lost. So, the finally block can't throw an exception. It could be coded to handle the exception later, but the code is quite ugly:

IOException closeFailure;
try {
  f.write();
} catch (IOException e) {
  //handle exception
} finally
  try {
    f.close();
  } catch (IOException e) {
    closeFailure = e;
  }
}
if (closeFailure != null)
  //handle close failure

As an exercise for the reader, try doing the same thing for two streams that need closing! In the end, you can only throw one exception from the method so either the read/write failure or the close failure can get lost. I guess this is the reason for the proposed try-with-resources in Java 7, which automatically figures out which exception should be propagated and which should be swallowed.

Our general experience was that exceptions on close only happened in practice following an exception on read or write. This case where the read and write succeeds but the close fails is not something I've seen before. Perhaps its unique to network mount scenarios...

In any case, safeClose is used to avoid the complexity of handling close failures while ensuring the original exception is never lost. You're welcome to try writing it in a way that also handles the close failure in the case where the read/write succeeded.
Comment 3 James Blackburn CLA 2010-12-14 17:23:15 EST
Thanks John. I had a quick look at fixing this and it's not too horrible. Basically safeClose should continue to be called in the finally's to ensure file descriptors aren't leaked. 
However, importantly, close should explicitly be called when we've finished with the output stream so any latent IOExceptions are re-thrown in accordance with the API contract.

In this case, the issue is 100% reproducible here and the fix I've got is pretty clean. (Unfortunately as the file is truncated, the editor prompts to replace contents -- but at least this reflects what's on disk.)

Will create a test and attach a patch.
Comment 4 John Arthorne CLA 2010-12-15 12:16:32 EST
Oh I see, you just do something like this:

try {
  f.write();
  f.close();
} catch (IOException e) {
  //handle exception
} finally
  safeClose(f);
}

That way the case of a successful write but a failed close is handled just like any other IOException.
Comment 5 James Blackburn CLA 2011-01-06 09:05:50 EST
Created attachment 186165 [details]
Tests 1

Test for the issue.

Enhances WrapperFileSystem to allow overriding the default WrapperFileStore to make it easy to write tests provoking strange fs behaviour (like I/OException on OutputStream #close()).
Comment 6 James Blackburn CLA 2011-01-06 09:26:55 EST
Created attachment 186166 [details]
Fix 1

Fix for the the issue.  

I've done a survey of OutputStream#close() in core.resources. The attached patch makes the following changes:
  - ensure that where IOException is thrown (or wrapped in CoreE and re-thrown) we don't ignore it on #close()
  - Document that OutputStreams should be explicitly close()d outside a safeClose()
  - ensure that first IOException is thrown, not second - i.e. #safeClose in a finally, not #close().
  - ModelObjectWriter#write(Object, OStream) throws IOException if PrintWriter#checkError() (PrintWriter usually eats IOExceptions).  Is there a reason for not checking this before?

This patch touches 13 (lucky for some) files in core.resources, but I believe on the whole it shouldn't be controversial (except possibly the last point above?).  

I've generally left InputStreams alone to minimize change.  They're less likely to throw IOExceptions on #close & if they do there isn't the risk of data loss.

Existing tests pass.
Comment 7 James Blackburn CLA 2011-01-06 10:32:34 EST
Editing external files shows the same problem, though worse as FileStoreTextFileBuffer eats all IOExceptions.  Patch on: Bug 333660
Comment 8 John Arthorne CLA 2011-01-07 16:07:39 EST
Looks good James. Feel free to release when ready. 

Also add your name or company to the copyright. FWIW, I don't think it is necessary to put much detail in the copyright... if someone else authored the file I usually add something like :

Contributors:
  IBM Corporation - ongoing development

Someone who cared to see what changes you made can always browse the repo history.
Comment 9 James Blackburn CLA 2011-01-10 07:24:31 EST
Created attachment 186381 [details]
patch 2

Thanks John + Szymon for taking a look.  I'll commit to HEAD once this week's I-Build has gone through.

Changes:
  - FileStore#transferStreams had the same bug
  - Copyright + Imports
Comment 10 James Blackburn CLA 2011-01-12 04:39:04 EST
Committed to HEAD.
Comment 11 James Blackburn CLA 2011-01-25 08:10:54 EST
Verified in I20110124-1800