Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 572282 - [performance] extend IFile API to read byte[] and char[]
Summary: [performance] extend IFile API to read byte[] and char[]
Status: CLOSED WORKSFORME
Alias: None
Product: Platform
Classification: Eclipse Project
Component: Resources (show other bugs)
Version: 4.20   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: ---   Edit
Assignee: Platform-Resources-Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-25 05:51 EDT by Jörg Kubitz CLA
Modified: 2021-03-27 08:58 EDT (History)
4 users (show)

See Also:


Attachments
Sampling Result getInputStreamAsCharArray (51.48 KB, image/png)
2021-03-25 16:12 EDT, Jörg Kubitz CLA
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jörg Kubitz CLA 2021-03-25 05:51:35 EDT
There are several Util classes in the depending projects which convert
org.eclipse.core.resources.IFile.getContents(boolean force) to byte[] and char[].
For the LocalFile this could be improved by 
 java.nio.Files.readAllBytes()
 java.nio.Files.readString.toCharArray() // its up to 2x faster in my measurements
Therfore we would need to extend the API with
 getContentsBytes(force) 
 getContentsCharArray(force)
and duplicate some boiler plate code for the force parameter.

Please let me know if you would like such change.
Comment 1 Jörg Kubitz CLA 2021-03-25 05:55:24 EDT
julian.honnen@vector.com, Lars.Vogel@vogella.com 

relates to:
 Bug 572277
 Bug 572273
Comment 2 Thomas Wolf CLA 2021-03-25 06:28:23 EDT
Be careful not to break EFS. Methinks this would also need changes to IFileStore.

getContents() returns an InputStream. That looks like the correct abstraction to me.

Any use cases for readAllBytes() are hopefully limited to small files anyway; reading those efficiently through an InputStream should be possible.

Likewise for the readString() use cases. It's probably 2 times faster than anything one can implement via public APIs because it uses an internal string constructor that avoids an array copy. But again, uses should be limited to small files?
Comment 3 Jörg Kubitz CLA 2021-03-25 13:34:04 EDT
(In reply to Thomas Wolf from comment #2)

> Likewise for the readString() use cases. It's probably 2 times faster than
> anything one can implement via public APIs because it uses an internal
> string constructor that avoids an array copy. But again, uses should be
> limited to small files?

The files i have in mind are typically .class and .java with 30kb average. Yes small files but the performance can sadly not be reached by streams due to internal methods and copyright issues. I know that this is not pretty design just work around those limitations.
Or we might add Util methods in resources whith fast pathes for localfile implementation. Not beautifull either but keeping the api - but impossible to improve by 3rd party filesystems.
Comment 4 Jörg Kubitz CLA 2021-03-25 13:36:32 EDT
alex.blewitt@gmail.com i'd like to read your oppinion about this too.
Comment 5 Alex Blewitt CLA 2021-03-25 15:15:17 EDT
There may be an opportunity to increase a wrapper around the FileInputStream for LocalFile:

https://github.com/eclipse/eclipse.platform.resources/blob/e1782ab6dce66b452345bfdda12e1db5f363121f/bundles/org.eclipse.core.filesystem/src/org/eclipse/core/internal/filesystem/local/LocalFile.java#L404

For converting to/from byte[] and char[], can you show where that's happening? Is it just using a built-in Reader? Do you have some hotspot data where we're seeing delays?
Comment 6 Jörg Kubitz CLA 2021-03-25 16:12:05 EDT
Created attachment 285953 [details]
Sampling Result getInputStreamAsCharArray

During our build 
org.eclipse.jdt.internal.compiler.util.Util.getInputStreamAsCharArray ()
is used for reading .java files during compilation (7s) and indexing (6s) and during 
ClasspathEntry.getManifestContents (<3s)

while org.eclipse.jdt.internal.compiler.util.Util.getInputStreamAsByteArray () is used to read classes from filesystem (8s) and Jar files (8s)

worth to optimize
Comment 7 Jörg Kubitz CLA 2021-03-27 08:58:41 EDT
The following functions are good enough to work with InputStream:

b=inputStream.readAllBytes() // could even be overwritten by custom code
c=(new String(b, int, int, Charset)).toCharArray()