| Summary: | FastDamagerRepairer is too slow | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Modeling] TMF | Reporter: | Mark Christiaens <mark.g.j.christiaens> | ||||||
| Component: | Xtext | Assignee: | Project Inbox <tmf.xtext-inbox> | ||||||
| Status: | CLOSED FIXED | QA Contact: | |||||||
| Severity: | enhancement | ||||||||
| Priority: | P3 | CC: | sebastian.zarnekow, sven.efftinge | ||||||
| Version: | 1.0.1 | Flags: | sebastian.zarnekow:
indigo+
|
||||||
| Target Milestone: | M7 | ||||||||
| Hardware: | All | ||||||||
| OS: | All | ||||||||
| Whiteboard: | |||||||||
| Attachments: |
|
||||||||
|
Description
Mark Christiaens
We should rename it to 'EvenFasterDamageRepairer" :-) What version do you use? I doubt that it takes a minute to copy or shift some (e.g. 500.000) entries of an arraylist ;-) Please note that such big files (10.000++ lines) often require project specific optimizations / implementations. I think you did not understand. The current implementation does multiple add and removes while it is going through the changed region and updates the tokenInfos. Each such operation shifts the (large) array. Don't doubt, measure. :) The profiler confirms I'm right. I'm working on a improved version but it still misses a few corner cases. I'll make it available shortly. Would still be interesting to hear which version you refer to. I see you are working with 1.0.1 Did you measure using a LinkedList instead? (In reply to comment #6) > Did you measure using a LinkedList instead? That's indeed what I'm working on. Just replacing the ArrayList with a LinkedList is not sufficient however because in the loop the List is indexed. Indexing a big LinkedList is very slow. So either you use ArrayList and the add/removes are slow or a LinkedList and the indexing is very slow (to quote Bart Simpson "You're dammed if you do and you're dammed if you don't"). My new version will not use indexing but it will use a LinkedList. Created attachment 179951 [details]
Git patch against the v1.0.1 tag improving speed
Created attachment 179952 [details]
Test file to illustrate speed difference
If you create an XText editor based on the sample project "Xtext Domain-Model Example" then you can open this file in that editor. You should see that it is very slow to actually edit the file (it's also slow to build the project but that's a different issue).
I've provided a git patch against tag v1.0.1. I'm just learning git but it seems that you can apply it doing "git am -3 allpatches.patch" (after unzipping).
I've run all the org.eclipse.xtext.ui.tests. On my setup I get 130 errors and 1 failure for both the v1.0.1 and this patch so I seem to have something that is functionally equivalent. Specifically, the two unit tests FastDamageRepairerTest and DamagerRepairerPerformanceTest pass entirely.
One thing that still puzzles me is that the DamagerRepairer gets invoked even when a code fold is opened or collapsed. Since that does not actually modify the code I've added a quick:
if (e.getLength() == 0 && e.getText().length() == 0) {
return new Region(0, 0);
}
that returns almost immediately. Maybe someone with more Eclipse experience can confirm that that makes sense and explain why the DamagerRepairer gets triggered to begin with.
A good review would be very welcome. It's easy to miss corner cases.
Thanks for the patch (haven't looked at it). The IDamager needs to provide a correct damaged region even if the document hasn't been modified. Because that's what the repairer (and the syntax coloring) is based on. However, I had already moved the class you have optimized to an earlier life cycle where it is only called on real document changes. The IDamager uses that information and always provides a correct damaged region. Regarding the tests. Did you run them as plug-in tests? (In reply to comment #11) > Thanks for the patch (haven't looked at it). > > The IDamager needs to provide a correct damaged region even if the document > hasn't been modified. > Because that's what the repairer (and the syntax coloring) is based on. > > However, I had already moved the class you have optimized to an earlier life > cycle where it is only called on real document changes. The IDamager uses that > information and always provides a correct damaged region. > > Regarding the tests. Did you run them as plug-in tests? Yes, I did run them as plugin tests. If you don't (I tried) then pretty much everything fails. I'm not absolutely sure about the state you are working on, since we had some trouble with the build system at that time (and still have), but usually all of our tests are green. And if some are broken, we fix them ASAP (just to give you some confidence in our test suite :-)). https://hudson.eclipse.org/hudson/job/Xtext-nightly-HEAD/314/testReport/? (In reply to comment #13) > I'm not absolutely sure about the state you are working on, since we had some > trouble with the build system at that time (and still have), but usually all of > our tests are green. And if some are broken, we fix them ASAP (just to give you > some confidence in our test suite :-)). > > https://hudson.eclipse.org/hudson/job/Xtext-nightly-HEAD/314/testReport/? I should be running v1.0.1 straight from the GIT repo. The first stack trace I get is for example: 1 [main] ERROR org.eclipse.xtext.ui.editor.reconciler.XtextReconcilerUnitOfWork - Parsing in reconciler failed. org.eclipse.xtext.parser.ParseException: java.lang.IllegalStateException: Unresolved proxy http://www.eclipse.org/2008/xtext/ui/common/tests/ContentAssist#//Start. Make sure the EPackage has been registered. at org.eclipse.xtext.ui.tests.testlanguages.parser.antlr.ContentAssistTestLanguageParser.parse(ContentAssistTestLanguageParser.java:35) at org.eclipse.xtext.parser.antlr.AbstractAntlrParser.doParse(AbstractAntlrParser.java:57) at org.eclipse.xtext.parser.AbstractParser.parse(AbstractParser.java:46) at org.eclipse.xtext.resource.XtextResource.doLoad(XtextResource.java:146) at org.eclipse.xtext.linking.lazy.LazyLinkingResource.doLoad(LazyLinkingResource.java:63) at org.eclipse.xtext.resource.XtextResource.reparse(XtextResource.java:168) at org.eclipse.xtext.ui.editor.reconciler.XtextReconcilerUnitOfWork.process(XtextReconcilerUnitOfWork.java:57) at org.eclipse.xtext.ui.editor.reconciler.XtextReconcilerUnitOfWork.process(XtextReconcilerUnitOfWork.java:1) at org.eclipse.xtext.util.concurrent.IUnitOfWork$Void.exec(IUnitOfWork.java:36) at org.eclipse.xtext.util.concurrent.IStateAccess$AbstractImpl.modify(IStateAccess.java:57) at org.eclipse.xtext.ui.editor.model.XtextDocument$XtextDocumentLocker.modify(XtextDocument.java:161) at org.eclipse.xtext.ui.editor.model.XtextDocument.modify(XtextDocument.java:74) at org.eclipse.xtext.ui.editor.reconciler.XtextDocumentReconcileStrategy.reconcile(XtextDocumentReconcileStrategy.java:27) at org.eclipse.xtext.ui.tests.editor.contentassist.Bug297909Test.testReconcileDocument(Bug297909Test.java:96) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:613) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.pde.internal.junit.runtime.RemotePluginTestRunner.main(RemotePluginTestRunner.java:62) at org.eclipse.pde.internal.junit.runtime.UITestApplication$1.run(UITestApplication.java:116) at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:35) at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:134) at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3527) at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3174) at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2629) at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2593) at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2427) at org.eclipse.ui.internal.Workbench$7.run(Workbench.java:670) at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332) at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:663) at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149) at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:115) at org.eclipse.pde.internal.junit.runtime.UITestApplication.start(UITestApplication.java:47) at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:196) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110) at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369) at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:613) at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:619) at org.eclipse.equinox.launcher.Main.basicRun(Main.java:574) at org.eclipse.equinox.launcher.Main.run(Main.java:1407) at org.eclipse.equinox.launcher.Main.main(Main.java:1383) Caused by: org.eclipse.emf.common.util.WrappedException: java.lang.IllegalStateException: Unresolved proxy http://www.eclipse.org/2008/xtext/ui/common/tests/ContentAssist#//Start. Make sure the EPackage has been registered. at org.eclipse.xtext.parser.antlr.AbstractInternalAntlrParser.parse(AbstractInternalAntlrParser.java:511) at org.eclipse.xtext.ui.tests.testlanguages.parser.antlr.ContentAssistTestLanguageParser.parse(ContentAssistTestLanguageParser.java:32) ... 59 more Caused by: java.lang.IllegalStateException: Unresolved proxy http://www.eclipse.org/2008/xtext/ui/common/tests/ContentAssist#//Start. Make sure the EPackage has been registered. at org.eclipse.xtext.parser.DefaultEcoreElementFactory.create(DefaultEcoreElementFactory.java:56) at org.eclipse.xtext.ui.tests.testlanguages.parser.antlr.internal.InternalContentAssistTestLanguageParser.ruleStart(InternalContentAssistTestLanguageParser.java:159) at org.eclipse.xtext.ui.tests.testlanguages.parser.antlr.internal.InternalContentAssistTestLanguageParser.entryRuleStart(InternalContentAssistTestLanguageParser.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:613) at org.eclipse.xtext.parser.antlr.AbstractInternalAntlrParser.parse(AbstractInternalAntlrParser.java:502) ... 60 more !ENTRY org.apache.log4j 4 0 2010-09-30 14:38:03.264 !MESSAGE org.eclipse.xtext.ui.editor.reconciler.XtextReconcilerUnitOfWork - Parsing in reconciler failed. Maybe that makes sense to you? Anyway. This probably belongs in a different ticket. Would it be possible for you to upgrade your patch to the latest version from the Xtext master branch? (In reply to comment #15) > Would it be possible for you to upgrade your patch to the latest version from > the Xtext master branch? I've located the equivalent code in the Xtext 2.0 branch. I think it gets invoked a lot less than in the Xtext 1.0.x development but it may still be useful to fix it nevertheless. I modified DocumentTokenSource like so: package org.eclipse.xtext.ui.editor.model; import java.util.Collections; import java.util.Iterator; import java.util.List; import java.util.ListIterator; import org.antlr.runtime.ANTLRStringStream; import org.antlr.runtime.CommonToken; import org.antlr.runtime.Token; import org.antlr.runtime.TokenSource; import org.eclipse.jface.text.DocumentEvent; import org.eclipse.jface.text.IRegion; import org.eclipse.jface.text.Region; import org.eclipse.xtext.parser.antlr.Lexer; import org.eclipse.xtext.ui.LexerUIBindings; import com.google.common.collect.AbstractIterator; import com.google.common.collect.Lists; import com.google.inject.Inject; import com.google.inject.Provider; import com.google.inject.name.Named; /** * @author Sebastian Zarnekow - Initial contribution and API * @author Sven Efftinge * @author Mark Christiaens */ public class DocumentTokenSource { public static class TokenInfo { private final int length; private final int type; public TokenInfo(CommonToken token) { length = getTokenLength(token); type = token.getType(); } @Override public int hashCode() { final int prime = 31; int result = 1; result = prime * result + length; result = prime * result + type; return result; } @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; TokenInfo other = (TokenInfo) obj; if (length != other.length) return false; if (type != other.type) return false; return true; } @Override public String toString() { return "TokenInfo [length=" + length + ", type=" + type + "]"; } public int getAntlrTokenType() { return this.type; } public int getLength() { return this.length; } } /** * @author Sven Efftinge - Initial contribution and API */ public static class TokenAdapter implements ILexerTokenRegion { private TokenInfo token; private int offset; public TokenAdapter(TokenInfo token, int offset) { this.token = token; this.offset = offset; } public int getLength() { return token.getLength(); } public int getOffset() { return offset; } public int getLexerTokenType() { return token.getAntlrTokenType(); } } public static class IRegionIterable implements Iterable<ILexerTokenRegion> { private Iterable<TokenInfo> tokens = null; public IRegionIterable(Iterable<TokenInfo> tokens) { this.tokens = tokens; } public Iterator<ILexerTokenRegion> iterator() { return new AbstractIterator<ILexerTokenRegion>() { private int offset = 0; private Iterator<TokenInfo> infos = tokens.iterator(); @Override protected ILexerTokenRegion computeNext() { if (!infos.hasNext()) { endOfData(); return null; } TokenInfo next = infos.next(); try { return new TokenAdapter(next, offset); } finally { offset += next.getLength(); } } }; } } private boolean checkInvariant = false; private List<TokenInfo> internalModifyableTokenInfos = Collections.emptyList(); private List<TokenInfo> tokenInfos = Collections.emptyList(); private IRegion previousRegion; private DocumentEvent previousEvent; @Inject @Named(LexerUIBindings.HIGHLIGHTING) private Provider<Lexer> lexer; public Iterable<ILexerTokenRegion> getTokenInfos() { return new IRegionIterable(tokenInfos); } public IRegion getLastDamagedRegion() { return previousRegion; } public void setLexer(Provider<Lexer> lexer) { this.lexer = lexer; } protected void setTokens(List<TokenInfo> infos) { this.internalModifyableTokenInfos = infos; this.tokenInfos = Collections.unmodifiableList(Lists.newArrayList(infos)); } protected List<TokenInfo> createTokenInfos(String string) { List<TokenInfo> result = Lists.newLinkedList(); TokenSource source = createLexer(string); CommonToken token = null; do { token = (CommonToken) source.nextToken(); TokenInfo info = createTokenInfo(token); result.add(info); } while (token != Token.EOF_TOKEN); return result; } protected TokenInfo createTokenInfo(CommonToken token) { TokenInfo info = new TokenInfo(token); return info; } public void updateStructure(final DocumentEvent e) { try { if (previousEvent == e && previousRegion != null) { return; } previousRegion = computeDamageRegion(e); } finally { previousEvent = e; if (isCheckInvariant()) { doCheckInvariant(e); } } } protected void doCheckInvariant(final DocumentEvent e) { List<TokenInfo> parsedTokenInfos = createTokenInfos(e.fDocument.get()); if (!parsedTokenInfos.equals(internalModifyableTokenInfos)) { throw new IllegalStateException("Expected: '" + parsedTokenInfos + "' but was: '" + internalModifyableTokenInfos + "'."); } } protected IRegion computeDamageRegion(final DocumentEvent e) { long stopTime = System.currentTimeMillis(); long startTime = stopTime; // empty document -> no dirty region if (e.getDocument().getLength() == 0) { setTokens(createTokenInfos(e.fDocument.get())); return new Region(0, 0); } // previously empty -> full document dirty if (internalModifyableTokenInfos.isEmpty()) { setTokens(createTokenInfos(e.fDocument.get())); return new Region(0, e.getDocument().getLength()); } /* If nothing is changed, return the empty region */ if (e.getLength() == 0 && e.getText().length() == 0) { return new Region(0, 0); } try { int tokenInfoStart = -1; int nextTokenInfoStart = 0; TokenSource source = createLexer(e.fDocument.get()); CommonToken token = null; ListIterator<TokenInfo> tokenInfosIt = internalModifyableTokenInfos.listIterator(); TokenInfo tokenInfo = null; /* * At the end of this loop, we want to have found the first token (if * any) that does not correspond with the previous time we lexed or that * lies within the modified region. */ while (true) { token = (CommonToken) source.nextToken(); /* * Note: there is always a EOF TokenInfo at the end of the list. If * we run into that then either the lexer will also have returned a * EOF or not. If so then the loop will be ended at the bottom. If * not, the loop will be ended due to a mismatch. Therefore, we can * safely fetch a tokenInfo without "hasNext". */ tokenInfo = tokenInfosIt.next(); tokenInfoStart = nextTokenInfoStart; nextTokenInfoStart = tokenInfoStart + tokenInfo.length; boolean inModifiedRegion = nextTokenInfoStart > e.fOffset; if (!tokenCorrespondsToTokenInfo(token, tokenInfo, inModifiedRegion)) { /* Mismatch */ break; } if (token.getType() == Token.EOF) { /* * Perfect match all the way to the end of the file without * running into the modified region. */ return new Region(0, 0); } } /* * At this point tokenInfo and token are the first mismatch. * tokenInfosIt points to the next mismatch (if any). */ int regionOffset = tokenInfoStart; int lengthDiff = e.fText.length() - e.fLength; int afterRegion = e.fOffset + e.fText.length(); /* * We have to shift the accounting for the position of the tokens we've * seen because we will be comparing them with the position of tokens * _after_ the modified region. We have to take the size of the new text * into account. */ tokenInfoStart += lengthDiff; nextTokenInfoStart += lengthDiff; /* * Drop all the TokenInfos that are (partially) in or before the * modified region. */ while (tokenInfo.type != Token.EOF && tokenInfoStart < afterRegion) { tokenInfosIt.remove(); tokenInfo = tokenInfosIt.next(); tokenInfoStart = nextTokenInfoStart; nextTokenInfoStart = tokenInfoStart + tokenInfo.length; } tokenInfosIt.previous(); /* * At this point tokenInfosIt.next will produce the first TokenInfo that * comes from after the modified region. tokenInfosIt.prev should * produce the last matched TokenInfo before the region (if it exists). * * tokenInfo is the first TokenInfo that is located completely after the * modified region. */ /* * Parse all the tokens that are partially in or before the modified * region. */ while (token.getType() != Token.EOF && token.getStartIndex() < afterRegion) { tokenInfosIt.add(createTokenInfo(token)); token = (CommonToken) source.nextToken(); } /* * At this point token is the first token located after the modified * region */ tokenInfosIt.next(); /* * At this point tokenInfosIt.next () will produce the second TokenInfo * that comes after the modified region. * * token points to the first token completely located after the modified * region. */ /* * Now we try to find a matching Token/TokenInfo pair located after the * changed region. */ while (token != Token.EOF_TOKEN && tokenInfo.type != Token.EOF) { /* * This loop removes all the TokenInfos that do not match with the * current or any subsequent Token and returns from the function if * a match is found */ while (true) { if (tokenInfoStart > token.getStopIndex() || tokenInfo.type == Token.EOF) { /* * We've rejected matching as many TokenInfos as we could on * the basis of the current Token. Go to next Token. */ break; } else { if (tokenInfoStart == token.getStartIndex() && tokenCorrespondsToTokenInfo(token, tokenInfo, false)) { /* Match */ return new Region(regionOffset, token.getStartIndex() - regionOffset); } else { /* Mismatch */ tokenInfosIt.remove(); tokenInfo = tokenInfosIt.next(); tokenInfoStart = nextTokenInfoStart; nextTokenInfoStart = tokenInfoStart + tokenInfo.length; } } } /* * After this loop, tokenInfo points to EOF or to a TokenInfo that * is positioned strictly after the current Token. The current Token * didn't match the TokenInfos so it must replace the TokenInfos * that were removed. */ TokenInfo tokenInfoToAdd = createTokenInfo(token); tokenInfosIt.previous(); tokenInfosIt.add(tokenInfoToAdd); tokenInfosIt.next(); /* * At this point tokenInfosIt.next () is again the TokenInfo after * the current tokenInfo value. */ token = (CommonToken) source.nextToken(); } /* * At this point, we either ran out of TokenInfos or Tokens without * actually finding a match. */ if (tokenInfo.type == Token.EOF) { /* * If we ran out of TokenInfos then we have to drop the EOF * TokenInfo and append the remaining converted Tokens. */ tokenInfosIt.remove(); tokenInfosIt.add(createTokenInfo(token)); while (token.getType() != Token.EOF) { token = (CommonToken) source.nextToken(); tokenInfosIt.add(createTokenInfo(token)); } } else { /* * If we ran out of Tokens then we have to remove any remaining * TokenInfos and append EOF. */ int size = internalModifyableTokenInfos.size(); int nextIndex = tokenInfosIt.nextIndex(); internalModifyableTokenInfos.subList(nextIndex - 1, size).clear(); internalModifyableTokenInfos.add(createTokenInfo(token)); } /* * Region begins at the first mismatch all the way to the end of the * document. */ return new Region(regionOffset, e.fDocument.getLength() - regionOffset); } finally { setTokens(internalModifyableTokenInfos); System.out.println ("computeDamageRegion: " + (stopTime - startTime) + "ms"); } } private boolean tokenCorrespondsToTokenInfo(CommonToken token, TokenInfo tokenInfo, boolean inModifiedRegion) { return !inModifiedRegion && tokenInfo.type == token.getType() && getTokenLength(token) == tokenInfo.length; } private static int getTokenLength(CommonToken token) { return token.getStopIndex() - token.getStartIndex() + 1; } protected Lexer createLexer(String string) { Lexer l = lexer.get(); l.setCharStream(new ANTLRStringStream(string)); return l; } public void setCheckInvariant(boolean checkInvariant) { this.checkInvariant = checkInvariant; } public boolean isCheckInvariant() { return checkInvariant; } } It still seems to work but please review carefully. I didn't think hard about it for this version :) Postponed performance optimization The spurious code is in TokenScanner.setRange(IDocument, int, int) which causes a complete iterator of all tokens for each and every range that is set. However, we can exploit the fact that the framework will reset the ranges in ascending order which reduces to time to open a large Xtext grammar from 10 secs to approx 1.5 secs. Pushed to master. Please reopen if the problem persists. The change broke the highlighting. Pushed the fixed fix to master. Closing all bugs that were set to RESOLVED before Neon.0 Closing all bugs that were set to RESOLVED before Neon.0 |