Git Product home page Git Product logo

Comments (21)

angelozerr avatar angelozerr commented on May 27, 2024

Have you some trouble with large file?

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

not necessarily yet, but we need to think about setting up performance tests, once all the major features are complete

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

not necessarily yet, but we need to think about setting up performance tests, once all the major features are complete

If there is performance problem with big file, I think vscode html language service will have the same problem since XMLDocument is rebuild each time you type some content inside editor (TextDocumentSyncKind.Full).

I think to improve performance, TextDocumentSyncKind.Incremental should be used, but I think it's an hard thing to do.

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

@angelozerr whatever performance the vscode html has, since the xml parser starts to deviate from it (in many ways), we might see a different behaviour, better or worse.
I'm not saying we need to improve the performance now, just that we need to keep track of it eventually. What matters now is to provide correct results. Worry about performance later.

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

What matters now is to provide correct results. Worry about performance later.

Totally agree with you. Thanks for creating this issue.

from lemminx.

NikolasKomonen avatar NikolasKomonen commented on May 27, 2024

After testing largeFile.txt
I deleted the 2nd last tag </a> in Intellij it took ~2.8 seconds and in lsp4xml ~4.7 seconds till a missing end tag response was received.

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

15000L is not what I call large :-) I was thinking about 10's of MB (even that's not large, depending on the context)
Anyways, I think we're getting murdered by TextDocumentSyncKind.Full, as we're sending the full document over the connection on each document change, the whole document is rebuilt entirely on every keystroke, it's not scaling well.
For now, we'll call it a known limitation, but we'll have to work on improving performance once we get the initial features right.

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

@NikolasKomonen thanks for testing that and attached your large file.

Anyways, I think we're getting murdered by TextDocumentSyncKind.Full, as we're sending the full document over the connection on each document change, the whole document is rebuilt entirely on every keystroke, it's not scaling well.

Indeed I think it can be a problem, and it was my fear, because we need to manage "incremental" parser which is an hard task I think.

BUT I have started to study the problem, and building the XMLDocument directly from the given file takes 1590 ms which is too big. It seems the problem comes from the regexp. I will give you feedback and try to fix the problem.

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

See test at https://github.com/angelozerr/lsp4xml/blob/master/org.eclipse.lsp4xml/src/test/java/org/eclipse/lsp4xml/internal/parser/XMLDocumentTest.java

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

@NikolasKomonen I have improved performance, your large file takes 195 ms instead of 1590 ms!

Please give me feedback.

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

@angelozerr awesome work! This makes the extension much snappier!!!
did you use a profiler or just noticed that substring hanging there?
Are there other places where substring/string concatenation might be hurting us?

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

@angelozerr awesome work! This makes the extension much snappier!!!

Glad it pleases you:) I think now we have the same performance than HTML Language Server.

did you use a profiler or just noticed that substring hanging there?

To be honnest with you, I'm not very familiar with profiler, I have just noticed this hang (at first I though it was because of regexp, but it was about String#substring)

Are there other places where substring/string concatenation might be hurting us?

I have tried to check that, but I think it's OK. The main problem is to do a substring from a position to string length (it creates a very large String each time, using Matcher#region locate just the matcher. We could use the same Matcher too, but it seems that it doesn't improve performance.

from lemminx.

NikolasKomonen avatar NikolasKomonen commented on May 27, 2024

@angelozerr This is awesome, great find.

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

Thanks guys!

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

Here's a site to find xml documents of various sizes, some pretty big: http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/

Tried with a 30MB doc in vscode. Never reached the point where I got error reported. I'll try with incremental support later

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

Here's a site to find xml documents of various sizes, some pretty big: http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/

Cool!

Tried with a 30MB doc in vscode. Never reached the point where I got error reported. I'll try with incremental support later

Could you give me the link of your xml that you are testing please.

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

Seems the server chokes on the nasa.xml from http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/www/repository.html#nasa

Tried with xmx2GB

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

@fbricon I have added the nasa.xml in the test:

  • for largeFile.xml:
Parsed 'largeFile.xml' with XMLScanner in 31 ms.
Parsed 'largeFile.xml' with XMLParser in 25 ms.

  • for nasa.xml:
Parsed 'nasa.xml' with XMLParser in 731 ms.
Parsed 'nasa.xml' with XMLScanner in 371 ms.

from lemminx.

fbricon avatar fbricon commented on May 27, 2024

try validation, formatting, hover...

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

try validation, formatting, hover...

Yes sure we need too add thoses tests. But if creation of XMLDocument takes 731ms, I think we will have slow problem.

WTP XML Editor cannot open it too your nasa.xml.

I fear that it will very hard to support very large file. I think a problem is because it's not incremental. Have you tried with "experimental" incremental support?

from lemminx.

angelozerr avatar angelozerr commented on May 27, 2024

I close this issue since 0.8.0 improves performance and memories and gives the capability to disable outline.

from lemminx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.