Skip to content

Integrate Piece Tree -> IncrementalTextDocument | TextDocuments Incremental Document Support#478

Merged
dbaeumer merged 15 commits intomicrosoft:masterfrom
Jma353:server/incremental-text-documents-sync
Jun 5, 2019
Merged

Integrate Piece Tree -> IncrementalTextDocument | TextDocuments Incremental Document Support#478
dbaeumer merged 15 commits intomicrosoft:masterfrom
Jma353:server/incremental-text-documents-sync

Conversation

@Jma353
Copy link
Contributor

@Jma353 Jma353 commented Apr 9, 2019

This pull request is the result of a dialogue in #476.

This PR does 2 things:

  1. Integrates @rebornix's vscode-piece-tree npm module. I use this module to implement IncrementalTextDocument, which is the Incremental sync document representation (as FullTextDocument is the Full sync document representation). This is a class internal to server module.
  2. Updates the TextDocuments document manager to support Incremental text synchronization mode. This manager defaults to Full, but has an optional constructor parameter that allows the consumer of the API to specify Incremental document sync. When Incremental document sync is specified, this document manager passes isFullSync = false to TextDocument#create, which results in IncrementalTextDocument being used per document.

This PR also bootstraps a test suite for TextDocuments (specifically involving onChange events) + a test suite for IncrementalTextDocument via TextDocuments document manager.

Output of server tests:

  Connection Tests
    ✓ Ensure request parameter passing
    ✓ Ensure notification parameter passing

  TextDocuments Tests
    ✓ onDidChangeContent TextDocumentSyncKind.Full change file content
    ✓ onDidChangeContent TextDocumentSyncKind.Full several content updates
    ✓ onDidChangeContent TextDocumentSyncKind.Incremental removing content
    ✓ onDidChangeContent TextDocumentSyncKind.Incremental adding content
    ✓ onDidChangeContent TextDocumentSyncKind.Incremental replacing content
    ✓ onDidChangeContent TextDocumentSyncKind.Incremental several content changes

  IncrementalTextDocument
    ✓ Single line document
    ✓ Multiple line document
    ✓ Varying newline characters
    ✓ getText(Range)
    ✓ Invalid inputs
    ✓ Basic append
    ✓ Multi-line append
    ✓ Basic delete
    ✓ Multi-line delete
    ✓ Single character replace
    ✓ Multi-character replace
    ✓ Invalid update ranges


  20 passing (37ms)

@msftclas
Copy link

msftclas commented Apr 9, 2019

CLA assistant check
All CLA requirements met.

let updatedDoc: UpdateableDocument = edits.reduce(
(workingTextDoc: UpdateableDocument, edit) => {
let newText = TextDocument.applyEdits(workingTextDoc, [edit]);
workingTextDoc.update({ text: newText }, version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently every update on the TextDocument will flush the _lineOffset

https://github.com/Microsoft/vscode-languageserver-node/blob/9de8809400b7c443bc25f211dfaddcb74675b2d7/types/src/main.ts#L2654

while applyEdits relies on _lineOffset https://github.com/Microsoft/vscode-languageserver-node/blob/9de8809400b7c443bc25f211dfaddcb74675b2d7/types/src/main.ts#L2500-L2501

It means for every content change we try to apply, it will go through the whole document, character by character, to find all line breaks. In real world scenarios it can be really costly.

Copy link
Contributor Author

@Jma353 Jma353 Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I definitely observed this limitation.

These black-box functions (applyEdits / update) are the only ways to currently update text documents. As we discussed in #476, I can use your npm module in this PR and swap out the implementations for these core functions as well, so that these calls don't rescan the entire document.

Apologies, when I originally wrote this up, I wasn't aware that you had already published the package to npm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries and you are doing a great job here. I'm just a bit surprised to see that lineOffsets were being calculated every time when we do full sync (the good part is we calculate only when reading line info).

Based on your proposal and this PR, I think we can see two major performance improvement

  • We don't need to send full document from client to server anymore when the sync mode is incremental
  • The line offset calculation is incremental when the sync mode is incremental, which is actually the major time spent in loading a file in VS Code.

Let me know if you need any help from me, I can push directly to your PR with piece tree related changes (if you are okay with that).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've made some sort of local progress on adding your piece tree as a dependency and using it, then feel free to push it to this PR. Otherwise, I was planning on adding it as a dependency + using it in FullTextDocument / TextDocument namespace today.

Feel free to let me know, and we can proceed from there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jma353 go for it, I don't have anything yet, look forward to your change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure - I'll update this today or tomorrow.


// Prepare Piece Tree
const ptBuilder = new PieceTreeTextBufferBuilder();
ptBuilder.acceptChunk(content);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rebornix - should I be slicing this up into 64 KB chunks, as you mentioned is done by fs.readFile in your blog post, or is this an appropriate use of your API (passing the entire content "chunk" in on instantiation).

Copy link
Member

@rebornix rebornix Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to. If we do simple slice, then there is no difference as SlicedString in v8 is just a mapping to part of the original string. If we create copies from that, the initialization will be costly.

It would be great that if the document data we get from client comes as chunks instead of a single string, and then we can benefit from that.

@Jma353 Jma353 changed the title TextDocuments Incremental Document Sync Support + Tests Integrate Piece Tree -> IncrementalTextDocument | TextDocuments Incremental Document Support Apr 16, 2019
@Jma353
Copy link
Contributor Author

Jma353 commented Apr 16, 2019

@dbaeumer given that I add piece tree npm module in this PR, unsure if I now need to add a thirdpartynotices.txt file in types, similar to the one in client. I'm assuming vscode-piece-tree will eventually become first-party (under Microsoft github).

@@ -2611,6 +2616,15 @@ export interface TextDocumentContentChangeEvent {
text: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the comments for text? In FullSync mode, text is always the content of the full document from client side but now in incremntal mode, it's the change for the corresponding range. Another thing we need to make sure that Client sends only content change but not full text in incremtnal mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/Microsoft/vscode-languageserver-node/blob/053149a76e7ec147347e41a5847a1cc4ccb3d541/client/src/client.ts#L936-L940

It's already guaranteed that when it's incremental, it sends the change event but not the full doc. So we only need to update the documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update the documentation in the next commit to the PR :)

@Jma353
Copy link
Contributor Author

Jma353 commented Apr 18, 2019

Outside of updating documentation, any additional feedback on this? @rebornix, @dbaeumer, @jrieken? I can address it all in one commit if there's any more.

@rebornix
Copy link
Member

@Jma353 the code looks good to me and thanks for your help! I'll love to hear feedback from others as well, considering some of them are in vacation, I think you can push changes and even better do some benchmark.

@Jma353
Copy link
Contributor Author

Jma353 commented May 3, 2019

@rebornix I'll submit doc change + will setup a lightweight benchmark this weekend.

In general, I feel as though an explicit win of this is just the idea that we can support incremental sync-ing in the first place. Sending full files between processes isn't cheap.

*/
export function create(uri: string, languageId: string, version: number, content: string): TextDocument {
return new FullTextDocument(uri, languageId, version, content);
export function create(uri: string, languageId: string, version: number, content: string, isFullSync: boolean = true): TextDocument {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make isFullSync a enum or string or type. It more feels like a enum in case we have another sync expierence later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boolean looks strange, and the return type is always TextDocument. It's unclear to an API user when to set isFullSync and when not.
What about defining a separate constructor?

When doing that I would also officially introduce the UpdatableTextDocument. The default implementation of a UpdatableTextDocument should be the new piece tree document as it is optimised for updates.

export interface UpdatableTextDocument extends TextDocument {
    update(event: TextDocumentContentChangeEvent, version: number)
}
export namespace UpdatableTextDocument {
    export function create(uri: string, languageId: string, version: number, content: string) : UpdatableTextDocument {
       return new IncrementalTextDocument(uri, languageId, version, content);
   }
}

@dbaeumer
Copy link
Member

dbaeumer commented May 8, 2019

Other than the above comment things look good to me.

@aeschli do you have any input / comments?

@aeschli
Copy link
Contributor

aeschli commented May 9, 2019

My comment is about TextDocument.create(uri: string, languageId: string, version: number, content: string, isFullSync: boolean = true): TextDocument

The isFullSync looks strange, and the return type is always TextDocument. It's unclear to an API user when to set isFullSync and when not.
What about defining a separate constructor?

When doing that I would also officially introduce the UpdatableTextDocument. The default implementation of a UpdatableTextDocument should be the new piece tree document as it is optimised for updates.

export interface UpdatableTextDocument extends TextDocument {
    update(event: TextDocumentContentChangeEvent, version: number)
}
export namespace UpdatableTextDocument {
    export function create(uri: string, languageId: string, version: number, content: string) : UpdatableTextDocument {
       return new IncrementalTextDocument(uri, languageId, version, content);
   }
}

TextDocument.create should continue to return a FullTextDocument for now, and we should also keep FullTextDocument.update, but we should fix it to also work correctly with a range.
At some point we could decide to remove the FullTextDocument and only use the new IncrementalTextDocument also for TextDocument.create.

@dbaeumer dbaeumer added this to the May 2019 milestone May 9, 2019
@dbaeumer
Copy link
Member

I though about this again and I am in favour of making this more explicit. However I don't like the name UpdatableTextDocument since FullTextDocument are updateable as well. To make this in line with the TextDocumentSyncKind I would call them FullTextDocument and IncrementalTextDocument. We should deprecate the TextDocument namespace and expose a FullTextDocument interface and namespace. The FullTextDocument interface would define an update(text: string, version) method. The IncrementalTextDocument text document a update(event: TextDocumentContentChangeEvent, version: number) method. We can also add is methods to the namespace which do an instanceof against the private class.

@Jma353 are you willing to do these changes?

@rebornix
Copy link
Member

rebornix commented May 24, 2019

I prefer FullTextDocument and IncrementalTextDocument as well, which aligns with TextDocumentSyncKind.

@Jma353
Copy link
Contributor Author

Jma353 commented May 29, 2019

These changes seem reasonable. I appreciate the feedback all!

Apologies for the slowness in my response (been busy with some other things). I will prioritize iterating on this by EOW.

@aeschli
Copy link
Contributor

aeschli commented May 29, 2019

@Jma353 @rebornix Me and Dirk had some more discussion and came up with one more suggestion, which has the smallest impact on the API:

  • we keep types as-is, no new APIs
  • IncrementalTextDocument stays in protocol as an internal class to be used byTextDocuments and optimized for its needs.

Sorry for the back and forth. If you are very busy then I'm happy to merge and make the changes. Just let me know.

@Jma353
Copy link
Contributor Author

Jma353 commented May 29, 2019

I can change these things @aeschli - thanks for the thoughtful feedback.

Expect an update with said changes by Friday, this way you don't have to do any additional refinements on your part.

@Jma353
Copy link
Contributor Author

Jma353 commented May 31, 2019

@aeschli did you mean server (not protocol) in your above feedback?

protocol seems to be only namespace and interface declarations. Additionally, TextDocuments is in server subdirectory.

I can remove changes from types directory. The only API change that will result from this PR is an optional argument passed to TextDocuments ctor. I currently pass it as a enum value.

@Jma353
Copy link
Contributor Author

Jma353 commented May 31, 2019

I have updated the PR according to feedback, to limit API changes. Here is a summary:

  1. As mentioned, the only API change is an optional constructor parameter passed to TextDocuments, which allows for enum-based switching between FullTextDocument and IncrementalTextDocument.
  2. IncrementalTextDocument is now a class internal to server (since it's only used by TextDocuments).
  3. I have ported all test-cases that existed in types for IncrementalTextDocument to server. I have the suite in the TextDocuments test file.
  4. I have rebased on @rebornix 's change to the piece tree npm version, as well as latest master.

@rebornix
Copy link
Member

Considering @Jma353 's great effort is going to be merged into the project, I'll add some scripts to VS Code in June to ship piece tree directly from there instead of having my own repository, similar to how we ship Monaco.

@Jma353
Copy link
Contributor Author

Jma353 commented May 31, 2019

@rebornix as in have piece tree available via vscode API suite?

A.k.a. import {PieceTreeBase, ...} from 'vscode';?

@rebornix
Copy link
Member

rebornix commented Jun 3, 2019

@Jma353 I didn't go that route finally but aligned with how we ship vscode-uri, which has its implementation in its own repo. I published a npm package called vscode-textbuffer, the api doesn't change at all and the repo will be under microsoft (will be public soon).

update

vscode-textbuffer is public now https://github.com/microsoft/vscode-textbuffer

@dbaeumer dbaeumer modified the milestones: May 2019, June 2019 Jun 4, 2019
@dbaeumer
Copy link
Member

dbaeumer commented Jun 4, 2019

@Jma353 I am correct that you still need to adopt the new vscode-textbuffer npm package. Right?

@rebornix
Copy link
Member

rebornix commented Jun 4, 2019

I replaced vscode-piece-tree with vscode-textbuffer and comments related to that. @dbaeumer is it good to go?

@Jma353
Copy link
Contributor Author

Jma353 commented Jun 4, 2019

@rebornix I see travis failure relating to the new module.

Thanks for swapping out npm modules btw :)

@rebornix
Copy link
Member

rebornix commented Jun 4, 2019

@Jma353 thanks it's green green now.

@dbaeumer
Copy link
Member

dbaeumer commented Jun 5, 2019

@Jma353 & @rebornix thanks a lot for making this happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants