RFC: Evaluating scope name additions to built-in grammars #19623

savetheclocktower · 2019-07-03T03:59:31Z

This RFC is about how to evolve grammars and syntax themes so that their design goals don't get in each others' way.

I am utterly certain that a maximum of four people on earth will care about this, but I'd love to find out I'm wrong. Hopefully some discussion can help refine exactly what is being proposed here.

Rendered version.

View rendered docs/rfcs/005-scope-naming.md

lee-dohm · 2019-07-03T17:39:32Z

It looks like this is a recommendation for the triage workflow mainly, if not totally. If that is correct and this is accepted, this document or excerpts from it will probably end up in atom/design-decisions.

maxbrunsfeld · 2019-07-03T18:39:36Z

I don't have strong opinions on this, and am not working on Atom full-time any more, but I'll give my 2 cents here, in case anyone finds anything useful in it.

For reference, the language-babel grammar scopes foo as variable.other.readwrite.js. I’d probably opt for something like variable.import; others may want to put it into the support namespace. There’s actually little cross-language consensus here.

I'm a bit skeptical that there will ever be cross-language consensus with that level of detail. I'd actually love for all of the scopes to become much, much simpler - ideally one word like (type, function, tag, variable, property, string), and occasionally two words (e.g. type.builtin), but only when necessary. IMO, the more complex the scopes become, the less compatible themes will be across different languages, the more tightly coupled themes will become to specific grammars, and the more bike shedding will go on endlessly.

When introducing the Tree-sitter grammars, I put a lot of work into trying to make themes look consistent across languages, and I found that I could do it to some degree, by simplifying the scopes. But people ended up needing to add back some of the specificity, mostly for compatibility with community themes. Backward compatibility is a huge impediment in this area.

I’ve got lots of commands that behave in different ways based on the surrounding scope. The richer the scope descriptor, the better.

In my ideal long-term vision, the scopes we use for syntax highlighting would be decoupled from APIs like atom.commands and atom.config. The syntax tree itself is a much more precise and performant way to customize behavior syntactically, as we have done in atom/bracket-matcher#367, and with the new folding system.

Unfortunately, I don't have detailed designs for how to use the syntax tree to serve your use cases. And once the API is designed, it's a lot of work to document it and try to migrate existing code to use it.

savetheclocktower · 2019-07-03T19:37:20Z

First off, @maxbrunsfeld: my main fear when writing this was that it would come off as unreasonably critical or dismissive of your efforts and design choices, because that’s honestly not how I feel.

I'm a bit skeptical that there will ever be cross-language consensus with that level of detail. I'd actually love for all of the scopes to become much, much simpler - ideally one word like (type, function, tag, variable, property, string), and occasionally two words (e.g. type.builtin), but only when necessary. IMO, the more complex the scopes become, the less compatible themes will be across different languages, the more tightly coupled themes will become to specific grammars, and the more bike shedding will go on endlessly.

I see what you’re saying here, but I think this is still conflating the design goals of syntax themes with the design goals of grammars. Any syntax theme can choose to behave in this way — to color variable the same whether it’s variable.foo or variable.bar.baz.thud — and I’d even agree that the built-in themes should shoot for that kind of simplicity out of the gate.

If I were giving advice to someone writing their first syntax theme, I’d tell them to start out by paying attention to only the initial part of a scope name. Pick your colors for variables, comments, strings, and such, and then you’re 80% done, and left with a theme that will look decent in any conforming grammar. But the last 20% of writing a syntax theme is about distinguishing the exceptions: going through the most popular grammars and applying any necessary tweaks based on the semantics of the particular language or, hell, just personal preference.

I agree that it’s not feasible to get cross-language consensus on how that last 20% should be scoped. Should the foo in import foo from "thing" be scoped as variable or constant? I don’t know. Either argument could be made. But if I want it to look like a variable in my syntax theme, I’m out of luck if it’s simply scoped constant. At least if it’s constant.imported-package I’ve got something to work with.

I understand the view that the existing hierarchy is a bit too left-brained, and even needlessly complex, but I think that the examples you’re proposing are too simple to do the job.

When introducing the Tree-sitter grammars, I put a lot of work into trying to make themes look consistent across languages, and I found that I could do it to some degree, by simplifying the scopes. But people ended up needing to add back some of the specificity, mostly for compatibility with community themes. Backward compatibility is a huge impediment in this area.

Before I edited this RFC for brevity, I had written several paragraphs on how challenging this task must have been, and how different developers would’ve approached it in various ways, all equally valid. There’s never a good time to try to harmonize the scoping of built-in grammars — it’ll change things, and people will complain — but, since tree-sitters were going to change syntax highlighting anyway, this was a natural time to try.

In my ideal long-term vision, the scopes we use for syntax highlighting would be decoupled from APIs like atom.commands and atom.config. The syntax tree itself is a much more precise and performant way to customize behavior syntactically, as we have done in atom/bracket-matcher#367, and with the new folding system.

I agree that bracket-matching and folding are better done outside of the scope system. I still think that it’s better to have scope names be the “public interface” around the syntax tree because (a) we still live in a world with TM-style grammars, for which there’s no syntax tree to use; (b) scopes allow someone to customize behavior in an abstract way, without having to know the details of how a certain grammar’s tree-sitter nodes are named.

Let me clarify what I’m talking about in the latter point:

The built-in link package allows you to open a URL if your cursor is within it. To figure out when the cursor is within a URL, it checks for the presence of the markup.underline.link scope. This enables it to work in any grammar that scopes URLs in that manner. (The only specific knowledge it has is of Markdown, so that it can follow named hyperlinks like [link text][footnote].)
The built-in toggle-quotes package allows you to toggle a string between single quotes and double quotes (and some other quote delimiters on a user-configurable, per-language basis). To figure out when the cursor is inside of a quoted string, it checks for the presence of the string.quoted scope.

These are packages that use scopes for their semantic value apart from syntax highlighting. Of course, these packages could inspect syntax trees instead. But for that to happen, we’d have to go beyond the tree-sitter grammars and come up with some naming conventions for tree-sitter parsers. After a lot of work, we’d end up with a standard for semantic, language-independent naming of common constructs in programming/markup languages. In other words, we’d have reinvented scopes.

Unfortunately, I don't have detailed designs for how to use the syntax tree to serve your use cases. And once the API is designed, it's a lot of work to document it and try to migrate existing code to use it.

Understood. I’m definitely up for as much of that work as I’m able to do, provided a consensus emerges on how to proceed.

sadick254

This looks like something I would consider doing some time in the future. @savetheclocktower Thank you for the detailed RFC.

savetheclocktower added 3 commits July 2, 2019 22:42

Add RFC about how to evaluate proposed scope additions to grammars.

8d3c5c2

Give the RFC a title.

f31782c

Fix typos.

209c104

savetheclocktower mentioned this pull request Jul 3, 2019

Suggestions for scope additions/changes atom/language-javascript#649

Open

claytonrcarter mentioned this pull request Aug 9, 2021

Update draft tree-sitter grammar in #303 atom/language-php#438

Draft

10 tasks

sadick254 approved these changes Sep 3, 2021

View reviewed changes

sadick254 merged commit 779a9ca into atom:master Sep 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Evaluating scope name additions to built-in grammars #19623

RFC: Evaluating scope name additions to built-in grammars #19623

Uh oh!

savetheclocktower commented Jul 3, 2019 •

edited by markdownify bot

Loading

Uh oh!

lee-dohm commented Jul 3, 2019

Uh oh!

maxbrunsfeld commented Jul 3, 2019 •

edited

Loading

Uh oh!

savetheclocktower commented Jul 3, 2019 •

edited

Loading

Uh oh!

sadick254 left a comment

Uh oh!

Uh oh!

RFC: Evaluating scope name additions to built-in grammars #19623

RFC: Evaluating scope name additions to built-in grammars #19623

Uh oh!

Conversation

savetheclocktower commented Jul 3, 2019 • edited by markdownify bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lee-dohm commented Jul 3, 2019

Uh oh!

maxbrunsfeld commented Jul 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

savetheclocktower commented Jul 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sadick254 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

savetheclocktower commented Jul 3, 2019 •

edited by markdownify bot

Loading

maxbrunsfeld commented Jul 3, 2019 •

edited

Loading

savetheclocktower commented Jul 3, 2019 •

edited

Loading