most efficient and professional way

Question:
I want code for: syntax highlighting (of programming languages)
Language: C# or assembly x86 (preferably C#)
Platform: Windows
Qualifications: most efficient implementation possible / most professional / the way that big corporations like Microsoft do it
Rephrased: How do I implement syntax highlighting in C# for Windows in the most efficient way presently known?


Elaboration (feel free to skip - not needed to answer question :)):
I don't want just any way of implementing it - I've already seen several.
What I'd like to know is how Microsoft does it so well on Visual Studio (whichever version).

People keep trying to reinvent the wheel when it comes to syntax highlighting. I don't understand why.
Is this considered a very hard problem? I've seen implementations that only highlight what's currently showing on the screen, I think that's the way to go... (it used some clever API to know which lines of a textbox were actually showing).
I've also seen implementations using RichTextBox and I think that's not the way to go (maybe I'm wrong here) - I think something like subclassing the routine that draws text on the regular textbox and changing its brushes might be better (maybe I've seen that somewhere - I doubt I'd think of that myself)
Also I've heard that some people implement it with AST just like a compiler would be coded (the lexer part, I think?) - I'd hope that that's overkill - I don't see that as being efficient. (uneducated guess)

If it's indeed a hard problem, then how do the big corps always get it right? I've never heard of a way to break the syntax highlighting in Visual Studio, for example.
But any other tool that implements it does so poorly, or worse than the big guys.
What's the official "this is the best way and any other way is less efficient" way of doing it?

I really don't have any evidence that Microsoft's way is better, but seeing that they probably know more about Windows API than anybody else, I'd guess that there way of implementing it is the best (I would love to be wrong - imagine being able to say that my implementation of syntax highlighting is better than MS's!)

Sorry for the disjointed elaboration.
Also I apologize in advance for any faux-pas - this is my first question.


I don't think there is a "this is the best way and any other way is less efficient" way to do it. In reality I don't think that efficiency is the major problem. Rather complexity is. A good syntax highlighter is based on a good parser. As long as you can parse the code you can highlight every part of it in any way you like. But, what happens then when the code is not well-formed? A lot of syntax highlighter just highlight keywords and a few block structures to overcome this problem. By doing this, they can use simple regular expressions instead of having a full-fledged, syntax-error tolerant parser (which is what Visual Studio has).


最好的方法可能是重用现有的东西,比如ScintillaNET。


As with anything code.... there rarely is a "best" way. There are multiple ways of doing things and each of them have benefits and drawbacks.

That said, some form of the Interpreter Pattern is probably the most common way. According to the GoF book:

The Interpreter pattern is widely used in compilers implemented with object-oriented languages, as the Smalltalk compilers are. SPECTalk uses the pattern to interpret descriptions of input file formats. The QOCA constraint-solving toolkit uses it to evaluate constraints.

It also goes on to talk about it's limitations in the applicability section

  • the grammer is simple. For complex grammars, the class hierarchy for the grammer becomes large and unmanagable. Tools such as parser generators are a better alternative in such cases
  • effeciency is not a critical concern. The most efficient interpreters are usually not implemented by interpreting parse trees directly but by first translating them into another form. For example, regular expressions are often transformed into state machines. But even then, the translator can be implemented by the Interpreter pattern, so the pattern is still applicable.
  • Understanding this, you should now know why it's better to pre-compile your reusable RegEx first before performing many matches with it. If you don't, it will have to do both steps every time (transformation, interpretations) rather than building the state machine once, and applying it efficiently several times over.

    Specifically for the type of interpretation you are describing, Microsoft exposes the Microsoft.VisualStudio namespace and all of it's powerful features as part of the Visual Studio SDK. You can also look at System.CodeDOM for dynamic code generation and compilation.

    链接地址: http://www.djcxy.com/p/47098.html

    上一篇: 如何获取Vim中所有已安装配色方案的列表?

    下一篇: 最高效和专业的方式