Convert rich MarkDown to plain text
How to convert rich Markdown into just plain text? So it can be used ie for a Facebook OpenGraph description.
I'm using MarkdownSharp, and it doesn't seem to have this functionality. Before I'm going to reinvent the wheel I thought of asking here first.
Any hints about an implementation strategy are greatly appreciated!
Example
The Monorailcat
---------------
![Picture of a Lolcat](https://media1.giphy.com/media/c7goDcMPKjw6A/200_s.gif)
One of the earliest pictures of **monorail cat** found is from the website [catmas.com’s blog][1] section, dated from November 2, 2006.
[1]: http://catmas.com/blog
Should be converted to:
The Monorailcat
One of the earliest pictures of monorail cat found is from the website catmas.com’s blog section, dated from November 2, 2006.
You have a few possibilities.
As stated in a comment, you can convert to HTML, then convert the HTML to plain text. This is probably the most reliable and consistent solution cross-platform.
Switch to a library that can convert between multiple formats, including the formats you desire. Pandoc would be an example of such a tool.
Use a Markdown parser which outputs an AST. While such parsers usually provide an HTML renderer (accepts AST as input and outputs HTML), you can create your own renderer which outputs whatever format you want.
Actually, it turns out that Pandoc is also an example of #3. It just happens to already have an existing plain text renderer. Of course, if you are looking for a C# lib, then Pandoc may not meet your needs. And I'm not aware of any C# libs which meet that need (the reference implementation uses regex string substitution and many (most?) parsers have followed that example). That said, I'm not familiar with any of the Markdown libs in C# and this is not an appropriate place to make recommendations. However, there is a lengthy, albeit incomplete, list of parsers here. You may find something of use there.
链接地址: http://www.djcxy.com/p/89542.html下一篇: 将丰富的MarkDown转换为纯文本