Python:如何将markdown格式的文本转换为文本

时间:2022-12-10 21:34:38

I need to convert markdown text to plain text format to display summary in my website. I want the code in python.

我需要将markdown文本转换为纯文本格式以在我的网站中显示摘要。我想要python中的代码。

2 个解决方案

#1


37  

This module will help do what you describe:

该模块将帮助您完成您所描述的内容:

http://www.freewisdom.org/projects/python-markdown/Using_as_a_Module

Once you have converted the markdown to HTML, you can use a HTML parser to strip out the plain text.

将markdown转换为HTML后,可以使用HTML解析器去除纯文本。

Your code might look something like this:

您的代码可能如下所示:

from BeautifulSoup import BeautifulSoup
from markdown import markdown

html = markdown(some_html_string)
text = ''.join(BeautifulSoup(html).findAll(text=True))

#2


2  

Commented and removed it because I finally think I see the rub here: It may be easier to convert your markdown text to HTML and remove HTML from the text. I'm not aware of anything to remove markdown from text effectively but there are many HTML to plain text solutions.

评论并删除它,因为我终于认为我看到了这里的问题:将标记文本转换为HTML并从文本中删除HTML可能更容易。我没有意识到有效地从文本中删除markdown,但有许多HTML到纯文本解决方案。

#1


37  

This module will help do what you describe:

该模块将帮助您完成您所描述的内容:

http://www.freewisdom.org/projects/python-markdown/Using_as_a_Module

Once you have converted the markdown to HTML, you can use a HTML parser to strip out the plain text.

将markdown转换为HTML后,可以使用HTML解析器去除纯文本。

Your code might look something like this:

您的代码可能如下所示:

from BeautifulSoup import BeautifulSoup
from markdown import markdown

html = markdown(some_html_string)
text = ''.join(BeautifulSoup(html).findAll(text=True))

#2


2  

Commented and removed it because I finally think I see the rub here: It may be easier to convert your markdown text to HTML and remove HTML from the text. I'm not aware of anything to remove markdown from text effectively but there are many HTML to plain text solutions.

评论并删除它,因为我终于认为我看到了这里的问题:将标记文本转换为HTML并从文本中删除HTML可能更容易。我没有意识到有效地从文本中删除markdown,但有许多HTML到纯文本解决方案。