使用C#我应该如何从docx文档中提取标题字幕和段落

时间:2022-10-30 13:58:49

Using C# how should I go about extracting titles subtitles and paragraphs from a docx document.

使用C#我应该如何从docx文档中提取标题字幕和段落。

I am thinking of doing this through VSTO but do know know the word object model. I am only familiar with the Excel object model.

我正在考虑通过VSTO这样做但是知道对话模型这个词。我只熟悉Excel对象模型。

Should I take the unzip + linq to XML approach ?

我应该采用unzip + linq to XML方法吗?

Using VSTO i could build an addin which could be used to edit the application where I would convert to and from docx.

使用VSTO我可以构建一个插件,可用于编辑我将转换为docx和从docx转换的应用程序。

does anyone have prior experiences with this kind of thing? any leads will be greatly appreciated.

有没有人有此类事情的经验?任何线索将不胜感激。

1 个解决方案

#1


Personally I'd take the unzip + LINQ2XML approach. (You can unzip using the built-in support in the framework or if you are using an old version you can use the zip library provided by icsharpcode.net

我个人采用unzip + LINQ2XML方法。 (您可以使用框架中的内置支持解压缩,或者如果您使用的是旧版本,则可以使用icsharpcode.net提供的zip库。

I'd take this approach because for something as simple as this I'd rather not depend on VSTO. This way the end user doesn't even need to have Office installed. (And there are no other license issues... of which I don't know the details).

我采取这种方法是因为对于像这样简单的事情,我宁愿不依赖于VSTO。这样最终用户甚至不需要安装Office。 (并且没有其他许可证问题......我不知道其中的细节)。

Just my opinion.

只是我的观点。

#1


Personally I'd take the unzip + LINQ2XML approach. (You can unzip using the built-in support in the framework or if you are using an old version you can use the zip library provided by icsharpcode.net

我个人采用unzip + LINQ2XML方法。 (您可以使用框架中的内置支持解压缩,或者如果您使用的是旧版本,则可以使用icsharpcode.net提供的zip库。

I'd take this approach because for something as simple as this I'd rather not depend on VSTO. This way the end user doesn't even need to have Office installed. (And there are no other license issues... of which I don't know the details).

我采取这种方法是因为对于像这样简单的事情,我宁愿不依赖于VSTO。这样最终用户甚至不需要安装Office。 (并且没有其他许可证问题......我不知道其中的细节)。

Just my opinion.

只是我的观点。