日期:2014-05-17  浏览次数:20941 次

使用c#读取word文档

原文地址: http://www.c-sharpcorner.com/UploadFile/Globalking/fileAccessingusingcsharp02242006050207AM/fileAccessingusingcsharp.aspx

作者: Krishnan LN

原文题目: Reading a word document using C#

我们可以使用FileStream对象来把文本文件里面的信息读取出来,但是对于word文档来说就不能使用这样的方法了.

这种情况下我们需要使用叫做” Microsoft Word 9.0 object library”COM组件来实现,它为我们提供了所有用来读取word文档的对象和方法.

这里我们主要用Word.ApplicationClass下的方法来处理word应用程序.

实现的思路是先在内存中把这个word文档打开,然后把里面的内容全部拷贝的剪切板中,最后再把数据从剪切板里面取出来.

代码如下:

Word.ApplicationClass wordApp=new ApplicationClass();

object file=path;

object nullobj=System.Reflection.Missing.Value;

Word.Document doc = wordApp.Documents.Open(

ref file, ref nullobj, ref nullobj,

ref nullobj, ref nullobj, ref nullobj,

ref nullobj, ref nullobj, ref nullobj,

ref nullobj, ref nullobj, ref nullobj);

doc.ActiveWindow.Selection.WholeStory();

doc.ActiveWindow.Selection.Copy();

IDataObject data=Clipboard.GetDataObject();

txtFileContent.Text=data.GetData(DataFormats.Text).ToString();

doc.Close();

以下是作者原文

We may have used FileStream to read text from a text file but not the same way for getting text from a word document.

We have to use a Microsoft COM component called "Microsoft Word 9.0 object library" which provides classes and methods to read from a word document.

We have to use Word.ApplicationClass to have access to the word application.

Open the word document in memory, copy all the content to the clipboard and then we can take the data from the clipboard.

The code required is given below:

Word.ApplicationClass wordApp=new ApplicationClass();

object file=path;

object nullobj=System.Reflection.Missing.Value;

Word.Document doc = wordApp.Documents.Open(

ref file, ref nullobj, ref nullobj,

ref nullobj, ref nullobj, ref nullobj,

ref nullobj, ref nullobj, ref nullobj,

ref nullobj, ref nullobj, ref nullobj);

doc.ActiveWindow.Selection.WholeStory();

doc.ActiveWindow.Selection.Copy();

IDataObject data=Clipboard.GetDataObject();

txtFileContent.Text=data.GetData(DataFormats.Text).ToString();

doc.Close();