在C#中使用Microsoft.Office.Interop.Word合并和拆分Word文档
我需要将许多Word文件合并为一个文件,将其发送作为修订者,然后再次将其拆分为相同的分隔文件.大约有200个小的Word文档.
I need to merge many Word files into a single file, send it for a revisor and split it again to same separated files. There is about 200 small Word documents.
因此,当我进行合并时,我需要添加任何类型的标记以供以后拆分时参考.我实际上是在添加带有原始文件名的标签,最终的Word文件将如下所示:
So, when I'm doing the merge I need to add any type of mark for reference for when I will split it. I'm actually adding a tag with the original file name, the final Word file will be like this:
[c:\ doc \ file1.doc]
[ c:\doc\file1.doc ]
Lorem ipsum dolor坐在amet,保密 高贵.智人双性恋双歧.阿提姆 ipsum pretium中的facilisis nunc ut arcu tincidunt. 菜豆静脉曲张Vulputate leo quis 发酵罐.菜豆的硬脂酸小菜等 达比布斯维利特达比布斯. Sed eleifend lectus et lacinia facilisis. Pellentesque eleifend,紫茎泽兰purus,智人purus fringilla arcu,volutpat dolor arcu ullamcorper purus.在维维拉 玛格娜·内克(magna neque)在习惯性高原中 悔. Praesent aliquam arcu diam,发酵酵母 彭特斯克式样Aliquam nulla eros,porttitor quis molestie eu, mollis vel lacus. Sed nec aliquam libero. Donec vel congue sapien,sed dignissim nisl. Praesent dui nulla,贝母fringilla lorem id, 紫花苜蓿.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis eget ipsum non est ultricies bibendum ac a sapien. Etiam facilisis nunc ut arcu tincidunt, in fermentum ipsum pretium. Phasellus non viverra orci. Vestibulum varius vulputate leo quis fermentum. Phasellus adipiscing diam ultricies odio accumsan, et dapibus velit dapibus. Sed eleifend lectus et lacinia facilisis. Pellentesque eleifend, purus in convallis faucibus, sapien purus fringilla arcu, a volutpat dolor arcu ullamcorper purus. In viverra magna neque, eget imperdiet urna luctus at. In hac habitasse platea dictumst. Praesent aliquam arcu diam, quis fermentum lacus pellentesque ut. Aliquam nulla eros, porttitor quis molestie eu, mollis vel lacus. Sed nec aliquam libero. Donec vel congue sapien, sed dignissim nisl. Praesent dui nulla, fringilla iaculis lorem id, lacinia imperdiet odio.
[c:\ doc \ file1.doc]
[ c:\doc\file1.doc ]
[c:\ doc \ file2.doc]
[ c:\doc\file2.doc ]
原产于土耳其的turpis,tur sagittis arcu. 毛uri(Mauris iaculis lacus ut orci adipiscing),维生素e ipif egestas. Suspendisse ullamcorper的结果是laoreet. Nullam interdum augue eget 前临时Porttitor. Sed dignissim nulla libero,优等品 前庭quis.菜豆Rhoncus leo sed leo gravida,NEC ullamcorper neque tempor. Sed sollicitudin,尼斯 sollicitudin,dui enim tristique leo,ac sodales leo elit quis odio. Nulla dictum mattis mi在tempus.
Proin eu consectetur turpis, vel sagittis arcu. Mauris iaculis lacus ut orci adipiscing, vitae eleifend ipsum egestas. Suspendisse ullamcorper consequat laoreet. Nullam interdum augue eget ante tempor porttitor. Sed dignissim nulla libero, eu ultricies urna vestibulum quis. Phasellus rhoncus leo sed leo gravida, nec ullamcorper neque tempor. Sed sollicitudin, nisi ut lobortis sollicitudin, dui enim tristique leo, ac sodales leo elit quis odio. Nulla dictum mattis mi in tempus.
[c:\ doc \ file2.doc]
[ c:\doc\file2.doc ]
我正在使用此代码合并文件,工作正常:
I'm using this code to merge the files, working fine:
using System;
using System.Collections.Generic;
using Word = Microsoft.Office.Interop.Word;
namespace MyDocs
{
public class MsWord
{
public static void Merge(List<string> filesToMerge, string outputFilename, string documentTemplate)
{
object defaultTemplate = documentTemplate;
object missing = System.Type.Missing;
object outputFile = outputFilename;
// Create a new Word application
Word._Application wordApplication = new Word.Application();
try
{
// Create a new file based on our template
Word._Document wordDocument = wordApplication.Documents.Add(ref defaultTemplate, ref missing, ref missing, ref missing);
// Make a Word selection object.
Word.Selection selection = wordApplication.Selection;
// Loop thru each of the Word documents
foreach(var file in filesToMerge)
{
// create a tag with the file name
string uid = String.Format("\n[ {0} ]\n", file);
selection.TypeText(uid);
selection.InsertFile(file, ref missing, ref missing, ref missing, ref missing);
selection.TypeText(uid);
}
// Save the document to it's output file.
wordDocument.SaveAs(ref outputFile,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing);
// Clean up!
wordDocument = null;
}
catch (Exception ex)
{
//I didn't include a default error handler so i'm just throwing the error
throw ex;
}
finally
{
// Finally, Close our Word application
wordApplication.Quit(ref missing, ref missing, ref missing);
}
}
}
}
现在我被困住了,我现在不知道如何拆分并且我不了解Interop类,我需要阅读整个Word,找到标记并将其拆分为单独的文件.
Now I'm stucked, I don't now how to split and I don't understood the Interop class, I need to read the entire Word, find the tags and split it into separated files.
我认为标记不是最好的方法,因为我不需要显示它.我试过像这样使用Section对象:
I think that the tag is not the best way, because I don't need show it. I tried using the Section object like this:
foreach(var file in filesToMerge)
{
selection.Sections.Add();
selection.InsertFile(Environment.CurrentDirectory + @"\" + file, ref missing, ref missing, ref missing, ref missing);
}
在阅读了这样的文档之后:
And after reading the document like this:
foreach (Word.Section section in wordDocument.Sections)
{
// do save stuff
}
但是现在只返回了2个部分:(
But now there is only 2 sections returned :(
在我看来,最好的选择(代替标签)是使用书签.书签为:
In my opinion best options (instead of tags) would be to use bookmarks. Bookmarks are:
- 易于添加!类似于
Activedocument.bookmarks.add...
(基于VBA语法) - 容易找到(按名称)
- 它们可以通过
for each loop
进行迭代,而迭代要通过bookmark name
进行, - 它们具有
range object property
,可让您在文档中找到书签所在的确切点, - 如果需要的话,它们可能是
zero length range
- 如果名称以
_
开头(下划线标记,仅在以编程方式添加书签时有效),它们可能不可见
- easy to add!! something like
Activedocument.bookmarks.add...
(based on VBA syntax) - easy to find (by name),
- they could be iterated by
for each loop
where iteration goes bybookmark name
, - they have
range object property
which allows you to find exact point within your document where bookmark is located, - they could be
zero length range
if needed - they could be invisible if name starts with
_
(underline mark, works rather only when bookmark is added programmatically)