在Swift中将Docx文件转换为文本

在Swift中将Docx文件转换为文本

问题描述:

我的临时存储中有一个.docx文件:

I have a .docx file in my temporary storage:

    let location: NSURL = NSURL.fileURLWithPath(NSTemporaryDirectory())
    let file_Name = location.URLByAppendingPathComponent("5 November 2016.docx")

我现在想做的是提取此文档中的文本.但我似乎找不到任何转换器或方法.

What I now want to do is extract the text inside this document. But I cannot seem to find any converters or methods of doing this.

我已经尝试过了:

    let file_Content = try? NSString(contentsOfFile: String(file_Name), encoding: NSUTF8StringEncoding)
    print(file_Content)

但是它显示nil.

那我该如何阅读docx文件中的文本?

So how do I read the text in a docx file?

最初的问题是如何从URL获取字符串. String(File_Name)不是将文件URL转换为文件路径的正确方法.正确的方法是使用 path 函数.

Your initial issue is with how you get the string from the URL. String(File_Name) is not the correct way to convert a file URL into a file path. The proper way is to use the path function.

let location = NSURL.fileURLWithPath(NSTemporaryDirectory())
let fileURL = location.URLByAppendingPathComponent("My File.docx")
let fileContent = try? NSString(contentsOfFile: fileURL.path, encoding: NSUTF8StringEncoding)

请注意许多更改.使用正确的命名约定.更清楚地命名变量.

Note the many changes. Use proper naming conventions. Name variables more clearly.

现在这是东西.这仍然不起作用,因为docx文件是XML和其他文件的压缩集合.您无法将docx文件加载到 NSString 中.您将需要使用 NSData 来加载zip内容.然后,您需要将其解压缩.然后,您需要浏览所有文件并找到所需的文本.这绝非易事,而且远远超出了单个堆栈溢出文章的范围.

Now here's the thing. This still won't work because a docx file is a zipped up collection of XML and other files. You can't load a docx file into an NSString. You would need to use NSData to load the zip contents. Then you would need to unzip it. Then you would need to go through all of the files and find the desired text. It's far from trivial and it is far beyond the scope of a single stack overflow post.