Converting Docx Files To Text In Swift
Your initial issue is with how you get the string from the URL. String(File_Name)
is not the correct way to convert a file URL into a file path. The proper way is to use the path
function.
let location = NSURL.fileURLWithPath(NSTemporaryDirectory())
let fileURL = location.URLByAppendingPathComponent("My File.docx")
let fileContent = try? NSString(contentsOfFile: fileURL.path, encoding: NSUTF8StringEncoding)
Note the many changes. Use proper naming conventions. Name variables more clearly.
Now here's the thing. This still won't work because a docx file is a zipped up collection of XML and other files. You can't load a docx file into an NSString
. You would need to use NSData
to load the zip contents. Then you would need to unzip it. Then you would need to go through all of the files and find the desired text. It's far from trivial and it is far beyond the scope of a single stack overflow post.
Convert text to .docx document file and open share dialog, Swift 2
I figured this one out for you.
Take this answer user Casey
made, and just change this line:
let fileName = "\(title).txt"
to this
let fileName = "\(title).docx"
and Swift will figure out that you want to export your text as a document, and convert it automatically.
How can I parse text from rich text format files like (.doc, .pages, .docx, etc.)
This can't be done using Data(contentsOf:)
or String(contentsOf:)
because .docx
format is a zipped format consists of xml and other files. In order to parse the text from the .docx
file, you should unzip the doc file. In my case, I used ZIPFoundation to unzip the document. Parse the file named word/document.xml
under the extract path using any XML Parser and you will be able to get the text from the document.
Sources:
Converting Docx Files To Text In Swift
Reading or Converting word .doc files iOS
Read contents of a DOC (.docx or .doc) file and convert it to String
You can use this SNDocx
OR
Doing this is not as easy as you imagine,
a docx file is a zipped up collection of XML and other files. You can't load a docx file into an String
. You would need to use Data
to load the zip
contents. Then you would need to unzip
it. Then you would need to go through all of the files and find the desired word/document.xml
then read xml and parse .
Im use Zippy
Look this code
guard let originalFileURL = Bundle.main.url(forResource: "test", withExtension: "docx") else {
print("file not found :( ")
return
}
do{
let filename = try! ZipFile.init(url: originalFileURL)
// file name content
// - 0 : "[Content_Types].xml"
// - 1 : "word/numbering.xml"
// - 2 : "_rels/.rels"
// - 3 : "word/theme/theme1.xml"
// - 4 : "word/fontTable.xml"
// - 5 : "word/document.xml"
// - 6 : "word/settings.xml"
// - 7 : "word/styles.xml"
// - 8 : "word/_rels/document.xml.rels"
for file in filename {
if file.contains("document.xml"){
let data = filename[file]
print(String.init(data: data!, encoding: String.Encoding.utf8))
}
}
}catch{
print(error)
}
Output
<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\r<w:document xmlns:mc=\"http://schemas.openxmlformats.org/markup-compatibility/2006\" xmlns:o=\"urn:schemas-microsoft-com:office:office\" xmlns:r=\"http://schemas.openxmlformats.org/officeDocument/2006/relationships\" xmlns:m=\"http://schemas.openxmlformats.org/officeDocument/2006/math\" xmlns:v=\"urn:schemas-microsoft-com:vml\" xmlns:wp=\"http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing\" xmlns:w10=\"urn:schemas-microsoft-com:office:word\" xmlns:w=\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\" xmlns:wne=\"http://schemas.microsoft.com/office/word/2006/wordml\" xmlns:sl=\"http://schemas.openxmlformats.org/schemaLibrary/2006/main\" xmlns:a=\"http://schemas.openxmlformats.org/drawingml/2006/main\" xmlns:pic=\"http://schemas.openxmlformats.org/drawingml/2006/picture\" xmlns:c=\"http://schemas.openxmlformats.org/drawingml/2006/chart\" xmlns:lc=\"http://schemas.openxmlformats.org/drawingml/2006/lockedCanvas\" xmlns:dgm=\"http://schemas.openxmlformats.org/drawingml/2006/diagram\" xmlns:wps=\"http://schemas.microsoft.com/office/word/2010/wordprocessingShape\" xmlns:wpg=\"http://schemas.microsoft.com/office/word/2010/wordprocessingGroup\" xmlns:w14=\"http://schemas.microsoft.com/office/word/2010/wordml\" xmlns:w15=\"http://schemas.microsoft.com/office/word/2012/wordml\"><w:body><w:p w:rsidR=\"00000000\" w:rsidDel=\"00000000\" w:rsidP=\"00000000\" w:rsidRDefault=\"00000000\" w:rsidRPr=\"00000000\" w14:paraId=\"00000000\"><w:pPr><w:contextualSpacing w:val=\"0\"/><w:rPr/></w:pPr><w:r w:rsidDel=\"00000000\" w:rsidR=\"00000000\" w:rsidRPr=\"00000000\"><w:rPr><w:rtl w:val=\"0\"/></w:rPr><w:t xml:space=\"preserve\">test</w:t></w:r></w:p><w:sectPr><w:pgSz w:h=\"15840\" w:w=\"12240\"/><w:pgMar w:bottom=\"1440\" w:top=\"1440\" w:left=\"1440\" w:right=\"1440\" w:header=\"0\"/><w:pgNumType w:start=\"1\"/></w:sectPr></w:body></w:document>
You have to parse xml this and you will find in the my output that it must be parse until this value is obtained
<w:t xml:space=\"preserve\">test</w:t>
docx XML Format reference
Create a word document, Swift
Unfortunately, it is nearly impossible to create a .docx
file in Swift, given how complicated they are (you can see for yourself by changing the file extension on any old .docx
file to .zip
, which will reveal their inner structure). The next best thing is to simply create a .txt
file, which can also be opened into Pages (though sadly not Docs). If you're looking for a more polished format, complete with formatting and possibly even images, you could choose to create a .pdf
file.
Here are some code samples that might be of assistance:
Creating and sharing a .txt
file in Swift 3:
func export(_ string: String, title: String) throws {
// create a file path in a temporary directory
let fileName = "\(title).txt"
let filePath = (NSTemporaryDirectory() as NSString).appendingPathComponent(fileName)
// save the string to the file
try string.write(toFile: filePath, atomically: true, encoding: String.Encoding.utf8)
// open share dialog
// Initialize Document Interaction Controller
self.interactionController = UIDocumentInteractionController(url: URL(fileURLWithPath: filePath))
// Configure Document Interaction Controller
self.interactionController!.delegate = self
// Present Open In Menu
self.interactionController!.presentOptionsMenu(from: yourexportbarbuttonoutlet, animated: true) // create an outlet from an Export bar button outlet, then use it as the `from` argument
}
This can be called with
export("Hello World", title: "HelloWorld")
to instantly create a txt file and open the share dialog for it.
Creating and sharing a simple .pdf
file in Swift 3:
func openPDF(_ string: String, title: String) throws {
// 1. Create a print formatter
let html = "<h2>\(title)</h2><br><h4>\(string)</h4>" // create some text as the body of the PDF with html.
let fmt = UIMarkupTextPrintFormatter(markupText: html)
// 2. Assign print formatter to UIPrintPageRenderer
let render = UIPrintPageRenderer()
render.addPrintFormatter(fmt, startingAtPageAt: 0)
// 3. Assign paperRect and printableRect
let page = CGRect(x: 10, y: 10, width: 595.2, height: 841.8) // A4, 72 dpi, x and y are horizontal and vertical margins
let printable = page.insetBy(dx: 0, dy: 0)
render.setValue(NSValue(cgRect: page), forKey: "paperRect")
render.setValue(NSValue(cgRect: printable), forKey: "printableRect")
// 4. Create PDF context and draw
let pdfData = NSMutableData()
UIGraphicsBeginPDFContextToData(pdfData, CGRect.zero, nil)
for i in 1...render.numberOfPages {
UIGraphicsBeginPDFPage();
let bounds = UIGraphicsGetPDFContextBounds()
render.drawPage(at: i - 1, in: bounds)
}
UIGraphicsEndPDFContext();
// 5. Save PDF file
var path = "\(NSTemporaryDirectory())\(title).pdf"
pdfData.write(toFile: path, atomically: true)
print("open \(path)") // check if we got the path right.
// open share dialog
print("opening share dialog")
// Initialize Document Interaction Controller
self.interactionController = UIDocumentInteractionController(url: URL(fileURLWithPath: path))
// Configure Document Interaction Controller
self.interactionController!.delegate = self
// Present Open In Menu
self.interactionController!.presentOptionsMenu(from: yourexportbarbuttonoutlet, animated: true) // create an outlet from an Export bar button outlet, then use it as the `from` argument
}
This can be called with
openPDF("Hello World", title: "HelloWorld")
to instantly create a pdf file and open the share dialog for it.
Edit: Found an interesting (though not polished) workaround to getting text to open up in Google Docs: use the function from the "creating a .txt
file" section here, and just change the filename to "\(title).docx"
. This will fool Docs into thinking it's a .docx
document, which will allow the text to open in Docs successfully. Unfortunately, this creates an invalid document that can't be opened by Pages, Word, or really any other app because it doesn't actually create a real document file. And the Interaction Controller will make it look to the user like they can also open it in Pages, though that invariably fails.
Related Topics
How to Suppress Warnings in Swift 3
Self Refrence Inside Swift Closure Return Nil Some Time
Accessing Swift Set Elements by Their Hash Value
In Swift, Can You Find All Types in a Module That Adhere to a Specific Protocol
Gcd Pattern for Chaining Async Operations While Piping the Results
Swiftlint Overriding Project Settings Related to Spm
How to Prevent Eventstore Access Error on First Run
How to Call Presentviewcontroller from Inside Class
iOS 11 PDFkit Not Updating Annotation Position
Generate Avaudiopcmbuffer with Avaudiorecorder
Is a Static Boolean a Reference Type in Swift
Swiftui Inputaccesoryview Implementation
Combining Watchconnectivity and Complications
iOS Charts Library - How to Handle X-Axis Duplicate Values
Swift Class Doesn't Like Self.View.Addsubview()
Why Do I Get "Static Member '...' Cannot Be Used on Instance of Type '...'" Error