Convert Nsattributedstring into Data for Storage

Convert NSAttributedString into Data for storage

You need to specify what kind of document data you would like to convert your attributed string to:



.txt    // Plain Text Document Type (Simple Text)
.html // HTML Text Document Type (Hypertext Markup Language)
.rtf // RTF Text Document Type (Rich text format document)
.rtfd // RTFD Text Document Type (Rich text format document with attachment)

update Xcode 10.2 • Swift 5 or later

let textView = UITextView()
textView.attributedText = .init(string: "abc",
attributes: [.font: UIFont(name: "Helvetica", size: 16)!])
if let attributedText = textView.attributedText {
do {
let htmlData = try attributedText.data(from: .init(location: 0, length: attributedText.length),
documentAttributes: [.documentType: NSAttributedString.DocumentType.html])
let htmlString = String(data: htmlData, encoding: .utf8) ?? ""
print(htmlString)
} catch {
print(error)
}
}

Expanding on that:

extension NSAttributedString {

convenience init(data: Data, documentType: DocumentType, encoding: String.Encoding = .utf8) throws {
try self.init(attributedString: .init(data: data, options: [.documentType: documentType, .characterEncoding: encoding.rawValue], documentAttributes: nil))
}

func data(_ documentType: DocumentType) -> Data {
// Discussion
// Raises an rangeException if any part of range lies beyond the end of the receiver’s characters.
// Therefore passing a valid range allow us to force unwrap the result
try! data(from: .init(location: 0, length: length),
documentAttributes: [.documentType: documentType])
}

var text: Data { data(.plain) }
var html: Data { data(.html) }
var rtf: Data { data(.rtf) }
var rtfd: Data { data(.rtfd) }
}

Usage:

let textView = UITextView()
textView.attributedText = .init(string: "abc", attributes: [.font: UIFont(name: "Helvetica", size: 16)!])
if let textData = textView.attributedText?.text {
let text = String(data: textData, encoding: .utf8) ?? ""
print(text) // abc
}
if let htmlData = textView.attributedText?.html {
let html = String(data: htmlData, encoding: .utf8) ?? ""
print(html) // /* <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" ...
}

This will print

abc
/* <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Style-Type" content="text/css">
<title></title>
<meta name="Generator" content="Cocoa HTML Writer">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px Helvetica}
span.s1 {font-family: 'Helvetica'; font-weight: normal; font-style: normal; font-size: 16.00pt}
</style>
</head>
<body>
<p class="p1"><span class="s1">abc</span></p>
</body>
</html>
*/

Simple way to store NSMutableAttributedString in CoreData

I started using CoreText when iOS5 was out, and thus used the Core Foundation values as attributes. However I now realize that since iOS6 came out, I can now use NSForegroundColorAttributeName, NSParagraphStyleAttributeName, NSFontAttributeName, etc. in the attributes dictionary, and those keys are accompanied by objects like UIColor, NSMutableParagraphStyle, and UIFont which can be archived.

How to save NSAttributedString to CoreData

The attribute should look like this in the data model.

Sample Image

The header file should be modified to match this:

@property (nonatomic, retain) NSAttributedString * attributedText;

That's it. You should be able to persist your attributed string just like any other attributes.

Suppose your entity is Event, and you have an object event of type Event, you can access it as event.attributedText. Here are some sample Swift code:

event.attributedText = NSAttributedString(string: "Hello World")
let attributedString = event.attributedText

Let us know should you prefer the answer in your native language.

Storing NSAttributedString Core Data

I was checking the Apple Developer Forums and found a thread almost exactly the same as this question, one person had done this but unfortunately did not share the code. All they said was the following:

"In Core Data i have an transformable in the database and i us my own NSVauleTransformer. This is a subclass of NSValueTransformer and creates an attributed string from the data object and back.

Therefore i created a class called PersistableAttributedString which is NSCoding compliant. This class has a string and an array of attributes and builds the attributed string. I also created a class for the possible text attributes which is NSCoding compliant. Together the string and the attributes are all NSCoding compliant.

The class NSAttributedString is also NSCoding compliant, but the attributes are not, that's the problem."

Hope that might help.

How To Optimize Storage Of NSAttributedString In Swift Using Data And Codable?

TL;DR: RTFD encodes images as PNGs, but you can make it encode JPGs instead to save space. A custom format might be better and easier though if you have the time to create one.

NSAttributedString can encode to HTML, rtf, rtfd, plain text, a variety of Office/Word formats, etc. Given that each of these is an official format with an official spec that must be followed, there's not much that can be done in terms of saving space other than:

  1. Choosing the supported format that works best for your use cases and has the smallest footprint.

OR


  1. Writing your own format.

Approach 1: RTFD

Of the supported format, RTFD does indeed sound best for your use case because it includes support for attachments such as images. Feel free to try out other included formats, of which descriptions are below in "Other Formats".

Saving it as Data, for example using the rtfd(from:documentAttributes:) method, and as part of a Codable structure, results in a very large string, much larger than the content itself especially when inserting an image into the NSTextView. For example, inserting a 200K image will result in a 5MB JSON file.

To understand what is happening here, try out the following code:

do {
let rtfd = try someAttributedString.rtfdFileWrapper(from: NSRange(location: 0, length: someAttributedString.length), documentAttributes: [:])
rtfd?.write(to: URL(fileURLWithPath: "/Users/yourname/someFolder/RTFD.rtfd"), options: .atomic, originalContentsURL: nil)
} catch {
print("\(error)")
}

When you call rtfd(from:documentAttributes:), you're getting flat Data. This flat data can then be encoded somewhere and read back into NSAttributedString. But make no mistake: RTFD is a package format ("D" stands for directory). So by instead calling rtfdFileWrapper(from:documentAttributes:), and writing that to a URL with the rtfd extension, we can see the actual package format that rtfd(from:documentAttributes:) replicates, but as a directory instead of raw data. In Finder, right click the generated file and choose "Show Package Contents".

The RTFD package contains an RTF file to specify the text and attribitues, and a copy of each attachment. So why was your example so much bigger? In my tests, the answer seems to be that RTFD expects to find its images in PNG format. When calling rtfdFileWrapper(from:documentAttributes:) or rtfd(from:documentAttributes:), any image attachments seem to get written out as PNG files, which take up significantly more space. This happens because your image gets wrapped in a NSImage before getting wrapped in a NSTextAttachment. The NSImage is able to write the image data out in other formats, including larger formats like PNG.

I'm assuming the image you tried was in a compressed format like JPEG, and NSAttributedString wrote it to RTFD as PNG.

Using JPEG instead

Assuming you're okay with the image being compressed and not having info such as an alpha channel, you should be able to create an RTFD file with jpg images.

For example, I was able to get an RTFD file down to 2.8 MB from over 12 MB (large image) just by replacing the generated PNG image with the original JPG one. This initially was unacceptible to TextEdit but I then changed the file extension of the image to .png (even though it is still a JPG) and it accepted it.

In code it was even simpler. You may be able to get away with just changing how you add image attachments.

// Don't do this unless you want PNG
let image = NSImage(contentsOf: ...) // NSImage will write to a larger PNG file
let attachment = NSTextAttachment()
attachment.image = image

// Do this if you want smaller files
let image = try? Data(contentsOf: ...) // This will remain in raw JPG format
let attachment = NSTextAttachment(data: image, ofType: kUTTypeJPEG as String) // Explicitly specify JPG

Then when you create a new NSAttributedString with that NSTextAttachment and append it to NSTextStorage, writing RTFD data will be signifantly smaller.

Of course, you may not have control of this process if you're relying on Cocoa UI/API for attaching images. That could make the process more difficult and you may need to resort to modifying the generated data by swapping images.

Approach 2: Custom Format

The approach described immediately above might be inconvenient due to not having control of the attachment-adding process and needing flat data. In that case a custom format might be better.

There's nothing stopping you from designing your own format (binary, text, package, whatever) and then writing a coder for it. You could specify a specific image format or support a variety. It's up to you. And unless you're a fancy word processor, you probably don't need to store all the attributes like font all the time.

I am also wondering whether there is a valid binary encoding option for Codable.

First, note that NSAttributedString is an Objective-C class (when used on Apple platforms) and conforms to NSSecureCoding instead of Codable.

Note that you cannot extend NSAttributedString to conform to Codable, because the init(from:) requirement on Decodable can only be satisfied by guarenteeing that the initializer will be included on all subclasses as well. Since this class is non-final, that means it can only be satisfied by a required init. Required initializers can only be specified on the original declaration, not extensions.

For this reason, if you wanted to conform it to Codable, you would need to use a wrapper object. enumerateAttributes(in:options:using:) should be helpful for getting the attributes and raw characters that need encoded, but you'll need to be sure to pay attention to the images too.

As for encoding in binary, Codable is completely agnostic to format, so you could write your own object conforming to Coder that does whatever you want, including store everything using raw bytes.

Aside: Other Formats

Here's a quick rundown of other supported formats (in order of size). In these tests, I used the very small string "Hello World! There's so much to see!" in the system font. After each format description (in parentheses) is the number of bytes to store that string.

  • Plain Text can store the above format in 36 bytes (1 for each character), but won't preserve attributes or attachments. (36 bytes)
  • RTF seems most lightweight if you need to preserve attributes but not attachments. (331 bytes)
  • HTML Is next lightest, but isn't really designed to be a storage format. In my experience, some attributes such as line spacing get lost when converted to HTML by NSAttributedString. (536 bytes)
  • Binary Plist, which is made when you use NSKeyedArchiver, is a fine option if you only need compatibility with Apple platforms and don't like the above formats. This format supports images too, but is generally still larger than the above (and RTFD). (648 bytes)
  • Web Archive is next for size, but I don't recommend using it as WebKit has deprecated it. Safari still uses it though for some things. (784 bytes)
  • Word ML is probably only useful for those that already know they need it. This format and all below it will generally have a bunch of boilerplate that will become a smaller percentage of the file as text is added. (~1.2 MB)
  • Open Document (OASIS) is smaller than most of the Word formats, but you probably wouldn't use it without a good reason. (~2.4 MB)
  • Office Open XML Is another format you'd only use if you needed that format exactly. (~3.5 MB)
  • Doc (Microsoft Word) This file is very large in comparison for small amounts of text. While I would expect this format to allow images, in my testing the file size did not actually go up when I added one. (~19.4 MB)
  • Mac Simple Text seems to always generate an error. (N/A)

Final Note

In the end, the encoding experience for NSAttributedString should get better as Foundation continues to adapt to Swift rather than Objective-C. You can imagine a day where NSAttributedString or some similar Swifty type conforms to Codable out of the box and can then be paired with any file format Coder.

Can you save attributed strings to Cloud Firestore?

As Doug mentioned in his answer, trying to use NSAttributedString cross platform is challenging as there is no direct Android equivalent so it's probably best to keep the data as primitives.

...But the short answer is: Yes, you can store an NSAttributedString in Firestore because Firestore supports NSData objects.

If you really want to go cross platform with your string style, one thought is to understand that an NSAttributed string is a string with a dictionary of key: value pairs that define the strings look. So you could store primitives in Firestore and then use the appropriate platforms functions to re-assemble the string.

So the string could be stored in Firestore like this

string_0 (a document)
text: "Hello, World"
attrs:
font: "Helvetica"
size: "12"
color: "blue"

You could then read that in as create an attributed string based on those attributes.

That being said, I can get you 1/2 way there on the iOS/macOS side if you really want to store an NSAttributed string in Firestore.

Here's a function that creates an NSAttributedString, archives it and stores the data in Firestore.

func storeAttributedString() {
let quote = "Hello, World"

let font = NSFont.boldSystemFont(ofSize: 20)
let color = NSColor.blue

let intialAttributes: [NSAttributedString.Key: Any] = [
.font: font,
.foregroundColor: color,
]

let attrString = NSAttributedString(string: quote, attributes: intialAttributes)
let archivedData: Data = try! NSKeyedArchiver.archivedData(withRootObject: attrString, requiringSecureCoding: false)

let dict: [String: Any] = [
"attrString": archivedData
]

let attrStringCollection = self.db.collection("attr_strings")
let doc = attrStringCollection.document("string_0")
doc.setData(dict)
}

then to read it back, here's the function that reads it and displays the attributed string an a macOS NSTextField.

func readAttributedString() {
self.myField.allowsEditingTextAttributes = true //allows rich text
let attrStringCollection = self.db.collection("attr_strings")
let doc = attrStringCollection.document("string_0")
doc.getDocument(completion: { snapshot, error in
if let err = error {
print(err.localizedDescription)
return
}

guard let snap = snapshot else { return }
let archivedData = snap.get("attrString") as! Data
let unarchivedData: NSAttributedString? = try! NSKeyedUnarchiver.unarchiveTopLevelObjectWithData(archivedData) as? NSAttributedString
self.myField.attributedStringValue = unarchivedData!
})
}

Conforming NSAttributedString to Codable throws error

You can try unarchiveTopLevelObjectWithData to unarchive your AttributedString object data:

NSKeyedUnarchiver.unarchiveTopLevelObjectWithData(data)

Your AttributedString implemented as a struct should look something like this:

struct AttributedString {
let attributedString: NSAttributedString
init(attributedString: NSAttributedString) { self.attributedString = attributedString }
init(string str: String, attributes attrs: [NSAttributedString.Key: Any]? = nil) { attributedString = .init(string: str, attributes: attrs) }
}

Archiving / Encoding

extension NSAttributedString {
func data() throws -> Data { try NSKeyedArchiver.archivedData(withRootObject: self, requiringSecureCoding: false) }
}

extension AttributedString: Encodable {
func encode(to encoder: Encoder) throws {
var container = encoder.singleValueContainer()
try container.encode(attributedString.data())
}
}

Unarchiving / Decoding

extension Data {
func topLevelObject() throws -> Any? { try NSKeyedUnarchiver.unarchiveTopLevelObjectWithData(self) }
func unarchive<T>() throws -> T? { try topLevelObject() as? T }
func attributedString() throws -> NSAttributedString? { try unarchive() }
}

extension AttributedString: Decodable {
public init(from decoder: Decoder) throws {
let container = try decoder.singleValueContainer()
guard let attributedString = try container.decode(Data.self).attributedString() else {
throw DecodingError.dataCorruptedError(in: container, debugDescription: "Corrupted Data")
}
self.attributedString = attributedString
}
}


Related Topics



Leave a reply



Submit