PDFKitten is highlighting on wrong position
This might be a bug in PDFKitten when calculating the width of characters whose character identifier does not coincide with its unicode character code.
appendPDFString in StringDetector works with two strings when processing some string data:
// Use CID string for font-related computations.
NSString *cidString = [font stringWithPDFString:string];
// Use Unicode string to compare with user input.
NSString *unicodeString = [[font stringWithPDFString:string] lowercaseString];
stringWithPDFString in Font transforms the sequence of character identifiers of its argument into a unicode string.
Thus, in spite of the name of the variable, cidString is not a sequence of character identifiers but instead of unicode chars. Nonetheless its entries are used as argument of didScanCharacter which in Scanner is implemented to forward the position by the character width: It is using the value as parameter of widthOfCharacter in Font to determine the character width, and that method (according to the comment "Width of the given character (CID) scaled to fontsize") expects its argument to be a character identifier.
So, if CID and unicode character code don't coincide, the wrong character widths is determined and the position of any following character cannot be trusted. In the case at hand, the /fi ligature has a CID of 12 which is way different from its Unicode code 0xfb01.
I would propose PDFKitten to be enhanced to also define a didScanCID method in StringDetector which in appendPDFString should be called next to didScanCharacter for each processed character forwarding its CID. Scanner then should make use of this new method instead to calculate the width to forward its cursor.
This should be triple-checked first, though. Maybe some widthOfCharacter implementations (there are different ones for different font types) in spite of the comment expect the argument to be a unicode code after all...
(Sorry if I used the wrong vocabulary here or there, I'm a 'Java guy... :))
How to select lines of text in a PDF and then highlight them? (iOS)
Your potential solution is the way to go. The size of the bounding rectangle of a Tj string is the sum of bounding rectangles of each glyph in the string so you can select anything in the string. THe PDFKitten library might help you with text processing: https://github.com/KurtCode/PDFKitten
Possible to show PDF over a PageViewController in an iOS app?
First, create a new instance of the scanner.
CGPDFPageRef page = CGPDFDocumentGetPage(document, 1);
Scanner *scanner = [Scanner scannerWithPage:page];
Set a keyword (case-insensitive) and scan a page.
NSArray *selections = [scanner select:@"happiness"];
Finally, scan the page and draw the selections.
for (Selection *selection in selections)
{
// draw selection
}
and then highlight the selections using core graphics framework.
Related Topics
Open App from Sms with My Url Scheme as a Link
Why Is Uiwebview Cangoback=No in iOS7
iOS Permission Alerts - Removing or Suppressing
How to Group Array of Objects by Date in Swift
Uiscrollview with iOS Auto Layout Constraints: Wrong Size for Subviews
Best Analytics Offering for Iphone
Add a Text Overlay with Avmutablevideocomposition to a Specific Timerange
Why Is Audio Coming Up Garbled When Using Avassetreader with Audio Queue
Detect Hotspot Enabling in iOS with Private API'S
The Bundle's Info.Plist Does Not Contain a Cfbundleversion Key or Its Value Is Not a String
Libz.Dylib Versus Libz.1.2.3.Dylib Versus Libz.1.2.5.Dylib
How to Draw on an Image in Swift
Phonegap on iOS with Absolute Path Urls for Assets
What Is the Swift Preprocessor Equivalent to iOS Version Check Comparison