Android SAX parser not getting full text from between tags
As you can see, it's cutting
everything off the url from the
ampersand escape code and after.
From the documentation of the characters()
method:
The Parser will call this method to
report each chunk of character data.
SAX parsers may return all contiguous
character data in a single chunk, or
they may split it into several chunks;
however, all of the characters in any
single event must come from the same
external entity so that the Locator
provides useful information.
When I write SAX parsers, I use a StringBuilder
to append everything passed to characters()
:
public void characters (char ch[], int start, int length) {
if (buf!=null) {
for (int i=start; i<start+length; i++) {
buf.append(ch[i]);
}
}
}
Then in endElement()
, I take the contents of the StringBuilder
and do something with it. That way, if the parser calls characters()
several times, I don't miss anything.
SaxParser doesn't get full string between the tags
The characters
method can be called more than once for the text within a single pair of open and close tags.
Your code assumes it's only called once, which will frequently be true for small data, but not always.
You need to initialize a buffer in the startElement method for that tag, collect into the buffer in the characters method, and convert the buffer to a string in the endElement.
SaxParser doesn't get full string between the tags
The characters
method can be called more than once for the text within a single pair of open and close tags.
Your code assumes it's only called once, which will frequently be true for small data, but not always.
You need to initialize a buffer in the startElement method for that tag, collect into the buffer in the characters method, and convert the buffer to a string in the endElement.
Android,SAX parser Problem while reading Html Tags
An HTML file is not XML conformant.
RSS Reader using Sax Parser losing characters from title
Use a StringBuilder to build the tag, rather than using a new String instance as the documentation says:
The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
And @CommonWares says this exactly in his post Here.
Build your tag as it is found using StringBuilder, since there is chunks coming in at once rather than the entire string (This explains the incomplete tags!). You may or may not need the isBuilding flag, but I don't know your entire implementation so I added it incase.
StringBuilder mSb;
boolean isBuilding;
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
mSb = new StringBuilder();
isBuilding = true;
if(qName.equals("title")){
parsingTitle = true;
}
...
...
}
@Override
public void characters (char ch[], int start, int length) {
if (mSb !=null && isBuilding) {
for (int i=start; i<start+length; i++) {
mSb.append(ch[i]);
}
}
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
if(parsingTitle){
currentItem.setTitle(sb.toString().trim());
parsingTitle = false;
isBuilding = false;
}
}
SAXParser - Handle tags with same text at different level in XML structure
You can use XPath rather than parsing your XML using SAX.
XPath expression for your case is:
/channel/item/title
Example code:
import org.xml.sax.InputSource;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.StringReader;
public class XPathTest {
public static void main(String[] args) throws XPathExpressionException {
String xml = "<channel>\n" +
"\n" +
" <title>Site Name</title>\n" +
"\n" +
" <item> \n" +
" <title>News Title!</title> \n" +
" </item>\n" +
"\n" +
"</channel>";
Object result = XPathFactory.newInstance().newXPath().compile("/channel/item/title").evaluate(new InputSource(new StringReader(xml)));
System.out.print(result);
}
}
Special characters in Text node not getting parsed by SAX's characters() method
Here, the parameter 'char[] ch' is supposed to fetch the entire line Deals & Dealmakers: Technology, media and communications M&A But it is only getting "Deals ".
You seem to be assuming that you'll get the whole text in one call. There's no guarantee of that. I strongly suspect that your characters
method will be called multiple times for the same text node, which is valid for the parser to do. You need to make sure your code handles that.
From the documentation:
SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
There may be a feature you can set to ensure you get all the data in one go; I'm not sure.
Related Topics
How Does One Implement Drag and Drop for Android Marker
Android:Change Button Text and Background Color
Change Date String Format in Android
Android Studio- "Sdk Tools Directory Is Missing"
How to Launch the 'Add Contact' Activity in Android
Using an Android Library Project Activity Within Another Project
Can't Handle Both Click and Touch Events Simultaneously
Can Gradientcolor Be Used to Define a Gradient for a Fill or Stroke Entirely in Xml
Android Studio Cannot Resolve R in Imported Project
Android Fragmenttransaction Custom Animation (Unknown Animator Name: Translate)
How to Change Actionbar Tab Indicator Programmatically
How to Store Large Blobs in an Android Content Provider
Changing Gradient Background Colors on Android at Runtime
Gradle Flavors for Android with Custom Source Sets - What Should the Gradle Files Look Like
Google Play Services Missing in Emulator (Android 4.4.2)