Parsing large JSON file in .NET
As you've correctly diagnosed in your update, the issue is that the JSON has a closing ]
followed immediately by an opening [
to start the next set. This format makes the JSON invalid when taken as a whole, and that is why Json.NET throws an error.
Fortunately, this problem seems to come up often enough that Json.NET actually has a special setting to deal with it. If you use a JsonTextReader
directly to read the JSON, you can set the SupportMultipleContent
flag to true
, and then use a loop to deserialize each item individually.
This should allow you to process the non-standard JSON successfully and in a memory efficient manner, regardless of how many arrays there are or how many items in each array.
using (WebClient client = new WebClient())
using (Stream stream = client.OpenRead(stringUrl))
using (StreamReader streamReader = new StreamReader(stream))
using (JsonTextReader reader = new JsonTextReader(streamReader))
{
reader.SupportMultipleContent = true;
var serializer = new JsonSerializer();
while (reader.Read())
{
if (reader.TokenType == JsonToken.StartObject)
{
Contact c = serializer.Deserialize<Contact>(reader);
Console.WriteLine(c.FirstName + " " + c.LastName);
}
}
}
Full demo here: https://dotnetfiddle.net/2TQa8p
How to parse huge JSON file as stream in Json.NET?
This should resolve your problem. Basically it works just like your initial code except it's only deserializing object when the reader hits the {
character in the stream and otherwise it's just skipping to the next one until it finds another start object token.
JsonSerializer serializer = new JsonSerializer();
MyObject o;
using (FileStream s = File.Open("bigfile.json", FileMode.Open))
using (StreamReader sr = new StreamReader(s))
using (JsonReader reader = new JsonTextReader(sr))
{
while (reader.Read())
{
// deserialize only when there's "{" character in the stream
if (reader.TokenType == JsonToken.StartObject)
{
o = serializer.Deserialize<MyObject>(reader);
}
}
}
Reading Large JSON file into variable in C#.net
Use 64 bit, check RredCat's answer on a similar question:
Newtonsoft.Json - Out of memory exception while deserializing big object
NewtonSoft Jason Performance Tips
Read the article by David Cox about tokenizing:
"The basic approach is to use a JsonTextReader object, which is part of the Json.NET library. A JsonTextReader reads a JSON file one token at a time. It, therefore, avoids the overhead of reading the entire file into a string. As tokens are read from the file, objects are created and pushed onto and off of a stack. When the end of the file is reached, the top of the stack contains one object — the top of a very big tree of objects corresponding to the objects in the original JSON file"
Parsing Big Records with Json.NET
Deserializing large json from WebService using Json.NET
What you can do is to adopt the basic approach of Issues parsing a 1GB json file using JSON.NET and Deserialize json array stream one item at a time, which is to stream through the JSON and deserialize and yield return each object; but in addition apply some stateful filtering expression to deserialize only StartObject
tokens matching the path d.results[*]
.
To do this, first define the following interface and extension method:
public interface IJsonReaderFilter
{
public bool ShouldDeserializeToken(JsonReader reader);
}
public static class JsonExtensions
{
public static IEnumerable<T> DeserializeSelectedTokens<T>(Stream stream, IJsonReaderFilter filter, JsonSerializerSettings settings = null, bool leaveOpen = false)
{
using (var sr = new StreamReader(stream, leaveOpen : leaveOpen))
using (var reader = new JsonTextReader(sr))
foreach (var item in DeserializeSelectedTokens<T>(reader, filter, settings))
yield return item;
}
public static IEnumerable<T> DeserializeSelectedTokens<T>(JsonReader reader, IJsonReaderFilter filter, JsonSerializerSettings settings = null)
{
var serializer = JsonSerializer.CreateDefault(settings);
while (reader.Read())
if (filter.ShouldDeserializeToken(reader))
yield return serializer.Deserialize<T>(reader);
}
}
Now, to filter only those items matching the path d.results[*]
, define the following filter:
class ResultsFilter : IJsonReaderFilter
{
const string path = "d.results";
const int pathDepth = 2;
bool inArray = false;
public bool ShouldDeserializeToken(JsonReader reader)
{
if (!inArray && reader.Depth == pathDepth && reader.TokenType == JsonToken.StartArray && string.Equals(reader.Path, "d.results", StringComparison.OrdinalIgnoreCase))
{
inArray = true;
return false;
}
else if (inArray && reader.Depth == pathDepth + 1 && reader.TokenType == JsonToken.StartObject)
return true;
else if (inArray && reader.Depth == pathDepth && reader.TokenType == JsonToken.EndArray)
{
inArray = false;
return false;
}
else
{
return false;
}
}
}
Next, create the following data model for each result:
public class Metadata
{
public string id { get; set; }
public string uri { get; set; }
public string type { get; set; }
}
public class Result
{
public Metadata metadata { get; set; }
public string ID { get; set; }
public string Value1 { get; set; }
public string Value2 { get; set; }
public string Value3 { get; set; }
}
And now you can deserialize your JSON stream incrementally as follows:
foreach (var result in JsonExtensions.DeserializeSelectedTokens<Result>(dataStream, new ResultsFilter()))
{
// Process each result in some manner.
result.Dump();
}
Demo fiddle here.
How to Save Large Json Data?
Instead of serializing to a string, and then writing the string to a stream, stream it directly:
using var stream = File.Create("file.json");
JsonSerializer.Serialize(stream, content, new JsonSerializerOptions
{
WriteIdented = true
});
Related Topics
How to Force Https Using a Web.Config File
Why Should You Remove Unnecessary C# Using Directives
Decompressing Gzip Stream from Httpclient Response
Convert String to Datetime in C#
Entity Framework Provider Type Could Not Be Loaded
Writing to a Textbox from Another Thread
General Purpose Fromevent Method
Unity - Need to Return Value Only After Coroutine Finishes
How to Bind an Enum to a Combobox Control in Wpf
How to Convert Rgb Color to Hsv
Alphanumeric Sorting Using Linq
Omitting All Xsi and Xsd Namespaces When Serializing an Object in .Net
How to Get First N Elements of a List in C#
Loop Through All the Resources in a .Resx File
Best Way to Repeat a Character in C#