Merge Json Files, Overwriting

Merge json files, overwriting

You need to convert your JSON into array using json_decode and then merge all json array using array_merge and then encode again using json_encode to get all json into one.

$json1 = '{"AEK":{"country":"Greece","shtv":"VEK"} ,
"BER":{"country":"Germany","shtv":"BKE"} ,
"CAR":{"country":"Italy","shtv":"CRA"}}';

$json2 = '{"AEK":{"country":"Greece","shtv":"MOR"} ,
"DAR":{"country":"Turkey","shtv":"DDR"}}';

$json3 = '{"AEK":{"country":"Greece","shtv":"MIL"} ,
"BER":{"country":"Germany","shtv":"BKE"} ,
"CAR":{"country":"Italy","shtv":"KUN"}}';

$arr1 = json_decode($json1,true);
$arr2 = json_decode($json2,true);
$arr3 = json_decode($json3,true);

$finalArr = array_merge($arr1,$arr2,$arr3);

$final_json = json_encode($finalArr);

echo $final_json;

Merge partially overlapping JSON files overwriting old data with most recent data

One solution is to use an extra column where you keep the file priority.

Here the steps:

  • Add one column according to the files priority (1 is the lowest priority, n the biggest)
  • Concat all the dataframe using concat
  • Remove duplicates by keeping only rows with the bigest priority:

    • Group rows by hours using groupby
    • Select largest priority value using transform.

Code:

print(df_1)
# wind_v wind_u dewpoint temp rh file
# hours
# 1555038000000 -1.412255 -0.023800 283.15000 284.15000 24.53871 1
# 1555048800000 -0.164115 -1.692985 284.32549 286.47823 21.82451 1
# 1555059600000 0.055155 -2.692985 285.04983 288.09226 21.82451 1
# 1555070400000 -2.412255 -0.857100 284.26031 286.47823 33.66288 1
# 1555081200000 -1.192985 -0.055155 284.15000 285.15000 30.98614 1
# 1555092000000 -0.857100 0.114030 283.71146 284.32549 37.11403 1
print(df_2)
# wind_v wind_u dewpoint temp rh file
# hours
# 1555070400000 -0.0572 3.4300 210.152144 292.03969 79.8188 2
# 1555081200000 0.4200 4.7622 207.006067 291.71146 83.1700 2
# 1555092000000 1.1578 -1.2322 205.239848 294.32549 73.7388 2
# 1555102800000 0.1750 0.9200 205.420127 297.86420 83.2532 2
# 1555113600000 0.2778 2.6106 206.944729 297.03969 82.2800 2
# 1555124400000 -2.4828 3.3722 208.115948 296.15000 83.7500 2

df = pd.concat([df_1,df_2])
print(df)
# wind_v wind_u dewpoint temp rh file
# hours
# 1555038000000 -1.412255 -0.023800 283.150000 284.15000 24.53871 1
# 1555048800000 -0.164115 -1.692985 284.325490 286.47823 21.82451 1
# 1555059600000 0.055155 -2.692985 285.049830 288.09226 21.82451 1
# 1555070400000 -2.412255 -0.857100 284.260310 286.47823 33.66288 1
# 1555081200000 -1.192985 -0.055155 284.150000 285.15000 30.98614 1
# 1555092000000 -0.857100 0.114030 283.711460 284.32549 37.11403 1
# 1555070400000 -0.057200 3.430000 210.152144 292.03969 79.81880 2
# 1555081200000 0.420000 4.762200 207.006067 291.71146 83.17000 2
# 1555092000000 1.157800 -1.232200 205.239848 294.32549 73.73880 2
# 1555102800000 0.175000 0.920000 205.420127 297.86420 83.25320 2
# 1555113600000 0.277800 2.610600 206.944729 297.03969 82.28000 2
# 1555124400000 -2.482800 3.372200 208.115948 296.15000 83.75000 2

df = df[df['file'] == df.groupby("hours")['file'].transform('max')]
# wind_v wind_u dewpoint temp rh file
# hours
# 1555038000000 -1.412255 -0.023800 283.150000 284.15000 24.53871 1
# 1555048800000 -0.164115 -1.692985 284.325490 286.47823 21.82451 1
# 1555059600000 0.055155 -2.692985 285.049830 288.09226 21.82451 1
# 1555070400000 -2.412255 -0.857100 284.260310 286.47823 33.66288 1
# 1555081200000 -1.192985 -0.055155 284.150000 285.15000 30.98614 1
# 1555092000000 -0.857100 0.114030 283.711460 284.32549 37.11403 1
# 1555070400000 -0.057200 3.430000 210.152144 292.03969 79.81880 2
# 1555081200000 0.420000 4.762200 207.006067 291.71146 83.17000 2
# 1555092000000 1.157800 -1.232200 205.239848 294.32549 73.73880 2
# 1555102800000 0.175000 0.920000 205.420127 297.86420 83.25320 2
# 1555113600000 0.277800 2.610600 206.944729 297.03969 82.28000 2
# 1555124400000 -2.482800 3.372200 208.115948 296.15000 83.75000 2

merge and overwrite json file where id's match, in PHP

You need to swap the arguments in array_merge like so:

$user_array = array_merge($user_array, $values);

overwrite a json file with objects from another json file using jq

Here is a solution which uses Object Multiplication. Assuming your data is in A.json and B.json:

$ jq -M --argfile b B.json '.[0].elements[0] *= $b[0].elements[0]' A.json

produces

[
{
"uri": "https://someurl.com",
"id": "some-id",
"keyword": "SomeKeyword",
"name": "Some Name",
"description": "Some description for that test result",
"line": 2,
"tags": [
{
"name": "@sometag",
"line": 1
}
],
"elements": [
{
"a": 5,
"b": 2
}
]
}
]

This approach is easily generalized if your arrays contain more data but you'll need to understand how corresponding elements should be identified.


Regarding the revised question, here is a filter which updates objects of A.json with corresponding objects of B.json having the same .id:

def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr| if type != "string" then tojson else . end] |= $row);

def merge_by_id(a;b):
if b then INDEX(a[];.id) * INDEX(b[];.id) | map(.) else a end;

INDEX($b[];.id) as $i
| map( .elements = merge_by_id(.elements; $i[.id].elements) )

For example if the above filter is in filter.jq, A.json contains the revised sample data and B.json contains

[
{
"id": "safety-tests",
"elements": [
{
"id": "some-element-id",
"description": "updated description"
}
]
}
]

The command

$ jq -M --argfile b B.json -f filter.jq A.json

generates the result

[
{
"uri": "some/url.feature",
"id": "safety-tests", <------ top level .id
...
"elements": [
{
"id": "some-element-id", <------ element .id
"keyword": "Scenario Outline",
"name": ": Some scenario name",
"description": "updated description", <------ updated value
"line": 46,
"type": "scenario",
...

Note that the above solution assumes the .id of the elements in A.json are unique otherwise merge_by_id won't produce the desired output. In that case the following filter should suffice:

def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr| if type != "string" then tojson else . end] |= $row);

(INDEX($b[];.id) | map_values(INDEX(.elements[];.id))) as $i
| map( $i[.id] as $o | if $o then .elements |= map($o[.id]//.) else . end )

This filter only requires the .id of the objects in B.json to be unique. If it's possible for there to be non-unique elements in both A.json and B.json then a more sophisticated mapping then this one will be required.

Here is a version of the filter with comments:

def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr| if type != "string" then tojson else . end] |= $row);

# first create a lookup table for elements from B.json
( # [{id:x, elements:[{id:y, ...}]}]
INDEX($b[];.id) # -> {x: {id:x, elements:[{id:y, ...}]}..}
| map_values(INDEX(.elements[];.id)) # -> {x: {y: {id:y, ...}}}
) as $i

# update A.json objects
| map( # for each object in A.json
$i[.id] as $o # do we have updated values from B.json ?
| if $o then .elements |= map($o[.id]//.) # if so then replace corresponding elements
else . end # otherwise leave object unchanged
)

Merging two Objects without overwriting

var o = {  1: {    "a": { "1": "test 1", "2": "test 2" },    "b": { "3": "test 3", "4": "test 4" },  }, 2: {    "a": { "1": "test 5", "2": "test 6" },    "b": { "3": "test 7", "4": "test 8" },  }}
function mergeObject (base, toMerge) { // loop through the keys in the item to be merged (`toMerge`) for (var key in toMerge) { // if `base[key]` is not already an object, set it as one base[key] = base[key] || {} // look through the keys in `toMerge[key]` for (var k in toMerge[key]) { // if the base already has an array at `base[key][k]` if (Array.isArray(base[key][k])) { // then push the current element base[key][k].push(toMerge[key][k]) } else { // otherwise, create an array and set `toMerge[key][k]` as the first element base[key][k] = [toMerge[key][k]] } } } return base}
function merge (o1, o2) { return mergeObject(mergeObject({}, o1), o2)}
document.write(JSON.stringify(merge(o[1], o[2])))

Merging json files together in php

The function array_merge has this overwriting behaviour, as specified in the manual:

If the input arrays have the same string keys, then the later value for that key will overwrite the previous one.

This effect can be illustrated with this little example:

$a1 = array("a" => 1, "b" => 2);
$a2 = array("a" => 100, "b" => 200);
$result = array_merge($a1, $a2);
print_r (json_encode($result));

output:

{"a":100,"b":200}

So, the values of the first array are lost.

There are several solutions, but it depends on which result you would like to get. If for instance you would like to get this:

{"a":[1, 100],"b":[2, 200]}

Then use the function array_merge_recursive instead of array_merge.

If you prefer to get this:

[{"a":1,"b":2},{"a":100,"b":200}]

Then use this code:

$result[] = $a1;
$result[] = $a2;

In your original code, that last solution would look like this:

$result[] = json_decode($json,true); 
foreach ($dir as $fileinfo) {
// ...
$result[] = $current;
// ...
}

JSON merge and override array

I have made a node package, it's properly a little overkill, but it will do the job, called json_merger:

// https://www.npmjs.com/package/json_merger
var json_merger = require('json_merger');

var a = {
"servers": {
"services": [
{
"name": "api"
"prop1": "XXX",
"prop2": "XXX"
},
{
"name": "web"
"prop1": "XXX",
"prop2": "XXX"
}
]
}
};

var b = {
"servers": {
"services": [
{
"@match": "[name=web]",
"prop1": "overriden value"
}
]
}
}

var result = json_merger.merge(a, b);

But with any decent merge tool it would perserve the indexes:

// This is pseudo code:
var a = [{...}, {...}, {...}]

// to override object at index 1 you should be able to do this:
merge(a, [{/*intentional empty object, so we get index up*/}, 'new value at index 1']);

Combine multiple json to single json using jq

This approach uses INDEX to create a dictionary of unique elements based on their .name field, reduce to iterate over the group fields to be considered, and an initial state created by combining the slurped (-s) input files using add after removing the group fileds to be processed separately using del.

jq -s '
[ "group1", "group2" ] as $gs | . as $in | reduce $gs[] as $g (
map(del(.[$gs[]])) | add; .[$g] = [INDEX($in[][$g][]; .name)[]]
)
' file1.json file2.json
{
"version": 4,
"group1": [
{
"name": "olditem1",
"content": "new content"
},
{
"name": "newitem1"
}
],
"group2": [
{
"name": "olditem2"
},
{
"name": "newitem2"
}
]
}

Demo



Related Topics



Leave a reply



Submit