text = Autotune exists! Hoorah! You can use microbolus-related features. {"iob":0.121,
"activity":0.0079,
"basaliob":-1.447,
"bolusiob":1.568,
"netbasalinsulin":-1.9,
"bolusinsulin":6.5,
"time":"2022-12-25T21:17:45.000Z",
"iobWithZeroTemp":
{"iob":0.121,
"activity":0.0079,
"basaliob":-1.447,
"bolusiob":1.568,
"netbasalinsulin":-1.9,
"bolusinsulin":6.5,
"time":"2022-12-25T21:17:45.000Z"},
"lastBolusTime":1671999216000,
"lastTemp":
{"rate":0,
"timestamp":"2022-12-25T23:56:14+03:00",
"started_at":"2022-12-25T20:56:14.000Z",
"date":1672001774000,
"duration":22.52}}
# Regular expression pattern to match nested JSON objects
pattern = r'(?<=\{)\s*[^{]*?(?=[\},])'
matches = re.findall(pattern, text)
parsed_objects = [json.loads(match) for match in matches]
for obj in parsed_objects:
print(obj)
JSONDecodeError: Extra data: line 1 column 6 (char 5)
Here is an attempt to get all valid JSON dicts from text using JSONDecoder.raw_decode()
:
text = """\
text = Autotune exists! Hoorah! You can use microbolus-related features. {"iob":0.121,
"activity":0.0079,
"basaliob":-1.447,
"bolusiob":1.568,
"netbasalinsulin":-1.9,
"bolusinsulin":6.5,
"time":"2022-12-25T21:17:45.000Z",
"iobWithZeroTemp":
{"iob":0.121,
"activity":0.0079,
"basaliob":-1.447,
"bolusiob":1.568,
"netbasalinsulin":-1.9,
"bolusinsulin":6.5,
"time":"2022-12-25T21:17:45.000Z"},
"lastBolusTime":1671999216000,
"lastTemp":
{"rate":0,
"timestamp":"2022-12-25T23:56:14+03:00",
"started_at":"2022-12-25T20:56:14.000Z",
"date":1672001774000,
"duration":22.52}}
This is some other text with { not valid JSON }
{"another valid JSON object": [1, 2, 3]}
"""
import json
decoder = json.JSONDecoder()
decoded_objs, idx = [], 0
while True:
try:
idx = text.index("{", idx)
except ValueError:
break
while True:
try:
obj, new_idx = decoder.raw_decode(text[idx:])
decoded_objs.append(obj)
idx += new_idx
except json.decoder.JSONDecodeError:
idx += 1
break
print(decoded_objs)
Prints:
[
{
"iob": 0.121,
"activity": 0.0079,
"basaliob": -1.447,
"bolusiob": 1.568,
"netbasalinsulin": -1.9,
"bolusinsulin": 6.5,
"time": "2022-12-25T21:17:45.000Z",
"iobWithZeroTemp": {
"iob": 0.121,
"activity": 0.0079,
"basaliob": -1.447,
"bolusiob": 1.568,
"netbasalinsulin": -1.9,
"bolusinsulin": 6.5,
"time": "2022-12-25T21:17:45.000Z",
},
"lastBolusTime": 1671999216000,
"lastTemp": {
"rate": 0,
"timestamp": "2022-12-25T23:56:14+03:00",
"started_at": "2022-12-25T20:56:14.000Z",
"date": 1672001774000,
"duration": 22.52,
},
},
{"another valid JSON object": [1, 2, 3]},
]
This is hopeless. You can easily enough strip out any text before an initial
{
, but scanning random text looking for JSON is not trivial.Your JSON has nested objects. The regexp only matches objects with no nesting.
Using lookarounds to match the
{
and}
means you’ll just get the middle of the object. But that’s not valid JSON by itself.@Barmar can you help with pattern?
No, this is not an appropriate use of regexp, for the reason @TimRoberts explained.
Show 4 more comments