9

Consider this json file named h.json I want to convert this into a python dataclass.

{
    "acc1":{
        "email":"[email protected]",
        "password":"acc1",
        "name":"ACC1",
        "salary":1
    },
    "acc2":{
        "email":"[email protected]",
        "password":"acc2",
        "name":"ACC2",
        "salary":2
    }

}

I could use an alternative constructor for getting each account, for example:

import json
from dataclasses import dataclass

@dataclass
class Account(object):
    email:str
    password:str
    name:str
    salary:int
    
    @classmethod
    def from_json(cls, json_key):
        file = json.load(open("h.json"))
        return cls(**file[json_key])

but this is limited to what arguments (email, name, etc.) were defined in the dataclass.

What if I were to modify the json to include another thing, say age? The script would end up returning a TypeError, specifically TypeError: __init__() got an unexpected keyword argument 'age'.

Is there a way to dynamically adjust the class attributes based on the keys of the dict (json object), so that I don't have to add attributes each time I add a new key to the json?

2
  • 5
    For such flexibility, it's better to keep the data as a dict instead of trying to fit it to a class. Commented Oct 29, 2021 at 18:50
  • 2
    The point of a dataclass is that it keeps you from defining new fields like this. If you want to dynamically change what fields can be defined, you can use a class. Commented Oct 29, 2021 at 18:51

3 Answers 3

11

Since it sounds like your data might be expected to be dynamic and you want the freedom to add more fields in the JSON object without reflecting the same changes in the model, I'd also suggest to check out typing.TypedDict instead a dataclass.

Here's an example with TypedDict, which should work in Python 3.7+. Since TypedDict was introduced in 3.8, I've instead imported it from typing_extensions so it's compatible with 3.7 code.

from __future__ import annotations

import json
from io import StringIO
from typing_extensions import TypedDict


class Account(TypedDict):
    email: str
    password: str
    name: str
    salary: int


json_data = StringIO("""{
    "acc1":{
        "email":"[email protected]",
        "password":"acc1",
        "name":"ACC1",
        "salary":1
    },
    "acc2":{
        "email":"[email protected]",
        "password":"acc2",
        "name":"ACC2",
        "salary":2,
        "someRandomKey": "string"
    }
}
""")

data = json.load(json_data)
name_to_account: dict[str, Account] = data

acct = name_to_account['acc2']

# Your IDE should be able to offer auto-complete suggestions within the
# brackets, when you start typing or press 'Ctrl + Space' for example.
print(acct['someRandomKey'])

If you are set on using dataclasses to model your data, I'd suggest checking out a JSON serialization library like the dataclass-wizard (disclaimer: I am the creator) which should handle extraneous fields in the JSON data as mentioned, as well as a nested dataclass model if you find your data becoming more complex.

It also has a handy tool that you can use to generate a dataclass schema from JSON data, which can be useful for instance if you want to update your model class whenever you add new fields in the JSON file as mentioned.

Sign up to request clarification or add additional context in comments.

5 Comments

wow, TypedDict !!! very good idea
yep, definitely agree, it's a cool but I feel a rather not well-known feature of typing :-)
@rv.kvetch, this something which will definitely come in handy, thanks for letting me know, but for my specific use case, I have many other methods in the Account class besides the alt constructor, and inheriting from TypedDict limits to only using annotations inside a class, also I don't get type hints for `"someRandomKey" which is understood as I haven't, specified that field in the class. Thanks for letting me know this.
Ah, that definitely makes sense. Yep agreed, one limitation of TypedDict is you can't define and use methods as you normally would. If you are still set on using dataclasses, I'd suggest checking out the linked library above as it has a CLI tool you can use to convert a JSON schema to a dataclass model, which can potentially be used if you add a bunch of new JSON fields. It is actually inspired in part by the other excellent tool here: russbiggs.github.io/json2dataclass
Insanely awesome. Works really well with a JSON dict list. {"mystr": [{"mystr2": {mystr3: -999}, more dictionaries...}]}. json_data_to_class = dict[list, Myclass]= _json_data. In other words, looks like this technique works many JSON formats.
7

This way you lose some dataclass features.

  • Such as determining whether it is optional or not
  • Such as auto-completion feature

However, you are more familiar with your project and decide accordingly

There must be many methods, but this is one of them:

@dataclass
class Account(object):
    email: str
    password: str
    name: str
    salary: int

    @classmethod
    def from_json(cls, json_key):
        file = json.load(open("1.txt"))
        keys = [f.name for f in fields(cls)]
        # or: keys = cls.__dataclass_fields__.keys()
        json_data = file[json_key]
        normal_json_data = {key: json_data[key] for key in json_data if key in keys}
        anormal_json_data = {key: json_data[key] for key in json_data if key not in keys}
        tmp = cls(**normal_json_data)
        for anormal_key in anormal_json_data:
            setattr(tmp,anormal_key,anormal_json_data[anormal_key])
        return tmp

test = Account.from_json("acc1")
print(test.age)

8 Comments

The __dataclass_fields__ attribute is internal to the dataclasses module and could change at any time; you should prefer to use dataclasses.fields here instead (which is documented).
@rv.kvetch tanks , but for this usage is not matter
@rv.kvetch yes, its true, tanks 🌹
@rv.kvetch now you cant make new Account object without password or email or... because those are require field, but age is optional now , for example if you want change email as optional field (means: can make new Account without pass email field in argument ) you must change line 3 with: email: Optional[str]
yes @Kanishk, sory
|
3

For a flat (not nested dataclass) the code below does the job.
If you need to handle nested dataclasses you should use a framework like dacite.
Note 1 that loading the data from the json file should not be part of your class logic.

Note 2 If your json can contain anything - you can not map it to a dataclass and you should have to work with a dict

from dataclasses import dataclass
from typing import List

data = {
    "acc1":{
        "email":"[email protected]",
        "password":"acc1",
        "name":"ACC1",
        "salary":1
    },
    "acc2":{
        "email":"[email protected]",
        "password":"acc2",
        "name":"ACC2",
        "salary":2
    }

}



@dataclass
class Account:
    email:str
    password:str
    name:str
    salary:int

accounts: List[Account] = [Account(**x) for x in data.values()]
print(accounts)

output

[Account(email='[email protected]', password='acc1', name='ACC1', salary=1), Account(email='[email protected]', password='acc2', name='ACC2', salary=2)]

1 Comment

I was going to do the same thing with obj.__dict__.update(x), but this is better.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.