Pydantic is a library to validate JSON documents and converts them into Python objects, based on Python Type Annotations. It works really well, it is easy to use, it supports everything we can possibly expect it to support and saves us a ton of effort.
However, there a few less intuitive aspects to it, which is what this post is about.
Mystery Int
Puzzler: What is the output of the following code?
from typing import Union
import pydantic
class Project(pydantic.BaseModel):
value: Union[pydantic.types.StrictBool, int]
def roundtrip(value_in: str) -> None:
"""
Convert to json and back
"""
project_in = Project(value=value_in)
as_json = project_in.dict(by_alias=True)
project_out = Project(**as_json)
print(f"input: {project_in.value}, output: {project_out.value}")
roundtrip(5)
roundtrip(True)
The answer on Python 3.7:
input: 5, output: 5
input: True, output: True
So, the round trip works exactly as expected.
On Python 3.6:
input: 5, output: 5
input: True, output: 1
The boolean also becomes an int?
Step 1: Stepping in
It took us a while to understand the problem. A first problem we encountered is that the debugger would not step into the Pydantic code anymore. Nor respond to breakpoint() or execute print statements.
We eventually figured it out:
import pydantic
print(pydantic.__file__)
[...]/pydantic/__init__.cpython-36m-x86_64-linux-gnu.so
Pydantic is packaged with a cythonized version. The python code is not actually executed anymore. By removing the *.so files, we regained the ability to step into the code.
Step 2: It’s not your fault
We happily stepped through the Pydantic code, only to realize the problem was not serialization or deserialization, but the initial object construction! We simplified the test case to:
def roundtrip(value_in: str) -> None:
Convert to json and back
project_in = Project(value=value_in)
print(f"input: {value_in}, output: {project_in.value}")
What we figured out is that Pydantic is not at fault, but it is the actual type annotation that is wrong:
>>> Project.__annotations__["value"]
<class 'int'>
Or to say it with a picture
The reason for this is that in Python 3.6 unions are flattened: all duplicates and strict subtypes are removed. And bool and StrictBool are both subtypes of int. In Python 3.7 subclasses are no longer removed.
>>> issubclass(pydantic.types.StrictBool, int)
True
The Solution
The solution is simple, but not pretty. We made a copy of StrictBool that is not a subclass of int:
class StrictNonIntBool(object):
StrictNonIntBool to allow for bools which are not type-coerced and that are not a subclass of int
Based on StrictBool from pydantic
@classmethod
def __get_validators__(cls) -> "types.CallableGenerator":
yield cls.validate
@classmethod
def validate(cls, value: Any) -> bool:
Ensure that we only allow bools.
if isinstance(value, bool):
return value
raise errors.StrictBoolError()
class Project(pydantic.BaseModel):
value: Union[StrictNonIntBool, int]
This produces the correct result on both Python 3.6 and Python 3.7.