I’m writing a custom MessagePack serializer to add to an old Python project that uses MessagePack extensively. In particular, I often have lists of objects built up that I need to serialize all at once, but the GIL prevents this, so I want to use Rust to do it, and be fast in other ways besides.
I have already gotten fundamental Python types to serialize this way, but I want to serialize some of the custom classes that my Python app has, as well, and would prefer not to port those into Rust—even though pyO3 would make that easy to do, I’m nervous about adding a hard dependency on this Rust code, and would like a fallback to the standard Python MessagePack library.
So I started by writing a struct to represent a singleton:
struct PyFinalRule(PyAny);
unsafe impl PyTypeInfo for PyFinalRule {
const NAME: &'static str = "FinalRule";
const MODULE: Option<&'static str> = Option::Some("LiSE.util");
type AsRefTarget = PyAny;
fn type_object_raw(py: Python<'_>) -> *mut PyTypeObject {
let modu = py.import("LiSE.util").unwrap();
let final_rule = modu.getattr("FinalRule").unwrap();
final_rule.get_type_ptr()
}
}
unsafe impl PyNativeType for PyFinalRule {}
I was expecting to be able to use PyFinalRule::is_type_of
to check when a PyAny
object is really an instance of FinalRule
. Instead, PyFinalRule::is_type_of
always returns false
when passed the one instance of the FinalRule
singleton. It returns true
when passed the FinalRule
type object, but that’s not very useful, since the app serializes the instance and not the type object.
How do I get PyFinalRule::is_type_of
to really check the type?
For me, i will advise you get the Python type object for your custom class. You can do this in a function or as part of your serialization logic like this
use pyo3::prelude::*;
use pyo3::types::PyType;
fn get_final_rule_type(py: Python) -> Py<PyType> {
let modu = py.import("LiSE.util").expect("Failed to import module");
let final_rule = modu.getattr("FinalRule").expect("Failed to get FinalRule class");
final_rule.extract::<Py<PyType>>().expect("Failed to extract PyType")
}
Then Check the Instance Type like this
fn is_final_rule_instance(obj: &PyAny, final_rule_type: &PyType) -> PyResult<bool> {
obj.is_instance_of(final_rule_type)
}
#https://pyo3.rs/main/doc/pyo3/types/struct.pyany
and in your serialization logic
#[pyfunction]
fn serialize_custom(py: Python, obj: &PyAny) -> PyResult<Vec<u8>> {
let final_rule_type = get_final_rule_type(py);
if is_final_rule_instance(obj, &final_rule_type)? {
// Perform custom serialization for FinalRule instances
} else {
// Handle other types
}
// ... rest of your serialization logic ...
}
You don’t need any unsafe code or complex things to check that, it’s simple:
fn is_instance_of_final_rule(py: Python<'_>, object: &PyAny) -> PyResult<bool> {
let module = py.import("LiSE.util")?;
let final_rule = module.getattr("FinalRule")?;
object.is_instance(final_rule)
}
Of course, I’d recommend to cache the FinalRule
object somewhere.
Here’s what I ended up with for type_object_raw
:
fn type_object_raw(py: Python<'_>) -> *mut PyTypeObject {
let modu = py.import("LiSE.util").unwrap();
let final_rule = modu.getattr("FinalRule").unwrap();
let typ = final_rule.extract::<&PyType>().unwrap();
typ.get_type_ptr()
}
Adapted a bit from Adesoji Alu’s answer.
Accessing Python objects requires holding the GIL even from Rust, so I don’t see how porting to Rust will help you avoid the GIL.
I still have to extract the data while inside the GIL, but can serialize it outside. Copying is cheaper than serializing.
Ah, ok. If serializing is that expensive then you’re right.
In fact there’s already a very fast msgpack serializer for python,
ormsgpack
, but my app does so much serializing so often that I really want to optimize it every way I can.