A reasonably fast DAG-CBOR parser for Python
Convert between DAG-CBOR and Python objects at hundreds of megabytes per second. Take a look at the benchmarks
Other than speed, a distinguishing feature is that it operates non-recursively. This means you can decode or encode arbitrarily deeply nested objects without running out of call stack (although of course you might still run out of heap).
Finally, cbrrr aims to be maximally strict regarding DAG-CBOR canonicalization rules. See below for further details.
From pypi:
python3 -m pip install cbrrr
From git:
git clone https://github.com/DavidBuchanan314/dag-cbrrr
cd dag-cbrrr
python3 -m pip install -v .
Here's the basics:
import cbrrr
encoded = cbrrr.encode_dag_cbor({"hello": [b"world", 1, 2, 3]})
print(encoded) # b'\xa1ehello\x84Eworld\x01\x02\x03'
decoded = cbrrr.decode_dag_cbor(encoded)
print(decoded) # {'hello': [b'world', 1, 2, 3]}
For more detailed API information, take a look at the commented python source, which provides an ergonomic wrapper for the native module (more docs coming soonβ’)
TL;DR:
class CID:
def __init__(self, cid_bytes: bytes) -> None:
...
def decode(cls, data: Union[bytes, str]) -> "CID":
...
def encode(self, base="base32") -> str:
...
...
DagCborTypes = Union[str, bytes, int, bool, float, CID, list, dict, None]
def decode_dag_cbor(
data: bytes,
atjson_mode: bool=False,
cid_ctor: Callable[[bytes], Any]=CID
) -> DagCborTypes:
...
def decode_multi_dag_cbor_in_violation_of_the_spec(
data: bytes,
atjson_mode: bool=False,
cid_ctor: Callable[[bytes], Any]=CID
) -> Iterator[DagCborTypes]:
...
def encode_dag_cbor(
obj: DagCborTypes,
atjson_mode: bool=False,
cid_type: Type=CID
) -> bytes:
...
"atjson_mode" refers to the representation used in atproto HTTP APIs, documented here here. It is not a round-trip-safe representation.
cbrrr aims to conform to all the strictness rules set out in the DAG-CBOR specification.
It decodes strictly, and there is no non-strict mode available. This means, among other things:
In its default configuration, valid DAG-CBOR should round-trip perfectly, i.e. encode_dag_cbor(decode_dag_cbor(data)) == data. (This is not necessarily true if you specify atjson_mode=True, or pass a custom CID type (see below) that misbehaves in some way).
multiformats.CIDcbrrr brings its own performance-oriented CID class, but it's relatively bare-bones (supporting only base32, for now). If you want more features and broader compatibility, you can use the CID class from hashberg-io/multiformats like so:
import cbrrr
import multiformats
encoded = cbrrr.encode_dag_cbor(
multiformats.CID.decode("bafkreibm6jg3ux5qumhcn2b3flc3tyu6dmlb4xa7u5bf44yegnrjhc4yeq"),
cid_type=multiformats.CID
)
decoded = cbrrr.decode_dag_cbor(encoded, cid_ctor=multiformats.CID.decode)
print(decoded) # zb2rhZfjRh2FHHB2RkHVEvL2vJnCTcu7kwRqgVsf9gpkLgteo
# clone the repo
python3 -m pip install -ve .
python3 -m unittest -v
π A bridge between decentralized social networks
π¬ The social web translator
ποΈ Fast Python library to work with IPLD: DAG-CBOR, CID, CAR, multibase
The AT Protocol (π¦ Bluesky) SDK for Python π
A collection of example projects and scripts for atproto development.
A script for auto-deleting Bluesky posts
Your Brand Here!
50K+ engaged viewers every month
Limited spots available!
π§ Contact us via emailπ¦ Contact us on Bluesky