xtquant.xtbson.bson36.raw_bson

Tools for representing raw BSON documents.

Inserting and Retrieving RawBSONDocuments

Example: Moving a document between different databases/collections

.. doctest::

>>> import bson
>>> from pymongo import MongoClient
>>> from .raw_bson import RawBSONDocument
>>> client = MongoClient(document_class=RawBSONDocument)
>>> client.drop_database('db')
>>> client.drop_database('replica_db')
>>> db = client.db
>>> result = db.test.insert_many([{'_id': 1, 'a': 1},
...                               {'_id': 2, 'b': 1},
...                               {'_id': 3, 'c': 1},
...                               {'_id': 4, 'd': 1}])
>>> replica_db = client.replica_db
>>> for doc in db.test.find():
...    print(f"raw document: {doc.raw}")
...    print(f"decoded document: {bson.decode(doc.raw)}")
...    result = replica_db.test.insert_one(doc)
raw document: b'...'
decoded document: {'_id': 1, 'a': 1}
raw document: b'...'
decoded document: {'_id': 2, 'b': 1}
raw document: b'...'
decoded document: {'_id': 3, 'c': 1}
raw document: b'...'
decoded document: {'_id': 4, 'd': 1}

For use cases like moving documents across different databases or writing binary blobs to disk, using raw BSON documents provides better speed and avoids the overhead of decoding or encoding BSON.

  1# Copyright 2015-present MongoDB, Inc.
  2#
  3# Licensed under the Apache License, Version 2.0 (the "License");
  4# you may not use this file except in compliance with the License.
  5# You may obtain a copy of the License at
  6#
  7# http://www.apache.org/licenses/LICENSE-2.0
  8#
  9# Unless required by applicable law or agreed to in writing, software
 10# distributed under the License is distributed on an "AS IS" BASIS,
 11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12# See the License for the specific language governing permissions and
 13# limitations under the License.
 14
 15"""Tools for representing raw BSON documents.
 16
 17Inserting and Retrieving RawBSONDocuments
 18=========================================
 19
 20Example: Moving a document between different databases/collections
 21
 22.. doctest::
 23
 24  >>> import bson
 25  >>> from pymongo import MongoClient
 26  >>> from .raw_bson import RawBSONDocument
 27  >>> client = MongoClient(document_class=RawBSONDocument)
 28  >>> client.drop_database('db')
 29  >>> client.drop_database('replica_db')
 30  >>> db = client.db
 31  >>> result = db.test.insert_many([{'_id': 1, 'a': 1},
 32  ...                               {'_id': 2, 'b': 1},
 33  ...                               {'_id': 3, 'c': 1},
 34  ...                               {'_id': 4, 'd': 1}])
 35  >>> replica_db = client.replica_db
 36  >>> for doc in db.test.find():
 37  ...    print(f"raw document: {doc.raw}")
 38  ...    print(f"decoded document: {bson.decode(doc.raw)}")
 39  ...    result = replica_db.test.insert_one(doc)
 40  raw document: b'...'
 41  decoded document: {'_id': 1, 'a': 1}
 42  raw document: b'...'
 43  decoded document: {'_id': 2, 'b': 1}
 44  raw document: b'...'
 45  decoded document: {'_id': 3, 'c': 1}
 46  raw document: b'...'
 47  decoded document: {'_id': 4, 'd': 1}
 48
 49For use cases like moving documents across different databases or writing binary
 50blobs to disk, using raw BSON documents provides better speed and avoids the
 51overhead of decoding or encoding BSON.
 52"""
 53
 54from collections.abc import Mapping as _Mapping
 55
 56from . import _get_object_size, _raw_to_dict
 57from .codec_options import _RAW_BSON_DOCUMENT_MARKER
 58from .codec_options import DEFAULT_CODEC_OPTIONS as DEFAULT
 59from .son import SON
 60
 61
 62class RawBSONDocument(_Mapping):
 63    """Representation for a MongoDB document that provides access to the raw
 64    BSON bytes that compose it.
 65
 66    Only when a field is accessed or modified within the document does
 67    RawBSONDocument decode its bytes.
 68    """
 69
 70    __slots__ = ("__raw", "__inflated_doc", "__codec_options")
 71    _type_marker = _RAW_BSON_DOCUMENT_MARKER
 72
 73    def __init__(self, bson_bytes, codec_options=None):
 74        """Create a new :class:`RawBSONDocument`
 75
 76        :class:`RawBSONDocument` is a representation of a BSON document that
 77        provides access to the underlying raw BSON bytes. Only when a field is
 78        accessed or modified within the document does RawBSONDocument decode
 79        its bytes.
 80
 81        :class:`RawBSONDocument` implements the ``Mapping`` abstract base
 82        class from the standard library so it can be used like a read-only
 83        ``dict``::
 84
 85            >>> from . import encode
 86            >>> raw_doc = RawBSONDocument(encode({'_id': 'my_doc'}))
 87            >>> raw_doc.raw
 88            b'...'
 89            >>> raw_doc['_id']
 90            'my_doc'
 91
 92        :Parameters:
 93          - `bson_bytes`: the BSON bytes that compose this document
 94          - `codec_options` (optional): An instance of
 95            :class:`~bson.codec_options.CodecOptions` whose ``document_class``
 96            must be :class:`RawBSONDocument`. The default is
 97            :attr:`DEFAULT_RAW_BSON_OPTIONS`.
 98
 99        .. versionchanged:: 3.8
100          :class:`RawBSONDocument` now validates that the ``bson_bytes``
101          passed in represent a single bson document.
102
103        .. versionchanged:: 3.5
104          If a :class:`~bson.codec_options.CodecOptions` is passed in, its
105          `document_class` must be :class:`RawBSONDocument`.
106        """
107        self.__raw = bson_bytes
108        self.__inflated_doc = None
109        # Can't default codec_options to DEFAULT_RAW_BSON_OPTIONS in signature,
110        # it refers to this class RawBSONDocument.
111        if codec_options is None:
112            codec_options = DEFAULT_RAW_BSON_OPTIONS
113        elif codec_options.document_class is not RawBSONDocument:
114            raise TypeError(
115                "RawBSONDocument cannot use CodecOptions with document "
116                "class %s" % (codec_options.document_class,)
117            )
118        self.__codec_options = codec_options
119        # Validate the bson object size.
120        _get_object_size(bson_bytes, 0, len(bson_bytes))
121
122    @property
123    def raw(self):
124        """The raw BSON bytes composing this document."""
125        return self.__raw
126
127    def items(self):
128        """Lazily decode and iterate elements in this document."""
129        return self.__inflated.items()
130
131    @property
132    def __inflated(self):
133        if self.__inflated_doc is None:
134            # We already validated the object's size when this document was
135            # created, so no need to do that again.
136            # Use SON to preserve ordering of elements.
137            self.__inflated_doc = _inflate_bson(self.__raw, self.__codec_options)
138        return self.__inflated_doc
139
140    def __getitem__(self, item):
141        return self.__inflated[item]
142
143    def __iter__(self):
144        return iter(self.__inflated)
145
146    def __len__(self):
147        return len(self.__inflated)
148
149    def __eq__(self, other):
150        if isinstance(other, RawBSONDocument):
151            return self.__raw == other.raw
152        return NotImplemented
153
154    def __repr__(self):
155        return "RawBSONDocument(%r, codec_options=%r)" % (self.raw, self.__codec_options)
156
157
158def _inflate_bson(bson_bytes, codec_options):
159    """Inflates the top level fields of a BSON document.
160
161    :Parameters:
162      - `bson_bytes`: the BSON bytes that compose this document
163      - `codec_options`: An instance of
164        :class:`~bson.codec_options.CodecOptions` whose ``document_class``
165        must be :class:`RawBSONDocument`.
166    """
167    # Use SON to preserve ordering of elements.
168    return _raw_to_dict(bson_bytes, 4, len(bson_bytes) - 1, codec_options, SON())
169
170
171DEFAULT_RAW_BSON_OPTIONS = DEFAULT.with_options(document_class=RawBSONDocument)
172"""The default :class:`~bson.codec_options.CodecOptions` for
173:class:`RawBSONDocument`.
174"""
class RawBSONDocument(collections.abc.Mapping):
 63class RawBSONDocument(_Mapping):
 64    """Representation for a MongoDB document that provides access to the raw
 65    BSON bytes that compose it.
 66
 67    Only when a field is accessed or modified within the document does
 68    RawBSONDocument decode its bytes.
 69    """
 70
 71    __slots__ = ("__raw", "__inflated_doc", "__codec_options")
 72    _type_marker = _RAW_BSON_DOCUMENT_MARKER
 73
 74    def __init__(self, bson_bytes, codec_options=None):
 75        """Create a new :class:`RawBSONDocument`
 76
 77        :class:`RawBSONDocument` is a representation of a BSON document that
 78        provides access to the underlying raw BSON bytes. Only when a field is
 79        accessed or modified within the document does RawBSONDocument decode
 80        its bytes.
 81
 82        :class:`RawBSONDocument` implements the ``Mapping`` abstract base
 83        class from the standard library so it can be used like a read-only
 84        ``dict``::
 85
 86            >>> from . import encode
 87            >>> raw_doc = RawBSONDocument(encode({'_id': 'my_doc'}))
 88            >>> raw_doc.raw
 89            b'...'
 90            >>> raw_doc['_id']
 91            'my_doc'
 92
 93        :Parameters:
 94          - `bson_bytes`: the BSON bytes that compose this document
 95          - `codec_options` (optional): An instance of
 96            :class:`~bson.codec_options.CodecOptions` whose ``document_class``
 97            must be :class:`RawBSONDocument`. The default is
 98            :attr:`DEFAULT_RAW_BSON_OPTIONS`.
 99
100        .. versionchanged:: 3.8
101          :class:`RawBSONDocument` now validates that the ``bson_bytes``
102          passed in represent a single bson document.
103
104        .. versionchanged:: 3.5
105          If a :class:`~bson.codec_options.CodecOptions` is passed in, its
106          `document_class` must be :class:`RawBSONDocument`.
107        """
108        self.__raw = bson_bytes
109        self.__inflated_doc = None
110        # Can't default codec_options to DEFAULT_RAW_BSON_OPTIONS in signature,
111        # it refers to this class RawBSONDocument.
112        if codec_options is None:
113            codec_options = DEFAULT_RAW_BSON_OPTIONS
114        elif codec_options.document_class is not RawBSONDocument:
115            raise TypeError(
116                "RawBSONDocument cannot use CodecOptions with document "
117                "class %s" % (codec_options.document_class,)
118            )
119        self.__codec_options = codec_options
120        # Validate the bson object size.
121        _get_object_size(bson_bytes, 0, len(bson_bytes))
122
123    @property
124    def raw(self):
125        """The raw BSON bytes composing this document."""
126        return self.__raw
127
128    def items(self):
129        """Lazily decode and iterate elements in this document."""
130        return self.__inflated.items()
131
132    @property
133    def __inflated(self):
134        if self.__inflated_doc is None:
135            # We already validated the object's size when this document was
136            # created, so no need to do that again.
137            # Use SON to preserve ordering of elements.
138            self.__inflated_doc = _inflate_bson(self.__raw, self.__codec_options)
139        return self.__inflated_doc
140
141    def __getitem__(self, item):
142        return self.__inflated[item]
143
144    def __iter__(self):
145        return iter(self.__inflated)
146
147    def __len__(self):
148        return len(self.__inflated)
149
150    def __eq__(self, other):
151        if isinstance(other, RawBSONDocument):
152            return self.__raw == other.raw
153        return NotImplemented
154
155    def __repr__(self):
156        return "RawBSONDocument(%r, codec_options=%r)" % (self.raw, self.__codec_options)

Representation for a MongoDB document that provides access to the raw BSON bytes that compose it.

Only when a field is accessed or modified within the document does RawBSONDocument decode its bytes.

RawBSONDocument(bson_bytes, codec_options=None)
 74    def __init__(self, bson_bytes, codec_options=None):
 75        """Create a new :class:`RawBSONDocument`
 76
 77        :class:`RawBSONDocument` is a representation of a BSON document that
 78        provides access to the underlying raw BSON bytes. Only when a field is
 79        accessed or modified within the document does RawBSONDocument decode
 80        its bytes.
 81
 82        :class:`RawBSONDocument` implements the ``Mapping`` abstract base
 83        class from the standard library so it can be used like a read-only
 84        ``dict``::
 85
 86            >>> from . import encode
 87            >>> raw_doc = RawBSONDocument(encode({'_id': 'my_doc'}))
 88            >>> raw_doc.raw
 89            b'...'
 90            >>> raw_doc['_id']
 91            'my_doc'
 92
 93        :Parameters:
 94          - `bson_bytes`: the BSON bytes that compose this document
 95          - `codec_options` (optional): An instance of
 96            :class:`~bson.codec_options.CodecOptions` whose ``document_class``
 97            must be :class:`RawBSONDocument`. The default is
 98            :attr:`DEFAULT_RAW_BSON_OPTIONS`.
 99
100        .. versionchanged:: 3.8
101          :class:`RawBSONDocument` now validates that the ``bson_bytes``
102          passed in represent a single bson document.
103
104        .. versionchanged:: 3.5
105          If a :class:`~bson.codec_options.CodecOptions` is passed in, its
106          `document_class` must be :class:`RawBSONDocument`.
107        """
108        self.__raw = bson_bytes
109        self.__inflated_doc = None
110        # Can't default codec_options to DEFAULT_RAW_BSON_OPTIONS in signature,
111        # it refers to this class RawBSONDocument.
112        if codec_options is None:
113            codec_options = DEFAULT_RAW_BSON_OPTIONS
114        elif codec_options.document_class is not RawBSONDocument:
115            raise TypeError(
116                "RawBSONDocument cannot use CodecOptions with document "
117                "class %s" % (codec_options.document_class,)
118            )
119        self.__codec_options = codec_options
120        # Validate the bson object size.
121        _get_object_size(bson_bytes, 0, len(bson_bytes))

Create a new RawBSONDocument

RawBSONDocument is a representation of a BSON document that provides access to the underlying raw BSON bytes. Only when a field is accessed or modified within the document does RawBSONDocument decode its bytes.

RawBSONDocument implements the Mapping abstract base class from the standard library so it can be used like a read-only dict::

>>> from . import encode
>>> raw_doc = RawBSONDocument(encode({'_id': 'my_doc'}))
>>> raw_doc.raw
b'...'
>>> raw_doc['_id']
'my_doc'

:Parameters:

  • bson_bytes: the BSON bytes that compose this document
  • codec_options (optional): An instance of ~bson.codec_options.CodecOptions whose document_class must be RawBSONDocument. The default is DEFAULT_RAW_BSON_OPTIONS.

Changed in version 3.8: RawBSONDocument now validates that the bson_bytes passed in represent a single bson document.

Changed in version 3.5: If a ~bson.codec_options.CodecOptions is passed in, its document_class must be RawBSONDocument.

raw

The raw BSON bytes composing this document.

def items(self):
128    def items(self):
129        """Lazily decode and iterate elements in this document."""
130        return self.__inflated.items()

Lazily decode and iterate elements in this document.

Inherited Members
collections.abc.Mapping
get
keys
values
DEFAULT_RAW_BSON_OPTIONS = CodecOptions(document_class=<class 'xtquant.xtbson.bson36.raw_bson.RawBSONDocument'>, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))

The default ~bson.codec_options.CodecOptions for RawBSONDocument.