xtquant.xtbson.bson36.json_util

Tools for using Python's json module with BSON documents.

This module provides two helper methods dumps and loads that wrap the native json methods and provide explicit BSON conversion to and from JSON. ~bson.json_util.JSONOptions provides a way to control how JSON is emitted and parsed, with the default being the Relaxed Extended JSON format. ~bson.json_util can also generate Canonical or legacy Extended JSON when CANONICAL_JSON_OPTIONS or LEGACY_JSON_OPTIONS is provided, respectively.

Example usage (deserialization):

.. doctest::

>>> from .json_util import loads
>>> loads('[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$scope": {}, "$code": "function x() { return 1; }"}}, {"bin": {"$type": "80", "$binary": "AQIDBA=="}}]')
[{'foo': [1, 2]}, {'bar': {'hello': 'world'}}, {'code': Code('function x() { return 1; }', {})}, {'bin': Binary(b'...', 128)}]

Example usage with RELAXED_JSON_OPTIONS (the default):

.. doctest::

>>> from . import Binary, Code
>>> from .json_util import dumps
>>> dumps([{'foo': [1, 2]},
...        {'bar': {'hello': 'world'}},
...        {'code': Code("function x() { return 1; }")},
...        {'bin': Binary(b"")}])
'[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }"}}, {"bin": {"$binary": {"base64": "AQIDBA==", "subType": "00"}}}]'

Example usage (with CANONICAL_JSON_OPTIONS):

.. doctest::

>>> from . import Binary, Code
>>> from .json_util import dumps, CANONICAL_JSON_OPTIONS
>>> dumps([{'foo': [1, 2]},
...        {'bar': {'hello': 'world'}},
...        {'code': Code("function x() { return 1; }")},
...        {'bin': Binary(b"")}],
...       json_options=CANONICAL_JSON_OPTIONS)
'[{"foo": [{"$numberInt": "1"}, {"$numberInt": "2"}]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }"}}, {"bin": {"$binary": {"base64": "AQIDBA==", "subType": "00"}}}]'

Example usage (with LEGACY_JSON_OPTIONS):

.. doctest::

>>> from . import Binary, Code
>>> from .json_util import dumps, LEGACY_JSON_OPTIONS
>>> dumps([{'foo': [1, 2]},
...        {'bar': {'hello': 'world'}},
...        {'code': Code("function x() { return 1; }", {})},
...        {'bin': Binary(b"")}],
...       json_options=LEGACY_JSON_OPTIONS)
'[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'

Alternatively, you can manually pass the default to json.dumps(). It won't handle ~bson.binary.Binary and ~bson.code.Code instances (as they are extended strings you can't provide custom defaults), but it will be faster as there is less recursion.

If your application does not need the flexibility offered by JSONOptions and spends a large amount of time in the json_util module, look to python-bsonjs for a nice performance improvement. python-bsonjs is a fast BSON to MongoDB Extended JSON converter for Python built on top of libbson . python-bsonjs works best with PyMongo when using ~bson.raw_bson.RawBSONDocument.

  1# Copyright 2009-present MongoDB, Inc.
  2#
  3# Licensed under the Apache License, Version 2.0 (the "License");
  4# you may not use this file except in compliance with the License.
  5# You may obtain a copy of the License at
  6#
  7# http://www.apache.org/licenses/LICENSE-2.0
  8#
  9# Unless required by applicable law or agreed to in writing, software
 10# distributed under the License is distributed on an "AS IS" BASIS,
 11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12# See the License for the specific language governing permissions and
 13# limitations under the License.
 14
 15"""Tools for using Python's :mod:`json` module with BSON documents.
 16
 17This module provides two helper methods `dumps` and `loads` that wrap the
 18native :mod:`json` methods and provide explicit BSON conversion to and from
 19JSON. :class:`~bson.json_util.JSONOptions` provides a way to control how JSON
 20is emitted and parsed, with the default being the Relaxed Extended JSON format.
 21:mod:`~bson.json_util` can also generate Canonical or legacy `Extended JSON`_
 22when :const:`CANONICAL_JSON_OPTIONS` or :const:`LEGACY_JSON_OPTIONS` is
 23provided, respectively.
 24
 25.. _Extended JSON: https://github.com/mongodb/specifications/blob/master/source/extended-json.rst
 26
 27Example usage (deserialization):
 28
 29.. doctest::
 30
 31   >>> from .json_util import loads
 32   >>> loads('[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$scope": {}, "$code": "function x() { return 1; }"}}, {"bin": {"$type": "80", "$binary": "AQIDBA=="}}]')
 33   [{'foo': [1, 2]}, {'bar': {'hello': 'world'}}, {'code': Code('function x() { return 1; }', {})}, {'bin': Binary(b'...', 128)}]
 34
 35Example usage with :const:`RELAXED_JSON_OPTIONS` (the default):
 36
 37.. doctest::
 38
 39   >>> from . import Binary, Code
 40   >>> from .json_util import dumps
 41   >>> dumps([{'foo': [1, 2]},
 42   ...        {'bar': {'hello': 'world'}},
 43   ...        {'code': Code("function x() { return 1; }")},
 44   ...        {'bin': Binary(b"\x01\x02\x03\x04")}])
 45   '[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }"}}, {"bin": {"$binary": {"base64": "AQIDBA==", "subType": "00"}}}]'
 46
 47Example usage (with :const:`CANONICAL_JSON_OPTIONS`):
 48
 49.. doctest::
 50
 51   >>> from . import Binary, Code
 52   >>> from .json_util import dumps, CANONICAL_JSON_OPTIONS
 53   >>> dumps([{'foo': [1, 2]},
 54   ...        {'bar': {'hello': 'world'}},
 55   ...        {'code': Code("function x() { return 1; }")},
 56   ...        {'bin': Binary(b"\x01\x02\x03\x04")}],
 57   ...       json_options=CANONICAL_JSON_OPTIONS)
 58   '[{"foo": [{"$numberInt": "1"}, {"$numberInt": "2"}]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }"}}, {"bin": {"$binary": {"base64": "AQIDBA==", "subType": "00"}}}]'
 59
 60Example usage (with :const:`LEGACY_JSON_OPTIONS`):
 61
 62.. doctest::
 63
 64   >>> from . import Binary, Code
 65   >>> from .json_util import dumps, LEGACY_JSON_OPTIONS
 66   >>> dumps([{'foo': [1, 2]},
 67   ...        {'bar': {'hello': 'world'}},
 68   ...        {'code': Code("function x() { return 1; }", {})},
 69   ...        {'bin': Binary(b"\x01\x02\x03\x04")}],
 70   ...       json_options=LEGACY_JSON_OPTIONS)
 71   '[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'
 72
 73Alternatively, you can manually pass the `default` to :func:`json.dumps`.
 74It won't handle :class:`~bson.binary.Binary` and :class:`~bson.code.Code`
 75instances (as they are extended strings you can't provide custom defaults),
 76but it will be faster as there is less recursion.
 77
 78.. note::
 79   If your application does not need the flexibility offered by
 80   :class:`JSONOptions` and spends a large amount of time in the `json_util`
 81   module, look to
 82   `python-bsonjs <https://pypi.python.org/pypi/python-bsonjs>`_ for a nice
 83   performance improvement. `python-bsonjs` is a fast BSON to MongoDB
 84   Extended JSON converter for Python built on top of
 85   `libbson <https://github.com/mongodb/libbson>`_. `python-bsonjs` works best
 86   with PyMongo when using :class:`~bson.raw_bson.RawBSONDocument`.
 87"""
 88
 89import base64
 90import datetime
 91import json
 92import math
 93import re
 94import uuid
 95
 96import bson
 97from . import EPOCH_AWARE, RE_TYPE, SON
 98from .binary import ALL_UUID_SUBTYPES, UUID_SUBTYPE, Binary, UuidRepresentation
 99from .code import Code
100from .codec_options import CodecOptions
101from .dbref import DBRef
102from .decimal128 import Decimal128
103from .int64 import Int64
104from .max_key import MaxKey
105from .min_key import MinKey
106from .objectid import ObjectId
107from .regex import Regex
108from .timestamp import Timestamp
109from .tz_util import utc
110
111_RE_OPT_TABLE = {
112    "i": re.I,
113    "l": re.L,
114    "m": re.M,
115    "s": re.S,
116    "u": re.U,
117    "x": re.X,
118}
119
120
121class DatetimeRepresentation:
122    LEGACY = 0
123    """Legacy MongoDB Extended JSON datetime representation.
124
125    :class:`datetime.datetime` instances will be encoded to JSON in the
126    format `{"$date": <dateAsMilliseconds>}`, where `dateAsMilliseconds` is
127    a 64-bit signed integer giving the number of milliseconds since the Unix
128    epoch UTC. This was the default encoding before PyMongo version 3.4.
129
130    .. versionadded:: 3.4
131    """
132
133    NUMBERLONG = 1
134    """NumberLong datetime representation.
135
136    :class:`datetime.datetime` instances will be encoded to JSON in the
137    format `{"$date": {"$numberLong": "<dateAsMilliseconds>"}}`,
138    where `dateAsMilliseconds` is the string representation of a 64-bit signed
139    integer giving the number of milliseconds since the Unix epoch UTC.
140
141    .. versionadded:: 3.4
142    """
143
144    ISO8601 = 2
145    """ISO-8601 datetime representation.
146
147    :class:`datetime.datetime` instances greater than or equal to the Unix
148    epoch UTC will be encoded to JSON in the format `{"$date": "<ISO-8601>"}`.
149    :class:`datetime.datetime` instances before the Unix epoch UTC will be
150    encoded as if the datetime representation is
151    :const:`~DatetimeRepresentation.NUMBERLONG`.
152
153    .. versionadded:: 3.4
154    """
155
156
157class JSONMode:
158    LEGACY = 0
159    """Legacy Extended JSON representation.
160
161    In this mode, :func:`~bson.json_util.dumps` produces PyMongo's legacy
162    non-standard JSON output. Consider using
163    :const:`~bson.json_util.JSONMode.RELAXED` or
164    :const:`~bson.json_util.JSONMode.CANONICAL` instead.
165
166    .. versionadded:: 3.5
167    """
168
169    RELAXED = 1
170    """Relaxed Extended JSON representation.
171
172    In this mode, :func:`~bson.json_util.dumps` produces Relaxed Extended JSON,
173    a mostly JSON-like format. Consider using this for things like a web API,
174    where one is sending a document (or a projection of a document) that only
175    uses ordinary JSON type primitives. In particular, the ``int``,
176    :class:`~bson.int64.Int64`, and ``float`` numeric types are represented in
177    the native JSON number format. This output is also the most human readable
178    and is useful for debugging and documentation.
179
180    .. seealso:: The specification for Relaxed `Extended JSON`_.
181
182    .. versionadded:: 3.5
183    """
184
185    CANONICAL = 2
186    """Canonical Extended JSON representation.
187
188    In this mode, :func:`~bson.json_util.dumps` produces Canonical Extended
189    JSON, a type preserving format. Consider using this for things like
190    testing, where one has to precisely specify expected types in JSON. In
191    particular, the ``int``, :class:`~bson.int64.Int64`, and ``float`` numeric
192    types are encoded with type wrappers.
193
194    .. seealso:: The specification for Canonical `Extended JSON`_.
195
196    .. versionadded:: 3.5
197    """
198
199
200class JSONOptions(CodecOptions):
201    """Encapsulates JSON options for :func:`dumps` and :func:`loads`.
202
203    :Parameters:
204      - `strict_number_long`: If ``True``, :class:`~bson.int64.Int64` objects
205        are encoded to MongoDB Extended JSON's *Strict mode* type
206        `NumberLong`, ie ``'{"$numberLong": "<number>" }'``. Otherwise they
207        will be encoded as an `int`. Defaults to ``False``.
208      - `datetime_representation`: The representation to use when encoding
209        instances of :class:`datetime.datetime`. Defaults to
210        :const:`~DatetimeRepresentation.LEGACY`.
211      - `strict_uuid`: If ``True``, :class:`uuid.UUID` object are encoded to
212        MongoDB Extended JSON's *Strict mode* type `Binary`. Otherwise it
213        will be encoded as ``'{"$uuid": "<hex>" }'``. Defaults to ``False``.
214      - `json_mode`: The :class:`JSONMode` to use when encoding BSON types to
215        Extended JSON. Defaults to :const:`~JSONMode.LEGACY`.
216      - `document_class`: BSON documents returned by :func:`loads` will be
217        decoded to an instance of this class. Must be a subclass of
218        :class:`collections.MutableMapping`. Defaults to :class:`dict`.
219      - `uuid_representation`: The :class:`~bson.binary.UuidRepresentation`
220        to use when encoding and decoding instances of :class:`uuid.UUID`.
221        Defaults to :const:`~bson.binary.UuidRepresentation.UNSPECIFIED`.
222      - `tz_aware`: If ``True``, MongoDB Extended JSON's *Strict mode* type
223        `Date` will be decoded to timezone aware instances of
224        :class:`datetime.datetime`. Otherwise they will be naive. Defaults
225        to ``False``.
226      - `tzinfo`: A :class:`datetime.tzinfo` subclass that specifies the
227        timezone from which :class:`~datetime.datetime` objects should be
228        decoded. Defaults to :const:`~bson.tz_util.utc`.
229      - `args`: arguments to :class:`~bson.codec_options.CodecOptions`
230      - `kwargs`: arguments to :class:`~bson.codec_options.CodecOptions`
231
232    .. seealso:: The specification for Relaxed and Canonical `Extended JSON`_.
233
234    .. versionchanged:: 4.0
235       The default for `json_mode` was changed from :const:`JSONMode.LEGACY`
236       to :const:`JSONMode.RELAXED`.
237       The default for `uuid_representation` was changed from
238       :const:`~bson.binary.UuidRepresentation.PYTHON_LEGACY` to
239       :const:`~bson.binary.UuidRepresentation.UNSPECIFIED`.
240
241    .. versionchanged:: 3.5
242       Accepts the optional parameter `json_mode`.
243
244    .. versionchanged:: 4.0
245       Changed default value of `tz_aware` to False.
246    """
247
248    def __new__(
249        cls,
250        strict_number_long=None,
251        datetime_representation=None,
252        strict_uuid=None,
253        json_mode=JSONMode.RELAXED,
254        *args,
255        **kwargs
256    ):
257        kwargs["tz_aware"] = kwargs.get("tz_aware", False)
258        if kwargs["tz_aware"]:
259            kwargs["tzinfo"] = kwargs.get("tzinfo", utc)
260        if datetime_representation not in (
261            DatetimeRepresentation.LEGACY,
262            DatetimeRepresentation.NUMBERLONG,
263            DatetimeRepresentation.ISO8601,
264            None,
265        ):
266            raise ValueError(
267                "JSONOptions.datetime_representation must be one of LEGACY, "
268                "NUMBERLONG, or ISO8601 from DatetimeRepresentation."
269            )
270        self = super(JSONOptions, cls).__new__(cls, *args, **kwargs)
271        if json_mode not in (JSONMode.LEGACY, JSONMode.RELAXED, JSONMode.CANONICAL):
272            raise ValueError(
273                "JSONOptions.json_mode must be one of LEGACY, RELAXED, "
274                "or CANONICAL from JSONMode."
275            )
276        self.json_mode = json_mode
277        if self.json_mode == JSONMode.RELAXED:
278            if strict_number_long:
279                raise ValueError("Cannot specify strict_number_long=True with" " JSONMode.RELAXED")
280            if datetime_representation not in (None, DatetimeRepresentation.ISO8601):
281                raise ValueError(
282                    "datetime_representation must be DatetimeRepresentation."
283                    "ISO8601 or omitted with JSONMode.RELAXED"
284                )
285            if strict_uuid not in (None, True):
286                raise ValueError("Cannot specify strict_uuid=False with JSONMode.RELAXED")
287            self.strict_number_long = False
288            self.datetime_representation = DatetimeRepresentation.ISO8601
289            self.strict_uuid = True
290        elif self.json_mode == JSONMode.CANONICAL:
291            if strict_number_long not in (None, True):
292                raise ValueError("Cannot specify strict_number_long=False with" " JSONMode.RELAXED")
293            if datetime_representation not in (None, DatetimeRepresentation.NUMBERLONG):
294                raise ValueError(
295                    "datetime_representation must be DatetimeRepresentation."
296                    "NUMBERLONG or omitted with JSONMode.RELAXED"
297                )
298            if strict_uuid not in (None, True):
299                raise ValueError("Cannot specify strict_uuid=False with JSONMode.RELAXED")
300            self.strict_number_long = True
301            self.datetime_representation = DatetimeRepresentation.NUMBERLONG
302            self.strict_uuid = True
303        else:  # JSONMode.LEGACY
304            self.strict_number_long = False
305            self.datetime_representation = DatetimeRepresentation.LEGACY
306            self.strict_uuid = False
307            if strict_number_long is not None:
308                self.strict_number_long = strict_number_long
309            if datetime_representation is not None:
310                self.datetime_representation = datetime_representation
311            if strict_uuid is not None:
312                self.strict_uuid = strict_uuid
313        return self
314
315    def _arguments_repr(self):
316        return (
317            "strict_number_long=%r, "
318            "datetime_representation=%r, "
319            "strict_uuid=%r, json_mode=%r, %s"
320            % (
321                self.strict_number_long,
322                self.datetime_representation,
323                self.strict_uuid,
324                self.json_mode,
325                super(JSONOptions, self)._arguments_repr(),
326            )
327        )
328
329    def _options_dict(self):
330        # TODO: PYTHON-2442 use _asdict() instead
331        options_dict = super(JSONOptions, self)._options_dict()
332        options_dict.update(
333            {
334                "strict_number_long": self.strict_number_long,
335                "datetime_representation": self.datetime_representation,
336                "strict_uuid": self.strict_uuid,
337                "json_mode": self.json_mode,
338            }
339        )
340        return options_dict
341
342    def with_options(self, **kwargs):
343        """
344        Make a copy of this JSONOptions, overriding some options::
345
346            >>> from .json_util import CANONICAL_JSON_OPTIONS
347            >>> CANONICAL_JSON_OPTIONS.tz_aware
348            True
349            >>> json_options = CANONICAL_JSON_OPTIONS.with_options(tz_aware=False, tzinfo=None)
350            >>> json_options.tz_aware
351            False
352
353        .. versionadded:: 3.12
354        """
355        opts = self._options_dict()
356        for opt in ("strict_number_long", "datetime_representation", "strict_uuid", "json_mode"):
357            opts[opt] = kwargs.get(opt, getattr(self, opt))
358        opts.update(kwargs)
359        return JSONOptions(**opts)
360
361
362LEGACY_JSON_OPTIONS = JSONOptions(json_mode=JSONMode.LEGACY)
363""":class:`JSONOptions` for encoding to PyMongo's legacy JSON format.
364
365.. seealso:: The documentation for :const:`bson.json_util.JSONMode.LEGACY`.
366
367.. versionadded:: 3.5
368"""
369
370CANONICAL_JSON_OPTIONS = JSONOptions(json_mode=JSONMode.CANONICAL)
371""":class:`JSONOptions` for Canonical Extended JSON.
372
373.. seealso:: The documentation for :const:`bson.json_util.JSONMode.CANONICAL`.
374
375.. versionadded:: 3.5
376"""
377
378RELAXED_JSON_OPTIONS = JSONOptions(json_mode=JSONMode.RELAXED)
379""":class:`JSONOptions` for Relaxed Extended JSON.
380
381.. seealso:: The documentation for :const:`bson.json_util.JSONMode.RELAXED`.
382
383.. versionadded:: 3.5
384"""
385
386DEFAULT_JSON_OPTIONS = RELAXED_JSON_OPTIONS
387"""The default :class:`JSONOptions` for JSON encoding/decoding.
388
389The same as :const:`RELAXED_JSON_OPTIONS`.
390
391.. versionchanged:: 4.0
392   Changed from :const:`LEGACY_JSON_OPTIONS` to
393   :const:`RELAXED_JSON_OPTIONS`.
394
395.. versionadded:: 3.4
396"""
397
398
399def dumps(obj, *args, **kwargs):
400    """Helper function that wraps :func:`json.dumps`.
401
402    Recursive function that handles all BSON types including
403    :class:`~bson.binary.Binary` and :class:`~bson.code.Code`.
404
405    :Parameters:
406      - `json_options`: A :class:`JSONOptions` instance used to modify the
407        encoding of MongoDB Extended JSON types. Defaults to
408        :const:`DEFAULT_JSON_OPTIONS`.
409
410    .. versionchanged:: 4.0
411       Now outputs MongoDB Relaxed Extended JSON by default (using
412       :const:`DEFAULT_JSON_OPTIONS`).
413
414    .. versionchanged:: 3.4
415       Accepts optional parameter `json_options`. See :class:`JSONOptions`.
416    """
417    json_options = kwargs.pop("json_options", DEFAULT_JSON_OPTIONS)
418    return json.dumps(_json_convert(obj, json_options), *args, **kwargs)
419
420
421def loads(s, *args, **kwargs):
422    """Helper function that wraps :func:`json.loads`.
423
424    Automatically passes the object_hook for BSON type conversion.
425
426    Raises ``TypeError``, ``ValueError``, ``KeyError``, or
427    :exc:`~bson.errors.InvalidId` on invalid MongoDB Extended JSON.
428
429    :Parameters:
430      - `json_options`: A :class:`JSONOptions` instance used to modify the
431        decoding of MongoDB Extended JSON types. Defaults to
432        :const:`DEFAULT_JSON_OPTIONS`.
433
434    .. versionchanged:: 3.5
435       Parses Relaxed and Canonical Extended JSON as well as PyMongo's legacy
436       format. Now raises ``TypeError`` or ``ValueError`` when parsing JSON
437       type wrappers with values of the wrong type or any extra keys.
438
439    .. versionchanged:: 3.4
440       Accepts optional parameter `json_options`. See :class:`JSONOptions`.
441    """
442    json_options = kwargs.pop("json_options", DEFAULT_JSON_OPTIONS)
443    kwargs["object_pairs_hook"] = lambda pairs: object_pairs_hook(pairs, json_options)
444    return json.loads(s, *args, **kwargs)
445
446
447def _json_convert(obj, json_options=DEFAULT_JSON_OPTIONS):
448    """Recursive helper method that converts BSON types so they can be
449    converted into json.
450    """
451    if hasattr(obj, "items"):
452        return SON(((k, _json_convert(v, json_options)) for k, v in obj.items()))
453    elif hasattr(obj, "__iter__") and not isinstance(obj, (str, bytes)):
454        return list((_json_convert(v, json_options) for v in obj))
455    try:
456        return default(obj, json_options)
457    except TypeError:
458        return obj
459
460
461def object_pairs_hook(pairs, json_options=DEFAULT_JSON_OPTIONS):
462    return object_hook(json_options.document_class(pairs), json_options)
463
464
465def object_hook(dct, json_options=DEFAULT_JSON_OPTIONS):
466    if "$oid" in dct:
467        return _parse_canonical_oid(dct)
468    if (
469        isinstance(dct.get("$ref"), str)
470        and "$id" in dct
471        and isinstance(dct.get("$db"), (str, type(None)))
472    ):
473        return _parse_canonical_dbref(dct)
474    if "$date" in dct:
475        return _parse_canonical_datetime(dct, json_options)
476    if "$regex" in dct:
477        return _parse_legacy_regex(dct)
478    if "$minKey" in dct:
479        return _parse_canonical_minkey(dct)
480    if "$maxKey" in dct:
481        return _parse_canonical_maxkey(dct)
482    if "$binary" in dct:
483        if "$type" in dct:
484            return _parse_legacy_binary(dct, json_options)
485        else:
486            return _parse_canonical_binary(dct, json_options)
487    if "$code" in dct:
488        return _parse_canonical_code(dct)
489    if "$uuid" in dct:
490        return _parse_legacy_uuid(dct, json_options)
491    if "$undefined" in dct:
492        return None
493    if "$numberLong" in dct:
494        return _parse_canonical_int64(dct)
495    if "$timestamp" in dct:
496        tsp = dct["$timestamp"]
497        return Timestamp(tsp["t"], tsp["i"])
498    if "$numberDecimal" in dct:
499        return _parse_canonical_decimal128(dct)
500    if "$dbPointer" in dct:
501        return _parse_canonical_dbpointer(dct)
502    if "$regularExpression" in dct:
503        return _parse_canonical_regex(dct)
504    if "$symbol" in dct:
505        return _parse_canonical_symbol(dct)
506    if "$numberInt" in dct:
507        return _parse_canonical_int32(dct)
508    if "$numberDouble" in dct:
509        return _parse_canonical_double(dct)
510    return dct
511
512
513def _parse_legacy_regex(doc):
514    pattern = doc["$regex"]
515    # Check if this is the $regex query operator.
516    if not isinstance(pattern, (str, bytes)):
517        return doc
518    flags = 0
519    # PyMongo always adds $options but some other tools may not.
520    for opt in doc.get("$options", ""):
521        flags |= _RE_OPT_TABLE.get(opt, 0)
522    return Regex(pattern, flags)
523
524
525def _parse_legacy_uuid(doc, json_options):
526    """Decode a JSON legacy $uuid to Python UUID."""
527    if len(doc) != 1:
528        raise TypeError("Bad $uuid, extra field(s): %s" % (doc,))
529    if not isinstance(doc["$uuid"], str):
530        raise TypeError("$uuid must be a string: %s" % (doc,))
531    if json_options.uuid_representation == UuidRepresentation.UNSPECIFIED:
532        return Binary.from_uuid(uuid.UUID(doc["$uuid"]))
533    else:
534        return uuid.UUID(doc["$uuid"])
535
536
537def _binary_or_uuid(data, subtype, json_options):
538    # special handling for UUID
539    if subtype in ALL_UUID_SUBTYPES:
540        uuid_representation = json_options.uuid_representation
541        binary_value = Binary(data, subtype)
542        if uuid_representation == UuidRepresentation.UNSPECIFIED:
543            return binary_value
544        if subtype == UUID_SUBTYPE:
545            # Legacy behavior: use STANDARD with binary subtype 4.
546            uuid_representation = UuidRepresentation.STANDARD
547        elif uuid_representation == UuidRepresentation.STANDARD:
548            # subtype == OLD_UUID_SUBTYPE
549            # Legacy behavior: STANDARD is the same as PYTHON_LEGACY.
550            uuid_representation = UuidRepresentation.PYTHON_LEGACY
551        return binary_value.as_uuid(uuid_representation)
552
553    if subtype == 0:
554        return data
555    return Binary(data, subtype)
556
557
558def _parse_legacy_binary(doc, json_options):
559    if isinstance(doc["$type"], int):
560        doc["$type"] = "%02x" % doc["$type"]
561    subtype = int(doc["$type"], 16)
562    if subtype >= 0xFFFFFF80:  # Handle mongoexport values
563        subtype = int(doc["$type"][6:], 16)
564    data = base64.b64decode(doc["$binary"].encode())
565    return _binary_or_uuid(data, subtype, json_options)
566
567
568def _parse_canonical_binary(doc, json_options):
569    binary = doc["$binary"]
570    b64 = binary["base64"]
571    subtype = binary["subType"]
572    if not isinstance(b64, str):
573        raise TypeError("$binary base64 must be a string: %s" % (doc,))
574    if not isinstance(subtype, str) or len(subtype) > 2:
575        raise TypeError("$binary subType must be a string at most 2 " "characters: %s" % (doc,))
576    if len(binary) != 2:
577        raise TypeError(
578            '$binary must include only "base64" and "subType" ' "components: %s" % (doc,)
579        )
580
581    data = base64.b64decode(b64.encode())
582    return _binary_or_uuid(data, int(subtype, 16), json_options)
583
584
585def _parse_canonical_datetime(doc, json_options):
586    """Decode a JSON datetime to python datetime.datetime."""
587    dtm = doc["$date"]
588    if len(doc) != 1:
589        raise TypeError("Bad $date, extra field(s): %s" % (doc,))
590    # mongoexport 2.6 and newer
591    if isinstance(dtm, str):
592        # Parse offset
593        if dtm[-1] == "Z":
594            dt = dtm[:-1]
595            offset = "Z"
596        elif dtm[-6] in ("+", "-") and dtm[-3] == ":":
597            # (+|-)HH:MM
598            dt = dtm[:-6]
599            offset = dtm[-6:]
600        elif dtm[-5] in ("+", "-"):
601            # (+|-)HHMM
602            dt = dtm[:-5]
603            offset = dtm[-5:]
604        elif dtm[-3] in ("+", "-"):
605            # (+|-)HH
606            dt = dtm[:-3]
607            offset = dtm[-3:]
608        else:
609            dt = dtm
610            offset = ""
611
612        # Parse the optional factional seconds portion.
613        dot_index = dt.rfind(".")
614        microsecond = 0
615        if dot_index != -1:
616            microsecond = int(float(dt[dot_index:]) * 1000000)
617            dt = dt[:dot_index]
618
619        aware = datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S").replace(
620            microsecond=microsecond, tzinfo=utc
621        )
622
623        if offset and offset != "Z":
624            if len(offset) == 6:
625                hours, minutes = offset[1:].split(":")
626                secs = int(hours) * 3600 + int(minutes) * 60
627            elif len(offset) == 5:
628                secs = int(offset[1:3]) * 3600 + int(offset[3:]) * 60
629            elif len(offset) == 3:
630                secs = int(offset[1:3]) * 3600
631            if offset[0] == "-":
632                secs *= -1
633            aware = aware - datetime.timedelta(seconds=secs)
634
635        if json_options.tz_aware:
636            if json_options.tzinfo:
637                aware = aware.astimezone(json_options.tzinfo)
638            return aware
639        else:
640            return aware.replace(tzinfo=None)
641    return bson._millis_to_datetime(int(dtm), json_options)
642
643
644def _parse_canonical_oid(doc):
645    """Decode a JSON ObjectId to bson.objectid.ObjectId."""
646    if len(doc) != 1:
647        raise TypeError("Bad $oid, extra field(s): %s" % (doc,))
648    return ObjectId(doc["$oid"])
649
650
651def _parse_canonical_symbol(doc):
652    """Decode a JSON symbol to Python string."""
653    symbol = doc["$symbol"]
654    if len(doc) != 1:
655        raise TypeError("Bad $symbol, extra field(s): %s" % (doc,))
656    return str(symbol)
657
658
659def _parse_canonical_code(doc):
660    """Decode a JSON code to bson.code.Code."""
661    for key in doc:
662        if key not in ("$code", "$scope"):
663            raise TypeError("Bad $code, extra field(s): %s" % (doc,))
664    return Code(doc["$code"], scope=doc.get("$scope"))
665
666
667def _parse_canonical_regex(doc):
668    """Decode a JSON regex to bson.regex.Regex."""
669    regex = doc["$regularExpression"]
670    if len(doc) != 1:
671        raise TypeError("Bad $regularExpression, extra field(s): %s" % (doc,))
672    if len(regex) != 2:
673        raise TypeError(
674            'Bad $regularExpression must include only "pattern"'
675            'and "options" components: %s' % (doc,)
676        )
677    opts = regex["options"]
678    if not isinstance(opts, str):
679        raise TypeError(
680            "Bad $regularExpression options, options must be " "string, was type %s" % (type(opts))
681        )
682    return Regex(regex["pattern"], opts)
683
684
685def _parse_canonical_dbref(doc):
686    """Decode a JSON DBRef to bson.dbref.DBRef."""
687    return DBRef(doc.pop("$ref"), doc.pop("$id"), database=doc.pop("$db", None), **doc)
688
689
690def _parse_canonical_dbpointer(doc):
691    """Decode a JSON (deprecated) DBPointer to bson.dbref.DBRef."""
692    dbref = doc["$dbPointer"]
693    if len(doc) != 1:
694        raise TypeError("Bad $dbPointer, extra field(s): %s" % (doc,))
695    if isinstance(dbref, DBRef):
696        dbref_doc = dbref.as_doc()
697        # DBPointer must not contain $db in its value.
698        if dbref.database is not None:
699            raise TypeError("Bad $dbPointer, extra field $db: %s" % (dbref_doc,))
700        if not isinstance(dbref.id, ObjectId):
701            raise TypeError("Bad $dbPointer, $id must be an ObjectId: %s" % (dbref_doc,))
702        if len(dbref_doc) != 2:
703            raise TypeError("Bad $dbPointer, extra field(s) in DBRef: %s" % (dbref_doc,))
704        return dbref
705    else:
706        raise TypeError("Bad $dbPointer, expected a DBRef: %s" % (doc,))
707
708
709def _parse_canonical_int32(doc):
710    """Decode a JSON int32 to python int."""
711    i_str = doc["$numberInt"]
712    if len(doc) != 1:
713        raise TypeError("Bad $numberInt, extra field(s): %s" % (doc,))
714    if not isinstance(i_str, str):
715        raise TypeError("$numberInt must be string: %s" % (doc,))
716    return int(i_str)
717
718
719def _parse_canonical_int64(doc):
720    """Decode a JSON int64 to bson.int64.Int64."""
721    l_str = doc["$numberLong"]
722    if len(doc) != 1:
723        raise TypeError("Bad $numberLong, extra field(s): %s" % (doc,))
724    return Int64(l_str)
725
726
727def _parse_canonical_double(doc):
728    """Decode a JSON double to python float."""
729    d_str = doc["$numberDouble"]
730    if len(doc) != 1:
731        raise TypeError("Bad $numberDouble, extra field(s): %s" % (doc,))
732    if not isinstance(d_str, str):
733        raise TypeError("$numberDouble must be string: %s" % (doc,))
734    return float(d_str)
735
736
737def _parse_canonical_decimal128(doc):
738    """Decode a JSON decimal128 to bson.decimal128.Decimal128."""
739    d_str = doc["$numberDecimal"]
740    if len(doc) != 1:
741        raise TypeError("Bad $numberDecimal, extra field(s): %s" % (doc,))
742    if not isinstance(d_str, str):
743        raise TypeError("$numberDecimal must be string: %s" % (doc,))
744    return Decimal128(d_str)
745
746
747def _parse_canonical_minkey(doc):
748    """Decode a JSON MinKey to bson.min_key.MinKey."""
749    if type(doc["$minKey"]) is not int or doc["$minKey"] != 1:
750        raise TypeError("$minKey value must be 1: %s" % (doc,))
751    if len(doc) != 1:
752        raise TypeError("Bad $minKey, extra field(s): %s" % (doc,))
753    return MinKey()
754
755
756def _parse_canonical_maxkey(doc):
757    """Decode a JSON MaxKey to bson.max_key.MaxKey."""
758    if type(doc["$maxKey"]) is not int or doc["$maxKey"] != 1:
759        raise TypeError("$maxKey value must be 1: %s", (doc,))
760    if len(doc) != 1:
761        raise TypeError("Bad $minKey, extra field(s): %s" % (doc,))
762    return MaxKey()
763
764
765def _encode_binary(data, subtype, json_options):
766    if json_options.json_mode == JSONMode.LEGACY:
767        return SON([("$binary", base64.b64encode(data).decode()), ("$type", "%02x" % subtype)])
768    return {
769        "$binary": SON([("base64", base64.b64encode(data).decode()), ("subType", "%02x" % subtype)])
770    }
771
772
773def default(obj, json_options=DEFAULT_JSON_OPTIONS):
774    # We preserve key order when rendering SON, DBRef, etc. as JSON by
775    # returning a SON for those types instead of a dict.
776    if isinstance(obj, ObjectId):
777        return {"$oid": str(obj)}
778    if isinstance(obj, DBRef):
779        return _json_convert(obj.as_doc(), json_options=json_options)
780    if isinstance(obj, datetime.datetime):
781        if json_options.datetime_representation == DatetimeRepresentation.ISO8601:
782            if not obj.tzinfo:
783                obj = obj.replace(tzinfo=utc)
784            if obj >= EPOCH_AWARE:
785                off = obj.tzinfo.utcoffset(obj)
786                if (off.days, off.seconds, off.microseconds) == (0, 0, 0):
787                    tz_string = "Z"
788                else:
789                    tz_string = obj.strftime("%z")
790                millis = int(obj.microsecond / 1000)
791                fracsecs = ".%03d" % (millis,) if millis else ""
792                return {
793                    "$date": "%s%s%s" % (obj.strftime("%Y-%m-%dT%H:%M:%S"), fracsecs, tz_string)
794                }
795
796        millis = bson._datetime_to_millis(obj)
797        if json_options.datetime_representation == DatetimeRepresentation.LEGACY:
798            return {"$date": millis}
799        return {"$date": {"$numberLong": str(millis)}}
800    if json_options.strict_number_long and isinstance(obj, Int64):
801        return {"$numberLong": str(obj)}
802    if isinstance(obj, (RE_TYPE, Regex)):
803        flags = ""
804        if obj.flags & re.IGNORECASE:
805            flags += "i"
806        if obj.flags & re.LOCALE:
807            flags += "l"
808        if obj.flags & re.MULTILINE:
809            flags += "m"
810        if obj.flags & re.DOTALL:
811            flags += "s"
812        if obj.flags & re.UNICODE:
813            flags += "u"
814        if obj.flags & re.VERBOSE:
815            flags += "x"
816        if isinstance(obj.pattern, str):
817            pattern = obj.pattern
818        else:
819            pattern = obj.pattern.decode("utf-8")
820        if json_options.json_mode == JSONMode.LEGACY:
821            return SON([("$regex", pattern), ("$options", flags)])
822        return {"$regularExpression": SON([("pattern", pattern), ("options", flags)])}
823    if isinstance(obj, MinKey):
824        return {"$minKey": 1}
825    if isinstance(obj, MaxKey):
826        return {"$maxKey": 1}
827    if isinstance(obj, Timestamp):
828        return {"$timestamp": SON([("t", obj.time), ("i", obj.inc)])}
829    if isinstance(obj, Code):
830        if obj.scope is None:
831            return {"$code": str(obj)}
832        return SON([("$code", str(obj)), ("$scope", _json_convert(obj.scope, json_options))])
833    if isinstance(obj, Binary):
834        return _encode_binary(obj, obj.subtype, json_options)
835    if isinstance(obj, bytes):
836        return _encode_binary(obj, 0, json_options)
837    if isinstance(obj, uuid.UUID):
838        if json_options.strict_uuid:
839            binval = Binary.from_uuid(obj, uuid_representation=json_options.uuid_representation)
840            return _encode_binary(binval, binval.subtype, json_options)
841        else:
842            return {"$uuid": obj.hex}
843    if isinstance(obj, Decimal128):
844        return {"$numberDecimal": str(obj)}
845    if isinstance(obj, bool):
846        return obj
847    if json_options.json_mode == JSONMode.CANONICAL and isinstance(obj, int):
848        if -(2**31) <= obj < 2**31:
849            return {"$numberInt": str(obj)}
850        return {"$numberLong": str(obj)}
851    if json_options.json_mode != JSONMode.LEGACY and isinstance(obj, float):
852        if math.isnan(obj):
853            return {"$numberDouble": "NaN"}
854        elif math.isinf(obj):
855            representation = "Infinity" if obj > 0 else "-Infinity"
856            return {"$numberDouble": representation}
857        elif json_options.json_mode == JSONMode.CANONICAL:
858            # repr() will return the shortest string guaranteed to produce the
859            # original value, when float() is called on it.
860            return {"$numberDouble": str(repr(obj))}
861    raise TypeError("%r is not JSON serializable" % obj)
class DatetimeRepresentation:
122class DatetimeRepresentation:
123    LEGACY = 0
124    """Legacy MongoDB Extended JSON datetime representation.
125
126    :class:`datetime.datetime` instances will be encoded to JSON in the
127    format `{"$date": <dateAsMilliseconds>}`, where `dateAsMilliseconds` is
128    a 64-bit signed integer giving the number of milliseconds since the Unix
129    epoch UTC. This was the default encoding before PyMongo version 3.4.
130
131    .. versionadded:: 3.4
132    """
133
134    NUMBERLONG = 1
135    """NumberLong datetime representation.
136
137    :class:`datetime.datetime` instances will be encoded to JSON in the
138    format `{"$date": {"$numberLong": "<dateAsMilliseconds>"}}`,
139    where `dateAsMilliseconds` is the string representation of a 64-bit signed
140    integer giving the number of milliseconds since the Unix epoch UTC.
141
142    .. versionadded:: 3.4
143    """
144
145    ISO8601 = 2
146    """ISO-8601 datetime representation.
147
148    :class:`datetime.datetime` instances greater than or equal to the Unix
149    epoch UTC will be encoded to JSON in the format `{"$date": "<ISO-8601>"}`.
150    :class:`datetime.datetime` instances before the Unix epoch UTC will be
151    encoded as if the datetime representation is
152    :const:`~DatetimeRepresentation.NUMBERLONG`.
153
154    .. versionadded:: 3.4
155    """
LEGACY = 0

Legacy MongoDB Extended JSON datetime representation.

datetime.datetime instances will be encoded to JSON in the format {"$date": <dateAsMilliseconds>}, where dateAsMilliseconds is a 64-bit signed integer giving the number of milliseconds since the Unix epoch UTC. This was the default encoding before PyMongo version 3.4.

New in version 3.4.

NUMBERLONG = 1

NumberLong datetime representation.

datetime.datetime instances will be encoded to JSON in the format {"$date": {"$numberLong": "<dateAsMilliseconds>"}}, where dateAsMilliseconds is the string representation of a 64-bit signed integer giving the number of milliseconds since the Unix epoch UTC.

New in version 3.4.

ISO8601 = 2

ISO-8601 datetime representation.

datetime.datetime instances greater than or equal to the Unix epoch UTC will be encoded to JSON in the format {"$date": "<ISO-8601>"}. datetime.datetime instances before the Unix epoch UTC will be encoded as if the datetime representation is ~DatetimeRepresentation.NUMBERLONG.

New in version 3.4.

class JSONMode:
158class JSONMode:
159    LEGACY = 0
160    """Legacy Extended JSON representation.
161
162    In this mode, :func:`~bson.json_util.dumps` produces PyMongo's legacy
163    non-standard JSON output. Consider using
164    :const:`~bson.json_util.JSONMode.RELAXED` or
165    :const:`~bson.json_util.JSONMode.CANONICAL` instead.
166
167    .. versionadded:: 3.5
168    """
169
170    RELAXED = 1
171    """Relaxed Extended JSON representation.
172
173    In this mode, :func:`~bson.json_util.dumps` produces Relaxed Extended JSON,
174    a mostly JSON-like format. Consider using this for things like a web API,
175    where one is sending a document (or a projection of a document) that only
176    uses ordinary JSON type primitives. In particular, the ``int``,
177    :class:`~bson.int64.Int64`, and ``float`` numeric types are represented in
178    the native JSON number format. This output is also the most human readable
179    and is useful for debugging and documentation.
180
181    .. seealso:: The specification for Relaxed `Extended JSON`_.
182
183    .. versionadded:: 3.5
184    """
185
186    CANONICAL = 2
187    """Canonical Extended JSON representation.
188
189    In this mode, :func:`~bson.json_util.dumps` produces Canonical Extended
190    JSON, a type preserving format. Consider using this for things like
191    testing, where one has to precisely specify expected types in JSON. In
192    particular, the ``int``, :class:`~bson.int64.Int64`, and ``float`` numeric
193    types are encoded with type wrappers.
194
195    .. seealso:: The specification for Canonical `Extended JSON`_.
196
197    .. versionadded:: 3.5
198    """
LEGACY = 0

Legacy Extended JSON representation.

In this mode, ~bson.json_util.dumps() produces PyMongo's legacy non-standard JSON output. Consider using ~bson.json_util.JSONMode.RELAXED or ~bson.json_util.JSONMode.CANONICAL instead.

New in version 3.5.

RELAXED = 1

Relaxed Extended JSON representation.

In this mode, ~bson.json_util.dumps() produces Relaxed Extended JSON, a mostly JSON-like format. Consider using this for things like a web API, where one is sending a document (or a projection of a document) that only uses ordinary JSON type primitives. In particular, the int, ~bson.int64.Int64, and float numeric types are represented in the native JSON number format. This output is also the most human readable and is useful for debugging and documentation.

seealso The specification for Relaxed Extended JSON_..

New in version 3.5.

CANONICAL = 2

Canonical Extended JSON representation.

In this mode, ~bson.json_util.dumps() produces Canonical Extended JSON, a type preserving format. Consider using this for things like testing, where one has to precisely specify expected types in JSON. In particular, the int, ~bson.int64.Int64, and float numeric types are encoded with type wrappers.

seealso The specification for Canonical Extended JSON_..

New in version 3.5.

class JSONOptions(xtquant.xtbson.bson36.codec_options.CodecOptions):
201class JSONOptions(CodecOptions):
202    """Encapsulates JSON options for :func:`dumps` and :func:`loads`.
203
204    :Parameters:
205      - `strict_number_long`: If ``True``, :class:`~bson.int64.Int64` objects
206        are encoded to MongoDB Extended JSON's *Strict mode* type
207        `NumberLong`, ie ``'{"$numberLong": "<number>" }'``. Otherwise they
208        will be encoded as an `int`. Defaults to ``False``.
209      - `datetime_representation`: The representation to use when encoding
210        instances of :class:`datetime.datetime`. Defaults to
211        :const:`~DatetimeRepresentation.LEGACY`.
212      - `strict_uuid`: If ``True``, :class:`uuid.UUID` object are encoded to
213        MongoDB Extended JSON's *Strict mode* type `Binary`. Otherwise it
214        will be encoded as ``'{"$uuid": "<hex>" }'``. Defaults to ``False``.
215      - `json_mode`: The :class:`JSONMode` to use when encoding BSON types to
216        Extended JSON. Defaults to :const:`~JSONMode.LEGACY`.
217      - `document_class`: BSON documents returned by :func:`loads` will be
218        decoded to an instance of this class. Must be a subclass of
219        :class:`collections.MutableMapping`. Defaults to :class:`dict`.
220      - `uuid_representation`: The :class:`~bson.binary.UuidRepresentation`
221        to use when encoding and decoding instances of :class:`uuid.UUID`.
222        Defaults to :const:`~bson.binary.UuidRepresentation.UNSPECIFIED`.
223      - `tz_aware`: If ``True``, MongoDB Extended JSON's *Strict mode* type
224        `Date` will be decoded to timezone aware instances of
225        :class:`datetime.datetime`. Otherwise they will be naive. Defaults
226        to ``False``.
227      - `tzinfo`: A :class:`datetime.tzinfo` subclass that specifies the
228        timezone from which :class:`~datetime.datetime` objects should be
229        decoded. Defaults to :const:`~bson.tz_util.utc`.
230      - `args`: arguments to :class:`~bson.codec_options.CodecOptions`
231      - `kwargs`: arguments to :class:`~bson.codec_options.CodecOptions`
232
233    .. seealso:: The specification for Relaxed and Canonical `Extended JSON`_.
234
235    .. versionchanged:: 4.0
236       The default for `json_mode` was changed from :const:`JSONMode.LEGACY`
237       to :const:`JSONMode.RELAXED`.
238       The default for `uuid_representation` was changed from
239       :const:`~bson.binary.UuidRepresentation.PYTHON_LEGACY` to
240       :const:`~bson.binary.UuidRepresentation.UNSPECIFIED`.
241
242    .. versionchanged:: 3.5
243       Accepts the optional parameter `json_mode`.
244
245    .. versionchanged:: 4.0
246       Changed default value of `tz_aware` to False.
247    """
248
249    def __new__(
250        cls,
251        strict_number_long=None,
252        datetime_representation=None,
253        strict_uuid=None,
254        json_mode=JSONMode.RELAXED,
255        *args,
256        **kwargs
257    ):
258        kwargs["tz_aware"] = kwargs.get("tz_aware", False)
259        if kwargs["tz_aware"]:
260            kwargs["tzinfo"] = kwargs.get("tzinfo", utc)
261        if datetime_representation not in (
262            DatetimeRepresentation.LEGACY,
263            DatetimeRepresentation.NUMBERLONG,
264            DatetimeRepresentation.ISO8601,
265            None,
266        ):
267            raise ValueError(
268                "JSONOptions.datetime_representation must be one of LEGACY, "
269                "NUMBERLONG, or ISO8601 from DatetimeRepresentation."
270            )
271        self = super(JSONOptions, cls).__new__(cls, *args, **kwargs)
272        if json_mode not in (JSONMode.LEGACY, JSONMode.RELAXED, JSONMode.CANONICAL):
273            raise ValueError(
274                "JSONOptions.json_mode must be one of LEGACY, RELAXED, "
275                "or CANONICAL from JSONMode."
276            )
277        self.json_mode = json_mode
278        if self.json_mode == JSONMode.RELAXED:
279            if strict_number_long:
280                raise ValueError("Cannot specify strict_number_long=True with" " JSONMode.RELAXED")
281            if datetime_representation not in (None, DatetimeRepresentation.ISO8601):
282                raise ValueError(
283                    "datetime_representation must be DatetimeRepresentation."
284                    "ISO8601 or omitted with JSONMode.RELAXED"
285                )
286            if strict_uuid not in (None, True):
287                raise ValueError("Cannot specify strict_uuid=False with JSONMode.RELAXED")
288            self.strict_number_long = False
289            self.datetime_representation = DatetimeRepresentation.ISO8601
290            self.strict_uuid = True
291        elif self.json_mode == JSONMode.CANONICAL:
292            if strict_number_long not in (None, True):
293                raise ValueError("Cannot specify strict_number_long=False with" " JSONMode.RELAXED")
294            if datetime_representation not in (None, DatetimeRepresentation.NUMBERLONG):
295                raise ValueError(
296                    "datetime_representation must be DatetimeRepresentation."
297                    "NUMBERLONG or omitted with JSONMode.RELAXED"
298                )
299            if strict_uuid not in (None, True):
300                raise ValueError("Cannot specify strict_uuid=False with JSONMode.RELAXED")
301            self.strict_number_long = True
302            self.datetime_representation = DatetimeRepresentation.NUMBERLONG
303            self.strict_uuid = True
304        else:  # JSONMode.LEGACY
305            self.strict_number_long = False
306            self.datetime_representation = DatetimeRepresentation.LEGACY
307            self.strict_uuid = False
308            if strict_number_long is not None:
309                self.strict_number_long = strict_number_long
310            if datetime_representation is not None:
311                self.datetime_representation = datetime_representation
312            if strict_uuid is not None:
313                self.strict_uuid = strict_uuid
314        return self
315
316    def _arguments_repr(self):
317        return (
318            "strict_number_long=%r, "
319            "datetime_representation=%r, "
320            "strict_uuid=%r, json_mode=%r, %s"
321            % (
322                self.strict_number_long,
323                self.datetime_representation,
324                self.strict_uuid,
325                self.json_mode,
326                super(JSONOptions, self)._arguments_repr(),
327            )
328        )
329
330    def _options_dict(self):
331        # TODO: PYTHON-2442 use _asdict() instead
332        options_dict = super(JSONOptions, self)._options_dict()
333        options_dict.update(
334            {
335                "strict_number_long": self.strict_number_long,
336                "datetime_representation": self.datetime_representation,
337                "strict_uuid": self.strict_uuid,
338                "json_mode": self.json_mode,
339            }
340        )
341        return options_dict
342
343    def with_options(self, **kwargs):
344        """
345        Make a copy of this JSONOptions, overriding some options::
346
347            >>> from .json_util import CANONICAL_JSON_OPTIONS
348            >>> CANONICAL_JSON_OPTIONS.tz_aware
349            True
350            >>> json_options = CANONICAL_JSON_OPTIONS.with_options(tz_aware=False, tzinfo=None)
351            >>> json_options.tz_aware
352            False
353
354        .. versionadded:: 3.12
355        """
356        opts = self._options_dict()
357        for opt in ("strict_number_long", "datetime_representation", "strict_uuid", "json_mode"):
358            opts[opt] = kwargs.get(opt, getattr(self, opt))
359        opts.update(kwargs)
360        return JSONOptions(**opts)

Encapsulates JSON options for dumps() and loads().

:Parameters:

  • strict_number_long: If True, ~bson.int64.Int64 objects are encoded to MongoDB Extended JSON's Strict mode type NumberLong, ie '{"$numberLong": "<number>" }'. Otherwise they will be encoded as an int. Defaults to False.
  • datetime_representation: The representation to use when encoding instances of datetime.datetime. Defaults to ~DatetimeRepresentation.LEGACY.
  • strict_uuid: If True, uuid.UUID object are encoded to MongoDB Extended JSON's Strict mode type Binary. Otherwise it will be encoded as '{"$uuid": "<hex>" }'. Defaults to False.
  • json_mode: The JSONMode to use when encoding BSON types to Extended JSON. Defaults to ~JSONMode.LEGACY.
  • document_class: BSON documents returned by loads() will be decoded to an instance of this class. Must be a subclass of collections.MutableMapping. Defaults to dict.
  • uuid_representation: The ~bson.binary.UuidRepresentation to use when encoding and decoding instances of uuid.UUID. Defaults to ~bson.binary.UuidRepresentation.UNSPECIFIED.
  • tz_aware: If True, MongoDB Extended JSON's Strict mode type Date will be decoded to timezone aware instances of datetime.datetime. Otherwise they will be naive. Defaults to False.
  • tzinfo: A datetime.tzinfo subclass that specifies the timezone from which ~datetime.datetime objects should be decoded. Defaults to ~bson.tz_util.utc.
  • args: arguments to ~bson.codec_options.CodecOptions
  • kwargs: arguments to ~bson.codec_options.CodecOptions

seealso The specification for Relaxed and Canonical Extended JSON_..

Changed in version 4.0: The default for json_mode was changed from JSONMode.LEGACY to JSONMode.RELAXED. The default for uuid_representation was changed from ~bson.binary.UuidRepresentation.PYTHON_LEGACY to ~bson.binary.UuidRepresentation.UNSPECIFIED.

Changed in version 3.5: Accepts the optional parameter json_mode.

Changed in version 4.0: Changed default value of tz_aware to False.

def with_options(self, **kwargs):
343    def with_options(self, **kwargs):
344        """
345        Make a copy of this JSONOptions, overriding some options::
346
347            >>> from .json_util import CANONICAL_JSON_OPTIONS
348            >>> CANONICAL_JSON_OPTIONS.tz_aware
349            True
350            >>> json_options = CANONICAL_JSON_OPTIONS.with_options(tz_aware=False, tzinfo=None)
351            >>> json_options.tz_aware
352            False
353
354        .. versionadded:: 3.12
355        """
356        opts = self._options_dict()
357        for opt in ("strict_number_long", "datetime_representation", "strict_uuid", "json_mode"):
358            opts[opt] = kwargs.get(opt, getattr(self, opt))
359        opts.update(kwargs)
360        return JSONOptions(**opts)

Make a copy of this JSONOptions, overriding some options::

>>> from .json_util import CANONICAL_JSON_OPTIONS
>>> CANONICAL_JSON_OPTIONS.tz_aware
True
>>> json_options = CANONICAL_JSON_OPTIONS.with_options(tz_aware=False, tzinfo=None)
>>> json_options.tz_aware
False

New in version 3.12.

LEGACY_JSON_OPTIONS = JSONOptions(strict_number_long=False, datetime_representation=0, strict_uuid=False, json_mode=0, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))

JSONOptions for encoding to PyMongo's legacy JSON format.

seealso The documentation for bson.json_util.JSONMode.LEGACY..

New in version 3.5.

CANONICAL_JSON_OPTIONS = JSONOptions(strict_number_long=True, datetime_representation=1, strict_uuid=True, json_mode=2, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))

JSONOptions for Canonical Extended JSON.

seealso The documentation for bson.json_util.JSONMode.CANONICAL..

New in version 3.5.

RELAXED_JSON_OPTIONS = JSONOptions(strict_number_long=False, datetime_representation=2, strict_uuid=True, json_mode=1, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))

JSONOptions for Relaxed Extended JSON.

seealso The documentation for bson.json_util.JSONMode.RELAXED..

New in version 3.5.

DEFAULT_JSON_OPTIONS = JSONOptions(strict_number_long=False, datetime_representation=2, strict_uuid=True, json_mode=1, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))

The default JSONOptions for JSON encoding/decoding.

The same as RELAXED_JSON_OPTIONS.

Changed in version 4.0: Changed from LEGACY_JSON_OPTIONS to RELAXED_JSON_OPTIONS.

New in version 3.4.

def dumps(obj, *args, **kwargs):
400def dumps(obj, *args, **kwargs):
401    """Helper function that wraps :func:`json.dumps`.
402
403    Recursive function that handles all BSON types including
404    :class:`~bson.binary.Binary` and :class:`~bson.code.Code`.
405
406    :Parameters:
407      - `json_options`: A :class:`JSONOptions` instance used to modify the
408        encoding of MongoDB Extended JSON types. Defaults to
409        :const:`DEFAULT_JSON_OPTIONS`.
410
411    .. versionchanged:: 4.0
412       Now outputs MongoDB Relaxed Extended JSON by default (using
413       :const:`DEFAULT_JSON_OPTIONS`).
414
415    .. versionchanged:: 3.4
416       Accepts optional parameter `json_options`. See :class:`JSONOptions`.
417    """
418    json_options = kwargs.pop("json_options", DEFAULT_JSON_OPTIONS)
419    return json.dumps(_json_convert(obj, json_options), *args, **kwargs)

Helper function that wraps json.dumps().

Recursive function that handles all BSON types including ~bson.binary.Binary and ~bson.code.Code.

:Parameters:

Changed in version 4.0: Now outputs MongoDB Relaxed Extended JSON by default (using DEFAULT_JSON_OPTIONS).

Changed in version 3.4: Accepts optional parameter json_options. See JSONOptions.

def loads(s, *args, **kwargs):
422def loads(s, *args, **kwargs):
423    """Helper function that wraps :func:`json.loads`.
424
425    Automatically passes the object_hook for BSON type conversion.
426
427    Raises ``TypeError``, ``ValueError``, ``KeyError``, or
428    :exc:`~bson.errors.InvalidId` on invalid MongoDB Extended JSON.
429
430    :Parameters:
431      - `json_options`: A :class:`JSONOptions` instance used to modify the
432        decoding of MongoDB Extended JSON types. Defaults to
433        :const:`DEFAULT_JSON_OPTIONS`.
434
435    .. versionchanged:: 3.5
436       Parses Relaxed and Canonical Extended JSON as well as PyMongo's legacy
437       format. Now raises ``TypeError`` or ``ValueError`` when parsing JSON
438       type wrappers with values of the wrong type or any extra keys.
439
440    .. versionchanged:: 3.4
441       Accepts optional parameter `json_options`. See :class:`JSONOptions`.
442    """
443    json_options = kwargs.pop("json_options", DEFAULT_JSON_OPTIONS)
444    kwargs["object_pairs_hook"] = lambda pairs: object_pairs_hook(pairs, json_options)
445    return json.loads(s, *args, **kwargs)

Helper function that wraps json.loads().

Automatically passes the object_hook for BSON type conversion.

Raises TypeError, ValueError, KeyError, or ~bson.errors.InvalidId on invalid MongoDB Extended JSON.

:Parameters:

Changed in version 3.5: Parses Relaxed and Canonical Extended JSON as well as PyMongo's legacy format. Now raises TypeError or ValueError when parsing JSON type wrappers with values of the wrong type or any extra keys.

Changed in version 3.4: Accepts optional parameter json_options. See JSONOptions.

def object_pairs_hook( pairs, json_options=JSONOptions(strict_number_long=False, datetime_representation=2, strict_uuid=True, json_mode=1, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))):
462def object_pairs_hook(pairs, json_options=DEFAULT_JSON_OPTIONS):
463    return object_hook(json_options.document_class(pairs), json_options)
def object_hook( dct, json_options=JSONOptions(strict_number_long=False, datetime_representation=2, strict_uuid=True, json_mode=1, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))):
466def object_hook(dct, json_options=DEFAULT_JSON_OPTIONS):
467    if "$oid" in dct:
468        return _parse_canonical_oid(dct)
469    if (
470        isinstance(dct.get("$ref"), str)
471        and "$id" in dct
472        and isinstance(dct.get("$db"), (str, type(None)))
473    ):
474        return _parse_canonical_dbref(dct)
475    if "$date" in dct:
476        return _parse_canonical_datetime(dct, json_options)
477    if "$regex" in dct:
478        return _parse_legacy_regex(dct)
479    if "$minKey" in dct:
480        return _parse_canonical_minkey(dct)
481    if "$maxKey" in dct:
482        return _parse_canonical_maxkey(dct)
483    if "$binary" in dct:
484        if "$type" in dct:
485            return _parse_legacy_binary(dct, json_options)
486        else:
487            return _parse_canonical_binary(dct, json_options)
488    if "$code" in dct:
489        return _parse_canonical_code(dct)
490    if "$uuid" in dct:
491        return _parse_legacy_uuid(dct, json_options)
492    if "$undefined" in dct:
493        return None
494    if "$numberLong" in dct:
495        return _parse_canonical_int64(dct)
496    if "$timestamp" in dct:
497        tsp = dct["$timestamp"]
498        return Timestamp(tsp["t"], tsp["i"])
499    if "$numberDecimal" in dct:
500        return _parse_canonical_decimal128(dct)
501    if "$dbPointer" in dct:
502        return _parse_canonical_dbpointer(dct)
503    if "$regularExpression" in dct:
504        return _parse_canonical_regex(dct)
505    if "$symbol" in dct:
506        return _parse_canonical_symbol(dct)
507    if "$numberInt" in dct:
508        return _parse_canonical_int32(dct)
509    if "$numberDouble" in dct:
510        return _parse_canonical_double(dct)
511    return dct
def default( obj, json_options=JSONOptions(strict_number_long=False, datetime_representation=2, strict_uuid=True, json_mode=1, document_class=dict, tz_aware=False, uuid_representation=UuidRepresentation.UNSPECIFIED, unicode_decode_error_handler='strict', tzinfo=None, type_registry=TypeRegistry(type_codecs=[], fallback_encoder=None))):
774def default(obj, json_options=DEFAULT_JSON_OPTIONS):
775    # We preserve key order when rendering SON, DBRef, etc. as JSON by
776    # returning a SON for those types instead of a dict.
777    if isinstance(obj, ObjectId):
778        return {"$oid": str(obj)}
779    if isinstance(obj, DBRef):
780        return _json_convert(obj.as_doc(), json_options=json_options)
781    if isinstance(obj, datetime.datetime):
782        if json_options.datetime_representation == DatetimeRepresentation.ISO8601:
783            if not obj.tzinfo:
784                obj = obj.replace(tzinfo=utc)
785            if obj >= EPOCH_AWARE:
786                off = obj.tzinfo.utcoffset(obj)
787                if (off.days, off.seconds, off.microseconds) == (0, 0, 0):
788                    tz_string = "Z"
789                else:
790                    tz_string = obj.strftime("%z")
791                millis = int(obj.microsecond / 1000)
792                fracsecs = ".%03d" % (millis,) if millis else ""
793                return {
794                    "$date": "%s%s%s" % (obj.strftime("%Y-%m-%dT%H:%M:%S"), fracsecs, tz_string)
795                }
796
797        millis = bson._datetime_to_millis(obj)
798        if json_options.datetime_representation == DatetimeRepresentation.LEGACY:
799            return {"$date": millis}
800        return {"$date": {"$numberLong": str(millis)}}
801    if json_options.strict_number_long and isinstance(obj, Int64):
802        return {"$numberLong": str(obj)}
803    if isinstance(obj, (RE_TYPE, Regex)):
804        flags = ""
805        if obj.flags & re.IGNORECASE:
806            flags += "i"
807        if obj.flags & re.LOCALE:
808            flags += "l"
809        if obj.flags & re.MULTILINE:
810            flags += "m"
811        if obj.flags & re.DOTALL:
812            flags += "s"
813        if obj.flags & re.UNICODE:
814            flags += "u"
815        if obj.flags & re.VERBOSE:
816            flags += "x"
817        if isinstance(obj.pattern, str):
818            pattern = obj.pattern
819        else:
820            pattern = obj.pattern.decode("utf-8")
821        if json_options.json_mode == JSONMode.LEGACY:
822            return SON([("$regex", pattern), ("$options", flags)])
823        return {"$regularExpression": SON([("pattern", pattern), ("options", flags)])}
824    if isinstance(obj, MinKey):
825        return {"$minKey": 1}
826    if isinstance(obj, MaxKey):
827        return {"$maxKey": 1}
828    if isinstance(obj, Timestamp):
829        return {"$timestamp": SON([("t", obj.time), ("i", obj.inc)])}
830    if isinstance(obj, Code):
831        if obj.scope is None:
832            return {"$code": str(obj)}
833        return SON([("$code", str(obj)), ("$scope", _json_convert(obj.scope, json_options))])
834    if isinstance(obj, Binary):
835        return _encode_binary(obj, obj.subtype, json_options)
836    if isinstance(obj, bytes):
837        return _encode_binary(obj, 0, json_options)
838    if isinstance(obj, uuid.UUID):
839        if json_options.strict_uuid:
840            binval = Binary.from_uuid(obj, uuid_representation=json_options.uuid_representation)
841            return _encode_binary(binval, binval.subtype, json_options)
842        else:
843            return {"$uuid": obj.hex}
844    if isinstance(obj, Decimal128):
845        return {"$numberDecimal": str(obj)}
846    if isinstance(obj, bool):
847        return obj
848    if json_options.json_mode == JSONMode.CANONICAL and isinstance(obj, int):
849        if -(2**31) <= obj < 2**31:
850            return {"$numberInt": str(obj)}
851        return {"$numberLong": str(obj)}
852    if json_options.json_mode != JSONMode.LEGACY and isinstance(obj, float):
853        if math.isnan(obj):
854            return {"$numberDouble": "NaN"}
855        elif math.isinf(obj):
856            representation = "Infinity" if obj > 0 else "-Infinity"
857            return {"$numberDouble": representation}
858        elif json_options.json_mode == JSONMode.CANONICAL:
859            # repr() will return the shortest string guaranteed to produce the
860            # original value, when float() is called on it.
861            return {"$numberDouble": str(repr(obj))}
862    raise TypeError("%r is not JSON serializable" % obj)