-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-5309 Ensure AsyncMongoClient doesn't use PyOpenSSL #2286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ure if this is correct or not, we'll seeeeee)
CONTRIBUTING.md
Outdated
@@ -420,3 +420,21 @@ partially-converted asynchronous version of the same name to the `test/asynchron | |||
Use this generated file as a starting point for the completed conversion. | |||
|
|||
The script is used like so: `python tools/convert_test_to_async.py [test_file.py]` | |||
|
|||
## Running PyMongo with SSL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for adding this section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh uh, I was struggling to get it working and asked Noah a bunch of questions, so he suggested I add a section here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about people seeing this section and misinterpreting it to mean that they should be using tlsAllowInvalidCertificates. Can we remove it or make it very clear that this is only for local testing (using our test certificates) and not production?
pymongo/encryption_options.py
Outdated
@@ -58,6 +58,7 @@ def __init__( | |||
bypass_query_analysis: bool = False, | |||
encrypted_fields_map: Optional[Mapping[str, Any]] = None, | |||
key_expiration_ms: Optional[int] = None, | |||
is_sync: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a public api so we can't add "is_sync" here. What's the motivation for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I added is_sync
as a param to a few functions because in ssl_support.get_ssl_context()
, we'll need to know which version of ssl should be used,, is there another way to go about this that is preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, we still can not add "is_sync" here so we need to find another way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way would be to lazily init the async SSLContext in kms_request().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I did it? Not sure if this is what you meant, but what I have now would leave the params to AutoEncryptOpts unchanged.
I basically delayed the parse_kms_tls_options because that's where is_sync was needed. I believe i appropriately delayed the definition of kms_tls_options but let me know if I missed something. I'm not super familiar with how encryption in the driver works >.<
Can you schedule a run of all the pyopenssl variants/tasks? |
oops yeah, sorry i'll add it to the latest run with the updated evergreen patch changes from steve's PR |
pymongo/network_layer.py
Outdated
BLOCKING_IO_WRITE_ERROR, | ||
_sslConn, | ||
) | ||
from pymongo.pyopenssl_context import _sslConn as _pysslConn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation behind the renaming to _pysslConn
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When i was making these changes, I started interpreting py* to mean it would refer to pyopenssl (if it existed, and stdlib ssl if it did not) whereas ssl related vars without py in the front meant it was strictly referring to stdlib ssl. I'm not strongly attached to the renaming and wouldn't mind undoing it, but it helped my brain read the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer the old name (_sslConn). A larger refactor could be a good idea but I'd prefer this PR be as minimal as possible to avoid regressions because it's going into a bugfix release.
@@ -57,6 +60,20 @@ | |||
BLOCKING_IO_WRITE_ERROR = _ssl.BLOCKING_IO_WRITE_ERROR | |||
BLOCKING_IO_LOOKUP_ERROR = BLOCKING_IO_READ_ERROR | |||
|
|||
if HAVE_PYSSL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Union
types here to remove the need for two separate exports of everything? Something like:
BLOCKING_IO_ERRORS = _ssl.BLOCKING_IO_ERRORS | _pyssl.BLOCKING_IO_ERRORS
Are there any situations where we specifically care if an exception type is from PyOpenSSL or stdlib SSL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooo good call out. I don't think we particularly care if its pyopenssl vs stdlib ssl error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think python3.9 didn't love the union types? so i just made them tuples. But its a similar idea.
I couldn't apply this to SSLError though because sometimes we'd raise SSLError so it needed to be one specific type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on why the union types didn't work? We use Union
elsewhere in our type hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm okay not sure what i did incorrectly last time, but it works now! sorry about that >.<
(I'm going to guess i handled BLOCKING_IO_ERRORS
incorrectly cuz i believe that's already a tuple of types and the union of two different types of tuples was not good?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait,, how am i supposed to do it then? I thought Union[x, y]
was for type hints??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That error is from trying to use the pipe |
operator, right? What happens when you use Union[x, y]
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it works! thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is a runtime object, not a type hint. We should be using the same type as ssl.BLOCKING_IO_ERRORS
(most likely a tuple or a list)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it didn't immediately error (as a runtime object) so i thought it was just something I didn't know about how python works HAHA but okay, changing it back to a tuple now xD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
CONTRIBUTING.md
Outdated
@@ -420,3 +420,21 @@ partially-converted asynchronous version of the same name to the `test/asynchron | |||
Use this generated file as a starting point for the completed conversion. | |||
|
|||
The script is used like so: `python tools/convert_test_to_async.py [test_file.py]` | |||
|
|||
## Running PyMongo with SSL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about people seeing this section and misinterpreting it to mean that they should be using tlsAllowInvalidCertificates. Can we remove it or make it very clear that this is only for local testing (using our test certificates) and not production?
That's fair, I've removed it and added it to our onboarding google doc for now. :) |
I would rather keep the section in contributing.md to help any external contributors. Having a warning that this is purely for local testing and should never be done in production seems sufficient to me. |
test/test_encryption.py
Outdated
@@ -168,7 +169,9 @@ def test_init_spawn_args(self): | |||
def test_init_kms_tls_options(self): | |||
# Error cases: | |||
with self.assertRaisesRegex(TypeError, r'kms_tls_options\["kmip"\] must be a dict'): | |||
AutoEncryptionOpts({}, "k.d", kms_tls_options={"kmip": 1}) | |||
opts = AutoEncryptionOpts({}, "k.d", kms_tls_options={"kmip": 1}) | |||
_parse_kms_tls_options(opts._kms_tls_options, _IS_SYNC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should use the real validation path. Something like:
# Error cases:
opts = AutoEncryptionOpts({}, "k.d", kms_tls_options={"kmip": 1})
with self.assertRaisesRegex(TypeError, r'kms_tls_options\["kmip"\] must be a dict'):
self.mongo_client(auto_encryption_opts=opts)
Alternatively, we can keep the old sync validation behavior in the AutoEncryptionOpts constructor and only re-parse the kms_tls_options in the async _EncryptionIO case.
I'm fine with either option.
That's fine but it's unrelated to this PR so should be done in a different ticket. |
@sleepyStick could you also update the PR title to describe the bug fix? |
test/test_encryption.py
Outdated
with self.assertRaisesRegex(TypeError, r'kms_tls_options\["kmip"\] must be a dict'): | ||
AutoEncryptionOpts({}, "k.d", kms_tls_options={"kmip": 1}) | ||
client = self.rs_or_single_client(auto_encryption_opts=opts) | ||
client.db.coll.insert_one({"encrypted": "test"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should raise from the AsyncMongoClient constructor, without even attempting to connect. We could even use AsyncMongoClient() directly:
with self.assertRaisesRegex(TypeError, r'kms_tls_options\["kmip"\] must be a dict'):
AsyncMongoClient(auto_encryption_opts=opts)
Could you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the test failure, the current approach is a little problematic because the validation error happens after we've created an internal MongoClient. When the validation fails, we leave the client open, leading to a resource warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh yeah, i see now. Thanks for popping over to further explain it, and for working thru the soln with me!
@@ -398,6 +398,7 @@ def __init__(self, client: AsyncMongoClient[_DocumentTypeArg], opts: AutoEncrypt | |||
encrypted_fields_map = _dict_to_bson(opts._encrypted_fields_map, False, _DATA_KEY_OPTS) | |||
self._bypass_auto_encryption = opts._bypass_auto_encryption | |||
self._internal_client = None | |||
opts._kms_ssl_contexts(_IS_SYNC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth adding a comment here to explain that we call this here so that parsing errors can be raised before creating internal clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, added :)
pymongo/encryption_options.py
Outdated
self._kms_ssl_contexts = _parse_kms_tls_options(kms_tls_options) | ||
self._kms_tls_options = kms_tls_options | ||
self._sync_kms_ssl_contexts: dict[str, SSLContext] = None # type:ignore[assignment] | ||
self._async_kms_ssl_contexts: dict[str, SSLContext] = None # type:ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove type:ignore[assignment]
. Those are bug factories and should only be used in exceptional circumstances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and fixed
Sorry, unable to cherry-pick to v4.12, please backport manually. Here are approximate instructions:
|
Co-authored-by: Noah Stapp <noah@noahstapp.com>
…2319) Co-authored-by: Iris <58442094+sleepyStick@users.noreply.github.com>
jira: https://jira.mongodb.org/browse/PYTHON-5309
problem: if a user had pyopenssl installed on their machine, the async client wouldnt configure ssl properly
brief description of my changes:
AutoEncryptionOpts
delays the resolution of_kms_ssl_contexts