Skip to content

Conversation

@faresbakhit
Copy link
Contributor

@faresbakhit faresbakhit commented Jul 10, 2022

Change Summary

  • Add percent_encode function to pydantic.utils
  • Add tests for percent_encode
  • Fix AnyUrl.build
  • Add new tests for AnyUrl.build

Related issue number

fix #3061

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable, not applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
  • My PR is ready to review

@faresbakhit
Copy link
Contributor Author

please review

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in principle, can we get some benchmarks to see how much this slows down validation

@samuelcolvin
Copy link
Member

please update.

@github-actions github-actions bot added awaiting author revision awaiting changes from the PR author and removed ready for review labels Aug 8, 2022
@faresbakhit
Copy link
Contributor Author

Looks good in principle, can we get some benchmarks to see how much this slows down validation

I'm not sure how this affects validation or how to benchmark pydantic's validation, can you assist me on this?

@samuelcolvin
Copy link
Member

Thanks so much for the benchmarks, I think we should go with from urllib.parse import quote_plus for the following reasons:

  • keep it simple, minimise our own logic
  • I would guess most URLs will be short enough that's quote_plus will be marginally quicker

I'm not sure how this affects validation or how to benchmark pydantic's validation, can you assist me on this?

I think we can skip this, we have to get this right, so there's no option of not quote encoding.

The real question long terms is whether we should move this logic to rust, but provided the API doesn't change we could do that in a minor release after V2.

@faresbakhit
Copy link
Contributor Author

faresbakhit commented Aug 13, 2022

Okay, I removed pydantic.utils.percent_encode in the latest commit.

The real question long terms is whether we should move this logic to rust, but provided the API doesn't change we could do that in a minor release after V2.

I think this should be moved to rust since urllib.parse is just pure python.

I will drop some benchmarks later for urllib.prase.quote against percent-encoding and urlencoding

Edit:

Here's the benchmark, code:

Python vs Rust percent encoding benchmark

This is a 12%ish increase

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, my main concerns are:

  • this could in theory be considered a breaking change, I think it's fine because it's pretty clear than without this the built URLs are "wrong", @PrettyWood do you agree?
  • there's no way to set quote_plus on a URL field, other than stricturl - maybe that's fine?

please can you add to the docs:

  • An example of this in action
  • an example with stricturl of using quote_plus=False (or whatever the non-default is)
  • a "note" admonition saying "percent encoding was added in V1.10 ..."

Please update.

path: Optional[str] = None,
query: Optional[str] = None,
fragment: Optional[str] = None,
plus: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
plus: bool = False,
quote_plus: bool = False,

might be a clearer name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a good reason not to have quote_plus = True as the default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add quote_plus to stricturl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a good reason not to have quote_plus = True as the default?

Just conforming to the standard (RFC 3986)

@samuelcolvin
Copy link
Member

Also, @FaresAhmedb please remember to use the magic comment "please review" (with one space) to request a review, otherwise we might miss updates on the PR.

@faresbakhit
Copy link
Contributor Author

please review

@pydantic-hooky pydantic-hooky bot added ready for review and removed awaiting author revision awaiting changes from the PR author labels Aug 15, 2022
Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a few small things here, but you need to fix linting in the docs so build succeeds and we can check the docs look right.

Please update.

@pydantic-hooky pydantic-hooky bot added awaiting author revision awaiting changes from the PR author and removed ready for review labels Aug 16, 2022
@faresbakhit
Copy link
Contributor Author

please review

@pydantic-hooky pydantic-hooky bot added ready for review and removed awaiting author revision awaiting changes from the PR author labels Aug 17, 2022
@samuelcolvin samuelcolvin enabled auto-merge (squash) August 19, 2022 09:02
@samuelcolvin
Copy link
Member

Thanks so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

*Url.build functions doesn't percent-encode arguments

4 participants