Skip to content

Conversation

@sunmy2019
Copy link
Member

@sunmy2019 sunmy2019 commented Oct 3, 2023

@sunmy2019 sunmy2019 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 3, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sunmy2019 for commit 19d1301 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 3, 2023
@pablogsal
Copy link
Member

Thanks a lot for the PR @sunmy2019 !

@sunmy2019 sunmy2019 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 3, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sunmy2019 for commit 60b4666 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 3, 2023
@sunmy2019 sunmy2019 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 4, 2023
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sunmy2019 for commit 19c68e2 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 4, 2023
@lysnikolaou
Copy link
Member

I'm wondering why we cannot do this in the tokenizer directly. Wouldn't something like this work?

cpython on  main [!] via C v15.0.0-clang via 🐍 pyenv 3.11.3 (venv) took 8s 
❯ git diff
diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c
index 41d0d16a47..504bc9bed9 100644
--- a/Parser/tokenizer.c
+++ b/Parser/tokenizer.c
@@ -2639,6 +2639,12 @@ tok_get_fstring_mode(struct tok_state *tok, tokenizer_mode* current_tok, struct
     tok->first_lineno = tok->lineno;
     tok->starting_col_offset = tok->col_offset;
 
+    int in_format_spec = (
+            current_tok->last_expr_end != -1
+            &&
+            INSIDE_FSTRING_EXPR(current_tok)
+    );
+
     // If we start with a bracket, we defer to the normal mode as there is nothing for us to tokenize
     // before it.
     int start_char = tok_nextc(tok);
@@ -2655,6 +2661,10 @@ tok_get_fstring_mode(struct tok_state *tok, tokenizer_mode* current_tok, struct
             return tok_get_normal_mode(tok, current_tok, token);
         }
     }
+    else if (start_char == '}' && in_format_spec) {
+        tok_backup(tok, start_char);
+        return tok_get_normal_mode(tok, current_tok, token);
+    }
     else {
         tok_backup(tok, start_char);
     }
@@ -2726,11 +2736,6 @@ tok_get_fstring_mode(struct tok_state *tok, tokenizer_mode* current_tok, struct
             end_quote_size = 0;
         }
 
-        int in_format_spec = (
-                current_tok->last_expr_end != -1
-                &&
-                INSIDE_FSTRING_EXPR(current_tok)
-        );
         if (c == '{') {
             int peek = tok_nextc(tok);
             if (peek != '{' || in_format_spec) {

Didn't thoroughly test it, but one pass through the tests only shows one failure that's got to do with the error message when we have a lambda in the expression part.

@sunmy2019
Copy link
Member Author

sunmy2019 commented Oct 4, 2023

I'm wondering why we cannot do this in the tokenizer directly.

I have no objection to this.

One thing to note: the same special handling of empty str constants is at the other parts of the code.

If your idea is adopted, we'd better clean up those special handling.

@sunmy2019
Copy link
Member Author

I'd propose we land this first and backport to 3.12.

Then maybe we can optimize further on the 3.13 branch.

@lysnikolaou
Copy link
Member

I'd propose we land this first and backport to 3.12.

Then maybe we can optimize further on the 3.13 branch.

Agreed. Let's land this and then iterate on it in main.

Copy link
Member

@lysnikolaou lysnikolaou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@pablogsal
Copy link
Member

I think we need to regenerate some files (run make regen-all)

@pablogsal pablogsal merged commit 2cb62c6 into python:main Oct 5, 2023
@sunmy2019 sunmy2019 deleted the gh-110309 branch October 5, 2023 15:35
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

empty string constant in f-string format_spec

4 participants