Reddit - The heart of the internet

Feed About

Best

Open sort options

Change post view

Built a SQL analyzer with a specific rule set for BigQuery cost traps

u/Anonymedemerde

Built a SQL analyzer with a specific rule set for BigQuery cost traps

BigQuery will happily scan your entire table and bill you for it without a single warning. The dangerous part is the queries that do this aren't obviously wrong. They pass review, they work fine in dev, and then they run in production against a 2TB table and you're looking at a bill you weren't expecting.

The patterns that cause it are pretty consistent. SELECT * when you only needed three columns. No partition filter on a partitioned table. Wildcard table queries with no date range. Joins where the larger table isn't filtered before the join. Unbounded aggregations on wide tables.

Built SlowQL to catch these statically before anything runs. You point it at your sql files and it flags the cost traps along with security issues, missing WHERE clauses, injection patterns and more. Works offline, zero dependencies, plugs into CI.

171 rules total.

pip install slowql

What's the most unexpected BigQuery bill you've seen from a query that looked totally reasonable?

u/SamsungMobileUS

•

Promoted

Meet the all-new Galaxy S26 Ultra – the world’s first phone with a built-in Privacy Display. Now you can keep your screen from being seen whenever you want, like when you’re browsing random subreddits out in the wild. Pre-order today to get up to $900 in savings!

samsung.com

Pre-order Now

Best way to load Sheets into BigQuery?

u/Great_Session_4227

Best way to load Sheets into BigQuery?

We’ve ended up in a pretty common situation where a lot of reporting still starts in Google Sheets, but the sheet itself is becoming the weakest part of the process. People keep editing rows, formulas get copied in strange ways, and every month we spend time figuring out whether a reporting issue is actually a data issue or just another spreadsheet problem. At this point I’m less interested in keeping Sheets “connected” and more interested in moving the data into BigQuery in a cleaner, more controlled way. Not looking for a super heavy solution here - mostly curious what people have found works well when the goal is to treat Sheets as an input source, but not as the place where the reporting logic keeps living.

BigQuery backup strategies

u/ohad1282

BigQuery backup strategies

Hi all – I’m trying to better understand how people actually handle backup and recovery for BigQuery in real environments. Some questions I’d love to hear about from folks running BigQuery in production, and might be using GCP table snapshots.

Are table snapshots generally “good enough” for backups?
Do you care about cross-region backups? Or is regional redundancy within BigQuery typically sufficient for your risk tolerance?
What kind of restore scenarios do you actually see? Restore an entire table/restore a dataset/restore only specific records or partitions
How often do you need data older than 7 days? Is restoring older historical states a real need in practice?

Has anyone used commercial backup tools for BigQuery? If so, what problems were they solving that the built-in features didn’t? Mostly trying to understand what actually happens in practice vs what docs recommend.

Disclaimer: I work for Eon, and I’m trying to learn more about real-world backup/recovery needs for BigQuery users. Not here to pitch anything — genuinely curious about how people approach this. Thanks!

r/bigquery

Community highlights