Switch books page lists to solr from db#11187

benbdeitch · 2025-08-24T19:46:34Z

This PR enables Edition pages to fetch the lists that a given edition is from the Solr database in a single query, rather than having to perform multiple queries on the infogami database. This should improve the loading times. It also allows us to only display non-trivial lists on these pages (i.e: lists that contain more than a single entry.)

Technical

Testing

Step 1: Create two new lists on your local environment. One with two books, and one with only one book.
Step 2: Navigate to the page of the book they share in common, and examine the Lists section.
Step 3: If only one list is shown on that page, then the code is behaving properly.

Screenshot

Stakeholders

@cdrini

cdrini

Nice work @benbdeitch ! A few comments, but one core blocker. One of the goals of the issue is to fix "We have to query twice against the DB and show 2 rows base on lists including the Work v. Edition". Looking at the code, I think we might need to rename/tweak _get_lists_solr_uncached and friends, since this requires take advantage of a new type of solr query to search for both work and edition keys. Eg something like:

@public
def get_book_lists(work_key: str, edition_key: str) -> list[List]:
    q = f'seed:("{work_key}' OR "{edition_key}"')
    ... the query you have now in `_get_lists_solr_uncached`

That would let us achieve the stated goal, and change the partials.py code to use this method, and only call render_template("lists/widget", ...) once with the results!

openlibrary/core/models.py

openlibrary/plugins/worksearch/code.py

cdrini · 2025-10-10T18:36:44Z

openlibrary/templates/lists/widget.html

 $ seed_info = get_seed_info(page)
 $ user_lists = [] if async_load or not ctx.user else get_user_lists(seed_info)
-$ page_lists = get_page_lists(page, seed_info)
+$ page_lists = get_book_lists(page.key.split('/')[-1], None) if work else get_book_lists(None, page.key.split('/')[-1])


Ahh, I think this is the issue! We don't need the split the keys, the entire key is stored in solr, eg /works/OL3261646W

cdrini · 2025-10-10T18:37:29Z

openlibrary/templates/lists/widget.html

 $ seed_info = get_seed_info(page)
 $ user_lists = [] if async_load or not ctx.user else get_user_lists(seed_info)
-$ page_lists = get_page_lists(page, seed_info)
+$ page_lists = get_book_lists(page.key.split('/')[-1], None) if work else get_book_lists(None, page.key.split('/')[-1])


Oh and we'll want to forward along both, eg get_book_lists(work and work.key, edition and edition.key), and update the method to handle None. That'll let it do the magic of combining both!

cdrini · 2025-10-10T18:37:54Z

openlibrary/plugins/openlibrary/lists.py

+    from openlibrary.plugins.worksearch.code import run_solr_query
+    from openlibrary.plugins.worksearch.schemes.lists import ListSearchScheme
+
+    filter_query = "seed_count:[2 TO *] OR (NOT seed_count:*)"


Now that the solr reindex finished, this is no longer necessary!

Suggested change

filter_query = "seed_count:[2 TO *] OR (NOT seed_count:*)"

filter_query = "seed_count:[2 TO *]"

…st carousel

cdrini

Lgtm! Did some refactoring of the lists widget html since it was kind of tricky to work through. The lists look much better now with the seed_count filter, too!

github-actions bot assigned cdrini Aug 24, 2025

github-actions bot added the Priority: 2 Important, as time permits. [managed] label Aug 24, 2025

mekarpeles added this to the Sprint 2025-08 milestone Sep 5, 2025

mekarpeles added Priority: 1 Do this week, receiving emails, time sensitive, . [managed] and removed Priority: 2 Important, as time permits. [managed] labels Sep 8, 2025

mekarpeles modified the milestones: Sprint 2025-08, Sprint 2025-09 Sep 12, 2025

cdrini requested changes Sep 16, 2025

View reviewed changes

cdrini added the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Sep 16, 2025

mekarpeles added Priority: 2 Important, as time permits. [managed] and removed Priority: 1 Do this week, receiving emails, time sensitive, . [managed] labels Sep 22, 2025

mekarpeles modified the milestones: Sprint 2025-09, Sprint 2025-10 Oct 8, 2025

github-actions bot removed the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Oct 9, 2025

cdrini reviewed Oct 10, 2025

View reviewed changes

benbdeitch added 5 commits October 19, 2025 17:38

Added functions for searching lists by Solr

609b6ad

Implemented fetching lists for editions from Solr.

7f6de6c

Incorporated feedback from code review.

1ee5259

Further changes to partials and widgets, still in testing.

480c6da

Fixed missing quotation mark, other formatting.

7f5593f

benbdeitch force-pushed the Switch-Books-Page-Lists-to-Solr-from-DB branch from 1f466fa to 7f5593f Compare October 20, 2025 17:52

Disentangle lists/carousel from lists/widget + fixes to solr-based li…

56f7292

…st carousel

cdrini force-pushed the Switch-Books-Page-Lists-to-Solr-from-DB branch from 5c5c8ff to 56f7292 Compare December 11, 2025 17:05

cdrini approved these changes Dec 11, 2025

View reviewed changes

cdrini merged commit 0c475be into internetarchive:master Dec 11, 2025
4 checks passed

cdrini mentioned this pull request Dec 20, 2025

Sort the list carousel by last_modified #11607

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Switch books page lists to solr from db#11187