I noticed that Wikipedia blocks the default HTTPX user-agent, as explained here: https://foundation.wikimedia.org/wiki/Policy:Wikimedia_Foundation_User-Agent_Policy
For example:
llm -f https://en.wikipedia.org/wiki/1988_World_Snooker_Championship 'top 3 winners'
Returns:
httpx.HTTPStatusError: Client error '403 Forbidden' for url 'https://en.wikipedia.org/wiki/1988_World_Snooker_Championship' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403
Setting a custom user-agent similar to this should fix it:
User-Agent: CoolBot/0.0 (https://example.org/coolbot/; coolbot@example.org) generic-library/0.0
I noticed that Wikipedia blocks the default HTTPX user-agent, as explained here: https://foundation.wikimedia.org/wiki/Policy:Wikimedia_Foundation_User-Agent_Policy
For example:
Returns:
Setting a custom user-agent similar to this should fix it: