If you're a startup running your own crawlers to gather data for whatever purpose, you should try really hard not to make the world a worse place by driving up costs for the sites you are scraping.
There's really no excuse for crawling Wikipedia ("65% of our most expensive traffic comes from bots") when they offer a comprehensive collection of bulk download options.
Do better!
Recent articles
- My Lethal Trifecta talk at the Bay Area AI Security Meetup - 9th August 2025
- The surprise deprecation of GPT-4o for ChatGPT consumers - 8th August 2025
- GPT-5: Key characteristics, pricing and model card - 7th August 2025