- A summary of my bot defence systems
- Butlerian Jihad - Blog posts on the topic of fighting off spam bots, search engine spiders and other non-humans wasting the precious resources we have on Earth
- EmacsWiki's robots.txt
VirtualTam's bookmarks
-
-
2025-04-09 Web page archive formats:
Tools for crawling, scraping and archiving Web pages:
- internetarchive/heritrix3 - Extensible, web-scale, archival-quality web crawler project (Java)
- internetarchive/Zeno - State-of-the-art web crawler (Go)
- internetarchive/gowarc - Read and write WARC files in Go
- webrecorder/pywb - Web Archiving Toolkit for replay and recording of web archives (Python)
Self-hosted solutions:
- ArchiveBox - A self-hosted app that lets you preserve content from websites in a variety of formats
- Wallabag - Save and classify articles, read them later
-
- BeamNG.tech Technical Paper (2021-06-21)
- Publications
- BeamNG.tech Blog
- Project References
- BeamNG/BeamNGpy - Python API for BeamNG.tech
- BeamNG/beamng-ros2-integration - Robot Operating System integration
-
2022-06-19 -
2018-09-11 - http://tinysubversions.com/botsummit/2016/
- http://tinysubversions.com/botsummit/2014/
- http://opentranscripts.org/sources/bot-summit/bot-summit-2014/
- https://docs.google.com/document/d/1bka4o1RE9RPUeoUzgpTIKRWsgWHzZEKEADialnv7haQ/view
- https://github.com/dariusk/corpora
- https://brianshumate.com/articles/twitter-bots/
- https://brianshumate.com/projects/index.html#-twitter-robots
-
2018-09-03 -
2018-04-05 -
2018-02-23 -
2018-01-31 -
2017-08-20 -
2017-08-08 -
2016-08-24 -
2014-04-24 -
2014-01-15