Skip to content
@internetarchive

Internet Archive

The Internet Archive is "the library of the Internet", and a big supporter of Free Software.

Pinned Loading

  1. openlibrary openlibrary Public

    One webpage for every book ever published!

    Python 6.2k 1.8k

  2. bookreader bookreader Public

    The Internet Archive BookReader

    JavaScript 1.1k 480

  3. heritrix3 heritrix3 Public

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    Java 3.2k 778

  4. cicd cicd Public

    build & test using github registry; deploy to nomad clusters

    22 1

  5. elements elements Public

    A web component library from the Internet Archive

    TypeScript 8

Repositories

Showing 10 of 271 repositories
  • internetarchive/iaux-collection-browser’s past year of commit activity
    TypeScript 8 AGPL-3.0 1 2 25 Updated Mar 12, 2026
  • openlibrary Public

    One webpage for every book ever published!

    internetarchive/openlibrary’s past year of commit activity
    Python 6,246 AGPL-3.0 1,815 764 (11 issues need help) 169 Updated Mar 12, 2026
  • emularity-config Public

    archive.org software emulation

    internetarchive/emularity-config’s past year of commit activity
    JavaScript 6 AGPL-3.0 1 0 0 Updated Mar 11, 2026
  • scholar Public

    IA Scholar

    internetarchive/scholar’s past year of commit activity
    HTML 3 AGPL-3.0 0 0 0 Updated Mar 11, 2026
  • wayback-machine-webextension Public

    A web browser extension for Chrome, Firefox, Edge, and Safari 14.

    internetarchive/wayback-machine-webextension’s past year of commit activity
    JavaScript 781 AGPL-3.0 225 56 5 Updated Mar 11, 2026
  • wiki-references-db Public

    Data models and scripts to build a database of references (broadly defined) appearing on Wikipedia and other wikis

    internetarchive/wiki-references-db’s past year of commit activity
    Python 7 GPL-3.0 0 3 0 Updated Mar 11, 2026
  • Zeno Public

    State-of-the-art web crawler 🔱

    internetarchive/Zeno’s past year of commit activity
    Go 393 AGPL-3.0 55 36 (2 issues need help) 10 Updated Mar 11, 2026
  • RevisionChest Public

    Transforms Wikipedia XML dumps into a more compact, stream-friendly format

    internetarchive/RevisionChest’s past year of commit activity
    Rust 0 GPL-3.0 0 0 0 Updated Mar 10, 2026
  • wiki-references-extractor Public

    Extracts references from Wikipedia articles

    internetarchive/wiki-references-extractor’s past year of commit activity
    Python 6 GPL-3.0 2 1 0 Updated Mar 10, 2026
  • heritrix3 Public

    Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

    internetarchive/heritrix3’s past year of commit activity
    Java 3,202 778 36 5 Updated Mar 10, 2026