Tracking progress on the SHM content archiving tools. Updated as work progresses.

    🎉 Phase 1 Complete!

      The web page archiver is working! Key features:

      ✅ HTML-to-SHM conversion with hierarchy inference

      ✅ Image upload to IPFS

      ✅ Dedicated archive keys (one per source)

      ✅ Archive profile creation with metadata

      ✅ Provenance tracking (source URL, timestamp, tool)

    Live Demo Archives

    CLI Usage

      # Create a dedicated archive identity
      shm-archive create-archive web "example-site" \
        --description "Archived articles from Example" \
        --source-url "https://example.com"
      
      # Archive a URL to that identity
      shm-archive url "https://example.com/article" \
        --key archive-web-example-site
      
      # Or test parsing without publishing
      shm-archive test-parse "https://example.com/article"

    Technical Achievement: Hierarchy Inference

      The hardest problem: HTML is flat, SHM requires tree structure. Solved by tracking heading levels with a stack algorithm.

      Empty Mermaid block

    Next: Phase 2 - More Sources

      ⬜ Wikipedia archiver (MediaWiki API)

      ⬜ Social media (Twitter, Mastodon)

      ⬜ Web UI for triggering archives

      ⬜ Batch archiving support

    Source Code

      Tool: ~/shm-web-archiver/

      Key files:

      • src/html-to-shm.js - HTML parsing and hierarchy inference

      • src/image-uploader.js - IPFS image upload

      • src/archive-key.js - Dedicated key management

      • bin/archive.js - CLI interface