Decentralized & Fediverse
AT-Protocol and Fediverse moderation — FightCSAM’s #1 target. Our planned Bluesky adapter fills the perceptual-hash gap in hepa and emits to Ozone.
AT-Protocol and Fediverse moderation — FightCSAM’s #1 target. Our planned Bluesky adapter fills the perceptual-hash gap in hepa and emits to Ozone.
5 projects — 2 use · 1 learn from · 1 reference · 1 out of scope.
Project descriptions are adapted from awesome-safety-tools (maintained by ROOST); the verdicts and analysis are ours. Snapshot: June 2026 — a point-in-time view that complements, and does not replace, their living list.
Automod (hepa)
Use · by Bluesky
Automod hands rules the raw media bytes but ships no perceptual-hash hook, so image-similarity matching is exactly the gap left open. FightCSAM's planned AT-Proto adapter slots in here as a hepa blob rule wrapping hashkit + hashkit-match, then emits to Ozone.
A 'rules engine' framework that augments human moderators on the AT Protocol network by proactively identifying patterns of behavior and content. It processes firehose events (new posts, handle changes) via the hepa service daemon, maintains metadata caches and counters, and can fire outcomes like account reports and content labels.
Ozone
Use · by Bluesky
Ozone is the natural sink for our AT-Proto adapter: the hepa rule emits labels and reports straight into its queue. We also plan a safemod skin for its reviewer pane so hash-match context lands in front of human reviewers.
A self-hostable web interface for labeling and moderating content on AT Protocol / Bluesky. Moderators triage, escalate, and action reports; apply labels and takedowns to content and accounts; review profiles and post threads (including some removed content) in a reviewer pane; and send templated moderation emails.
FIRES
Learn from · by FediMod
FIRES is the model for advisory-style distribution we want to interoperate with: we plan a FIRES-compatible output for hashstream so Fediverse admins can subscribe to our recommendations and decide for themselves, rather than receiving forced blocks.
A protocol and reference server (Fediverse Intelligence Replication Endpoint Server) for exchanging moderation advisories and recommendations across the Fediverse. Trust & safety teams publish research-backed recommendations that client servers pull and periodically refresh; it is explicitly not designed for creating denylists, leaving final decisions to each moderator.
FediCheck
Reference · by IFTAS
FediCheck operates at the domain/instance layer rather than per-media, so it sits adjacent to our hash-matching work. It's a useful reference for how Fediverse admins consume shared trust-and-safety lists, and a complement to the FIRES-style advisory output we plan.
A Moderation-as-a-Service tool for ActivityPub providers (e.g. Mastodon) that synchronizes a server's domain-level denylist with curated upstream lists such as IFTAS's CARIAD, sparing admins from manually researching and blocking problem domains. After IFTAS wound down operations, it moved toward being open-sourced so anyone can run the service against their own upstream providers.
Fediverse Spam Filtering
Out of scope · by Marc Damie
This targets text-spam classification, a different problem from the perceptual-hash media matching FightCSAM focuses on. It's an interesting PoC for Fediverse moderation tooling, but out of scope for our AT-Proto adapter.
A proof-of-concept spam filter for Fediverse platforms (e.g. Mastodon) using a Naive Bayes classifier over status features like content words, spoiler text, media attachments, tags, and sensitivity flags. It exposes REST endpoints for prediction, outlier review, and model import/export, and minimizes admin workload by surfacing outliers and random samples for labeling.
Investigation & signal-sharing
Threat-signal sharing and investigation tooling. Meta ThreatExchange / python-threatexchange set the bar for hashstream; disinformation and platform-o
Datasets & benchmarks
Training and evaluation datasets. We anchor promptshield’s evaluation to NVIDIA Aegis 2.0 and borrow Tattle / Uli annotation methodology; the rest are