How the catalog stays honest.
The site has one rule that overrides every other consideration: nothing in the catalog is fabricated. Every actor, every alias, every timeline event, and every indicator either has a verifiable citation or doesn't exist on the site.
Actor records
Each actor record is anchored on a primary naming source — most often a G#### MITRE ATT&CK group ID. Aliases carry explicit source attribution (mitre, microsoft, crowdstrike, mandiant, other) so a CTI engineer reading the page can immediately see which vendor uses which name. Descriptions stick to what the public record says; we don't synthesize attribution beyond what the cited publisher claims.
Timeline events
Each event row carries a publisher and a URL pointing at the publisher's own statement — a CISA advisory page, a DOJ press release, an NCSC attribution notice, a vendor blog post. The occurred_on date is the date the event itself happened (an intrusion was disclosed, an indictment was unsealed, sanctions were imposed), not the date we added it to the catalog. When the exact day is ambiguous, we use the cited source's publication date and note that in the seed comments.
Indicators of compromise
IOCs are the place a sloppy tracker most easily loses CTI credibility. The seed catalog ships a small, deliberately conservative set — every value is something we could quote verbatim from a public CISA advisory, vendor disclosure, or well-known incident write-up. We'd rather have one rock-solid IOC for an actor than ten plausible-sounding guesses. The underlying data model is built for STIX/MISP-style ingestion to grow the catalog from public feeds.
Display defangs domains, IPs, and URLs by default (evil.com becomes evil[.]com) so a casual click can't navigate to a malicious target. The underlying value is stored raw; copy-to-clipboard returns the raw value for use in downstream tools.
Ingested events
Events from automated ingestion (CISA, Microsoft Threat Intelligence, Google Cloud Threat Intelligence / Mandiant) are attributed to actors via a deliberately narrow keyword match: whole-word, case-insensitive equality on the actor's primary name or any alias of length 4+. We don't do substring matching, and the short ambiguous tokens (the dukes, etc.) are blocklisted. False attribution is worse than missing data, so we err toward false negatives.
When the Claude summarization client is configured, the raw publisher description is rewritten into a neutral, factual blurb before it lands on the public timeline. The summary is constrained by a JSON schema enforced by the API — prompt injection in a feed body cannot produce arbitrary markup or instructions. Failures fall back to the publisher's headline verbatim.
What we don't do
- We don't invent IOCs to fill out actor pages. Empty indicator panels stay empty.
- We don't synthesize attribution. If a publisher says “suspected,” the event description says “ suspected.”
- We don't scrape. Every source is a published RSS or JSON feed.
- We don't track every cluster ever named. Curation favors actors with multiple-vendor public attribution and stable published reporting.
Read the about page for the high-level pitch, or jump into the catalog.