Wayback Machine filters

Wayback Machine filters: content changes, age, snapshot dates, keywords, languages, HTTP codes, CJK, redirects, 403, Website IDs. How to set conditions in the UI.

Updated:
Language:
Try for Free!
Use bonus credits!

How Wayback Machine filters work

Wayback Machine filters search domains using data from the Internet Archive (Wayback Machine). They analyze a domain’s history, content, languages, HTTP response codes, and other parameters stored in the archive.

Wayback Machine filters in Karma.Domains

Important: All Wayback Machine filters rely on archive data. If a domain was not crawled by Wayback Machine, these filters cannot match it.

Content changes filter

The Content changes filter selects domains by how many content changes appear in the archive history. The UI has min and max (0 or higher).

A content change means the page changed by more than 50% compared to the previous archived version in Wayback Machine.

This filter helps find domains with little change (0–3 changes). A large number of changes may indicate ownership changes.

How to use

  1. Find Content changes in the Wayback Machine block
  2. Enter bounds in min and max
  3. You can fill only min, only max, or both

Examples

Example 1: Active updates (from 50) — min 50, max empty.

Example 2: Stable domains (0–5 changes) — min: 0, max: 5.

Age filter

The Age filter selects domains by age in years (from archive data). The UI has min and max (years, ≥ 0).

Age is measured from the first content snapshot in Wayback Machine.

You can use this to find aged domains.

How to use

  1. Find Age in the Wayback Machine block
  2. Enter min and max in years
  3. Only min, only max, or both

Examples

Example 1: Old domains (10+ years) — min 10, max empty.

Example 2: Young domains (0–2 years) — min: 0, max: 2.

First snap filter

First snap selects domains by the date of the first snapshot in the archive. The UI has a date picker and quick buttons.

How to use

  1. Find First snap
  2. Click the field — calendar; pick start and end dates of the range
  3. Or use quick buttons: Older than 2 years, Older than 5 years, Older than 10 years, Last year

Examples

Example 1: First indexed more than 10 years ago — Older than 10 years.

Example 2: First snapshot in a specific year — pick a range in the calendar (e.g. Jan 1–Dec 31 of that year).

Last snap filter

Last snap selects domains by the date of the latest snapshot. Date picker and quick buttons.

How to use

  1. Find Last snap
  2. Click — calendar; pick start and end dates
  3. Or Last 7 days, Last 30 days, Last 60 days, This month, Last year

Examples

Example 1: Latest snapshot in the last 30 days — Last 30 days.

Example 2: Snapshot in the previous year — Last year or a calendar range.

Keywords in content filter

Keywords in content searches domains by keywords in archived page content. There is a text field with logic hints and an Elements counter (max 20); if exceeded, a Maximum … elements allowed warning appears.

How to use

  1. Find Keywords in content
  2. Enter keywords: comma = AND (all terms required), pipe | = OR (any term)
  3. Watch the Elements counter

Examples

Example 1: Any of several words (OR) — shop | store | buy.

Example 2: Both words (AND) — tech, news.

Language Filters

Language Filters select domains by content languages in the archive, with an optional minimum share per language. Searchable language dropdown, percentage field, Add; languages as badges (× to remove), Clear All. A domain matches if any listed language condition is met.

How to use

  1. Find Language Filters
  2. Pick a language; optionally set minimum % (0–100%)
  3. Add; remove with × on a badge; Clear All clears all

Examples

Example 1: English content with no threshold — English (en), Add.

Example 2: At least 50% Russian — Russian (ru), 50%, Add.

Example 3: Several languages (any can match) — e.g. English 30%+, German 20%+, French with no %.

Server Code Share filter

Server Code Share finds domains by the share of specific HTTP status codes in Wayback history. You can set a code and its percentage.

A 100% share of “200” responses can surface domains whose history had no server errors, redirects, or access-denied responses — often a sign of stable past behavior.

How to use

  1. Find Server Code Share in Wayback filters
  2. Enter minimum % in min % (0–100)
  3. Enter maximum % in max % (0–100)
  4. Click Select Code to pick an HTTP code (enabled after you enter a percentage)
  5. Choose the code from the list
  6. Clear resets the selected code

Available HTTP codes

  • 200 — OK
  • 301 — Moved Permanently
  • 302 — Found
  • 307 — Temporary Redirect
  • 308 — Permanent Redirect
  • 400 — Bad Request
  • 401 — Unauthorized
  • 403 — Forbidden
  • 404 — Not Found
  • 500 — Internal Server Error
  • 502 — Bad Gateway
  • 503 — Service Unavailable

Select Code activates after at least one of min % or max % is set. Clear clears the selected code.

Examples

Example 1: At least 80% “200 OK” — min % 80, Select Code, choose 200.

Example 2: 50–100% “404” errors — min % 50, max % 100, code 404.

Example 3: Any share of 301 redirects — code 301, leave percentages empty.

Hieroglyphs (CJK) filter

Hieroglyphs (CJK) finds domains that do or do not contain CJK content (Chinese, Japanese, Korean).

If you are not targeting Asian-language domains, excluding such content is often a good idea — historically many such domains used aggressive promotion tactics.

How to use

  1. Find Hieroglyphs (CJK)
  2. Three-state control:
    • Unknown (forbidden icon) — filter off
    • Yes (checkmark) — only domains with CJK content
    • No (cross) — only domains without CJK content

Details

  • The filter checks for CJK characters in content
  • Detection uses patterns from Wayback data
  • Useful for including or excluding Asian-language content

Examples

Example 1: Domains with CJK content — set to Yes.

Example 2: Domains without CJK — set to No.

Redirects (30x) filter

Redirects (30x) finds domains that do or do not have redirects (HTTP 301, 302, 307, 308) in Wayback history to other domains. Same-domain redirects (e.g. http to https) do not count.

Redirects to other domains can indicate black-hat SEO, but also ownership changes (registrar parking), moves to another domain, or hosting issues (redirect to host page).

How to use

  1. Find Redirects (30x)
  2. Three-state control: Unknown, Yes (only with cross-domain redirects), No (only without)

Details

  • Checks for 301, 302, 307, 308 in domain history
  • Only cross-domain redirects count
  • May reflect ownership changes, migrations, or SEO tactics

Examples

Example 1: Domains with redirects — Yes.

Example 2: Domains without redirects — No.

Error 403 filter

Error 403 finds domains that do or do not have HTTP 403 (Forbidden) responses in Wayback history.

403 often means the site owner blocked Wayback from crawling — sometimes a sign of black-hat SEO practices.

How to use

  1. Find Error 403
  2. Unknown, Yes (only with 403), No (only without)

Details

  • Detects HTTP 403 in history
  • May indicate access restrictions, protected content, or SEO-related blocking

Examples

Example 1: Domains with 403 — Yes.

Example 2: Domains without 403 — No.

Website IDs filter

Website IDs searches domains by analytics/widget IDs (e.g. Google Analytics, Yandex Metrica) found in archived content — useful for finding domains likely owned by the same party.

How to use

  1. Find Website IDs
  2. Enter an ID (e.g. UA-123456789, GTM-XXXXX)
  3. Add or Enter/Space to add
  4. IDs appear as badges; × removes one; Clear All clears all

There is a help icon next to the field name. Matching is any of the IDs; case-insensitive.

Details

  • Supports analytics/widget IDs found in archive content
  • Multiple IDs mean domains matching any ID

Examples

Example 1: Specific Google Analytics ID — UA-123456789.

Example 2: Any of several IDs — UA-123456789, GTM-XXXXX, 12345678.

Example 3: Same owner — add an ID from one domain to find others using the same ID.

Combining Wayback filters

All Wayback filters can be combined with other filters. Multiple filters use AND — every condition must hold. You can save combinations.

Combined example

Goal: Old domains (10+ years), at least 50% English content, no redirects, last snapshot in the last 30 days.

UI: Age min 10; Language Filters — English (en) 50%; Redirects (30x)No; Last snapLast 30 days.

Troubleshooting

“Maximum … elements allowed”

Why: More than 20 elements in Keywords in content; red warning under the field.

Fix: Remove elements until the counter is ≤ 20.

No results

Why: Conditions too strict, or no domains in the archive match.

Fix: Relax or disable filters step by step.

First snap / Last snap

Fix: Use the calendar or quick buttons — date format is applied automatically.

Select Code inactive

Why: In Server Code Share, neither min % nor max % is set.

Fix: Enter at least one percentage, then Select Code becomes available.

Tips

  1. First snap / Last snap: Quick buttons (Older than 2/5/10 years, Last 7/30/60 days, This month, Last year) simplify ranges.

  2. Combine filters to narrow results (age, language, codes, redirects, etc.).

  3. Keywords in content: comma = AND, | = OR. Watch the Elements limit (20).

  4. Language Filters: use minimum % for tighter language targeting.

  5. Website IDs: add a known analytics ID to find other domains using the same ID (possible same owner).

Other articles in this section "Filters in Karma.Domains"

All articles in this section

Try for Free!

Use bonus credits!

Open domain list
+5