Password guessing without AI: How attackers build targeted wordlists

Specops Wordlist

Passwords remain a persistent point of tension between usability and security. Controls designed to strengthen authentication often introduce complexity, which encourages users to rely on familiar patterns rather than genuinely unpredictable credentials. In practice, this frequently results in passwords derived from an organization’s own language.

Attackers have long recognized this behavioral pattern and continue to exploit it. Rather than relying on artificial intelligence or sophisticated guessing algorithms, many credential attacks begin with something far simpler: harvesting contextual language and converting it into highly targeted password guesses.

Tools such as Custom Word List generators (CeWL) make this process efficient and repeatable without introducing additional technical complexity, significantly improving success rates while reducing noise and detection risk.

This attacker behavior helps explain why NIST SP 800-63B explicitly advises against the use of context-specific words in passwords, including service names, usernames, and related derivatives. Enforcing that guidance, however, requires an understanding of how attackers assemble and operationalize these wordlists in real-world attacks.

This distinction matters because many defensive strategies still assume that password guessing relies on broad, generic datasets.

Where targeted wordlists really come from

CeWL is an open-source web crawler that extracts words from websites and compiles them into structured lists. It is included by default in widely used penetration testing distributions such as Kali Linux and Parrot OS, which lowers the barrier to entry for both attackers and defenders.

Attackers use CeWL to crawl an organization’s public-facing digital presence and collect terminology that reflects how that organization communicates externally.

This typically includes company service descriptions, internal phrasing surfaced in documentation, and industry-specific language that would not appear in generic password dictionaries.

The effectiveness of this approach lies not in novelty, but in relevance. The resulting wordlists closely mirror the vocabulary users already encounter in their day-to-day work and are therefore more likely to influence password construction.

Verizon’s Data Breach Investigation Report found stolen credentials are involved in 44.7% of breaches. 
 
Effortlessly secure Active Directory with compliant password policies, blocking 4+ billion compromised passwords, boosting security, and slashing support hassles!

Try it for free

From public-facing content to password guesses

CeWL can be configured to control crawl depth and minimum word length, allowing attackers to exclude low-value results. When harvested in this way, the output forms realistic password candidates through predictable transformations.

For a healthcare organization, for example, a hospital, public-facing content may expose terms such as the name of the organization, references to its location, or the services or treatments it offers.

These terms are rarely used as passwords in isolation but instead serve as a foundational candidate set that attackers systematically modify using common patterns such as numeric suffixes, capitalization, or appended symbols to generate plausible password guesses.

Once attackers obtain password hashes, often through third-party breaches or infostealer infections, tools such as Hashcat apply these mutation rules at scale. Millions of targeted candidates can be generated and tested efficiently against compromised data.

The same wordlists can also be used against live authentication services, where attackers may rely on throttling, timing, or low-and-slow guessing techniques to reduce the likelihood of detection or account lockout.

Why password complexity rules still fail

A key challenge is that many passwords generated in this way satisfy standard complexity requirements.

Specops analysis of more than six billion compromised passwords suggests that organizations continue to struggle with this distinction, even where awareness and training programs are in place. When passwords are constructed from familiar organizational language, added length or character variety does little to offset the reduced uncertainty introduced by highly contextual base terms.

A password such as HospitalName123! illustrates this problem more clearly. While it exceeds default Active Directory complexity requirements, it remains a weak choice within a healthcare environment.

CeWL-derived wordlists readily identify organization names and abbreviations harvested from public-facing content, allowing attackers to arrive at plausible password variants through minimal and systematic modification.

Defending against targeted wordlist attacks

Reducing exposure to wordlist-based attacks requires controls that address password construction rather than complexity alone.

Block context-derived and known-compromised passwords

Prevent users from creating passwords based on organization-specific language such as company and product names, internal project terms, industry vocabulary, and common attacker substitutions, while also blocking credentials that have already appeared in data breaches.

Specops Password Policy can enforce custom exclusion dictionaries and continuously scans Active Directory against more than 5.4 billion known-compromised passwords, disrupting CeWL-style wordlist attacks and reducing the reuse of exposed credentials.

Specops Password Policy Continuously block over 5.4 billion breached passwords
Specops Password Policy Continuously block over 5.4 billion breached passwords

Enforce minimum length and complexity

Require at least 15-character passphrases, as length and unpredictability offer the best protection against brute-force techniques. Passphrases are the best way to get users to create strong, long passwords.

Enable multi-factor authentication (MFA)

If you haven’t already, this is the obvious place to start. Consider a simple, effective MFA solution such as Specops Secure Access that can protect Windows Logon, VPNs, and RDP connections.

While MFA does not prevent password compromise, it significantly limits the impact of credential exposure by preventing passwords from being used as a standalone authentication factor.

Defending against targeted wordlist attacks

Align password policy with real-world attacks

Treat passwords as an active security control rather than a static compliance requirement. Enforcing policies that prevent context-derived, previously exposed, or easily inferred passwords reduces the value attackers gain from targeted wordlists, while MFA provides a necessary second line of defense when credentials are compromised.

Together, these controls form a more resilient authentication strategy that reflects how password attacks actually occur.

Speak with one of our experts to learn how Specops can support stronger, more resilient password security without adding unnecessary complexity for users.

Sponsored and written by Specops Software.