get the words online
from urllib.request import Request, urlopen
url="https://svnweb.freebsd.org/csrg/share/dict/words?revision=61569&view=co"
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
web_byte = urlopen(req).read()
webpage = web_byte.decode('utf-8')
print(webpage)
Randomizing the first 500 words
from urllib.request import Request, urlopen
import random
url="https://svnweb.freebsd.org/csrg/share/dict/words?revision=61569&view=co"
req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
web_byte = urlopen(req).read()
webpage = web_byte.decode('utf-8')
first500 = webpage[:500].split("n")
random.shuffle(first500)
print(first500)
Output
[‘abnegation’, ‘able’, ‘aborning’, ‘Abigail’, ‘Abidjan’, ‘ablaze’, ‘abolish’, ‘abbe’, ‘above’, ‘abort’, ‘aberrant’, ‘aboriginal’, ‘aborigine’, ‘Aberdeen’, ‘Abbott’, ‘Abernathy’, ‘aback’, ‘abate’, ‘abominate’, ‘AAA’, ‘abc’, ‘abed’, ‘abhorred’, ‘abolition’, ‘ablate’, ‘abbey’, ‘abbot’, ‘Abelson’, ‘ABA’, ‘Abner’, ‘abduct’, ‘aboard’, ‘Abo’, ‘abalone’, ‘a’, ‘abhorrent’, ‘Abelian’, ‘aardvark’, ‘Aarhus’, ‘Abe’, ‘abjure’, ‘abeyance’, ‘Abel’, ‘abetting’, ‘abash’, ‘AAAS’, ‘abdicate’, ‘abbreviate’, ‘abnormal’, ‘abject’, ‘abacus’, ‘abide’, ‘abominable’, ‘abode’, ‘abandon’, ‘abase’, ‘Ababa’, ‘abdominal’, ‘abet’, ‘abbas’, ‘aberrate’, ‘abdomen’, ‘abetted’, ‘abound’, ‘Aaron’, ‘abhor’, ‘ablution’, ‘abeyant’, ‘about’]
Project description
This is a simple python package to generate random English words.
If you need help after reading the below, please find me on Twitter at @vaibhavsingh97.
If you love the package, please :star2: the repo.
Installation
You should be able to install using easy_install
or pip
in the usual ways:
$ easy_install random-word $ pip install random-word
Or clone this repository and run:
$ python3 setup.py install
Or place the random-word
folder that you downloaded somewhere where your scripts can access it.
Basic Usage
👋 This package will now, by default, fetch the random word from local database
from random_word import RandomWords r = RandomWords() # Return a single random word r.get_random_word()
Different services are available as a part of the random word package, which fetches random words from various API providers. Please check the Services
section for more details.
Services
- Wordnik
- API Ninjas
Development
Assuming that you have Python
and pipenv
installed, set up your environment and install the required dependencies like this instead of the pip install random-word
defined above:
$ git clone https://github.com/vaibhavsingh97/random-word.git
$ cd random-word
$ make init
Add API Key in random_word
directory defining API Key in config.yml
. If you don’t have an API key, then request your API key [here][wornikWebsiteLink]
API_KEY = "<API KEY>"
To check your desired changes, you can install your package locally.
$ pip install -e .
Issues
You can report the bugs at the issue tracker
License
Built with ♥ by Vaibhav Singh(@vaibhavsingh97) under MIT License
You can find a copy of the License at https://vaibhavsingh97.mit-license.org/
Download files
Download the file for your platform. If you’re not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
This post goes over how to generate a random word or letter in Python.
Install
Install random-word and PyYaml:
pip3 install random-word pyyaml
PyYaml is required or else you’ll get the error:
ModuleNotFoundError: No module named 'yaml'
Usage
Generate a random word:
from random_word import RandomWords
random_words = RandomWords()
print(random_words.get_random_word())
See the package documentation for more information.
Demo
Replit:
Random Letter
Get a random letter from the alphabet:
from string import ascii_lowercase
from random import randrange
print(ascii_lowercase[randrange(len(ascii_lowercase))])
Demo
Replit:
Please support this site and join our Discord!
Random Word Generator
How to install this library?
pip3 install Random-Word-Generator
OR
pip install Random-Word-Generator
Which other python packages are needed to be installed?
- No need of any external packages
- Only Python version >= 3 is required
What this library does?
It helps us to generate random words i.e random noise in text data which is helpful in many text augmentation based tasks, NER, etc.
Which methods are available currently in this library?
Method | Args | Description |
---|---|---|
.generate() | None | This will return a randomly generated word |
.getList(num_of_words) | num_of_words | This will return list of random words |
Setting to look out before generating random words
Basic
from RandomWordGenerator import RandomWord
# Creating a random word object
rw = RandomWord(max_word_size=10,
constant_word_size=True,
include_digits=False,
special_chars=r"@_!#$%^&*()<>?/|}{~:",
include_special_chars=False)
Args | Data Type | Default | Description |
---|---|---|---|
max_word_size | int | 10 | Represents maximum length of randomly generated word |
constant_word_size | bool | True | Represents word length of randomly generated word |
include_digits | bool | False | Represents whether or not to include digits in generated words |
special_chars | regex/string | r»@_!#$%^&*()<>?/\ |}{~:» |
Represents a regex string of all specials character you want to include in generated words |
include_special_chars | bool | False | Represents inclusion of special characters in generated words |
How to get started with this library?
-
Simple random word generation with constant word size
from RandomWordGenerator import RandomWord rw = RandomWord(max_word_size=5) print(rw.generate())
Output will be some random word like > hdsjq
-
Simple random word generation with variable word size
from RandomWordGenerator import RandomWord rw = RandomWord(max_word_size=5, constant_word_size=False) print(rw.generate())
Output will be some random word like > gw
-
Random word generation with constant word size and including special character included
from RandomWordGenerator import RandomWord rw = RandomWord(max_word_size=5, constant_word_size=True, special_chars=r"@#$%.*", include_special_chars=True) print(rw.generate())
Output will be some random word like > gsd$
-
If we want randomly generated words in list we just have to input the argument with number of words we want
from RandomWordGenerator import RandomWord rw = RandomWord(max_word_size=5, constant_word_size=False) print(rw.getList(num_of_random_words=3))
Output will be some random word like > ['adjse', 'qytqw', ' klsdf', 'ywete', 'klljs']
Application
- In cases where we need to add random noise in text
- Text Data Augmentation based tasks
- Can be used to generate random tokens for some particular application like authorization code
- In Automatic Password Suggestion system
Author
I will be happy to connect with you guys!!
Citation
@software{abhishek_c_salian_2020_4384164,
author = {Abhishek C. Salian},
title = {AbhishekSalian/Random-Word-Generator v1.0.0},
month = dec,
year = 2020,
publisher = {Zenodo},
version = {v1.0.0},
doi = {10.5281/zenodo.4384164},
url = {https://doi.org/10.5281/zenodo.4384164}
}
Any suggestions are most welcome.
Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article
File handling in Python is really simple and easy to implement. In order to pull a random word or string from a text file, we will first open the file in read mode and then use the methods in Python’s random module to pick a random word.
There are various ways to perform this operation:
This is the text file we will read from:
Method 1: Using random.choice()
Steps:
- Using with function, open the file in read mode. The with function takes care of closing the file automatically.
- Read all the text from the file and store in a string
- Split the string into words separated by space.
- Use random.choice() to pick a word or string.
Python
import
random
with
open
(
"MyFile.txt"
,
"r"
) as
file
:
allText
=
file
.read()
words
=
list
(
map
(
str
, allText.split()))
print
(random.choice(words))
Note: The split() function, by default, splits by white space. If you want any other delimiter like newline character you can specify that as an argument.
Output:
Output for two sample runs
The above can be achieved with just a single line of code like this :
Python
import
random
print
(random.choice(
open
(
"myFile.txt"
,
"r"
).readline().split()))
Method 2: Using random.randint()
Steps:
- Open the file in read mode using with function
- Store all data from the file in a string and split the string into words.
- Count the total number of words.
- Use random.randint() to generate a random number between 0 and the word_count.
- Print the word at that position.
Python
import
random
with
open
(
"myFile.txt"
,
"r"
) as
file
:
data
=
file
.read()
words
=
data.split()
word_pos
=
random.randint(
0
,
len
(words)
-
1
)
print
(
"Position:"
, word_pos)
print
(
"Word at position:"
, words[word_pos])
Output:
Output for two sample runs
Like Article
Save Article
I had made a few improvements to the code, its outputs are now much better, though still it isn’t not quite what I had in mind when I wrote the code, but it is much closer than before, though it still isn’t ideal.
It now imports ascii_lowercase from String module, has a constant that stores vowels, and gets consonants by subtracting vowels from letters.
I capitalized the name of tails because it acts as a constant during code execution, and only store unique entries in it.
I made a few corrections to the letter following rules, and extended the definition of vowels to include the letter «y», because y can follow virtually any letter.
I lowered the maximum limit of heads a word can have to four, and set the limit to the maximum number of tails a head can have to a random integer in the range between two and five, just in the improbable case the coin-flip fails to break the loops.
I have also made the script discard the random letter if there are already two consecutive vowels or consonants or same letters.
So this is the updated code:
import random
from collections import Counter
from string import ascii_lowercase
SAMPLE = Counter({
'e': 1202,
't': 910,
'a': 812,
'o': 768,
'i': 731,
'n': 695,
's': 628,
'r': 602,
'h': 592,
'd': 432,
'l': 398,
'u': 288,
'c': 271,
'm': 261,
'f': 230,
'y': 211,
'w': 209,
'g': 203,
'p': 182,
'b': 149,
'v': 111,
'k': 69,
'x': 17,
'q': 11,
'j': 10,
'z': 8
})
pool = list(SAMPLE.elements())
randpool = []
while len(pool) > 0:
elem = random.choice(pool)
randpool.append(elem)
pool.remove(elem)
LETTERS = set(ascii_lowercase)
VOWELS = set('aeiouy')
CONSONANTS = LETTERS - VOWELS
TAILS = {
'a': CONSONANTS,
'b': 'bjlr',
'c': 'chjklr',
'd': 'dgjw',
'e': CONSONANTS,
'f': 'fjlr',
'g': 'ghjlrw',
'h': '',
'i': CONSONANTS,
'j': '',
'k': 'hklrvw',
'l': 'l',
'm': 'cm',
'n': 'gn',
'o': CONSONANTS,
'p': 'fhlprst',
'q': '',
'r': 'hrw',
's': 'chjklmnpqstw',
't': 'hjrstw',
'u': CONSONANTS,
'v': 'lv',
'w': 'hr',
'x': 'h',
'y': 'sv',
'z': 'hlvw'
}
# variables expanded:
# w: Word, r: Random Letter, sc: Serial Consonants count, sv: Serial Vowels Count, ss: Serial Same-letter count, lm: Max Length of tails, l: Length of tails
def randomword():
count = random.randint(1, 4)
heads = [random.choice(randpool) for i in range(count)]
i = 0
segments = []
while count > 0:
sc, ss, sv = 0, 0, 0
w = heads[i]
if w in CONSONANTS: sc += 1
else: sv += 1
while True:
r = random.choice(randpool)
if r in TAILS[w] or r in VOWELS:
if i == 0 and r == w: continue
else:
if r in VOWELS:
sc = 0
sv += 1
break
else:
sv = 0
sc += 1
break
w += r
l = 1
lm = random.randint(2, 5)
while True:
if l == lm:
segments.append(w)
count -= 1
break
f = r
r = random.choice(randpool)
if r in TAILS[f] or r in VOWELS:
if r in VOWELS:
sc = 0
sv += 1
elif r in CONSONANTS:
sv = 0
sc += 1
if sv == 3 or sc == 3: continue
if r != f: ss = 0
if r == f and ss == 1: continue
if r == f: ss += 1
w += r
l += 1
if random.getrandbits(1):
segments.append(w)
count -= 1
break
i += 1
return ''.join(segments)
if __name__ == '__main__':
print(randomword())
In this lesson, you will learn how to create a random string and passwords in Python.
Table of contents
- String Constants
- How to Create a Random String in Python
- Example to generate a random string of any length
- Random String of Lower Case and Upper Case Letters
- Random string of specific letters
- Random String without Repeating Characters
- Create Random Password with Special characters, letters, and digits
- Random password with a fixed count of letters, digits, and symbols
- Generate a secure random string and password
- Generate a random alphanumeric string of letters and digits
- Random alphanumeric string with a fixed count of letters and digits
- Generate a random string token
- Generate universally unique secure random string Id
- Use the StringGenerator module to generate a random string
- Next Steps
- Practice Problem
String Constants
Below is the list of string constants you can use to get a different set of characters as a source for creating a random string.
Constant | Description |
---|---|
ascii_lowercase |
Contain all lowercase letters |
ascii_uppercase |
Contain all uppercase letters |
ascii_letters |
Contain both lowercase and uppercase letters |
digits |
Contain digits ‘0123456789’. |
punctuation |
All special symbols !”#$%&'()*+,-./:;<=>?@[]^_`{|}~. |
whitespace |
Includes the characters space, tab, linefeed, return, formfeed, and vertical tab [^ tnx0brf] |
printable |
characters that are considered printable. This is a combination of constants digits , letters , punctuation , and whitespace . |
How to Create a Random String in Python
We can generate the random string using the random module and string module. Use the below steps to create a random string of any length in Python.
- Import string and random module
The string module contains various string constant which contains the ASCII characters of all cases. It has separate constants for lowercase, uppercase letters, digits, and special symbols, which we use as a source to generate a random string.
Pass string constants as a source of randomness to the random module to create a random string - Use the string constant ascii_lowercase
The
string.ascii_lowercase
returns a list of all the lowercase letters from ‘a’ to ‘z’. This data will be used as a source to generate random characters. - Decide the length of a string
Decide how many characters you want in the resultant string.
- Use a for loop and random choice() function to choose characters from a source
Run a for loop till the decided string length and use the random choice() function in each iteration to pick a single character from the string constant and add it to the string variable using a
join()
function. print the final string after loop competition - Generate a random Password
Use the
string.ascii_letters
,string.digits
, andstring.punctuation
constants together to create a random password and repeat the first four steps.
Example to generate a random string of any length
import random
import string
def get_random_string(length):
# choose from all lowercase letter
letters = string.ascii_lowercase
result_str = ''.join(random.choice(letters) for i in range(length))
print("Random string of length", length, "is:", result_str)
get_random_string(8)
get_random_string(6)
get_random_string(4)
Output:
Random string of length 8 is: ijarubtd Random string of length 6 is: ycfxbs Random string of length 4 is: dpla
- The random choice() function is used to choose a single item from any sequence and it can repeat characters.
- The above random strings contain all lower case letters. If you want only the uppercase letters, then use the
string.ascii_uppercase
constant instead in the place of astring.ascii_lowercase
.
Random String of Lower Case and Upper Case Letters
In Python, to generate a random string with the combination of lowercase and uppercase letters, we need to use the string.ascii_letters
constant as the source. This constant contains all the lowercase and uppercase letters.
Example
import random
import string
def get_random_string(length):
# With combination of lower and upper case
result_str = ''.join(random.choice(string.ascii_letters) for i in range(length))
# print random string
print(result_str)
# string of length 8
get_random_string(8)
get_random_string(8)
Output:
WxQqJQlD NoCpqruK
Random string of specific letters
If you wanted to generate a random string from a fixed set of characters, please use the following example.
import random
# Random string of length 5
result_str = ''.join((random.choice('abcdxyzpqr') for i in range(5)))
print(result_str)
# Output ryxay
Random String without Repeating Characters
Note: The choice()
method can repeat characters. If you don’t want repeated characters in a resultant string, then use the random.sample() method.
import random
import string
for i in range(3):
# get random string of length 6 without repeating letters
result_str = ''.join(random.sample(string.ascii_lowercase, 8))
print(result_str)
Output:
wxvdkbfl ztondpef voeduias
Warning: As you can see in the output, all characters are unique, but it is less secure because it will reduce the probability of combinations of letters because we are not allowing repetitive letters and digits.
Create Random Password with Special characters, letters, and digits
A password that contains a combination of characters, digits, and special symbols is considered a strong password.
Assume, you want to generate a random password like: –
- ab23cd#$
- jk%m&l98
- 87t@h*ki
We can generate a random string password in Python with letters, special characters, and digits using the following two ways.
- Combine the following three constants and use them as a data source for the
random.choice()
function to select random characters from it.string.ascii_letters
: To include letters from a-z and A-Zstring.digits
: To include digits from 1 to 10string.punctuation
: to get special symbols
- Use the
string.printable
constant andchoice()
function. Thestring.printable
contains a combination of digits, ascii_letters (lowercase and uppercase letters), punctuation, and whitespace.
Example
import random
import string
# get random password pf length 8 with letters, digits, and symbols
characters = string.ascii_letters + string.digits + string.punctuation
password = ''.join(random.choice(characters) for i in range(8))
print("Random password is:", password)
Output:
Random password is: 6(I3goZ}
Using the string.printable
import random
import string
password = ''.join(random.choice(string.printable) for i in range(8))
print("Random password is:", password)
Output
Random password is: hY*34jj.
Random password with a fixed count of letters, digits, and symbols
It is a widespread use case that passwords must contain some count of digits and special symbols.
Let’s see how to generate a random password that contains at least one lowercase letter, one uppercase letter, one digit, and one special symbol.
Steps: –
- First, select the number of random lowercase and uppercase letters specified
- Next, choose the number of random digits
- Next, choose the number of special symbols
- Combine both letters, digits, and special symbols into a list
- At last shuffle the list
- Convert list back to a string
import random
import string
def get_random_password():
random_source = string.ascii_letters + string.digits + string.punctuation
# select 1 lowercase
password = random.choice(string.ascii_lowercase)
# select 1 uppercase
password += random.choice(string.ascii_uppercase)
# select 1 digit
password += random.choice(string.digits)
# select 1 special symbol
password += random.choice(string.punctuation)
# generate other characters
for i in range(6):
password += random.choice(random_source)
password_list = list(password)
# shuffle all characters
random.SystemRandom().shuffle(password_list)
password = ''.join(password_list)
return password
print("First Random Password is ", get_random_password())
# output qX49}]Ru!(
print("Second Random Password is ", get_random_password())
# Output 3nI0.V#[T
Generate a secure random string and password
Above all, examples are not cryptographically secure. The cryptographically secure random generator generates random data using synchronization methods to ensure that no two processes can obtain the same data simultaneously.
If you are producing random passwords or strings for a security-sensitive application, then you must use this approach.
If you are using Python version less than 3.6, then use the random.SystemRandom().choice()
function instead of random.choice()
.
If you are using a Python version higher than 3.6 you can use the secrets module to generate a secure random password.
Use secrets.choice()
function instead of random.choice()
import secrets
import string
# secure random string
secure_str = ''.join((secrets.choice(string.ascii_letters) for i in range(8)))
print(secure_str)
# Output QQkABLyK
# secure password
password = ''.join((secrets.choice(string.ascii_letters + string.digits + string.punctuation) for i in range(8)))
print(password)
# output 4x]>@;4)
Generate a random alphanumeric string of letters and digits
We often want to create a random string containing both letters and digits such as ab23cd, jkml98, 87thki. In such cases, we use the string.ascii_letters
and string.digits
constants to get the combinations of letters and numbers in our random string.
Now, let’s see the to create a random string with the combination of a letter from A-Z, a-z, and digits 0-9.
import random
import string
# get random string of letters and digits
source = string.ascii_letters + string.digits
result_str = ''.join((random.choice(source) for i in range(8)))
print(result_str)
# Output vZkOkL97
Random alphanumeric string with a fixed count of letters and digits
For example, I want to create a random alpha-numeric string that contains 5 letters and 3 numbers.
Example
import random
import string
def get_string(letters_count, digits_count):
letters = ''.join((random.choice(string.ascii_letters) for i in range(letters_count)))
digits = ''.join((random.choice(string.digits) for i in range(digits_count)))
# Convert resultant string to list and shuffle it to mix letters and digits
sample_list = list(letters + digits)
random.shuffle(sample_list)
# convert list to string
final_string = ''.join(sample_list)
print('Random string with', letters_count, 'letters', 'and', digits_count, 'digits', 'is:', final_string)
get_string(5, 3)
# Output get_string(5, 3)
get_string(6, 2)
# Output Random string with 6 letters and 2 digits is: 7DeOCm5t
Output:
First random alphanumeric string is: v809mCxH Second random alphanumeric string is: mF6m1TRk
Generate a random string token
The above examples depend on String constants and random module functions. There are also other ways to generate a random string in Python. Let see those now.
We can use secrets.token_hex()
to get a secure random text in hexadecimal format.
import secrets
print("Secure hexadecimal string token", secrets.token_hex(32))
Output:
Secure hexadecimal string token 25cd4dd7bedd7dfb1261e2dc1489bc2f046c70f986841d3cb3d59a9626e0d802
Generate universally unique secure random string Id
The random string generated using a UUID module is suitable for the Cryptographically secure application. The UUID module has various functions to do this. Here in this example, we are using a uuid4()
function to generate a random string Id.
import uuid
stringId = uuid.uuid4()
print("Secure unique string id", stringId)
# Output 0682042d-318e-45bf-8a16-6cc763dc8806
Use the StringGenerator module to generate a random string
The StringGenerator module is not a part of a standard library. However, if you want you can install it using pip and start using it.
Steps: –
pip install StringGenerator
.- Use a
render()
function of StringGenerator to generate randomized strings of characters using a template
Let see the example now.
import strgen
random_str = strgen.StringGenerator("[wd]{10}").render()
print(random_str)
# Output 4VX1yInC9S
random_str2 = strgen.StringGenerator("[d]{3}&[w]{3}&[p]{2}").render()
print(random_str2)
# output "C01N=10
Next Steps
I want to hear from you. What do you think of this article? Or maybe I missed one of the ways to generate random string in Python. Either way, let me know by leaving a comment below.
Also, try to solve the random module exercise and quiz to have a better understanding of working with random data in Python.
Practice Problem
Create a random alphanumeric string of length ten that must contain at least four digits. For example, the output can be anything like 1o32WzUS87, 1P56X9Vh87
import random
import string
digits = string.digits
letter_digit_list = list(string.digits + string.ascii_letters)
# shuffle random source of letters and digits
random.shuffle(letter_digit_list)
# first generate 4 random digits
sample_str = ''.join((random.choice(digits) for i in range(4)))
# Now create random string of length 6 which is a combination of letters and digits
# Next, concatenate it with sample_str
sample_str += ''.join((random.choice(letter_digit_list) for i in range(6)))
aList = list(sample_str)
random.shuffle(aList)
final_str = ''.join(aList)
print("Random String:", final_str)
# Output 81OYQ6D430
Python Random Word Generator Game
I found a question on a Random Word Generator game on the question and answer site StackOverflow. The question contained a small runnable version of the game code.
The author’s question was: where to find large English word lists on the internet?
Getting this large list of words would add good game replay value to the game and potentially make it a lot more compelling for the end-users.
I thought that the questions its small and readable code contained many interesting that I could expand upon. I could use it to learn how python language randomness for users to interact with. I could also use it to extend other features of the game around the word list to make it more robust.
The StackOverflow question was entitled Random word generator- Python.
Motivational Example Word Game
Here’s the runnable code of the game:
import random WORDS = ("python", "jumble", "easy", "difficult", "answer", "xylophone") word = random.choice(WORDS) correct = word jumble = "" while word: position = random.randrange(len(word)) jumble += word[position] word = word[:position] + word[(position + 1):] print( """ Welcome to WORD JUMBLE!!! Unscramble the leters to make a word. (press the enter key at prompt to quit) """ ) print("The jumble is:", jumble) guess = input("Your guess: ") while guess != correct and guess != "": print("Sorry, that's not it") guess = input("Your guess: ") if guess == correct: print("That's it, you guessed it!n") print("Thanks for playing") input("nnPress the enter key to exit")
You can play it interactively here:
The game randomly chooses a word from a list. Then it jumbles or scrambles the word by changing the order of letters in it. The code does this by randomly choosing a number that is 0 to the length of the word -1. This is then used as an index. The word is then The game user is supposed to figure out what the correct word is when the letters are unscrambled.
After that, the user unscrambles the letters to make a word. The user inputs this guess by using the keyboard and pressing enter. If the user unscrambles the word incorrectly then they are required to keep guessing the correct word. Once the user guesses the correct answer which is python then the program prints "thank you for playing"
. The game ends when the user presses Enter
to exit the game.
In the second line of code, the author of the question just pulls in a couple of words that are hardcoded into the answer. I found some word lists, optimized the randomization and retrieval of word lists. I also cleaned the word lists for any inconsistencies in type or formatting.
How to Get Word Lists
The StackOverflow question about where to find word lists had multiple answers. The response that was marked as the answer by the author, contained a 1000 word list from the word list called word lists – MIT . The author of the answer showed how to read the word list by making a web request or reading it from the hard drive.
The author did not integrate this with the code that the StackOverflow question. Since this was not done, I decided to implement a web request function that pulled in the word list resources and read them and a file IO function.
Some of the word lists were in from plain text files and others were from files that contained a byte type.
There was a collection of word lists here:
- I used the 100 word list from word lists – MIT.
- Natural Language Corpus Data: Beautiful Data – This word list has data fromthe most frequently used word list from 2008 to 2009. These word lists also show how many times the words were used.
- This is a good list for kid grade levels second grade spelling word lists up to eighth grade. This could be useful if the game is designed for kids. I decided to make this code the default so I could more easily guess and test what the words were.
I saw a couple of other word lists that I chose not to use because they’d require scraping from the web, were proprietary, or did not seem as comprehensive. There did seem to be other good word lists on Kaggle.
Adding Value to the Game
One of the most fun parts of going through this coding exercise was adding additional features to the game. I added code retrieving the word lists. I also added a feature that enforced a set degree of randomness that I determined was necessary to have the unjumbling of the word to be challenging.
I also added value to the game by
- Adding Game settings
- The settings
MINIMUM_WORD_LENGTH = 5
andMAXIMUM_WORD_LENGTH = 7
to control the size of the words that the user can guess - The words from file were a flag to decide whether or not the word list was from the file or from a web request.
- The user could also choose
- The settings
#GAME SETTINGS MINIMUM_WORD_LENGTH = 5 MAXIMUM_WORD_LENGTH = 7 WORDS_FROM_FILE = False WORD_LIST_TO_USE = "THIRD_GRADE_WORDS"
- Creating functions so the code was more testable. This can be seen throughout the code
- Cleaned up the words in the word list so they could be read in if they were bytes or strings
- This MIT word list was in a file format that when read was in bytes. Other word lists were in strings. The code was changed so it could convert the word that was in bytes into a string so it could be jumbled. I modified the code so there were separate functions that could easily be tested by me for the proper conversion of strings to bytes.
- Some code had additional characters like numbers or extra characters. I used a regular expression to remove these extra characters.
def format_words(words): if len(words) > 1: words_pattern = '[a-z]+' if type(words[0]) is bytes: words = [re.findall(words_pattern, word.decode('utf-8'), flags=re.IGNORECASE)[0] for word in words] else: words = [re.findall(words_pattern, word, flags=re.IGNORECASE)[0] for word in words] words = [word for word in words if len(word) >= MINIMUM_WORD_LENGTH and len(word) <= MAXIMUM_WORD_LENGTH] return words
- Making it easy to swap between word lists by adding a dictionary
if WORDS_FROM_FILE: words = get_words_from_file(WORD_LIST_FILE[WORD_LIST_TO_USE]) else: words = get_word_list_from_web(WORD_LIST_WEB[WORD_LIST_TO_USE]) words = format_words(words)
- Made sure that word was jumbled to a degree that made guessing fun
- I added a sequence matcher code that enforced a certain percentage of randomness in the word. It did so by looping through the code
- There was code added to make sure that the word was jumbled to a certain degree. If it was not then the word was jumbled again. Here’s how a SequnceMatcher works SequenceMatcher in Python. A human-friendly longest contiguous &… | by Nikhil Jaiswal | Towards Data Science
def generate_unique_shuffled_word(word): while True: shuffled_word = shuffle_word(word) simliar_percent = SequenceMatcher(None, shuffled_word, word).ratio() if MINIMUM_WORD_LENGTH >= 5 and simliar_percent <= 0.5: break return shuffled_word
Full Code
import random import requests import re from difflib import SequenceMatcher from pathlib import Path #GAME SETTINGS MINIMUM_WORD_LENGTH = 5 MAXIMUM_WORD_LENGTH = 7 WORDS_FROM_FILE = False WORD_LIST_TO_USE = "THIRD_GRADE_WORDS" WORD_LIST_WEB = { "MIT_WORDS": "https://www.mit.edu/~ecprice/wordlist.10000", "NORVIG_WORDS": "http://norvig.com/ngrams/count_1w.txt", "THIRD_GRADE_WORDS": "http://www.ideal-group.org/dictionary/p-3_ok.txt" } WORD_LIST_FILE = { "MIT_WORDS": "mit_wordlist.10000", "NORVIG_WORDS": "norvig_count_1w.txt", "THIRD_GRADE_WORDS": "p-3_ok.txt" } def get_word_list_from_web(word_site): response = requests.get(word_site) words = response.content.splitlines() return words def format_words(words): if len(words) > 1: words_pattern = '[a-z]+' if type(words[0]) is bytes: words = [re.findall(words_pattern, word.decode('utf-8'), flags=re.IGNORECASE)[0] for word in words] else: words = [re.findall(words_pattern, word, flags=re.IGNORECASE)[0] for word in words] words = [word for word in words if len(word) >= MINIMUM_WORD_LENGTH and len(word) <= MAXIMUM_WORD_LENGTH] return words def get_words_from_file(word_path): file_directory = Path().absolute() word_file_path = str(file_directory) + "\" + WORD_LIST_FILE[WORD_LIST_TO_USE] words = open(word_file_path).readlines() return words def shuffle_word(word): jumble = "" while word: position = random.randrange(len(word)) jumble += word[position] word = word[:position] + word[(position + 1):] return jumble def generate_unique_shuffled_word(word): while True: shuffled_word = shuffle_word(word) simliar_percent = SequenceMatcher(None, shuffled_word, word).ratio() if MINIMUM_WORD_LENGTH >= 5 and simliar_percent <= 0.5: break return shuffled_word def main(): print( """ Welcome to WORD JUMBLE!!! Unscramble the leters to make a word. (press the enter key at prompt to quit) """ ) if WORDS_FROM_FILE: words = get_words_from_file(WORD_LIST_FILE[WORD_LIST_TO_USE]) else: words = get_word_list_from_web(WORD_LIST_WEB[WORD_LIST_TO_USE]) words = format_words(words) word = random.choice(words).lower() shuffle_word = generate_unique_shuffled_word(word) correct_word = word print(shuffle_word) guess = input("Your guess: ") while (guess != correct_word and guess != "" ) : print("Sorry, that's not it") guess = input("Your guess: ") if guess == correct_word: print("That's it, you guessed it!n") print("Thanks for playing") input("nnPress the enter key to exit") main()
The version of the code is here on GitHub.
Conclusion
I learned about Python, different word lists, and implementing randomness for a user. Most importantly, I had fun coding it!
I hope you had fun reading and learning about it as well, and you picked something up from this simple piece of code as well.
Kevin MacNeel likes soccer, chess, hiking, traveling, and computer programming. His fascination with the development of machine intelligence that mimics human intelligence reflects his interest in using technology to make the world a better place.
Kevin works as a software developer who specializes in web development. He enjoys the challenge of creating innovative software and applications that help businesses and individuals achieve their goals. Kevin also pursues computer programming as a hobby in his free time, working on personal projects that allow him to explore new technologies and techniques.
Kevin is fascinated field of machine intelligence. He is particularly interested in the development of machine intelligence that mimics human intelligence, which is often referred to as artificial intelligence (AI). Kevin believes that AI has the potential to revolutionize many industries, from healthcare to transportation and software development. He is fascinated by the challenge of creating AI systems that can accurately mimic human intelligence and the ethical implications of using such systems in society.