Advertisement
Advertisement


Random string generation with upper case letters and digits


Question

I want to generate a string of size N.

It should be made up of numbers and uppercase English letters such as:

  • 6U1S75
  • 4Z4UKK
  • U911K4

How can I achieve this in a pythonic way?

2019/03/28
1
1364
3/28/2019 6:10:36 PM

Accepted Answer

Answer in one line:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

or even shorter starting with Python 3.6 using random.choices():

''.join(random.choices(string.ascii_uppercase + string.digits, k=N))

A cryptographically more secure version; see https://stackoverflow.com/a/23728630/2213647:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

In details, with a clean function for further reuse:

>>> import string
>>> import random
>>> def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
...    return ''.join(random.choice(chars) for _ in range(size))
...
>>> id_generator()
'G5G74W'
>>> id_generator(3, "6793YUIO")
'Y3U'

How does it work ?

We import string, a module that contains sequences of common ASCII characters, and random, a module that deals with random generation.

string.ascii_uppercase + string.digits just concatenates the list of characters representing uppercase ASCII chars and digits:

>>> string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> string.digits
'0123456789'
>>> string.ascii_uppercase + string.digits
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'

Then we use a list comprehension to create a list of 'n' elements:

>>> range(4) # range create a list of 'n' numbers
[0, 1, 2, 3]
>>> ['elem' for _ in range(4)] # we use range to create 4 times 'elem'
['elem', 'elem', 'elem', 'elem']

In the example above, we use [ to create the list, but we don't in the id_generator function so Python doesn't create the list in memory, but generates the elements on the fly, one by one (more about this here).

Instead of asking to create 'n' times the string elem, we will ask Python to create 'n' times a random character, picked from a sequence of characters:

>>> random.choice("abcde")
'a'
>>> random.choice("abcde")
'd'
>>> random.choice("abcde")
'b'

Therefore random.choice(chars) for _ in range(size) really is creating a sequence of size characters. Characters that are randomly picked from chars:

>>> [random.choice('abcde') for _ in range(3)]
['a', 'b', 'b']
>>> [random.choice('abcde') for _ in range(3)]
['e', 'b', 'e']
>>> [random.choice('abcde') for _ in range(3)]
['d', 'a', 'c']

Then we just join them with an empty string so the sequence becomes a string:

>>> ''.join(['a', 'b', 'b'])
'abb'
>>> [random.choice('abcde') for _ in range(3)]
['d', 'c', 'b']
>>> ''.join(random.choice('abcde') for _ in range(3))
'dac'
2593
8/4/2018 7:28:53 AM

This Stack Overflow quesion is the current top Google result for "random string Python". The current top answer is:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

This is an excellent method, but the PRNG in random is not cryptographically secure. I assume many people researching this question will want to generate random strings for encryption or passwords. You can do this securely by making a small change in the above code:

''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))

Using random.SystemRandom() instead of just random uses /dev/urandom on *nix machines and CryptGenRandom() in Windows. These are cryptographically secure PRNGs. Using random.choice instead of random.SystemRandom().choice in an application that requires a secure PRNG could be potentially devastating, and given the popularity of this question, I bet that mistake has been made many times already.

If you're using python3.6 or above, you can use the new secrets module as mentioned in MSeifert's answer:

''.join(secrets.choice(string.ascii_uppercase + string.digits) for _ in range(N))

The module docs also discuss convenient ways to generate secure tokens and best practices.

2019/06/26

Simply use Python's builtin uuid:

If UUIDs are okay for your purposes, use the built-in uuid package.

One Line Solution:

import uuid; uuid.uuid4().hex.upper()[0:6]

In Depth Version:

Example:

import uuid
uuid.uuid4() #uuid4 => full random uuid
# Outputs something like: UUID('0172fc9a-1dac-4414-b88d-6b9a6feb91ea')

If you need exactly your format (for example, "6U1S75"), you can do it like this:

import uuid

def my_random_string(string_length=10):
    """Returns a random string of length string_length."""
    random = str(uuid.uuid4()) # Convert UUID format to a Python string.
    random = random.upper() # Make all characters uppercase.
    random = random.replace("-","") # Remove the UUID '-'.
    return random[0:string_length] # Return the random string.

print(my_random_string(6)) # For example, D9E50C
2018/07/31

A simpler, faster but slightly less random way is to use random.sample instead of choosing each letter separately, If n-repetitions are allowed, enlarge your random basis by n times e.g.

import random
import string

char_set = string.ascii_uppercase + string.digits
print ''.join(random.sample(char_set*6, 6))

Note: random.sample prevents character reuse, multiplying the size of the character set makes multiple repetitions possible, but they are still less likely then they are in a pure random choice. If we go for a string of length 6, and we pick 'X' as the first character, in the choice example, the odds of getting 'X' for the second character are the same as the odds of getting 'X' as the first character. In the random.sample implementation, the odds of getting 'X' as any subsequent character are only 6/7 the chance of getting it as the first character

2016/02/13

import uuid
lowercase_str = uuid.uuid4().hex  

lowercase_str is a random value like 'cea8b32e00934aaea8c005a35d85a5c0'

uppercase_str = lowercase_str.upper()

uppercase_str is 'CEA8B32E00934AAEA8C005A35D85A5C0'

2018/01/25

A faster, easier and more flexible way to do this is to use the strgen module (pip install StringGenerator).

Generate a 6-character random string with upper case letters and digits:

>>> from strgen import StringGenerator as SG
>>> SG("[\u\d]{6}").render()
u'YZI2CI'

Get a unique list:

>>> SG("[\l\d]{10}").render_list(5,unique=True)
[u'xqqtmi1pOk', u'zmkWdUr63O', u'PGaGcPHrX2', u'6RZiUbkk2i', u'j9eIeeWgEF']

Guarantee one "special" character in the string:

>>> SG("[\l\d]{10}&[\p]").render()
u'jaYI0bcPG*0'

A random HTML color:

>>> SG("#[\h]{6}").render()
u'#CEdFCa'

etc.

We need to be aware that this:

''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))

might not have a digit (or uppercase character) in it.

strgen is faster in developer-time than any of the above solutions. The solution from Ignacio is the fastest run-time performing and is the right answer using the Python Standard Library. But you will hardly ever use it in that form. You will want to use SystemRandom (or fallback if not available), make sure required character sets are represented, use unicode (or not), make sure successive invocations produce a unique string, use a subset of one of the string module character classes, etc. This all requires lots more code than in the answers provided. The various attempts to generalize a solution all have limitations that strgen solves with greater brevity and expressive power using a simple template language.

It's on PyPI:

pip install StringGenerator

Disclosure: I'm the author of the strgen module.

2019/03/22