This article shows how to generate large file using python.
1. The environment
Python 2.7.10
2. The targets
There are four targets in this post:
generate a big binary file filled by random hex codes
generate a big text file filled by random aphabets/letters
generate a big empty/sparse file
genrate a big text file filled by lines of random string(sentences)
2.1 generate a big binary file filled by random hex codes
the line #1 used os.urandom function to generate random bytes,this is the explanation of this function:
os.urandom(n)
Return a string of n random bytes suitable for cryptographic use.
This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, though its exact quality depends on the OS implementation. On a UNIX-like system this will query /dev/urandom, and on Windows it will use CryptGenRandom(). If a randomness source is not found, NotImplementedError will be raised.
run it:
and we got this result:
2.2 generate a big text file filled by random aphabets/letters
The key point is the random.choice
function. Here is the an introduction of this function:
random.choice(seq)
Return a random element from the non-empty sequence seq. If seq is empty, raises IndexError.
run it:
and we got this result:
2.3 generate a big empty/sparse file
The key point is the f.seek function call ,it would set the pointer to the end of the file and write a byte.
run it:
and we got this file content:
We can see that the byte 0001 is at the last of the file.
2.4 genrate a big text file filled by lines of random string(sentences)
the key points are:
setup four arrays which contains elements of sentences,e.g. the nons/verbs/adv/adj words
construct an array of array for use
use the random.choice to select a random word to construct a random sentence
run it(generate 1000 lines of sentences):
and we got this file content:
You can find detail documents about the python IO here: