Reading Text Files

File Object Methods of Reading a Text File:

  • .read(): read the file once into one long string
  • .readline(): read the file one line at a time
  • .readlines(): read all lines into a list, with each line as an element in the list.

.read(), .readline() and .readline() methods include the EOL character in the string it reads. To remove the EOL character requires extra string processing, see .strip(), .lstrip(), .rstrip() to remove the EOL or other characters.

Before a file can be read, it first must be opened. See the File Processing note.

.read()

read() is a method of a file object. The file object is created by the open() statement.

When the read()method is called, it returns some data from the file. If read()is not given a parameter, the entire file is read and placed into the variable that is assigned to it.

If the EOF (End Of File) has been reached, read() will return an empty string "" (or '', either a pair of single or double quotes can be used)

CAUTION: if the file is larger than the amount of memory available on the computer you are using, the program will crash.

.read() Syntax

Note: To read() a file, it first must be opened

string_variable = file_object.read()

string_variable:

file_object: the name that will be used in the program to process the file.

.read() Example

Read an entire file in to a single variable:

fileToRead = 'textFile.txt'

f_in = open(fileToRead,'r')

print ('Reading file ' + fileToRead)

data = f_in.read()

# all the data in the file is now the variable data

# you can use any valid variable name in place of data

# this is where the data would be processed

print(data)

f_in.close()

Read an entire file in to a single variable removing the EOL characters at the same time:

# all EOL characters are removed and each line is placed in a list

f = open(fname, 'r')

lines = f.read().splitlines()

print (lines )

f.close()

.readline()

.readline() is a method of a file object. The file object is created by the open() statement.

When the .readline()method is called, it reads one line from a text file. The EOL (End Of Line) is represented by the '\n' character. All lines in a text file contain an EOL character except the last line in the file

If the EOF (End Of File) has been reached, .readline() will return an empty string ""

readline() Syntax

Note: To readline() a file, it first must be opened

string_variable = file_object.readline()

string_variable:

file_object: the name that will be used in the program to process the file.

readline() Example

This is a very traditional method of reading a file. the logic used here is call "priming the loop". The first line of the file is read before the while loop. The while loop checks to see if the EOF has been reached. Providiing the EOF has not been reached the data is processed and another line is read.

fname = 'textFile.txt'

f_input = open(fname,'r')

# f_input is the file object created by the open() statement

# read the first line of the file

one_line_of_data = f_input.read()

while '' != one_line_of_data: # keep looping while the EOF has not been read

print (one_line_of_data)

one_line_of_data = f_input.read()

print ('done')

f_input.close()

This method uses an implied readline(). The for statement does the reading and quits when the EOF is reached.

fname = 'textFile.txt'

f_input = open(fname,'r')

# implied reading line by line -- no readline required

# line is a variable and can be any valid variable name

# f_input is the file object created by the open() statement

for line in f_input:

# process the data in line here

# NOTICE that the data is printed double spaced.

# One of the EOL is in the file

# the other is created by the print statement

# the last line is not double spaced as the last line in the file

# does not have an EOL character

print (line)

print ('done')

f_input.close()

.readlines()

.readlines() is a method of a file object. The file object is created by the open() statement.

When the .readlines()method is called, the entire file is read into a list with each line being one element of the list.

If the EOF (End Of File) has been reached, .readlines() will return an empty string ""

.readlines() Syntax

Note: To .readlines() a file, it first must be opened

listOfStrings_variable = file_object.readlines()

listOfString_variable: a list (an array) of strings. Each element in the list contains a string that holds one line from the file.

file_object: the name that will be used in the program to process the file.

.readlines() Example

fileToRead = 'textFile.txt' f_in = open(fileToRead, 'r') print('Reading file ' + fileToRead) list_of_data = f_in.readlines()

# the data in the file can now be processed in the variable list_of_data

# the EOL \n is in the string

print(list_of_data)

print("done")

f_in.close()