How to Retrieve the Line Count of a File in Python : Kat McKelvie

How to Retrieve the Line Count of a File in Python
by: Kat McKelvie
blow post content copied from  Finxter
click here to view original post


5/5 - (1 vote)

Problem Formulation and Solution Overview

In this article, you’ll learn how to quickly retrieve the line count of a file in Python.

To follow along, save the contents below to a flat-text file called mona_lisa.txt and move this file to the current working directory.

The Mona Lisa: A painting by Leonardo da Vinci
Leonardo da Vinci began painting the Mona Lisa about 1503, which was in his studio when he died in 1519. He worked on it intermittently over several years, adding multiple layers of thin oil glazes at different times.

Reference: https://www.britannica.com/topic/Mona-Lisa-painting

💬 Question: How would we write Python code to retrieve the line count?

We can accomplish this task by one of the following options:


Method 1: Use open() and len()

This method uses three (3) functions, open(), len() and readlines() to retrieve the file’s line count. Ideal for reasonably sized files as it reads in all lines at once.

with open('mona_lisa.txt', 'r') as fp:
    line_count = len(fp.readlines())
print(line_count)

Above opens the file mona_lisa.txt in reading (r) mode, creating a File Object (similar to below). This object is assigned to fp, allowing access to and manipulation of the stated file.

<_io.TextIOWrapper name='mona_lisa.txt' mode='r' encoding='cp1252'>

The next line does the following:

  • Opens and reads in the contents of the stated flat-text file (readlines()).
  • Passes the above code as an argument to the len() function, which calculates the file’s line count (including blank lines).
  • The results are saved to line_count.

Then, line_count is output to the terminal.

4

Method 2: Use sum()

This method uses the sum() function. This function takes two (2) arguments: an iterable (required) and a start position (optional).

line_count = sum(1 for x in open('mona_lisa.txt', 'r'))
print(line_count)

The above code snippet calls the sum() function and passes an argument that opens the mona_list.txt file in read (r) mode.

Then it loops through each line and increases sum() by one (1) (including blank lines). The results are saved to line_count.

Then, line_count is output to the terminal.

4

Method 3: Use read() and split()

This method uses open(), read() , split() and len() to determine a file’s line count. Not as efficient as other solutions but gets the job done.

with open('mona_lisa.txt', 'r') as fp:
    all_lines = fp.read()
line_count = len(all_lines.split('\n'))
print(line_count)

Above opens the mona_list.txt file in read (r) mode. Then, read() is called in, with no argument. The results save to all_lines.

💡Note: Passing no argument into read() means to read in the entire file (including blank lines).

Next, the contents of all_lines are split on the newline character (\n), and the results (total number of lines) save to line_count.

Then, line_count is output to the terminal.

4

Method 4: Use List Comprehension

This method uses List Comprehension and len() to retrieve the file’s line count while ignoring blank lines.

lines = [x for x in open('mona_lisa.txt') if len(x) > 1]
print(len(lines))

Above opens the file mona_lisa.txt in read (r) mode. Then each line is examined, and if the line length exceeds one (1), it is appended to lines.

💡Note: The code (if len(x) > 1) checks to see if the line in question contains data. If a newline is encountered (\n), it resolves to a length of one (1) and is not appended.

The contents of lines display below.

['The Mona Lisa: A painting by Leonardo da Vinci\n', 'Leonardo da Vinci began painting the Mona Lisa about 1503, which was in his studio when he died in 1519. He worked on it intermittently over several years, adding multiple layers of thin oil glazes at different times. \n', 'Reference: https://www.britannica.com/topic/Mona-Lisa-painting']

Then, line_count is output to the terminal.

3

Method 5: Use List Comprehension and a Generator

This method uses Use List Comprehension and a Generator
to retrieve the file’s line count.

with open('mona_lisa.txt') as fp:
    line_count = [ln for ln in (line.strip() for line in fp) if ln]
print(len(line_count))

Above opens the file mona_lisa.txt in read (r) mode, creating a File Object (similar to below). This object is assigned to fp, allowing access to and manipulation of the stated file.

<_io.TextIOWrapper name='mona_lisa.txt' mode='r' encoding='cp1252'>

List Comprehension is used to loop through each line in the file while the Generator strips any leading or trailing spaces from the line. If the line still contains data, it is appended to line_count.

Next, the length of line_count is determined (len(line_count)) and output to the terminal.

3

Bonus: Use NumPy loadtxt()

What if you needed to determine the line count from a file containing floating-point numbers? You could use NumPy’s loadtxt() function.

The contents of the flat-text file nums.txt.

110.90 146.03
44.83 211.82
97.13 209.30
105.64 164.21
23.55 435.67
import numpy as np
data = np.loadtxt('nums.txt')
print(len(data))

The first line imports the NumPy library. Click here if this library requires installation.

Then, nums.txt is read using NumPy’s loadtxt() function. The contents are saved to data as follows.

[[110.9 146.03]
[ 44.83 211.82]
[ 97.13 209.3 ]
[105.64 164.21]
[ 23.55 435.67]]

Then, len(data) is called to determine the file’s line count and output to the terminal.

5

Summary

These methods of retrieving a file’s line count should give you enough information to select the best one for your coding requirements.

Good Luck & Happy Coding!


Programmer Humor

👱‍♀️ Programmer 1: We have a problem
🧔‍♂️ Programmer 2: Let’s use RegEx!
👱‍♀️ Programmer 1: Now we have two problems

… yet – you can easily reduce the two problems to zero as you polish your “RegEx Superpower in Python“. 🙂


July 10, 2022 at 11:00PM
Click here for more details...

=============================
The original post is available in Finxter by Kat McKelvie
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================

Salesforce