Skip to content Skip to sidebar Skip to footer

Find The Number Of Characters In A File Using Python

Here is the question: I have a file with these words: hey how are you I am fine and you Yes I am fine And it is asked to find the number of words, lines and characters. Below is m

Solution 1:

Sum up the length of all words in a line:

characters += sum(len(word) for word in wordslist)

The whole program:

with open('my_words.txt') as infile:
    lines=0
    words=0
    characters=0for line in infile:
        wordslist=line.split()
        lines=lines+1
        words=words+len(wordslist)
        characters += sum(len(word) for word in wordslist)
print(lines)
print(words)
print(characters)

Output:

3
13
35

This:

(len(word) for word in wordslist)

is a generator expression. It is essentially a loop in one line that produces the length of each word. We feed these lengths directly to sum:

sum(len(word) for word in wordslist)

Improved version

This version takes advantage of enumerate, so you save two lines of code, while keeping the readability:

withopen('my_words.txt') as infile:
    words = 0
    characters = 0for lineno, line inenumerate(infile, 1):
        wordslist = line.split()
        words += len(wordslist)
        characters += sum(len(word) for word in wordslist)

print(lineno)
print(words)
print(characters)

This line:

withopen('my_words.txt') as infile:

opens the file with the promise to close it as soon as you leave indentation. It is always good practice to close file after your are done using it.

Solution 2:

Remember that each line (except for the last) has a line separator. I.e. "\r\n" for Windows or "\n" for Linux and Mac.

Thus, exactly two characters are added in this case, as 47 and not 45.

A nice way to overcome this could be to use:

import os

fname=input("enter the name of the file:")
infile=open(fname, 'r')
lines=0
words=0
characters=0for line in infile:
    line = line.strip(os.linesep)
    wordslist=line.split()
    lines=lines+1
    words=words+len(wordslist)
    characters=characters+ len(line)
print(lines)
print(words)
print(characters)

Solution 3:

To count the characters, you should count each individual word. So you could have another loop that counts characters:

for word in wordslist:
    characters += len(word)

That ought to do it. The wordslist should probably take away newline characters on the right, something like wordslist = line.rstrip().split() perhaps.

Solution 4:

I found this solution very simply and readable:

withopen("filename", 'r') as file:
    text = file.read().strip().split()
    len_chars = sum(len(word) for word in text)
    print(len_chars)

Solution 5:

This is too long for a comment.

Python 2 or 3? Because it really matters. Try out the following in your REPL for both:

Python 2.7.12
>>>len("taña")
5

Python 3.5.2
>>>len("taña")
4

Huh? The answer lies in unicode. That ñ is an 'n' with a combining diacritical. Meaning its 1 character, but not 1 byte. So unless you're working with plain ASCII text, you'd better specify which version of python your character counting function is for.

Post a Comment for "Find The Number Of Characters In A File Using Python"