Recuento de palabras de un progtwig de archivo txt

Estoy contando la palabra de un archivo txt con el siguiente código:

#!/usr/bin/python file=open("D:\\zzzz\\names2.txt","r+") wordcount={} for word in file.read().split(): if word not in wordcount: wordcount[word] = 1 else: wordcount[word] += 1 print (word,wordcount) file.close(); 

Esto me está dando la salida de esta manera:

 >>> goat {'goat': 2, 'cow': 1, 'Dog': 1, 'lion': 1, 'snake': 1, 'horse': 1, '': 1, 'tiger': 1, 'cat': 2, 'dog': 1} 

Pero quiero la salida de la siguiente manera:

 word wordcount goat 2 cow 1 dog 1..... 

También obtengo un símbolo extra en la salida (  ). ¿Cómo puedo eliminar esto?

Los símbolos divertidos que está encontrando son una lista de materiales UTF-8 (marca de orden de bytes) . Para deshacerte de ellos, abre el archivo con la encoding correcta (supongo que estás en Python 3):

 file = open(r"D:\zzzz\names2.txt", "r", encoding="utf-8-sig") 

Además, para el conteo, puede utilizar collections.Counter .

 from collections import Counter wordcount = Counter(file.read().split()) 

Mostrarlos también es fácil:

 >>> for item in wordcount.items(): print("{}\t{}".format(*item)) ... snake 1 lion 2 goat 2 horse 3 
 #!/usr/bin/python file=open("D:\\zzzz\\names2.txt","r+") wordcount={} for word in file.read().split(): if word not in wordcount: wordcount[word] = 1 else: wordcount[word] += 1 for k,v in wordcount.items(): print k, v 
 import sys file=open(sys.argv[1],"r+") wordcount={} for word in file.read().split(): if word not in wordcount: wordcount[word] = 1 else: wordcount[word] += 1 for key in wordcount.keys(): print ("%s %s " %(key , wordcount[key])) file.close(); 

Si está utilizando graphLab, puede utilizar esta función. Es realmente poderoso.

 products['word_count'] = graphlab.text_analytics.count_words(your_text) 
 #!/usr/bin/python file=open("D:\\zzzz\\names2.txt","r+") wordcount={} for word in file.read().split(): if word not in wordcount: wordcount[word] = 1 else: wordcount[word] += 1 for k,v in wordcount.items(): print k,v file.close(); 
 FILE_NAME = 'file.txt' wordCounter = {} with open(FILE_NAME,'r') as fh: for line in fh: # Replacing punctuation characters. Making the string to lower. # The split will spit the line into a list. word_list = line.replace(',','').replace('\'','').replace('.','').lower().split() for word in word_list: # Adding the word into the wordCounter dictionary. if word not in wordCounter: wordCounter[word] = 1 else: # if the word is already in the dictionary update its count. wordCounter[word] = wordCounter[word] + 1 print('{:15}{:3}'.format('Word','Count')) print('-' * 18) # printing the words and its occurrence. for (word,occurance) in wordCounter.items(): print('{:15}{:3}'.format(word,occurance)) 

#

  Word Count ------------------ of 6 examples 2 used 2 development 2 modified 2 open-source 2