295

Given an unordered list of values like

a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]

How can I get the frequency of each value that appears in the list, like so?

# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
3
  • Does this answer your question? How do I count the occurrences of a list item? Commented Jul 28, 2022 at 4:08
  • 2
    @Alireza How does it answer this question? This linked question is about counting a single, specific item from a list. This question asks to get the count of all elements in a list Commented Jul 28, 2022 at 7:33
  • @Tomerikoo see the 'user52028778' answer and just use Counter.values() Commented Jul 28, 2022 at 7:43

32 Answers 32

655

In Python 2.7 (or newer), you can use collections.Counter:

>>> import collections
>>> a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
>>> counter = collections.Counter(a)
>>> counter
Counter({1: 4, 2: 4, 5: 2, 3: 2, 4: 1})
>>> counter.values()
dict_values([2, 4, 4, 1, 2])
>>> counter.keys()
dict_keys([5, 1, 2, 4, 3])
>>> counter.most_common(3)
[(1, 4), (2, 4), (5, 2)]
>>> dict(counter)
{5: 2, 1: 4, 2: 4, 4: 1, 3: 2}
>>> # Get the counts in order matching the original specification,
>>> # by iterating over keys in sorted order
>>> [counter[x] for x in sorted(counter.keys())]
[4, 4, 2, 1, 2]

If you are using Python 2.6 or older, you can download an implementation here.

Sign up to request clarification or add additional context in comments.

5 Comments

@unutbu: What if I have three lists, a,b,c for which a and b remain the same, but c changes? How to count the the value of c for which a and c are same?
@Srivatsan: I don't understand the situation. Please post a new question where you can elaborate.
Is there a way to extract the dictionary {1:4, 2:4, 3:2, 5:2, 4:1} from the counter object ?
@Pavan: collections.Counter is a subclass of dict. You can use it in the same way you would a normal dict. If you really want a dict, however, you could convert it to a dict using dict(counter).
Is there a way to count values if the list is a set of co-ordinates? Say a = [(0,0),(0,1),(0,2),(1,0),(0,1)...] I need to get the frequency in place, preferably in another list
174

If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):

from itertools import groupby

a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]

Output:

[4, 4, 2, 1, 2]

11 Comments

nice, using groupby. I wonder about its efficiency vs. the dict approach, though
The python groupby creates new groups when the value it sees changes. In this case 1,1,1,2,1,1,1] would return [3,1,3]. If you expected [6,1] then just be sure to sort the data before using groupby.
@CristianCiupitu: sum(1 for _ in group).
This is not a solution. The output doesn't tell what was counted.
[(key, len(list(group))) for key, group in groupby(a)] or {key: len(list(group)) for key, group in groupby(a)} @buhtz
|
119

Python 2.7+ introduces Dictionary Comprehension. Building the dictionary from the list will get you the count as well as get rid of duplicates.

>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> d = {x:a.count(x) for x in a}
>>> d
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
>>> a, b = d.keys(), d.values()
>>> a
[1, 2, 3, 4, 5]
>>> b
[4, 4, 2, 1, 2]

9 Comments

It's faster using a set: {x:a.count(x) for x in set(a)}
This is hugely inefficient. a.count() does a full traverse for each element in a, making this a O(N^2) quadradic approach. collections.Counter() is much more efficient because it counts in linear time (O(N)). In numbers, that means this approach will execute 1 million steps for a list of length 1000, vs. just 1000 steps with Counter(), 10^12 steps where only 10^6 are needed by Counter for a million items in a list, etc.
@stenci: sure, but the horror of using a.count() completely dwarfs the efficiency of having used a set there.
@MartijnPieters one more reason to use it fewer times :)
@DylanYoung, that's what collections.Counter does but better.
|
53

Count the number of appearances manually by iterating through the list and counting them up, using a collections.defaultdict to track what has been seen so far:

from collections import defaultdict

appearances = defaultdict(int)

for curr in a:
    appearances[curr] += 1

3 Comments

+1 for collections.defaultdict. Also, in python 3.x, look up collections.Counter. It is the same as collections.defaultdict(int).
@hughdbrown, actually Counter can use multiple numeric types including float or Decimal, not just int.
collections.Counter does much more, and is a much more specialized tool, than collections.defaultdict with a numeric value type. It has extra convenience functions, and conceptually models the idea that the values represent counts rather than just being arbitrary numbers.
38

In Python 2.7+, you could use collections.Counter to count items

>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>>
>>> from collections import Counter
>>> c=Counter(a)
>>>
>>> c.values()
[4, 4, 2, 1, 2]
>>>
>>> c.keys()
[1, 2, 3, 4, 5]

2 Comments

Counter is much slower than the default dict, and the default dict is much slower than manual use of a dict.
@JonathanRay, not anymore, stackoverflow.com/a/27802189/1382487.
32

Counting the frequency of elements is probably best done with a dictionary:

b = {}
for item in a:
    b[item] = b.get(item, 0) + 1

To remove the duplicates, use a set:

a = list(set(a))

5 Comments

@phkahler: Mine would only a tiny bit better than this. It's hardly worth my posting a separate answer when this can be improved with a small change. The point of SO is to get to the best answers. I could simply edit this, but I prefer to allow the original author a chance to make their own improvements.
@S.Lott The code is much cleaner without having to import defaultdict.
Why not preinitialize b: b = {k:0 for k in a}?
@DylanYoung, because then you have to scan the list twice. And there's unlikely to be any benefit in Python: but check this for yourself.
The benefit is clean code :) Could use a defaultdict too of course, then you don't have to iterate through a
25

You can do this:

import numpy as np
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
np.unique(a, return_counts=True)

Output:

(array([1, 2, 3, 4, 5]), array([4, 4, 2, 1, 2], dtype=int64))

The first array is values, and the second array is the number of elements with these values.

So If you want to get just array with the numbers you should use this:

np.unique(a, return_counts=True)[1]

Comments

22

Here's another succint alternative using itertools.groupby which also works for unordered input:

from itertools import groupby

items = [5, 1, 1, 2, 2, 1, 1, 2, 2, 3, 4, 3, 5]

results = {value: len(list(freq)) for value, freq in groupby(sorted(items))}

results

format: {value: num_of_occurencies}
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}

Comments

10

I would simply use scipy.stats.itemfreq in the following manner:

from scipy.stats import itemfreq

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]

freq = itemfreq(a)

a = freq[:,0]
b = freq[:,1]

you may check the documentation here: http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html

Comments

9
from collections import Counter
a=["E","D","C","G","B","A","B","F","D","D","C","A","G","A","C","B","F","C","B"]

counter=Counter(a)

kk=[list(counter.keys()),list(counter.values())]

pd.DataFrame(np.array(kk).T, columns=['Letter','Count'])

2 Comments

While this code snippet may be the solution, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion
Yes will do that Rahul Gupta
8

Suppose we have a list:

fruits = ['banana', 'banana', 'apple', 'banana']

We can find out how many of each fruit we have in the list like so:

import numpy as np    
(unique, counts) = np.unique(fruits, return_counts=True)
{x:y for x,y in zip(unique, counts)}

Result:

{'banana': 3, 'apple': 1}

Comments

6
seta = set(a)
b = [a.count(el) for el in seta]
a = list(seta) #Only if you really want it.

3 Comments

using lists count is ridiculously expensive and uncalled for in this scenario.
@IdanK why count is expensive?
@KritikaRajain For each unique element in the list you iterate over the whole list to generate a count (quadratic in the number of unique elements in the list). Instead, you can iterate over the list once and count up the number of each unique element (linear in the size of the list). If your list has only one unique element, the result will be the same. Moreover, this approach requires an additional intermediate set.
5

This answer is more explicit

a = [1,1,1,1,2,2,2,2,3,3,3,4,4]

d = {}
for item in a:
    if item in d:
        d[item] = d.get(item)+1
    else:
        d[item] = 1

for k,v in d.items():
    print(str(k)+':'+str(v))

# output
#1:4
#2:4
#3:3
#4:2

#remove dups
d = set(a)
print(d)
#{1, 2, 3, 4}

2 Comments

Good work, simple solution to implement occurrence count in dictionary.
There is no need to use d.get(item) after checking if item in d: – both will check the exact same thing. Either use d[item] = d[item]+1 inside the if, or remove the if and use the single case of d[item] = d.get(item, 0) + 1.
3

For your first question, iterate the list and use a dictionary to keep track of an elements existsence.

For your second question, just use the set operator.

1 Comment

Can you please elaborate on the first answer
3
def frequencyDistribution(data):
    return {i: data.count(i) for i in data}   

print frequencyDistribution([1,2,3,4])

...

 {1: 1, 2: 1, 3: 1, 4: 1}   # originalNumber: count

Comments

3

I am quite late, but this will also work, and will help others:

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq_list = []
a_l = list(set(a))

for x in a_l:
    freq_list.append(a.count(x))


print 'Freq',freq_list
print 'number',a_l

will produce this..

Freq  [4, 4, 2, 1, 2]
number[1, 2, 3, 4, 5]

Comments

3
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}

Comments

1
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]

# 1. Get counts and store in another list
output = []
for i in set(a):
    output.append(a.count(i))
print(output)

# 2. Remove duplicates using set constructor
a = list(set(a))
print(a)
  1. Set collection does not allow duplicates, passing a list to the set() constructor will give an iterable of totally unique objects. count() function returns an integer count when an object that is in a list is passed. With that the unique objects are counted and each count value is stored by appending to an empty list output
  2. list() constructor is used to convert the set(a) into list and referred by the same variable a

Output

D:\MLrec\venv\Scripts\python.exe D:/MLrec/listgroup.py
[4, 4, 2, 1, 2]
[1, 2, 3, 4, 5]

Comments

1

Simple solution using a dictionary.

def frequency(l):
     d = {}
     for i in l:
        if i in d.keys():
           d[i] += 1
        else:
           d[i] = 1

     for k, v in d.iteritems():
        if v ==max (d.values()):
           return k,d.keys()

print(frequency([10,10,10,10,20,20,20,20,40,40,50,50,30]))

2 Comments

max(d.values()) will not change in the last loop. Don't compute it in the loop, compute it before the loop.
This returns the most common item plus all unique items, not the count/frequency of items.
0
#!usr/bin/python
def frq(words):
    freq = {}
    for w in words:
            if w in freq:
                    freq[w] = freq.get(w)+1
            else:
                    freq[w] =1
    return freq

fp = open("poem","r")
list = fp.read()
fp.close()
input = list.split()
print input
d = frq(input)
print "frequency of input\n: "
print d
fp1 = open("output.txt","w+")
for k,v in d.items():
fp1.write(str(k)+':'+str(v)+"\n")
fp1.close()

Comments

0
from collections import OrderedDict
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
def get_count(lists):
    dictionary = OrderedDict()
    for val in lists:
        dictionary.setdefault(val,[]).append(1)
    return [sum(val) for val in dictionary.values()]
print(get_count(a))
>>>[4, 4, 2, 1, 2]

To remove duplicates and Maintain order:

list(dict.fromkeys(get_count(a)))
>>>[4, 2, 1]

Comments

0

i'm using Counter to generate a freq. dict from text file words in 1 line of code

def _fileIndex(fh):
''' create a dict using Counter of a
flat list of words (re.findall(re.compile(r"[a-zA-Z]+"), lines)) in (lines in file->for lines in fh)
'''
return Counter(
    [wrd.lower() for wrdList in
     [words for words in
      [re.findall(re.compile(r'[a-zA-Z]+'), lines) for lines in fh]]
     for wrd in wrdList])

Comments

0

For the record, a functional answer:

>>> L = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> import functools
>>> >>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc,1)] if e<=len(acc) else acc+[0 for _ in range(e-len(acc)-1)]+[1], L, [])
[4, 4, 2, 1, 2]

It's cleaner if you count zeroes too:

>>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc)] if e<len(acc) else acc+[0 for _ in range(e-len(acc))]+[1], L, [])
[0, 4, 4, 2, 1, 2]

An explanation:

  • we start with an empty acc list;
  • if the next element e of L is lower than the size of acc, we just update this element: v+(i==e) means v+1 if the index i of acc is the current element e, otherwise the previous value v;
  • if the next element e of L is greater or equals to the size of acc, we have to expand acc to host the new 1.

The elements do not have to be sorted (itertools.groupby). You'll get weird results if you have negative numbers.

Comments

0

Another approach of doing this, albeit by using a heavier but powerful library - NLTK.

import nltk

fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()

Comments

0

Found another way of doing this, using sets.

#ar is the list of elements
#convert ar to set to get unique elements
sock_set = set(ar)

#create dictionary of frequency of socks
sock_dict = {}

for sock in sock_set:
    sock_dict[sock] = ar.count(sock)

Comments

0

For an unordered list you should use:

[a.count(el) for el in set(a)]

The output is

[4, 4, 2, 1, 2]

1 Comment

Note that sets do not preserve order. As a result, the positions in the list and thus the meaning of the contained counts are completely arbitrary wrt the actual items.
-1

Yet another solution with another algorithm without using collections:

def countFreq(A):
   n=len(A)
   count=[0]*n                     # Create a new list initialized with '0'
   for i in range(n):
      count[A[i]]+= 1              # increase occurrence for value A[i]
   return [x for x in count if x]  # return non-zero count

Comments

-1
num=[3,2,3,5,5,3,7,6,4,6,7,2]
print ('\nelements are:\t',num)
count_dict={}
for elements in num:
    count_dict[elements]=num.count(elements)
print ('\nfrequency:\t',count_dict)

2 Comments

Please don't post code-only answers but clarify your code, especially when a question already has a valid answer.
this is one of the slowest way you can do it
-1

You can use the in-built function provided in python

l.count(l[i])


  d=[]
  for i in range(len(l)):
        if l[i] not in d:
             d.append(l[i])
             print(l.count(l[i])

The above code automatically removes duplicates in a list and also prints the frequency of each element in original list and the list without duplicates.

Two birds for one shot ! X D

Comments

-1

This approach can be tried if you don't want to use any library and keep it simple and short!

a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)

o/p

[4, 4, 2, 1, 2]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.