Python: group by one object attribute

Question 1

I have a generator containing 5400 Sqlalchemy rows of 1 table(Entity). I would like to group rows by the value of one column (let’s say column 1); for each “group” I want to calculate the mean of the rows based on another column(let’s say column 2).

Something like this:

Step 1: {col1_value_1: [row1.col_2_value, …..rown.col_2_value], col1_value_2:[rowj.col_2_value, rowm.col_2_value, …….rowk.col_2_value], ……col1_value_p:[rowq.col_2_value, …..rowt.col_2_value]}

Step2: {col1_value_1: arreay_mean, col1_value2:array_mean, …..}

I tried something like this:

def group_by_col_1_val(rows):
    map_ = {}
    for row in rows:
        key = row.col_1
        value = row.col_2
        try:
            map_[key] = map_[key].append(value)
        except KeyError:
            map_[key] = [value]
    return map_

I am quite sure there is a better way of doing it.

I found this question Python group by a object key gives errros but I did not understand how to use itertools in my case

Question 2

Enhance the code by importing defaultdict from the collections module for a more streamlined approach. Initialize map_ as a defaultdict with a list as the default value, eliminating the necessity for try-except blocks when adding to the dictionary. To maintain consistency, return the result as a regular dictionary.

from collections import defaultdict

def group_by_col_1_val(rows):
   map_ = defaultdict(list)
   for row in rows:
      map_[row.col_1].append(row.col_2)
   return dict(map_)

Leave a Comment Cancel reply