Merging repeated items in a list into a python dic

2020-03-26 12:02发布

问题:

I have a list that looks like the one bellow, with the same item of a pair repeated some times.

l = (['aaron distilled ', 'alcohol', '5'], 
['aaron distilled ', 'gin', '2'], 
['aaron distilled ', 'beer', '6'], 
['aaron distilled ', 'vodka', '9'], 
['aaron evicted ', 'owner', '1'], 
['aaron evicted ', 'bum', '1'], 
['aaron evicted ', 'deadbeat', '1'])

I would like to convert it to a list of dictionaries in which I would merge all repetitions of the first item into one key, so the end result would look like:

data = {'aaron distilled' :  ['alcohol', '5', 'gin', '2',  'beer', '6', 'vodka', '9'], 
'aaron evicted ':  ['owner', '1', 'bum', '1', 'deadbeat', '1']}

I was trying something like:

result = {}
for row in data:
    key = row[0]
    result = {row[0]: row[1:] for row in data}

or

for dicts in data:
   for key, value in dicts.items():
    new_dict.setdefault(key,[]).extend(value)

But I get the wrong result. I am very new to python and would really appreciate any tip on how to solve this or reference to where to find the information that would allow me to do this. Thanks!

回答1:

Use a collections.defaultdict() object for ease:

from collections import defaultdict

result = defaultdict(list)

for key, *values in data:
    result[key].extend(values)

Your first attempty will overwrite keys; a dict comprehension would not merge the values. The second attempt seems to treat the lists in the data list as dictonaries, so that wouldn't work at all.

Demo:

>>> from collections import defaultdict
>>> data = (['aaron distilled ', 'alcohol', '5'], 
... ['aaron distilled ', 'gin', '2'], 
... ['aaron distilled ', 'beer', '6'], 
... ['aaron distilled ', 'vodka', '9'], 
... ['aaron evicted ', 'owner', '1'], 
... ['aaron evicted ', 'bum', '1'], 
... ['aaron evicted ', 'deadbeat', '1'])
>>> result = defaultdict(list)
>>> for key, *values in data:
...    result[key].extend(values)
... 
>>> result
defaultdict(<class 'list'>, {'aaron distilled ': ['alcohol', '5', 'gin', '2', 'beer', '6', 'vodka', '9'], 'aaron evicted ': ['owner', '1', 'bum', '1', 'deadbeat', '1']})


回答2:

If the items in L are sorted by the first element you can use groupby

>>> L = (['aaron distilled ', 'alcohol', '5'], 
... ['aaron distilled ', 'gin', '2'], 
... ['aaron distilled ', 'beer', '6'], 
... ['aaron distilled ', 'vodka', '9'], 
... ['aaron evicted ', 'owner', '1'], 
... ['aaron evicted ', 'bum', '1'], 
... ['aaron evicted ', 'deadbeat', '1'])
>>> from operator import itemgetter
>>> from itertools import groupby
>>> {k: [j for a,b,c in g for j in b,c] for k, g in groupby(L, itemgetter(0))}
{'aaron evicted ': ['owner', '1', 'bum', '1', 'deadbeat', '1'], 'aaron distilled ': ['alcohol', '5', 'gin', '2', 'beer', '6', 'vodka', '9']}