Obteniendo la diferencia (delta) entre dos listas de diccionarios

Tengo las siguientes estructuras de datos de Python:

data1 = [{'name': u'String 1'}, {'name': u'String 2'}] data2 = [{'name': u'String 1'}, {'name': u'String 2'}, {'name': u'String 3'}] 

Estoy buscando la mejor manera de obtener el delta entre las dos listas. ¿Hay algo en Python que sea tan conveniente como la biblioteca JavaScript Underscore.js (_.difference)?

Utilice itertools.filterfalse :

 import itertools r = list(itertools.filterfalse(lambda x: x in data1, data2)) + list(itertools.filterfalse(lambda x: x in data2, data1)) assert r == [{'name': 'String 3'}] 

Qué tal esto:

 >>> [x for x in data2 if x not in data1] [{'name': u'String 3'}] 

Editar :

Si necesitas diferencia simétrica puedes usar:

 >>> [x for x in data1 + data2 if x not in data1 or x not in data2] 

o

 >>> [x for x in data1 if x not in data2] + [y for y in data2 if y not in data1] 

Una edición más

También puedes usar sets:

 >>> from functools import reduce >>> s1 = set(reduce(lambda x, y: x + y, [x.items() for x in data1])) >>> s2 = set(reduce(lambda x, y: x + y, [x.items() for x in data2])) >>> s2.difference(s1) >>> s2.symmetric_difference(s1) 

En caso de que quiera la diferencia recursivamente, he escrito un paquete para python: https://github.com/seperman/deepdiff

Instalación

Instalar desde PyPi:

 pip install deepdiff 

Ejemplo de uso

Importador

 >>> from deepdiff import DeepDiff >>> from pprint import pprint >>> from __future__ import print_function # In case running on Python 2 

El mismo objeto devuelve vacío

 >>> t1 = {1:1, 2:2, 3:3} >>> t2 = t1 >>> print(DeepDiff(t1, t2)) {} 

El tipo de un artículo ha cambiado

 >>> t1 = {1:1, 2:2, 3:3} >>> t2 = {1:1, 2:"2", 3:3} >>> pprint(DeepDiff(t1, t2), indent=2) { 'type_changes': { 'root[2]': { 'newtype': , 'newvalue': '2', 'oldtype': , 'oldvalue': 2}}} 

El valor de un artículo ha cambiado

 >>> t1 = {1:1, 2:2, 3:3} >>> t2 = {1:1, 2:4, 3:3} >>> pprint(DeepDiff(t1, t2), indent=2) {'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}} 

Artículo añadido y / o eliminado

 >>> t1 = {1:1, 2:2, 3:3, 4:4} >>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff) {'dic_item_added': ['root[5]', 'root[6]'], 'dic_item_removed': ['root[4]'], 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}} 

Diferencia de cadena

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}} >>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2}, "root[4]['b']": { 'newvalue': 'world!', 'oldvalue': 'world'}}} 

Diferencia de cadena 2

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'values_changed': { "root[4]['b']": { 'diff': '--- \n' '+++ \n' '@@ -1,5 +1,4 @@\n' '-world!\n' '-Goodbye!\n' '+world\n' ' 1\n' ' 2\n' ' End', 'newvalue': 'world\n1\n2\nEnd', 'oldvalue': 'world!\n' 'Goodbye!\n' '1\n' '2\n' 'End'}}} >>> >>> print (ddiff['values_changed']["root[4]['b']"]["diff"]) --- +++ @@ -1,5 +1,4 @@ -world! -Goodbye! +world 1 2 End 

Cambio de tipo

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'type_changes': { "root[4]['b']": { 'newtype': , 'newvalue': 'world\n\n\nEnd', 'oldtype': , 'oldvalue': [1, 2, 3]}}} 

Diferencia de lista

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) {'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}} 

Diferencia de lista 2:

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'iterable_item_added': {"root[4]['b'][3]": 3}, 'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2}, "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}} 

Enumere las diferencias ignorando el orden o los duplicados: (con los mismos diccionarios que arriba)

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}} >>> ddiff = DeepDiff(t1, t2, ignore_order=True) >>> print (ddiff) {} 

Lista que contiene el diccionario:

 >>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'dic_item_removed': ["root[4]['b'][2][2]"], 'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}} 

Conjuntos:

 >>> t1 = {1, 2, 8} >>> t2 = {1, 2, 3, 5} >>> ddiff = DeepDiff(t1, t2) >>> pprint (DeepDiff(t1, t2)) {'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']} 

Tuplas nombradas:

 >>> from collections import namedtuple >>> Point = namedtuple('Point', ['x', 'y']) >>> t1 = Point(x=11, y=22) >>> t2 = Point(x=11, y=23) >>> pprint (DeepDiff(t1, t2)) {'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}} 

Objetos personalizados:

 >>> class ClassA(object): ... a = 1 ... def __init__(self, b): ... self.b = b ... >>> t1 = ClassA(1) >>> t2 = ClassA(2) >>> >>> pprint(DeepDiff(t1, t2)) {'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}} 

Atributo de objeto añadido:

 >>> t2.c = "new attribute" >>> pprint(DeepDiff(t1, t2)) {'attribute_added': ['root.c'], 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}} 
 data1 = [{'name': u'String 1'}, {'name': u'String 2'}] data2 = [{'name': u'String 1'}, {'name': u'String 2'}, {'name': u'String 3'}] delta = list({dict2['name'] for dict2 in data2} - {dict1['name'] for dict1 in data1}) delta_dict = [{'name': value} for value in delta] print delta_dict