unión python de múltiples rangos

Tengo estos rangos:

7,10 11,13 11,15 14,20 23,39 

Necesito realizar una unión de los rangos superpuestos para dar rangos que no se superpongan, así que en el ejemplo:

 7,20 23,39 

He hecho esto en Ruby, donde he presionado el inicio y el final del rango en la matriz, los he ordenado y luego realizo la unión de los rangos superpuestos. ¿Alguna forma rápida de hacer esto en Python?

Gracias

    Digamos que (7, 10) y (11, 13) resultan en (7, 13) :

     a = [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39)] b = [] for begin,end in sorted(a): if b and b[-1][1] >= begin - 1: b[-1] = (b[-1][0], end) else: b.append((begin, end)) 

    b es ahora

     [(7, 20), (23, 39)] 

    EDITAR :

    Como @CentAu nota correctamente, [(2,4), (1,6)] devolvería (1,4) lugar de (1,6) . Aquí está la nueva versión con el manejo correcto de este caso:

     a = [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39)] b = [] for begin,end in sorted(a): if b and b[-1][1] >= begin - 1: b[-1][1] = max(b[-1][1], end) else: b.append([begin, end]) 

    Vieja pregunta Pero quería agregar esta respuesta para futuras referencias. Sympy se puede utilizar para lograr la unión de intervalos:

     from sympy import Interval, Union def union(data): """ Union of a list of intervals eg [(1,2),(3,4)] """ intervals = [Interval(begin, end) for (begin, end) in data] u = Union(*intervals) return [list(u.args[:2])] if isinstance(u, Interval) \ else list(u.args) 

    Si la salida de Union es más de dos intervalos, es un objeto de Union mientras que cuando hay un solo intervalo, la salida es un objeto de Interval . Esa es la razón de la if statement en la línea de retorno.

    ejemplos:

     In [26]: union([(10, 12), (14, 16), (15, 22)]) Out[26]: [[10, 12], [14, 22]] In [27]: union([(10, 12), (9, 16)]) Out[27]: [[9, 16]] 

    Lo intenté con casos particulares de presencia de (45, 46) y (45, 45)
    y también los casos de prueba que es poco probable que ocurran en su aplicación: presencia de (11,6), presencia de (-1, -5), presencia de (-9, 5), presencia de (-3, 10).
    De todos modos los resultados son correctos para todos estos casos, es un punto.

    El algoritmo:

     def yi(li): gen = (x for a,b in li for x in xrange(a,b+1)) start = p = gen.next() for x in gen: if x>p+2: yield (start,p) start = p = x else: p = x yield (start,x) 

    Si aff en el siguiente código se establece en Verdadero, se muestran los pasos de la ejecución.

     def yi(li): aff = 0 gen = (x for a,b in li for x in xrange(a,b+1)) start = p = gen.next() for x in gen: if aff: print ('start %sp %d p+2 %d ' 'x==%s' % (start,p,p+2,x)) if x>p+2: if aff: print 'yield range(%d,%d)' % (start,p+1) yield (start,p) start = p = x else: p = x if aff: print 'yield range(%d,%d)' % (start,x+1) yield (start,x) for li in ([(7,10),(23,39),(11,13),(11,15),(14,20),(45,46)], [(7,10),(23,39),(11,13),(11,15),(14,20),(45,46),(45,45)], [(7,10),(23,39),(11,13),(11,15),(14,20),(45,45)], [(7,10),(23,39),(11,13),(11,6),(14,20),(45,46)], #1 presence of (11, 6) [(7,10),(23,39),(11,13),(-1,-5),(14,20),(45,45)], #2 presence of (-1,-5) [(7,10),(23,39),(11,13),(-9,-5),(14,20),(45,45)], #3 presence of (-9, -5) [(7,10),(23,39),(11,13),(-3,10),(14,20),(45,45)]): #4 presence of (-3, 10) li.sort() print 'sorted li %s'%li print '\n'.join(' (%d,%d) %r' % (a,b,range(a,b)) for a,b in li) print 'list(yi(li)) %s\n' % list(yi(li)) 

    resultado

     sorted li [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39), (45, 46)] (7,10) [7, 8, 9] (11,13) [11, 12] (11,15) [11, 12, 13, 14] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,46) [45] list(yi(li)) [(7, 20), (23, 39), (45, 46)] sorted li [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39), (45, 45), (45, 46)] (7,10) [7, 8, 9] (11,13) [11, 12] (11,15) [11, 12, 13, 14] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,45) [] (45,46) [45] list(yi(li)) [(7, 20), (23, 39), (45, 46)] sorted li [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39), (45, 45)] (7,10) [7, 8, 9] (11,13) [11, 12] (11,15) [11, 12, 13, 14] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,45) [] list(yi(li)) [(7, 20), (23, 39), (45, 45)] sorted li [(7, 10), (11, 6), (11, 13), (14, 20), (23, 39), (45, 46)] (7,10) [7, 8, 9] (11,6) [] (11,13) [11, 12] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,46) [45] list(yi(li)) [(7, 20), (23, 39), (45, 46)] sorted li [(-1, -5), (7, 10), (11, 13), (14, 20), (23, 39), (45, 45)] (-1,-5) [] (7,10) [7, 8, 9] (11,13) [11, 12] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,45) [] list(yi(li)) [(7, 20), (23, 39), (45, 45)] sorted li [(-9, -5), (7, 10), (11, 13), (14, 20), (23, 39), (45, 45)] (-9,-5) [-9, -8, -7, -6] (7,10) [7, 8, 9] (11,13) [11, 12] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,45) [] list(yi(li)) [(-9, -5), (7, 20), (23, 39), (45, 45)] sorted li [(-3, 10), (7, 10), (11, 13), (14, 20), (23, 39), (45, 45)] (-3,10) [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9] (7,10) [7, 8, 9] (11,13) [11, 12] (14,20) [14, 15, 16, 17, 18, 19] (23,39) [23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38] (45,45) [] list(yi(li)) [(-3, 20), (23, 39), (45, 45)] 

    La siguiente función funciona bien con:

    a = [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39)]

    y

    a = [(2,4), (1,6)]

     def range_overlap_adjust(list_ranges): overlap_corrected = [] for start, stop in sorted(list_ranges): if overlap_corrected and start-1 <= overlap_corrected [-1][1] and stop >= overlap_corrected [-1][1]: overlap_corrected [-1] = min(overlap_corrected [-1][0], start), stop elif overlap_corrected and start <= overlap_corrected [-1][1] and stop <= overlap_corrected [-1][1]: break else: overlap_corrected.append((start,stop)) return overlap_corrected 

    prueba

     list_ranges = [(7, 10), (11, 13), (11, 15), (14, 20), (23, 39)] print range_overlap_adjust(list_ranges) 

    da:

    [(7, 20), (23, 39)]