Python re.split lookahead patrón

Estoy intentando re.split para obtener BCF #, BTS # y LAC, CI de logfile con el encabezado y la estructura regular dentro de:

================================================================================== RADIO NETWORK CONFIGURATION IN BSC: EPB FTRC D-CHANNEL BUSY AD OP R ET- BCCH/CBCH/ RES O&M LINK HR FR LAC CI HOP ST STATE FREQ T PCM ERACH XFU NAME ST /GP ===================== == ====== ==== == ==== =========== = = == ===== == === === BCF-0010 FLEXI MULTI U WO 2 LM10 WO 10090 31335 BTS-0010 U WO 0 0 KHAKHAATT070D BB/- 7 TRX-001 U WO 779 0 1348 MBCCH+CBCH P 0 TRX-002 U WO 659 0 1348 1 TRX-003 U WO 661 0 1348 2 TRX-004 U WO 670 0 1348 0 TRX-005 U WO 674 0 1348 1 10090 31336 BTS-0011 U WO 0 0 KHAKHAATT200D BB/- 7 TRX-006 U WO 811 0 1348 MBCCH+CBCH P 2 TRX-009 U WO 845 0 1349 2 TRX-010 U WO 819 0 1349 0 TRX-011 U WO 823 0 1349 1 TRX-012 U WO 836 0 1349 2 10090 31337 BTS-0012 U WO 0 0 KHAKHAATT340D BB/- 5 TRX-013 U WO 799 0 1349 MBCCH+CBCH P 0 TRX-014 U WO 829 0 1349 1 TRX-017 U WO 831 0 1302 2 TRX-018 U WO 834 0 1302 1 TRX-019 U WO 853 0 1302 0 TRX-020 U WO 858 0 1302 2 TRX-021 U WO 861 0 1302 1 BCF-0020 FLEXI MULTI U WO 0 LM20 WO 10090 30341 BTS-0020 U WO 0 0 KHAKHABYT100G BB/- 1 TRX-001 U WO 14 0 1856 MBCCH+CBCH P 0 TRX-002 U WO 85 0 1856 1 10090 30342 BTS-0021 U WO 0 0 KHAKHABYT230G BB/- 1 TRX-003 U WO 4 0 1856 MBCCH+CBCH P 2 TRX-004 U WO 12 0 1856 0 10090 30343 BTS-0022 U WO 0 0 KHAKHABYT340G BB/- 1 TRX-005 U WO 20 0 1856 MBCCH+CBCH P 1 TRX-006 U WO 22 0 1856 2 10090 30345 BTS-0025 U WO 0 0 KHAKHABYT100D BB/- 5 TRX-007 U WO 793 0 1856 MBCCH+CBCH P 0 TRX-008 U WO 851 0 1856 1 TRX-009 U WO 834 0 1857 2 TRX-010 U WO 825 0 1857 1 10090 30346 BTS-0026 U WO 0 0 KHAKHABYT230D BB/- 4 TRX-011 U WO 803 0 1857 MBCCH+CBCH P 2 TRX-012 U WO 860 0 1857 0 TRX-013 U WO 846 0 1857 1 TRX-014 U WO 844 0 1857 2 TRX-015 U WO 828 0 1857 0 TRX-016 U WO 813 0 1857 1 10090 30347 BTS-0027 U WO 0 2 KHAKHABYT340D BB/- 5 TRX-017 U WO 801 0 1352 MBCCH+CBCH P 2 TRX-018 U WO 857 0 1352 0 TRX-019 U WO 840 0 1352 1 TRX-020 U WO 838 0 1352 0 TRX-021 U WO 836 0 1352 1 TRX-022 U WO 823 0 1352 2 TRX-023 U WO 821 0 1352 0 TRX-024 U WO 817 0 1352 1 ======================================================================================= 

con código:

 def GetTheSentences(infile): with con: cur = con.cursor() cur.execute("DROP TABLE IF EXISTS eei") cur.execute("CREATE TABLE eei(BCF INT, BTS INT PRIMARY KEY) ") with open(infile) as fp: for result_1 in re.split('BCF-', fp.read(), flags=re.UNICODE): BCF = result_1[:4] for result_2 in re.compile("(?=BTS-)").split(result_1): rec = re.search('TRX-',result_2) if rec is not None: BTS = result_2[4:8] print BCF + "," + BTS 

Necesito dividir result_1 en partes relacionadas con BTS, incluyendo 13 caracteres antes de “BTS-” (“10090 31335 BTS-0010”) usando la expresión de expresión regular y la división a result_3 para cada TRX, pero no tengo éxito.

¡Por favor apoya!

re.split() Python no se divide en coincidencias de longitud cero.

Por re.compile("(?=BTS-)").split(result_1) tanto, re.compile("(?=BTS-)").split(result_1) nunca dividirá su cadena. re.split() encontrar una solución sin re.split() o usar el nuevo módulo regex .