0
Python Gene Sequence Search
I want to write a python code to search for genes in a genome sequence in two reading frames, forward and reverse. The length should be divisible by 3 and the gene must be found between a start and stop codon. I wrote the code below using regular expressions (which doesn't give output). Can anyone help polish up the code so it does exactly what is in my description? Or can anyone give another way of doing it? https://code.sololearn.com/cI5EsEAEHGcq/?ref=app
2 Answers
+ 2
you forgoted to print
+ 2
The fastest algorithm I can think of us:
seq = 'TGCACTGATG'
seq_len = len(seq)
with open('ecoli.fa', 'r') as ecoli_sequence:
previous = ''
for line,data in enumerate(ecoli_sequence):
if seq not in previous + data:
previous = data[-seq_len:-1]
else:
if seq in previous + data[:seq_len]:
print "It's split over lines:", line, "and", line+1
else:
print "It's in line: ", line+1
break
https://crbtech.in/programmes/clinical-research-training-programme