Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!
  • Guest, before posting your code please take these rules into consideration:
    • It is required to use our BBCode feature to display your code. While within the editor click < / > or >_ and place your code within the BB Code prompt. This helps others with finding a solution by making it easier to read and easier to copy.
    • You can also use markdown to share your code. When using markdown your code will be automatically converted to BBCode. For help with markdown check out the markdown guide.
    • Don't share a wall of code. All we want is the problem area, the code related to your issue.


    To learn more about how to use our BBCode feature, please click here.

    Thank you, Code Forum.

Python Write in lines from a txt that is called by the occurrence of two strings

kkhhhkk

Coder
I have a data file with has lines that look like this:

input
lines
lines
optimized
input
lines
lines
input
lines
lines
optimized

I want my script to go through this file, find the lines with input, find the line with optmized and write all the lines between them but only if its input lines lines optmized as opposed to input lines lines input lines lines optmized (basically for the above example input lines lines optmized should only write twice)

I have the following code, however it only prints the lines with optimized, how can I alter it to do what I intend it to?

Python:
writefile = open("input_two.txt","a")
skip_line = True
with open("input_n.txt","r") as myfile:
    skip_line = True
    for line in myfile:
        if "Input" in line:
            next_skip_line = False
        elif "Optimized" in line:
            next_skip_line = True


        if not skip_line and next_skip_line:
            writefile.write(line)
        skip_line = next_skip_line
 
I have a current way that does it by line index, but this is not some I can use for generalised data:

Python:
writefile = open("input_two.txt","a")
with open("input_n.txt","r") as myfile:
    lines = myfile.readlines()
    for index, line in enumerate(lines):
        if "Optimized" in line:
            writefile.write("".join(lines[max(0,index-13):index]))
 
It could be just me being thick but I cannot make heads or tails of your description
I want my script to go through this file, find the lines with input, find the line with optmized and write all the lines between them but only if its input lines lines optmized as opposed to input lines lines input lines lines optmized (basically for the above example input lines lines optmized should only write twice)
I find that often when someone uses the word "basically" they are failing to properly explain the problem.
 
Look at it this way. If you see a line that is 'input' you want to start collecting lines. If you see a line that is 'optimized' you want to stop collecting lines, and if the number of collected lines is two you want to print them out. All you need to add is some housekeeping to set/clear a collection flag, and add/print/clear the list. Give it a try now and if you are still having problems I can post some code, but I think you'd be better served trying to write the code yourself.
 
There are better ways of doing this. Just something I threw together.

Python:
# Imports
import re

# Create empty list
alist = []

# Open file for reading
with open('Python/my.txt', 'r') as read_file:

    # Join list for manipulation
    string = ' '.join(list(map(str.rstrip, read_file)))

    # Split string using input creates new list
    string = string.split('input')

    # Pop of index 0 as it is blank
    string.pop(0)

    # Loop through newlist and add data to a return list if contains optimized
    # Remove the optimized word from list
    for data in string:
        if 'optimized' in data.strip():
            alist.append(data.strip().strip('optimized'))
    print(alist)

Output

Code:
['lines lines ', 'lines lines ']
 
Personally I prefer clear over clever.

Python:
"""
Displays all lines between 'input' and 'optimized' only if
the number of lines is two.
"""

collect = False

for line in open('input.txt'):

    line = line.replace('\n','')

    if line == 'input':
        collect = True
        lines = []
    elif line == 'optimized':
        if len(lines) == 2:
            for l in lines:
                print(l)
            print('')
        lines = []
        collect = False
    else:
        lines.append(line)
 
OP didn't say 'two' was a constraint on how many lines he wants in the pattern or that 'two' is the total number of lines that should be collected. He said he only has two constraints: input at the beginning, only, and optimized at the end, only, with an arbitrary number of lines between them.

So I think what you want is output like
Code:
input
lines
lines
optimized
so you have a specific start trigger and a specific end trigger and you want your start trigger to not be included as one of those lines (as in, if you hit input again, that would be the new start to your output). This is basically like regex matching, but spanning multiple lines.

I did this:
Python:
input_file_lines = """
input
lines
lines
optimized
input
lines
lines
input
lines
lines
optimized
"""

def find_line_pattern(input_lines):
    collect = False
    input_lines_list = input_lines.split()
    collected_lines = []
    for line in input_lines_list:
        if line == 'input':
            collect = True
            collected_lines.clear() # Start fresh if we hit the input line again
            collected_lines.append(line)
        elif collect == False:
            continue # If we haven't hit an 'input' line yet, no sense in going all the way through the loop
        elif line == 'optimized':
            collected_lines.append(line)
            print(collected_lines) # Give you the entire group of lines and start over at the next line
        else:
            collected_lines.append(line) # If we're here, then we have an 'input' line, but have not yet found the
                                         # 'optimized' line, to keep adding lines to the list until we hit 'optimized'

find_line_pattern(input_file_lines)

Result:
Code:
['input', 'lines', 'lines', 'optimized']
['input', 'lines', 'lines', 'optimized']

That could probably be a little cleaner, though.
 
Once again we have an OP who is nowhere to be seen while several people donate their time to try and provide a solution to their ill-formulated problem. And the same happens in their other posts 🙄 How hard can it be to acknowledge a reply ?
 
I go ahead and use the last example I did.
  • Joins the lines into a string
  • Splits the string @ input
  • Appends lines with optimized
  • Strips white spaces and the word optimized

text file:
Code:
input
lines
lines
optimized
input
lines
lines
input
lines
lines
optimized
input
lines lines
another lines
lines again
optimized
input
lines bottom
lines

Python:
# Create empty list
tmp = []

# Open file for reading
with open('Python/my.txt', 'r') as read_file:

    # Join list for manipulation
    string = ' '.join(list(map(str.strip, read_file)))

    # Split string @ input
    string = string.split('input')

    # Loop through list and append elements with the word optimized
    # into a tmp list
    # Using strip to remove the optimized word and white spaces
    for data in string:
        if 'optimized' in data:
            tmp.append(data.strip().strip('optimized').strip())

# Print tmp list
print(tmp)

Output
Code:
['lines lines', 'lines lines', 'lines lines another lines lines again']
 
I go ahead and use the last example I did.
  • Joins the lines into a string
  • Splits the string @ input
  • Appends lines with optimized
  • Strips white spaces and the word optimized

text file:
Code:
input
lines
lines
optimized
input
lines
lines
input
lines
lines
optimized
input
lines lines
another lines
lines again
optimized
input
lines bottom
lines

Python:
# Create empty list
tmp = []

# Open file for reading
with open('Python/my.txt', 'r') as read_file:

    # Join list for manipulation
    string = ' '.join(list(map(str.strip, read_file)))

    # Split string @ input
    string = string.split('input')

    # Loop through list and append elements with the word optimized
    # into a tmp list
    # Using strip to remove the optimized word and white spaces
    for data in string:
        if 'optimized' in data:
            tmp.append(data.strip().strip('optimized').strip())

# Print tmp list
print(tmp)

Output
Code:
['lines lines', 'lines lines', 'lines lines another lines lines again']
It sounded like his target was to start over if he hit his start line again.
 
Your original spec says "write all the lines between them but only if its input lines lines optmized". According to the spec the output should only be the following starred lines

input
*lines
*lines
optimized
input
lines
lines
input
*lines
*lines
optimized
input
lines lines
another lines
lines again
optimized
input
lines bottom
lines
 
Back
Top Bottom