I have a text file from which I want to delete all data up to the point where I see the value ‘NODATACODE’ .
The text in the text file is:
MMMMM ; MMMMM : MMMMMMMMMMN, AAAAAAAAAAA,52, AAAA,CCCCCC, MMMMM ; MMMMM : MMMMMMMMMMN,
>AAAAAAAAAAA,200, AAAA,CCCCCC,;MMMMM ; MMMMM : MMMMMMMMMMN, AAAAAAAAAAA,53,
>AAAA,CCCCCC,AAAA AAAAA AAAAAAAAAAA AAAAAAAAAAA AAAAAAAAAAA NODATACODE, : Food Meal
Please let me know how I can rewrite the following code in Python to perform this task.
I tried the following code but it doesn’t work:
with open('Schedule.txt', 'w') as fw:
for line in lines:
if line.strip('\n') = 'NODATACODE':
fw.write(line)
Error message that I get is below:
Cell In[1], line 5
if line.strip('\n') = 'NODATACODE':
^
SyntaxError: cannot assign to function call here. Maybe you meant '==' instead of
'='?
Original Output
Desired Output
Thank you in advance.
Here’s a revised version of your script:
# Read the file content first
with open('Schedule.txt', 'r') as file:
data = file.read()
# Find the index where 'NODATACODE' occurs
nodata_index = data.find('NODATACODE')
# Check if 'NODATACODE' is found
if nodata_index != -1:
# Extract the text from 'NODATACODE' to the end
data_to_write = data[nodata_index:]
# Write the modified data back to the file
with open('Schedule.txt', 'w') as file:
file.write(data_to_write)
else:
print("'NODATACODE' not found in the file.")
In this script:
- The file is first opened in read mode to get all its content.
- We find the index of the first occurrence of ‘NODATACODE’ in the file’s data.
- If ‘NODATACODE’ is found, we extract all the text from this point onwards.
- Finally, we open the file in write mode and overwrite it with the extracted text.
This script assumes that ‘NODATACODE’ appears only once in your file. If ‘NODATACODE’ can appear multiple times and you want to delete content up to the last occurrence, you would need to adjust the logic to find the last index of ‘NODATACODE’.
Remember, opening a file in write mode (‘w’) as you did initially will immediately truncate the file, so it’s important to read its content before overwriting it. The syntax error you encountered is also fixed in this revised script.
This should do what you want; note that we test whether the line begins with ‘NODATACODE’, not is equal to it. And we use a flag so that the next lines will be written to the output file too:
with open('input_file.txt') as f_in:
with open('output_file.txt', 'w') as f_out:
write_flag = False
for line in f_in.readlines():
if line.startswith('NODATACODE'):
write_flag = True
if write_flag:
f_out.write(line)
If ‘NODATACODE’ is likely to be inside a line, an approach with regex could be better:
import re
with open('input_file.txt') as f_in:
with open('output_file.txt', 'w') as f_out:
data = f_in.read()
f_out.write(re.sub(r'[\w\W]*NODATACODE', 'NODATACODE', data))
Did you read the error message?
if line.strip(‘\n’) == ‘NODATACODE’:
Line 5 should be
!=
, but this is a wild guess since your question is not clear enough. In that file are those lines separated by line breaks? Are there more lines after “NODATACODE”? The indentation is wrong. And I think you might need a read handle to get all the lines first, close it and write handle to write the lines you want.@AnalysisNerd, can you make a meaningful example and show the exact matching expected output ?
@Timeless. Just making the required edits to my question. Thank you for your patience.
Show 1 more comment