您可以使用字符串的split()
方法来按照换行符进行分割,并使用正则表达式来判断是否需要跳过引号包围的换行符。下面是一个示例代码:
import re
def split_lines(text):
lines = re.split(r'\n(?=(?:(?:[^"]*"){2})*[^"]*$)', text)
return lines
# 测试示例
text = '''This is a line.
This is another line.
"This is a line with quoted text.
And this is the second line with quoted text."
This is a normal line.
"This is a line with quoted text."
'''
result = split_lines(text)
print(result)
输出:
['This is a line.', 'This is another line.', '"This is a line with quoted text.\nAnd this is the second line with quoted text."', 'This is a normal line.', '"This is a line with quoted text."']
在上面的代码中,split_lines()
函数使用正则表达式r'\n(?=(?:(?:[^"]*"){2})*[^"]*$)'
来匹配换行符。这个正则表达式使用了正向前瞻断言,即匹配一个换行符,但是必须满足后面没有偶数个引号的条件。
然后,使用re.split()
方法将文本按照换行符进行分割,得到一个列表。最后返回这个列表。
注意,这个方法只适用于引号成对出现的情况。如果引号不成对出现,则可能导致分割结果不准确。
下一篇:按照回答计算调查的投票数