可能是由于特定位置后的
示例代码:
import re import requests from bs4 import BeautifulSoup
response = requests.get("http://example.com") soup = BeautifulSoup(response.text, 'html.parser')
table = soup.find('table') rows = table.find_all('tr')
for row in rows: cells = row.find_all('td')
for cell in cells:
# 使用正则表达式提取标签内容
content = re.sub('<.*?>', '', str(cell))
print(content)
# 或者使用lxml
# content = cell.text_content()
# print(content)