Beautiful Soup和Splinter - 获取href和src属性_程序开发

Beautiful Soup和Splinter - 获取href和src属性

创始人

2024-11-27 07:00:32

0次

使用Beautiful Soup和Splinter获取href和src属性的解决方法如下：

导入必要的库

from bs4 import BeautifulSoup
from splinter import Browser

使用Splinter打开网页

browser = Browser()
browser.visit(url)
html = browser.html

使用Beautiful Soup解析HTML

soup = BeautifulSoup(html, 'html.parser')

获取所有的带有href属性的元素

href_elements = soup.find_all(href=True)
for element in href_elements:
    href = element['href']
    print(href)

获取所有的带有src属性的元素

src_elements = soup.find_all(src=True)
for element in src_elements:
    src = element['src']
    print(src)

完整示例代码如下：

from bs4 import BeautifulSoup
from splinter import Browser

# 使用Splinter打开网页
browser = Browser()
browser.visit(url)
html = browser.html

# 使用Beautiful Soup解析HTML
soup = BeautifulSoup(html, 'html.parser')

# 获取所有的带有href属性的元素
href_elements = soup.find_all(href=True)
for element in href_elements:
    href = element['href']
    print(href)

# 获取所有的带有src属性的元素
src_elements = soup.find_all(src=True)
for element in src_elements:
    src = element['src']
    print(src)

注意：在运行代码之前，需要安装和配置好Beautiful Soup和Splinter库，并根据实际需求修改代码中的URL。

上一篇：Beautiful Soup和requests问题，它不显示任何文本输出。

下一篇：Beautiful Soup脚本未产生所需的CSV输出。

Beautiful Soup和Splinter - 获取href和src属性

相关内容

热门资讯