通过提取“转到最后一页”元素的page参数来确定最后一页。并通过^{}遍历每个维护web抓取会话的页面:import re

import requests

from bs4 import BeautifulSoup

with requests.Session() as session:

# extract the last page

response = session.get("http://www.engineering.careers360/colleges/list-of-engineering-colleges-in-India?sort_filter=alpha")

soup = BeautifulSoup(response.content, "html.parser")

last_page = int(re.search("page=(\d+)", soup.select_one("li.pager-last").a["href"]).group(1))

# loop over every page

for page in range(last_page):

response = session.get("http://www.engineering.careers360/colleges/list-of-engineering-colleges-in-India?sort_filter=alpha&page=%f" % page)

soup = BeautifulSoup(response.content, "html.parser")

# print the title of every search result

for result in soup.select("li.search-result"):

title = result.find("div", class_="title").get_text(strip=True)

print(title)

印刷品:

^{2}$

更多推荐

python 爬取下一页_如何使用Beautifulsoup在python中抓取下一页