python - How to handle dynamic pagination in Selenium where XPath changes for each page?

admin管理员组
文章数量:1024166

I am making a scraper for web.archive site. The scraper should open the following link and:

scroll down the page
scrape the information
go to the next page
repeat.

The thing is I can't make the code click the next page over and over until there is nothing to click. The current code clicks it once, and when it gets to page 2 it doesn't go to page 3.

Here is the minimal code that represents the pagination:

from selenium import webdriver
from selenium.webdrivermon.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from seleniummon.exceptions import TimeoutException
import time

URL = "://www.coinpeople/forum/64-new-member-information-and-welcome33/"

driver = webdriver.Chrome()
driver.get(URL)

time.sleep(5)

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

time.sleep(2)

while True:
    try:
        next_page = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, '//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a'))
        )
        next_page.click()
    except TimeoutException:
            print("No more pages to click.")
            break 


driver.quit()

I have tried the following:

next_page = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'("a[rel='next']")')))

and

driver.execute_script("document.querySelector('[rel=next]').click();")

The thing is the XPATH changes for every page. Here is an example of the XPATH for the first 3 pages:

//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a
//*[@id="elPagination_f4be6b25f47a268e848f4596d5b1e3a6_66137219"]/li[10]/a
//*[@id="elPagination_728dd0ae3583cbafa454d08121ab9841_298843258"]/li[11]/a

What can I do so the program goes through all pages?

I am making a scraper for web.archive. site. The scraper should open the following link and:

scroll down the page
scrape the information
go to the next page
repeat.

The thing is I can't make the code click the next page over and over until there is nothing to click. The current code clicks it once, and when it gets to page 2 it doesn't go to page 3.

Here is the minimal code that represents the pagination:

from selenium import webdriver
from selenium.webdrivermon.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from seleniummon.exceptions import TimeoutException
import time

URL = "https://web.archive./web/20230203123249/https://www.coinpeople/forum/64-new-member-information-and-welcome33/"

driver = webdriver.Chrome()
driver.get(URL)

time.sleep(5)

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

time.sleep(2)

while True:
    try:
        next_page = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, '//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a'))
        )
        next_page.click()
    except TimeoutException:
            print("No more pages to click.")
            break 


driver.quit()

I have tried the following:

next_page = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'("a[rel='next']")')))

and

driver.execute_script("document.querySelector('[rel=next]').click();")

The thing is the XPATH changes for every page. Here is an example of the XPATH for the first 3 pages:

//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a
//*[@id="elPagination_f4be6b25f47a268e848f4596d5b1e3a6_66137219"]/li[10]/a
//*[@id="elPagination_728dd0ae3583cbafa454d08121ab9841_298843258"]/li[11]/a

What can I do so the program goes through all pages?

Share Improve this question edited Nov 19, 2024 at 10:32 asked Nov 19, 2024 at 10:32 NewUser 36 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

This worked in the end:

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

I am making a scraper for web.archive site. The scraper should open the following link and:

scroll down the page
scrape the information
go to the next page
repeat.

The thing is I can't make the code click the next page over and over until there is nothing to click. The current code clicks it once, and when it gets to page 2 it doesn't go to page 3.

Here is the minimal code that represents the pagination:

from selenium import webdriver
from selenium.webdrivermon.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from seleniummon.exceptions import TimeoutException
import time

URL = "://www.coinpeople/forum/64-new-member-information-and-welcome33/"

driver = webdriver.Chrome()
driver.get(URL)

time.sleep(5)

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

time.sleep(2)

while True:
    try:
        next_page = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, '//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a'))
        )
        next_page.click()
    except TimeoutException:
            print("No more pages to click.")
            break 


driver.quit()

I have tried the following:

next_page = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'("a[rel='next']")')))

and

driver.execute_script("document.querySelector('[rel=next]').click();")

The thing is the XPATH changes for every page. Here is an example of the XPATH for the first 3 pages:

//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a
//*[@id="elPagination_f4be6b25f47a268e848f4596d5b1e3a6_66137219"]/li[10]/a
//*[@id="elPagination_728dd0ae3583cbafa454d08121ab9841_298843258"]/li[11]/a

What can I do so the program goes through all pages?

I am making a scraper for web.archive. site. The scraper should open the following link and:

scroll down the page
scrape the information
go to the next page
repeat.

The thing is I can't make the code click the next page over and over until there is nothing to click. The current code clicks it once, and when it gets to page 2 it doesn't go to page 3.

Here is the minimal code that represents the pagination:

from selenium import webdriver
from selenium.webdrivermon.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from seleniummon.exceptions import TimeoutException
import time

URL = "https://web.archive./web/20230203123249/https://www.coinpeople/forum/64-new-member-information-and-welcome33/"

driver = webdriver.Chrome()
driver.get(URL)

time.sleep(5)

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

time.sleep(2)

while True:
    try:
        next_page = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, '//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a'))
        )
        next_page.click()
    except TimeoutException:
            print("No more pages to click.")
            break 


driver.quit()

I have tried the following:

next_page = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, f'("a[rel='next']")')))

and

driver.execute_script("document.querySelector('[rel=next]').click();")

The thing is the XPATH changes for every page. Here is an example of the XPATH for the first 3 pages:

//*[@id="elPagination_30a5535a7a65933469c1ef7e81dc96e0_806040338"]/li[9]/a
//*[@id="elPagination_f4be6b25f47a268e848f4596d5b1e3a6_66137219"]/li[10]/a
//*[@id="elPagination_728dd0ae3583cbafa454d08121ab9841_298843258"]/li[11]/a

What can I do so the program goes through all pages?

Share Improve this question edited Nov 19, 2024 at 10:32 asked Nov 19, 2024 at 10:32 NewUser 36 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

This worked in the end:

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

本文标签： pythonHow to handle dynamic pagination in Selenium where XPath changes for each pageStack Overflow

版权声明：本文标题：python - How to handle dynamic pagination in Selenium where XPath changes for each page? - Stack Overflow 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://it.en369.cn/questions/1745568340a2156585.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

369IT编程

python - How to handle dynamic pagination in Selenium where XPath changes for each page? - Stack Overflow

1 Answer 1

1 Answer 1

更多相关文章

python - How to handle dynamic pagination in Selenium where XPath changes for each page? - Stack Overflow

发表评论

推荐文章

html - Is it possible to detect hardware acceleration by browser in JavaScript? - Stack Overflow

php - Memcache not closing connections with close()? - Stack Overflow

javascript - Three.js Create multiple objects - Stack Overflow

Getting past the Synology default web site when creating a web server

javascript - Find next and previous keys in js array - Stack Overflow

热门文章

Query certain amount of posts from multiple dates

javascript - How to count checked checkbox using onclick event ref by id? - Stack Overflow

php - HTML tags in bloginfo description

Html Agility Pack messing with my javascript - Stack Overflow

android - how to handle keyboard positions in composables - Stack Overflow

javascript - Node.js on Heroku? - Stack Overflow

javascript - JS, JQuery and Observable - Stack Overflow

jquery - Javascript function call from HTML Table Cells - Stack Overflow

python - Virtual environment marked as externally managed environemnt - Stack Overflow

javascript - Text Misalignment in Tables Generated with html2pdf.js and TypeScript - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

程序员刚毕业，先去大厂镀金还是先去小厂攒经验？

万象2008清空boss账户密码

【Tools】GitBook简明教程

oracle exadata celldisk 闪存盘受损导致性能下降

SDUT 2138 图结构练习——BFSDFS——判断可达性

javascript - get the last insert id in node.js and pass it to a second query - Stack Overflow

javascript - In html, open a new window with selected background color - Stack Overflow

c# - How to count # of tokens consumed by OpenAI Assistant while streaming the message - Stack Overflow

javascript - Override colorbox onClose - Stack Overflow

javascript - Load specific element from another page with vanilla js - Stack Overflow