Python Web Scraping (Beautifulsoup)

Hyeonseop-Noh·2022년 4월 22일
0

URLs

Website

URL_HOLDER = 'https://polygonscan.com/token/0x6A8Ec2d9BfBDD20A7F5A4E89D640F7E7cebA4499'

Open api

URL_PRICE = 'https://openapi.digifinex.com/v3/ticker'
URL_CURRENCY = 'https://api.exchangerate-api.com/v4/latest/USD'

Target file path

FILE_PATH = '/var/www/html/customer/globalmsq.com/public_html/index.html'

Holder scrap

Find by class and stringify the data. By data processing, pick only required value.

holder_page = requests.get(URL_HOLDER).content
holder_soup = BeautifulSoup(holder_page, 'html.parser')
holder_result = holder_soup.find(class_="mr-3").text.strip().replace(' addresses', '')

Get coin price

By using get method, extract the msq_usdt price from open api in DIGIFINEX.

header = { "symbol": "msq_usdt" }
price = requests.get(URL_PRICE, header).json()["ticker"][0]["last"]

And then, set multiplier to exchange currency. (Use api of currency exchange rate)

currency = requests.get(URL_CURRENCY).json()["rates"]["KRW"]
price_result = "{:,}".format(round(price*currency, 2))

Edit original html file

Use find(class_="CLASS_NAME") method and index, replace the target value.

with open(FILE_PATH) as inf:
  soup = BeautifulSoup(inf.read(), 'html.parser')
  index = 0
  for tag in soup.find(class_="section section-01").find(class_="statistics").find_all(class_="value"):
    if index == 0:
      tag.string.replace_with(price_result + " KRW")
    if index == 2:
      tag.string.replace_with(holder_result)
    index += 1
  new_txt = soup.prettify()

Overwrite

with open(FILE_PATH, 'w') as new_file:
  new_file.write(new_txt)

Crontab script

It executes every 1 minute

* * * * * /usr/bin/python3 /var/www/html/customer/globalmsq.com/public_html/scrap.py

References

https://ostechnix.com/a-beginners-guide-to-cron-jobs/
https://towardsdatascience.com/how-to-schedule-python-scripts-with-cron-the-only-guide-youll-ever-need-deea2df63b4e

profile
PlanBy Developer

0개의 댓글