Python Web Crawler

时间:2023-03-08 19:40:21

Python版本:3.5.2

pycharm

URL Parsing

https://docs.python.org/3.5/library/urllib.parse.html?highlight=urlparse#urllib.parse.urlparse

>>> from urllib.parse import urlparse
>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
>>> o
ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
params='', query='', fragment='')
>>> o.scheme
'http'
>>> o.port
80
>>> o.geturl()
'http://www.cwi.nl:80/%7Eguido/Python.html'