Lot of things I’m not understanding
First of all: the code runs well with the following: (I added spaces so i could post links)
Define root domain to crawl
domain = “openai . com”
full_url = “ht*ps: // openai. com”
I get a list of the files its crawling but many say “can’t parse”. Why is that?
secondly, if i change the code:
Define root domain to crawl
domain = “mysite. com”
full_url = “ht*ps: // mysite. com”
then I get a straight 403 error. Why?
Thank you