Python Syntax Error

I have a python script (Script is attached) that works functionally but m getting a syntax error on the return section of the code, but cant figure out why it doesnt work despite indenting the line. can you please help solve the indentation or syntax issue? Sorry i m not very familiar with python indentation so any advise/help would be appreciated.
this is the error:
File “scrape.py”, line 26
return e.extract(r.text,base_url=url)
^
SyntaxError: ‘return’ outside function

And this is the script:
scrape.js (1.4 KB)

syntax errors are usually relation to older version of pyton.

Perhaps you can explain what you mean @msm1365? Your reply isn’t very enlightening.

1 Like

Python is quite picky when it comes to indentation. I’ll assume you wanted to place your return statement inside of the scrape() function, but forgot to indent/space it and that is why python thinks the return statement isn’t a part of a the scrape function.

1 Like

yes but when I try that, it complains about unexpected indentation

So you have it looking something like this?

from selectorlib import Extractor
import requests 
from time import sleep
import csv
# Create an Extractor by reading from the YAML file
e = Extractor.from_yaml_file('booking.yml')


def scrape(url):    
	headers = {
		'Connection': 'keep-alive',
		'Pragma': 'no-cache',
		'Cache-Control': 'no-cache',
		'DNT': '1',
		'Upgrade-Insecure-Requests': '1',
		# You may want to change the user agent if you get blocked
		'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36',
		'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
		'Referer': 'https://www.booking.com/index.en-gb.html',
		'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8',
              }

	# Download the page using requests
	print("Downloading %s"%url)
	r = requests.get(url, headers=headers)

	# Pass the HTML of the page and create 
	return e.extract(r.text,base_url=url)

with open("urls.txt",'r') as urllist, open('data.csv','w') as outfile:
	fieldnames = [
		"name",
		"location",
		"price",
		"price_for",
		"room_type",
		"beds",
		"rating",
		"rating_title",
		"number_of_ratings",
		"url"
		]
	writer = csv.DictWriter(outfile, fieldnames=fieldnames,quoting=csv.QUOTE_ALL)
	writer.writeheader()
	for url in urllist.readlines():
	     data = scrape(url) 
	     if data:
		for h in data['hotels']:
			writer.writerow(h)

# sleep(5)

the one you post is better indented though it is now complaining about line 48
File “.\scrape.py”, line 48
for h in data[‘hotels’]:
^
TabError: inconsistent use of tabs and spaces in indentation

You are using spaces in one place and indents in another, you need to pick one.

That was probably me trying to clean up the display.

@hm9, like RiversideRocks said, make sure there aren’t any errant spaces instead of tabs (or the other way around, however you want to make it work…)

1 Like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.