Close

Getting some data

A project log for Prize stats

having a crack at playing with the data from hackadayPrize2016 to look for trends

johnowhitakerjohnowhitaker 04/25/2016 at 17:580 Comments

I'm a little past the deadline, so I may net some extra data not counted in the first round but who cares. Here goes:

I couldn't find an easy way to access a list of entries using the API, so I went to https://hackaday.io/submissions/prize2016/list and downloaded the page source. Some regexp magic, and I have a list of unique project numbers for which I can download data:

import urllib, json, time, re
p = re.compile('project/\d\d*-') #search for strings 'project/XXXX-name...'
f = open("source.html", "r") #The saved source
matches = p.findall(f.readlines()[2]) # The relevant bit
project_numbers = [int(s[8:-1]) for s in list(set(matches))] #the list(set( part is to remove duplicates
Now, I can use the hackaday API to get a json object describing each project:
url = "https://api.hackaday.io/v1/projects/%sapi_key=MY_KEY"
data = []
for ID in project_numbers:
	response = urllib.urlopen(url % ID)
	data.append(json.loads(response.read()))
	time.sleep(1)
	print(ID)
import numpy
numpy.save(open("projects.txt", 'w'), data) #So we don't have to do this each time we start
So now I can get a project's view count with data[i][u'views'] and so on.

Discussions