Our new Terms of Service make it easier to get funded on Gratipay.

I am making the world better by publishing free data sets for the semantic web.

The first product I developed is

which is a conversion of Freebase data to industry-standard RDF. Every week, Freebase publishes a new data dump, and I use the open source

software to process it into a usable product. More recently I've downloaded Wikipedia Pagecount information into the AWS cloud. This details how many hits every page on every Wikipedia site got every hour from 2008 to the present. This data set is several terabytes and costs several hundred dollars a month to store. I've also created an open source project for working with this data using Amazon Elastic Map/Reduce

and I've averaged this data over time to produce popularity statistics for all of the concepts in Wikipedia:

from which no end of interesting reports can be generated

The full data set (nearly 4TB) is now available for free in the AWS cloud.

I am finding that more and more people are doing projects using :BaseKB data such as

At this point I'm getting data set creation down to a science and the thing that limits me is my budget for cloud resources.

There's a spiritual teaching that people should give 10% of their earnings to good causes. Some people give to a church, some people give to the United Way, but I think gittip is a good place to do that, so I have a policy of gifting 10% of my gittip earnings to other gittip users to promote general prosperity.

