create account

advanced mode

steemdata | Recent


View on official site
@furion · (edited)
$123.58
Roadmap for SteemData 2.0 ∙ Crowdfunding

SteemData

Last week I launched a preview of SteemData, and the response, both public and private has been fantastic.

Based on your feedback, I've created a roadmap for SteemData 2.0.

Main Features

Real-Time

One of the major drawbacks of the current design is that the database is lagging behind blockchain state. While this is sufficient for a large array of applications, it is not optimal for certain kinds of user facing apps that rely on fresh data.

A major re-design to an event based model is required, so that when new blocks become available, all relevant parts of the database get updated in near-real-time.

Native Data Types

A vast majority of the data in the current version is under-typed. This makes it hard for people to make good queries, and often requires additional post-processing to get the results we're interested in. SteemData 2.0 needs to address these issues.

Indexes

The indexes need to be studied and optimized to fit real world usage patterns. Furthermore, it would be nice to have relationships between objects in different collections.

Backups

It takes more than a week to re-create (sync) the database from scratch. This is why SteemData needs to implement automatic Amazon S3 snapshots, so that in the event of catastrophic failure, the service can be brought back to life with minimum downtime.

Open Source

SteemData is written in Python and uses Docker services for deployment/orchestration.
The software stack needs to be refactored, documented and made available so that anyone can deploy their own version of SteemData for personal use.

Furthermore, reference implementations for Python and JavaScript clients as well as helpers utilities should be created.

The license will be MIT, as it is highly permissive and basically grants users the power to do whatever they like. This will hopefully also help developers create support for different databases (SQL, Firebase, etc).

More Features

  • Proper Integration of Post Comments
  • Restructure Operations
  • Support for HF 16.1 and beyond
  • Historic Price Feeds
  • Steemle like charts on SteemData.com

Target shipping date: March 1st 2017

Donations

Steem community has proven that a donation model can work as Busy raised close to $60,000 for the awesome work they're doing. This is awesome.

The main goal for SteemData is help developers create new and interesting applications for the STEEM ecosystem. I believe that the ecosystem will play an important role in Steem's future success, and I would love to contribute in my humble little way.

Crowdfunding

To fund the development of SteemData 2.0 I am looking to raise $5,000 in STEEM or SBD.

The donations should be sent to @steemdata, and the list of friendly donors will be published and updated here, as well as in future announcements.

Jan 22nd: We have raised $2,600 of the $5,000 goal so far.

Supporters
@cass $2,500
@fabien $100
👍  
json_metadata{"tags":["steemdata","steem","steemd","steemit"],"users":["steemdata","cass","fabien"],"image":["https://img1.steemit.com/0x0/http://i.imgur.com/uAu5ST4.jpg"],"links":["https://steemit.com/steemdata/@furion/introducing-steemdata-a-database-layer-for-steem","https://busy.org","https://steemit.com/busy/@busy.org/busy-january-news-alpha-release-updates-2-weeks-before-public-beta-release-donation-report-what-s-next"],"app":"steemit/0.1","format":"markdown"}
last_update2017-01-22 14:56:48
created2017-01-17 15:10:15
active2017-01-22 14:56:48
last_payout2017-01-18 18:55:48
depth0
children14
children_rshares210,230,998,601,376,336,161,813,752
net_rshares1,066,045,839,460
vote_rshares1,066,045,839,460
children_abs_rshares2,037,675,015,312
cashout_time2017-02-17 18:55:48
max_cashout_time2017-02-01 18:57:48
total_vote_weight0
reward_weight10,000
total_payout_value115.601 SBD
curator_payout_value7.943 SBD
author_rewards734,327
net_votes268
root_comment1,776,543
modesecond_payout
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars0
allow_repliestrue
allow_votestrue
allow_curation_rewardstrue
root_title"Roadmap for SteemData 2.0 ∙ Crowdfunding"
pending_payout_value0.033 SBD
total_pending_payout_value0.064 SBD
author_reputation52,592,362,364,445
promoted0.000 SBD
author_curate_reward""
vote details (268)

vote your-acct "furion" "roadmap-for-steemdata-2-0-help-needed" 100 true
post_comment your-acct "re-furion-roadmap-for-steemdata-2-0-help-needed-20170124t112019498z" "furion" "roadmap-for-steemdata-2-0-help-needed" "" "your reply.." "{}" true

View on official site
@chinadaily ·
$13.73
Some information about steemit users which you may be interested in. (To Test steemdata) / 一些关于用户的数据, 测试steemdata

Some days ago, @furion published the amazing project: steemdata
Here you can find out some information about it:

It's very great because it makes development easier.
I grabbed some interesting data from steemdata, It works well.

Top 10 Users (sorted by Reputation score)

Rank ID Reputation score
1 @steemsports 77.12
2 @knozaki2015 75.81
3 @gavvet 75.35
4 @ozchartart 75.21
5 @ericvancewalton 74.40
6 @krnel 74.31
7 @curie 74.15
8 @sirwinchester 74.08
9 @stellabelle 73.68
10 @dantheman 72.90

Top 10 Users (sorted by Steem Power)

Rank ID Steem Power
1 @steemit 87957012.36
2 @ned 5709083.28
3 @blocktrades 5244418.14
4 @dan 4713097.20
5 @jamesc 3314099.49
6 @freedom 3086063.05
7 @val-a 2958387.01
8 @abit 2952517.73
9 @smooth 2632645.43
10 @ben 2523023.72

Top 10 Users (sorted by Followers count)

Rank ID Followers
1 @dollarvigilante 3560
2 @dantheman 2623
3 @ned 2028
4 @always1success 1930
5 @charlieshrem 1821
6 @instructor2121 1703
7 @larkenrose 1679
8 @stellabelle 1657
9 @gavvet 1555
10 @blocktrades 1521

Top 10 Users (sorted by Following count)

Rank ID Following
1 @always1success 21220
2 @instructor2121 16127
3 @arnoldwish 13938
4 @joanaltres 13379
5 @carlobelgado 12761
6 @steemlinks 11899
7 @sergey44 11094
8 @manosteel211 10906
9 @katiasan1978 9067
10 @skyefox 8954

中文版

几天以前, @furion 发布了steemdata 项目
详细信息看这里:

它非常棒,因为这个项目使得开发steemit相关程序更加容易。
通过使用steemdata我抓取了一些感兴趣的数据, 它表现的很好。

非常感谢 @furion 提供了这个项目。

👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
json_metadata{"tags":["steemdata","steem","steemd","steemit","cn"],"users":["furion","steemsports","knozaki2015","gavvet","ozchartart","ericvancewalton","krnel","curie","sirwinchester","stellabelle","dantheman","steemit","ned","blocktrades","dan","jamesc","freedom","val-a","abit","smooth","ben","dollarvigilante","always1success","charlieshrem","instructor2121","larkenrose","arnoldwish","joanaltres","carlobelgado","steemlinks","sergey44","manosteel211","katiasan1978","skyefox"],"image":["http://tse2.mm.bing.net/th?id=OIP.Ma641eade2dc28fb58129dd94862778cco0&pid=15.1"],"links":["https://steemit.com/steemdata/@furion/introducing-steemdata-a-database-layer-for-steem"],"app":"steemit/0.1","format":"markdown"}
created2017-01-15 13:35:06
active2017-01-16 01:44:21
last_payout2017-01-16 19:21:30
depth0
children4
children_rshares20
net_rshares0
vote_rshares0
children_abs_rshares0
cashout_time2017-02-15 19:21:30
max_cashout_time1969-12-31 23:59:59
total_vote_weight0
reward_weight10,000
total_payout_value11.925 SBD
curator_payout_value1.802 SBD
author_rewards87,286
net_votes156
root_comment1,761,441
modesecond_payout
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
allow_repliestrue
allow_votestrue
allow_curation_rewardstrue
root_title"Some information about steemit users which you may be interested in. (To Test steemdata) / 一些关于用户的数据, 测试steemdata"
total_pending_payout_value0.000 SBD
author_reputation156,479,296,751,352
promoted0.000 SBD
author_curate_reward""
vote details (156)

vote your-acct "chinadaily" "some-information-about-steemit-users-which-you-may-be-interested-in-to-test-steemdata-steemdata" 100 true
post_comment your-acct "re-chinadaily-some-information-about-steemit-users-which-you-may-be-interested-in-to-test-steemdata-steemdata-20170124t112019529z" "chinadaily" "some-information-about-steemit-users-which-you-may-be-interested-in-to-test-steemdata-steemdata" "" "your reply.." "{}" true

View on official site
@furion ·
$645.65
Introducing SteemData - A Database Layer for STEEM

Why

The goal of the SteemData project is to make data from the STEEM blockchain more accessible to developers, researchers and 3rd party services.

Today, most apps use steemd as the source of data. In this context, steemd is used for fetching information about the blockchain itself, requesting blocks, and fetching recent content (ie. new blog posts from a user, homepage feed, etc.)

Unfortunately it also comes with a few shortcomings.

Running steemd locally is very hard, due to its growing RAM requirements. (None of my computers are capable of running it). Which means that we have to rely on remote RPC's, and that brings up another issue: time.

It takes a long time for a round trip request to a remote RPC server (sometimes more than 1 second per request).

Because steemd was never intended for running queries, aggregates, map-reduce, text search, it is not very well equipped to deal with historic data. If we are interested in historic data, we have to get it block-by-block form the remote RPC, which takes a really really long time.

For example, fetching the data required to create a monthly STEEM report now takes more than a week. This is simply not feasible.

Hello MongoDB

I have chosen MongoDB for this project for a couple of reasons:

  • Mongo is a document-based database, which is great for storing unstructured (schema-less) data.
  • Mongo has a powerful and expressive query language, ability to run aggregate queries and javascript functions directly in its shell (for example: map-reduce pattern).
  • By utilizing Mongo's Oplog we can 'subscribe' to new data as well as database changes. This is useful for creating real-time applications.
  • Steemit Inc is already developing a MySQL based solution, and Microsoft SQL solution exists on http://steemsql.com/

Server

I have setup a preview version of the database as a service. You can access it on:

Host: mongo0.steemdata.com
Port: 27017

Database: Steem
Username: steemit
Password: steemit

The steemit user account is read-only.

I highly recommend RoboMongo as a GUI utility for experimenting with the database.

After you're connected, you can run queries against any collection like this:

Data Layout

Accounts

Accounts contains Steem Accounts and their:

  • account info / profile
  • balances
  • vesting routes
  • open conversion requests
  • voting history on posts
  • a list of followers and followings
  • witness votes
  • curation stats

Example
Find all Steemit users that have at least 500 followers, less than $50,000 SBD in cash, have set their profile picture, and follow me (@furion) on Steemit.

db.getCollection('Accounts').find({
    'followers_count': {'$gt': 500},
    'balances.SBD': {'$lte': 50000},
    'profile.profile_image': {'$exists': true},
    'following': {'$in': ['furion']},
    })

Posts

Posts provide us with easy to query post objects, and include content, metadata, and a few added helpers. They also come with all the replies, which are also full Post objects.

A few extra niceties:

  • body field supports Full Text Search
  • timestamps are parsed as native ISO dates
  • amounts are parsed as Amount objects

Example
Find all Posts by @steemsports from October, which have raised at least $200.5 in post rewards and have more than 20 comments and mention @theprophet0 in the metadata.

db.getCollection('Posts').find({
    'author': 'steemsports',
    'created': {
        '$gte': ISODate('2016-10-01 00:00:00.000Z'),
        '$lt': ISODate('2016-11-01 00:00:00.000Z'),
     },
     'total_payout_reward.amount': {'$gte': 200.5},
     '$where':'this.replies.length>20',
     'json_metadata.people': {'$in': ['theprophet0']},
    })

Example 2
Find all posts which mention meteor in their body:

db.getCollection('Posts').find({'$text': {'$search': 'meteor'}})

Operations

Operations represent the entire blockchain, as seen trough a time series of individual actions, such as:

operation_types = [
    'vote', 'comment_options', 'delete_comment', 'account_create', 'account_update',
    'limit_order_create', 'limit_order_cancel',
    'transfer', 'transfer_to_vesting', 'withdraw_vesting', 'convert', 'set_withdraw_vesting_route',
    'pow', 'pow2', 'feed_publish', 'witness_update',
    'account_witness_vote', 'account_witness_proxy',
    'recover_account', 'request_account_recovery', 'change_recovery_account',
    'custom', 'custom_json'
]

Operations have the same structure as on the Blockchain, but come with a few extra fields, such as timestamp, type and block_num.

Example
Find all transfers in block 6717326.

db.getCollection('Operations').find({'type':'transfer', 'block_num': 6717326})

We get 1 result:

{
    "_id" : ObjectId("584eac2fd6194c5ab027f671"),
    "from" : "bittrex",
    "to" : "poloniex",
    "type" : "transfer",
    "timestamp" : "2016-11-14T13:21:30",
    "block_num" : 6717326,
    "amount" : "466.319 STEEM",
    "memo" : "83ad5b2c56448d45"
}

VirtualOperations

Virtual Operations represent all actions performed by individual accounts, such as:

    types = {
        'account_create',
        'account_update',
        'account_witness_vote',
        'comment',
        'delete_comment',
        'comment_reward',
        'author_reward',
        'convert',
        'curate_reward',
        'curation_reward',
        'fill_order',
        'fill_vesting_withdraw',
        'fill_convert_request',
        'set_withdraw_vesting_route',
        'interest',
        'limit_order_cancel',
        'limit_order_create',
        'transfer',
        'transfer_to_vesting',
        'vote',
        'witness_update',
        'account_witness_proxy',
        'feed_publish',
        'pow', 'pow2',
        'liquidity_reward',
        'withdraw_vesting',
        'transfer_to_savings',
        'transfer_from_savings',
        'cancel_transfer_from_savings',
        'custom',
    }

Operations have the same structure as in the steemd database, but come with a few extra fields, such as account, timestamp, type, index and trx_id.

Example:
Query all transfers from @steemsports to @furion in the past month.

db.getCollection('VirtualOperations').find({
    'account': 'steemsports',
    'type': 'transfer',
    'to': 'furion',
    'timestamp': {
        '$gte': ISODate('2016-10-01 00:00:00.000Z'),
        '$lt': ISODate('2016-11-01 00:00:00.000Z'),
    }})

TODO

  • [] Historic 3rd party price feeds (partially done)
  • [] add Indexes based on usage patterns (partially done)
  • [] parse more values into native data types
  • [] create relationships using HRefs
  • [] Create Open-Source Server (Python+Docker based)
  • [] Create Open-Source Client Libraries (Python, JS?)

Looking for feedback and testers

I would love to get community feedback on the database structure, as well as feature requests.

If you're a hacker, and have a cool app idea, feel free to use the public mongo endpoint provided by steemdata.com

Expansion Ideas

I would love to expand this service to PostgreSQL as well as build a https://steemdata.com portal with useful utilities, statistics and charts.

Sponsored by SteemSports

A 32GB RAM, Quad-Core baremetal server that is powering SteemData has been kindly provided by SteemSports.



Don't miss out on the next post - follow me

👍  
json_metadata{"tags":["steemdata","steem","steemd","steemit"],"users":["steemsports","theprophet0","furion"],"image":["http://i.imgur.com/uAu5ST4.jpg","http://i.imgur.com/lRSpXG1.png","http://i.imgur.com/LjIa5KL.png","http://i.imgur.com/5MaAhy7.png"],"links":["https://steemit.com/stats/@furion/a-collection-of-steem-stats-for-october","http://steemsql.com/","https://robomongo.org/","https://steemdata.com","https://steemsports.com","https://steemit.com/@furion"],"app":"steemit/0.1","format":"markdown"}
created2017-01-10 18:34:24
active2017-01-13 09:19:24
last_payout2017-01-11 21:47:12
depth0
children43
children_rshares290,282,907,624,203,048,337,333,360
net_rshares7,558,191,822,638
vote_rshares7,558,191,822,638
children_abs_rshares8,252,195,749,016
cashout_time2017-02-10 21:47:12
max_cashout_time2017-01-25 22:55:09
total_vote_weight0
reward_weight10,000
total_payout_value628.203 SBD
curator_payout_value16.898 SBD
author_rewards4,188,025
net_votes557
root_comment1,730,348
modesecond_payout
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars0
allow_repliestrue
allow_votestrue
allow_curation_rewardstrue
root_title"Introducing SteemData - A Database Layer for STEEM"
pending_payout_value0.548 SBD
total_pending_payout_value0.566 SBD
author_reputation52,592,362,364,445
promoted0.000 SBD
author_curate_reward""
vote details (557)

vote your-acct "furion" "introducing-steemdata-a-database-layer-for-steem" 100 true
post_comment your-acct "re-furion-introducing-steemdata-a-database-layer-for-steem-20170124t112019628z" "furion" "introducing-steemdata-a-database-layer-for-steem" "" "your reply.." "{}" true