python - Changing dictionary consisting 16k dicts to a Pandas Dataframe -


i'm working on data mining problem master thesis. i'm using python data analysis, have no experience pandas, needed convert data dataframe. in order survival regression python package called lifelines need create covariate matrix experiment_data dict containing on 16k of dicts twitter data kickstarter projects (see example dict below).

16041: {'goal': 1200, 'launch': 1353544772, 'days-before-deadline': 3, 'followers': 149, 'date-funded': 1355887690.9189188, 'id': 52687, 'tweet_ids': [280965208409796608, ... n], 'state': 1, 'deadline': 1356136772, 'retweets': 0, 'favorites': 0, 'duration': 31, 'timestamps': [1355876412.0], 'favourites': 0, 'runtime': 27, 'friends': 127, 'pledges': [0.0, 0.0625, 0.0625, ... n], 'statuses': 7460} 

if create pandas dataframe dict, i'll able create covariate matrix using patsy, example this:

x = patsy.dmatrix('friends + followers + retweets, favorites -1', data, return_type='dataframe')  

now question how create pandas dataframe experiment_data dicts? keys of inner dictionaries (goal, launch, followers, etc.) should columns each kickstarter project (i.e. index nr.: 0 16041).

any appreciated. in advance!

p.s. if have experience in survival regression using python , lifelines, please let me know!

i think want from_dict using param orient='index':

in [31]: d={16041: {'goal': 1200, 'launch': 1353544772, 'days-before-deadline': 3, 'followers': 149, 'date-funded': 1355887690.9189188, 'id': 52687, 'tweet_ids': [280965208409796608], 'state': 1, 'deadline': 1356136772, 'retweets': 0, 'favorites': 0, 'duration': 31, 'timestamps': [1355876412.0], 'favourites': 0, 'runtime': 27, 'friends': 127, 'pledges': [0.0, 0.0625, 0.0625], 'statuses': 7460}} pd.dataframe.from_dict(d, orient='index')      out[31]:           id  followers  days-before-deadline  statuses  duration  state  \ 16041  52687        149                     3      7460        31      1            goal             tweet_ids                pledges  favourites  \ 16041  1200  [280965208409796608]  [0.0, 0.0625, 0.0625]           0              deadline  favorites  retweets  runtime  friends      launch  \ 16041  1356136772          0         0       27      127  1353544772                timestamps   date-funded   16041  [1355876412.0]  1.355888e+09  

Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -