similarity - what is the best way to process multi files in storm -


i new apache storm, want use storm similarity of files. want cosine of of file in folder "a" of file in folder "b". can me show way result. much.

i did not understand meant 'cosine of files', in general, can think of each folder 'stream'. can have spouta read-understand-format-emit files in foldera , spoutb same folderb 2 tuple streams (i assuming there differences between 2 folders encoding, formatting etc.). processing bolt can 'subscribe' streams. e.g.,

bolt.fieldsgrouping(spouta, streamname, new fields("field_in_stream")); bolt.fieldsgrouping(spoutb, streamname, new fields("field_in_stream")); 

if on other hand, meant 2 different instances of same spout read different folders

  • not great idea, because number of spout executors tied #folders have. not scalable.
  • load distribution pretty bad.
  • if still want it, can use task-index of spout have different spout executors different behavior (different meaning reading different folders)

like this, maybe

public class myspout extends baserichspout {      public void open(map conf, topologycontext context,                 spoutoutputcollector collector) {                 system.out.println("spout index = " + context.getthistaskid());            }      } 

Comments

Popular posts from this blog

Fail to load namespace Spring Security http://www.springframework.org/security/tags -

sql - MySQL query optimization using coalesce -

unity3d - Unity local avoidance in user created world -