riak

Mapreduce with Riak

懵懂的女人 提交于 2019-12-04 13:03:57
Does anyone have example code for mapreduce for Riak that can be run on a single Riak node. cd ~/riak erl -name zed@127.0.0.1 -setcookie riak -pa apps/riak/ebin In the shell: # connect to the server > {ok, Client} = riak:client_connect('riak@127.0.0.1'). {ok,{riak_client,'riak@127.0.0.1',<<6,201,208,64>>}} # create and insert objects > Client:put(riak_object:new(<<"groceries">>, <<"mine">>, ["eggs", "bacons"]), 1). ok > Client:put(riak_object:new(<<"groceries">>, <<"yours">>, ["eggs", "sausages"]), 1). ok # create Map and Reduce functions > Count = fun(G, 'undefined', 'none') -> [dict:from

Riak node no longer working after changing IP address

风格不统一 提交于 2019-12-04 10:01:44
I'm using an instanced Amazon EC2 virtual Ubuntu 12.04 server as my single Riak node. I've gone through all the proper stages of setting up Riak on the instance using the guide on the basho website here . Where x.x.x.x is the private IP address of the instance, this included: Installation Using sudo su - to gain root privileges (EC2 logs me in as 'Ubuntu'). Installing the SSL Lib with: sudo apt-get install libssl0.9.8 Downloading the 64-bit package for 12.04: wget http://downloads.basho.com.s3-website-us-east-1.amazonaws.com/riak/CURRENT/ubuntu/precise/riak_1.2.1-1_amd64.deb Then unpacking via

How to append data to a Riak key under a heavily distributed environment?

两盒软妹~` 提交于 2019-12-04 07:14:25
Using Riak I want to append data sequentially in a way that I can obtain all of the data I appended from time to time. Think of logs, if I pick incremented log rows and transfer them to riak, at some point I want to reconstitute all what I have appended. I thought of doing this by creating a new bucket for that purpose, then add keys defined by a sequential number or datetime stamp, and add the content to it, then use the list keys API and reconstitute the data I need. The problem with that is that the list key API is not efficient and production recommended. What I like about this approach is

How to deactivate or delete a bucket type in Riak?

不打扰是莪最后的温柔 提交于 2019-12-04 04:52:13
/home/khorkak> sudo riak-admin bucket-type Usage: riak-admin bucket-type <command> The follow commands can be used to manage bucket types for the cluster: list List all bucket types and their activation status status <type> Display the status and properties of a type activate <type> Activate a type create <type> <json> Create or modify a type before activation update <type> <json> Update a type after activation /home/khorkak> Well I have a set of bucket types I created while trying some things out that I no longer want around - can I get rid of these without reinstalling Riak? Unfortunately

Riak link-walking like a join?

我只是一个虾纸丫 提交于 2019-12-04 04:25:53
I am looking to store pictures in a NoSQL database (<5MB) and link them to articles in a different bucket. What kind of speed does Riak's link walking feature offer? Is it like a RDBMS join at all? Links are not at all similar to JOINs (which involve a Cartesian product), but they can be used for similar purposes in some senses. They are very similar to links in an HTML document. With link-walking you either start with a single key, or you create a map-reduce job that starts with multiple keys. (Link-walking/traversal is actually a special case of map-reduce.) Those values are fetched, their

Quick Reference Guide to Various NoSQL Databases

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-03 16:31:34
I'm looking for one place that summarizes the main properties of the NoSQL databases that I keep seeing referenced - in particular, MongoDB, Riak, Redis, Memcached, Membase, and Cassandra. Types of queries, acid, architecture for/properties of scaling, etc. All in memory, overflow to disk, backup on disk, or mainly only indexes in memory? Probably one of the best source which summarizes basic information (and points you to more detailed source in the first place) about various nosql databases is this website. Other than that you should check out these: Cassandra vs MongoDB vs CouchDB vs Redis

Bitcask ok for simple and high performant file store?

£可爱£侵袭症+ 提交于 2019-12-03 14:57:39
I am looking for a simple way to store and retrieve millions of xml files. Currently everything is done in a filesystem, which has some performance issues. Our requirements are: Ability to store millions of xml-files in a batch-process. XML files may be up to a few megs large, most in the 100KB-range. Very fast random lookup by id (e.g. document URL) Accessible by both Java and Perl Available on the most important Linux-Distros and Windows I did have a look at several NoSQL-Platforms (e.g. CouchDB, Riak and others), and while those systems look great, they seem almost like beeing overkill: No

Downsides of storing binary data in Riak?

非 Y 不嫁゛ 提交于 2019-12-03 10:14:22
What are the problems, if any, of storing binary data in Riak? Does it effect the maintainability and performance of the clustering? What would the performance differences be between using Riak for this rather than a distributed file system? Elad Adding to @Oscar-Godson's excellent answer, you're likely to experience problems with values much larger than 50MBs. Bitcask is best suited for values that are up to a few KBs. If you're storing large values, you may want to consider alternative storage backends, such as innostore . I don't have experience with storing binary values, but we've a

Which, if any, of the NoSQL databases can provide stream of *changes* to a query result set?

天涯浪子 提交于 2019-12-03 06:02:36
问题 Which, if any, of the NoSQL databases can provide stream of changes to a query result set? Could anyone point me at some examples? Firstly, I believe that none of the SQL databases provide this functionality - am I correct? I need to be able to specify arbitrary, simple queries, whose equivalent in SQL might be written: SELECT * FROM accounts WHERE balance < 0 and balance > -1000; I want an an initial result set: id: 100, name: Fred, balance: -10 id: 103, name: Mary, balance: -200 but then I

What NoSQL DB to use for sparse Time Series like data?

。_饼干妹妹 提交于 2019-12-03 04:32:36
问题 I'm planning a side project where I will be dealing with Time Series like data and would like to give one of those shiny new NoSQL DBs a try and am looking for a recommendation. For a (growing) set of symbols I will have a list of ( time , value ) tuples (increasing over time). Not all symbols will be updated; some symbols may be updated while others may not, and completely new symbols may be added. The database should therefore allow: Add Symbols with initial one-element (tuple) list. E.g. A