I need to execute MapReduce on my Cassandra cluster, including data locality, ie. each job queries only rows which belong to local Casandra Node where the job runs.
yes I was looking for the same thing, seems DataStaxEnterprise has a simplified Hadoop integration,
read this http://wiki.apache.org/cassandra/HadoopSupport