Cassandra update fails

流过昼夜 提交于 2019-12-08 02:22:16

问题


Solved I was testing update on 3 nodes, and the time on one of those nodes was 1 second behind, so when update a row, the write time is always behind the timestamp, cassandra would not update the rows. I sync all nodes time, and the issue fixed.

Edit: I double checked the result, all insertions are succeed, partial updates failed. There's no error/exception messages

I have a cassandra cluster(Cassandra 2.0.13) which contains 5 nodes. Using python(2.6.6) cassandra driver(2.6.0c2) for inserting data into database. my server systems are Centos6.X

The following code is how i connect to cassandra and get session. I provided at most 2 nodes ip addresses, and select the keyspace.

def get_cassandra_session():
    """creates cluster and gets the session base on key space"""
    # be aware that session cannot be shared between threads/processes
    # or it will raise OperationTimedOut Exception
    if CLUSTER_HOST2:
        cluster = cassandra.cluster.Cluster([CLUSTER_HOST1, CLUSTER_HOST2])
    else:
        # if only one address is available, we have to use older protocol version
        cluster = cassandra.cluster.Cluster([CLUSTER_HOST1], protocol_version=1)

    session = cluster.connect(KEY_SPACE)
    return session 

For each row, I have 17 columns and if the key does not exist in database, I will use session insert key with the rest columns default values, and then update specific column's value.

def insert_initial_row(session, key):
    session.execute(INITIAL_INSERTION_STATEMENT, tuple(INITIAL_COLUMNS_VALUES))


def update_columnX(session, key, column):
    session.execute("INSERT INTO " + TABLE + "(" + KEY + "," + COLUMN_X + ") VALUES(%s, %s)", (key, column))

def has_found(session, key):
    """checks key is in database or not"""
    query = "SELECT " + "*" + " FROM " + KEY_SPACE + "." + TABLE \
            + " WHERE " + KEY + " = " + "'" + key + "'"
    # returns a list
    row = session.execute(query)
    return True if row else False

the following is how I invoke them:

for a_key in keys_set:
    """keys_set contains 100 no duplicate keys"""
    if has_found(session, a_key):
        update_columnX(session, a_key, "column x value")
    else:
        """the key is not in db, initialize it with all default values, then update column x"""
        insert_initial_row(session,  a_key)
        if has_found(sessin, a_key):
            update_columnX(session,  a_key, "column x value")
        else:
            logger.error("not initialized correctly...")

I was trying to insert 100 rows and update each row's columnX, but only partial of those 100 rows can be updated, the rest rows columnX are the default values.insert_initial_row has been invoked and initialized default values for all 100 lines, but the update_columnX does not. Event I change the consistency level to Quorum, it doesnt help at all. "not initialized correctly..." never printed out, and I added a print line in update_columnX and the line is printed 100 time, so it is invoked 100 times, but not all of them updated.

Any idea? Please help.

Thanks


回答1:


If your session.execute writes were not successful (they did not meet the required consistency level), then the driver will raise one of the following exceptions:

  1. Unavailable - There were not enough live replicas to satisfy the requested consistency level, so the coordinator node immediately failed the request without forwarding it to any replicas.
  2. Timeout - Replicas did not respond to the coordinator before cassandra timeout.
  3. Write timeout - Replicas did not respond to the coordinator before the write timeout. Configured in cassandra.yaml. There is a similar timeout for reads, read and write timeouts are configured separately in the yaml.
  4. Operation timeout - Operation took longer than the specified client side timeout. Configure in your application code.

You can try tracing your queries and find out what exactly happened for each write. This will show you the coordinators and the replica nodes involved in the operation and how much time the request spent in each.



来源:https://stackoverflow.com/questions/31256238/cassandra-update-fails

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!