Segmentation fault 11, python hash with lists, hashing 1 million objects

问题

When I try to make and hash objects from a file, containing one million songs, I get a weird segmentation error after about 12000 succesfull hashes.

Anyone have any idea why this:

Segmentation fault: 11

happens when I run the program?

I have these classes for hashing the objects:

class Node():
    def __init__(self, key, value = None):
        self.key = key
        self.value = value

    def __str__(self):
        return str(self.key) + " : " + str(self.value)

class Hashtable():
    def __init__(self, hashsize, hashlist = [None]):
        self.hashsize = hashsize*2
        self.hashlist = hashlist*(self.hashsize)

    def __str__(self):
        return self.hashlist

    def hash_num(self, name):
        result = 0
        name_list = list(name)
        for letter in name_list:
            result = (result*self.hashsize + ord(letter))%self.hashsize
        return result

    def check(self, num):
        if self.hashlist[num] != None:
            num = (num + 11**2)%self.hashsize#Kolla här jättemycket!
            chk_num = self.check(num)#här med
            return chk_num#lär dig
        else:
            return num

    def check_atom(self, num, name):
        if self.hashlist[num].key == name:
            return num
        else:
            num = (num + 11**2)%self.hashsize
            chk_num = self.check_atom(num, name)#läs här
            return chk_num#läs det här

    def put(self, name, new_atom):
        node = Node(name)
        node.value = new_atom
        num = self.hash_num(name)
        chk_num = self.check(num)
        print(chk_num)
        self.hashlist[chk_num] = node

    def get(self, name):
        num = self.hash_num(name)
        chk_num = self.check_atom(num, name)
        atom = self.hashlist[chk_num]
        return atom.value

And I call upon the function in this code:

from time import *
from hashlist import *
import sys

sys.setrecursionlimit(1000000000)

def lasfil(filnamn, h):
    with open(filnamn, "r", encoding="utf-8") as fil:
        for rad in fil:
            data = rad.split("<SEP>")
            artist = data[2].strip()
            song = data[3].strip()
            h.put(artist, song)

def hitta(artist, h):
    try:
        start = time()
        print(h.get(artist))
        stop = time()
        tidhash = stop - start
        return tidhash
    except AttributeError:
        pass

h = Hashtable(1000000)
lasfil("write.txt", h)

回答1:

The reason you're getting a segmentation fault is this line:

sys.setrecursionlimit(1000000000)

I assume you added it because you received a RuntimeError: maximum recursion depth exceeded. Raising the recursion limit doesn't allocate any more memory for the call stack, it just defers the aforementioned exception. If you set it too high, the interpreter runs out of stack space and accesses memory that doesn't belong to it, causing random errors (likely segfaults, but in theory anything is possible).

The real solution is to not use unbounded recursion. For things like balanced search trees, where the recursion depth is limited to a few dozen levels, it's okay, but you can't replace long loops with recursion.

Also, unless this is an exercise in creating hash tables, you should just use the built in dict. If it is an exercise in creating hash tables, consider this a hint that something about your hash table sucks: It indicates a probe length of at least 1000, more likely several thousand. It should only be a few dozen at most, ideally in the single digits.

来源：https://stackoverflow.com/questions/19320903/segmentation-fault-11-python-hash-with-lists-hashing-1-million-objects

标签

python

hash

segmentation-fault

hashtable

hashcode