Segmentation fault 11, python hash with lists, hashing 1 million objects

北城以北 提交于 2019-12-12 03:05:08

问题


When I try to make and hash objects from a file, containing one million songs, I get a weird segmentation error after about 12000 succesfull hashes.

Anyone have any idea why this:

Segmentation fault: 11

happens when I run the program?

I have these classes for hashing the objects:

class Node():
    def __init__(self, key, value = None):
        self.key = key
        self.value = value

    def __str__(self):
        return str(self.key) + " : " + str(self.value)

class Hashtable():
    def __init__(self, hashsize, hashlist = [None]):
        self.hashsize = hashsize*2
        self.hashlist = hashlist*(self.hashsize)

    def __str__(self):
        return self.hashlist

    def hash_num(self, name):
        result = 0
        name_list = list(name)
        for letter in name_list:
            result = (result*self.hashsize + ord(letter))%self.hashsize
        return result

    def check(self, num):
        if self.hashlist[num] != None:
            num = (num + 11**2)%self.hashsize#Kolla här jättemycket!
            chk_num = self.check(num)#här med
            return chk_num#lär dig
        else:
            return num

    def check_atom(self, num, name):
        if self.hashlist[num].key == name:
            return num
        else:
            num = (num + 11**2)%self.hashsize
            chk_num = self.check_atom(num, name)#läs här
            return chk_num#läs det här

    def put(self, name, new_atom):
        node = Node(name)
        node.value = new_atom
        num = self.hash_num(name)
        chk_num = self.check(num)
        print(chk_num)
        self.hashlist[chk_num] = node

    def get(self, name):
        num = self.hash_num(name)
        chk_num = self.check_atom(num, name)
        atom = self.hashlist[chk_num]
        return atom.value

And I call upon the function in this code:

from time import *
from hashlist import *
import sys

sys.setrecursionlimit(1000000000)

def lasfil(filnamn, h):
    with open(filnamn, "r", encoding="utf-8") as fil:
        for rad in fil:
            data = rad.split("<SEP>")
            artist = data[2].strip()
            song = data[3].strip()
            h.put(artist, song)

def hitta(artist, h):
    try:
        start = time()
        print(h.get(artist))
        stop = time()
        tidhash = stop - start
        return tidhash
    except AttributeError:
        pass

h = Hashtable(1000000)
lasfil("write.txt", h)

回答1:


The reason you're getting a segmentation fault is this line:

sys.setrecursionlimit(1000000000)

I assume you added it because you received a RuntimeError: maximum recursion depth exceeded. Raising the recursion limit doesn't allocate any more memory for the call stack, it just defers the aforementioned exception. If you set it too high, the interpreter runs out of stack space and accesses memory that doesn't belong to it, causing random errors (likely segfaults, but in theory anything is possible).

The real solution is to not use unbounded recursion. For things like balanced search trees, where the recursion depth is limited to a few dozen levels, it's okay, but you can't replace long loops with recursion.

Also, unless this is an exercise in creating hash tables, you should just use the built in dict. If it is an exercise in creating hash tables, consider this a hint that something about your hash table sucks: It indicates a probe length of at least 1000, more likely several thousand. It should only be a few dozen at most, ideally in the single digits.



来源:https://stackoverflow.com/questions/19320903/segmentation-fault-11-python-hash-with-lists-hashing-1-million-objects

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!