My Python for loop is causing a MemoryError. How can I optimize this?

怎甘沉沦 提交于 2019-12-01 19:34:24

问题


I'm trying to compile a list of all the MAC address Apple devices will have. oui.txt tells me Apple has been assigned 77 MAC ranges to use. These ranges come in the form of:

00:00:00
00:11:11
etc...

This leaves me the last three HEX digits to append. That's 16^6. A total of 1291845632 Apple MAC addresses.

The problem I'm having is writing a program to create a list of these MAC addresses. Here's my current code:

import re

apple_mac_range = []
apple_macs      = []

# Parse the HTML of http://standards.ieee.org/cgi-bin/ouisearch to get the MACs
with open('apple mac list', 'r') as f:
    for line in f.readlines():

        match = re.search(r'[\w\d]{2}-[\w\d]{2}-[\w\d]{2}', line)

        if match:
            apple_mac_range.append(match.group().split('-'))

for mac in apple_mac_range:
    for i in range(1, 1291845633):
        print i

This gives me a MemoryError... How can I optimize it?


回答1:


range(1, 1291845633) creates a list of 1,291,845,632 elements (several GB) all at once. Use xrange(1, 1291845633) instead and it will generate elements as you need them instead of all at once.

Regardless, it looks like you want something more like this:

for mac in apple_mac_range: 
    for i in xrange(16777216): 
        print mac, i 

Of course it's quite likely that a list of 1.3e+9 MAC addresses will not be very useful. If you want to see if a given MAC address is an Apple device, you should just check to see if the 3-byte prefix is in the list of 77. If you're trying to do access control by giving a router or something a list of all possible MAC addresses, it's unlikely that the device will accept 1.3e+9 items in its list.




回答2:


Others have answered your actual question, but I'm not really sure that's what's warranted here. Why don't you just create a class which implements __contains__ to test MAC address algorithmically? I presume you're getting a MAC and you want to test if it's possibly an iPhone MAC, so you could implement that class and then just do something like:

if found_mac in MACTester:
  ...do work...

Alternatively if you really do want an iterable sequence, you should at least use a generator instead of actually trying to fit them all in memory.




回答3:


Don't use readlines

with file('apple mac list') as f:
    for x in f:
        print x



回答4:


How about:

i = 0
while i < 1291845633:
  print i
  i += 1



回答5:


Well, to start with, range(1, 1291845633) creates a list containing about a billion entries. Since each entry is at least sizeof(Py_Object), it's not too surprising that you run right out of memory. Don't do that.



来源:https://stackoverflow.com/questions/4405083/my-python-for-loop-is-causing-a-memoryerror-how-can-i-optimize-this

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!