问题
From this question, I thought I could get around the 2 GB collection size limit by creating a BigList datatype using the following pattern (and by the way, this limit seems to be imposed by default on x86 applications, if you are curious about trying it out):
using Microsoft.Win32;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace RegistryHawk
{
class Program
{
struct RegistryPath
{
public RegistryView View;
public string Path;
public bool IsKey;
public RegistryValueKind ValueKind;
public string ValueName;
public object Value;
public int HashValue;
}
public class BigList<T>
{
object listLock = new object();
List<List<T>> Items = new List<List<T>>();
int PageSize = 1000000; // Tweak this to be the maximum size you can grow each individual list before reaching the 2 GB size limit of .NET.
public ulong Count = 0;
int listCount = 0;
public BigList()
{
Items.Add(new List<T>());
}
public void Add(T item)
{
lock (listLock)
{
if (Items[listCount].Count == PageSize)
{
Items.Add(new List<T>());
listCount++;
}
Items[listCount].Add(item);
Count++;
}
}
}
static void Main(string[] args)
{
BigList<RegistryPath> snapshotOne = new BigList<RegistryPath>();
WalkTheRegistryAndPopulateTheSnapshot(snapshotOne);
BigList<RegistryPath> snapshotTwo = new BigList<RegistryPath>();
WalkTheRegistryAndPopulateTheSnapshot(snapshotTwo);
}
private static void WalkTheRegistryAndPopulateTheSnapshot(BigList<RegistryPath> snapshot)
{
List<ManualResetEvent> handles = new List<ManualResetEvent>();
foreach (RegistryHive hive in Enum.GetValues(typeof(RegistryHive)))
{
foreach (RegistryView view in Enum.GetValues(typeof(RegistryView)).Cast<RegistryView>().ToList().Where(x => x != RegistryView.Default))
{
ManualResetEvent manualResetEvent = new ManualResetEvent(false);
handles.Add(manualResetEvent);
new Thread(() =>
{
WalkKey(snapshot, view, RegistryKey.OpenBaseKey(hive, view));
manualResetEvent.Set();
}).Start();
}
}
ManualResetEvent.WaitAll(handles.ToArray());
}
private static void WalkKey(BigList<RegistryPath> snapshot, RegistryView view, RegistryKey key)
{
RegistryPath path = new RegistryPath { View = view, Path = key.Name, HashValue = (view.GetHashCode() ^ key.Name.GetHashCode()).GetHashCode() };
snapshot.Add(path);
string[] valueNames = null;
try
{
valueNames = key.GetValueNames();
}
catch { }
if (valueNames != null)
{
foreach (string valueName in valueNames)
{
RegistryValueKind valueKind = RegistryValueKind.Unknown;
try
{
valueKind = key.GetValueKind(valueName);
}
catch { }
object value = key.GetValue(valueName);
RegistryPath pathForValue = new RegistryPath { View = view, Path = key.Name, ValueKind = valueKind, ValueName = valueName, Value = value, HashValue = (view.GetHashCode() ^ key.Name.GetHashCode() ^ valueKind.GetHashCode() ^ valueName.GetHashCode()).GetHashCode() };
snapshot.Add(pathForValue);
}
}
string[] subKeyNames = null;
try
{
subKeyNames = key.GetSubKeyNames();
}
catch { }
if (subKeyNames != null)
{
foreach (string subKeyName in subKeyNames)
{
try
{
WalkKey(snapshot, view, key.OpenSubKey(subKeyName));
}
catch { }
}
}
}
}
}
However, CLR still triggers a System.OutOfMemory exception. It is not thrown anywhere, but I see program execution stop entirely at around 2 GB of RAM, and when I freeze my code in Visual Studio, it shows that an out of memory exception was thrown whenever I try to view the state of variables within any thread of the application. It never happens on the first call to WalkTheRegistryAndPopulateTheSnapshot(snapshotOne);, but when the second call to WalkTheRegistryAndPopulateTheSnapshot(snapshotTwo); proceeds, it ends up stopping program execution at around 2 GB of overall RAM usage in my collections. The entire code is posted, so if you have a beefy registry you can probably see it get generated on an x86 console application. Is there something that I failed to grasp here, or is this pattern not a valid means to get around the 2 GB collection size limit that the other question on Stack seems to play up to?
回答1:
I'm going to expand on my comment. If you're writing a 32-bit app, you have some serious memory constraints when you're working with large amounts of data.
The most important thing to remember is that the 32-bit application is limited to an absolute maximum of 2^32 bytes (4 GB) of memory. In practice, it's usually 2 GB, or perhaps 3 GB if you have that much memory and the application is large address aware.
There's also the .NET imposed 2 GB limit, which limits the size of any single object to no more than 2 GB. It's rare that you'll encounter this limit in a 32-bit program, simply because, even on a machine that has more than 2 GB of memory, it's unlikely that there will be a contiguous chunk of memory that's 2 GB in size.
The 2 GB limit also exists in 64 bit versions of .NET, unless you're running .NET 4.5 and use the app.config setting that enables large objects.
As for why something like BigList exists in 32-bit versions, it's a way to get around requiring a contiguous block of memory. For example, a List<int> with 250 million items requires a gigabyte: a contiguous block of memory that's 1 GB in size. But if you use the BigList trick (as you did in your code), then you need 250 individual blocks of memory that are 4 MB in size. It's a whole lot more likely that you'll have 250 blocks of 4 MB than you will a single 1 GB block.
来源:https://stackoverflow.com/questions/26953877/getting-around-the-2-gb-collection-limit-in-net