C# Dictionary 的实际工作原理_c#dictionary.add_编码文章

Dictionary是 C# 中非常流行的数据结构，也是面试问题的热门选择。我已经使用了 10 亿次，我非常确定我了解它们的工作原理。但是，当我更深入地研究它们并检查实际代码时，我发现它们比我想象的还要好（也许您也是如此）。在本文中，我们将一起进行深入研究，甚至编写我们自己的词典教育副本。所以和我一起开始吧！Dictionary

为了确保我们的副本与实际代码匹配，我们将首先探索原始代码中的内容，然后删除所有不必要的内容，并在需要的地方添加日志（）。只需复制两个主要方法就足够了：Add 和 GetValueOrDefault 来重新创建的所有基本要素，因此这就是我们要执行的操作。但首先，让我们看看 a 中的字段：Console.WriteLineDictionaryDictionary

我将使用参考源中的 .NET Framework 4.8 中的代码。尽管现代 .NET 代码稍微复杂一些，但基本要素仍然相同，因此 Framework 版本应该可以。

private struct Entry { 
public int hashCode; 
public int next; 
public TKey key; 
public TValue value; 
} 

private int[] buckets; 
private Entry[] entries; 
private int count; 
private int version; 
private int freeList; 
private int freeCount; 
private IEqualityComparer<TKey> comparer;

和 properties 是优化技术的一部分，对于 a 的运行不是必需的 - 我们将把它们传递给我们的 replica。只是一个 changes 计数器，我们也会传递它。为简单起见，我们还将使用而不是允许外部配置。添加打印当前状态（所有属性的值）的方法，我们将得到：freeListfreeCountDictionaryversionEqualityComparer.Default

public class EducationalDictionary<TKey, TValue>
{
private struct Entry {
public int hashCode;
public int next;
public TKey key;
public TValue? value;

string ValueString => value ==  ? "" : value!.ToString()!;
public override string ToString()
 {
return $"{key} - {ValueString} + (next = {next})";
 }
 }

private int[] buckets;
private Entry[] entries;
private int count;
private readonly EqualityComparer<TKey> comparer = EqualityComparer.Default;
public void PrintFullState(string preface)
 {
 Console.WriteLine();
 Console.Write(this.ToString(preface));
 }
public string ToString(string preface)
 {
StringBuilder result = new();

 result.AppendLine($"{preface}, state:");
 result.AppendLine();
 result.AppendLine($"buckets: [{String.Join(", ", buckets!)}]");
 result.AppendLine($"entries:");
for (int i = 0; i < entries.Length; i++)
 {
 result.AppendLine($" [{i}] = {entries[i]}");
 }
 result.AppendLine($"count: {count}");
return result.ToString();
 }
}

源代码中的方法实质上是对另一个方法的调用：Add

public void Add(TKey key, TValue value) { 
Insert(key, value, true); 
}

该方法可能看起来很可怕，但别担心，我们会一点一点地剖析它：Insert

private void Insert(TKey key, TValue value, bool add) {
if( key ==  ) {
 ThrowHelper.ThrowArgumentException(ExceptionArgument.key);
 }

if (buckets == ) Initialize(0);
int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
int targetBucket = hashCode % buckets.Length;

#if FEATURE_RANDOMIZED_STRING_HASHING
int collisionCount = 0;
#endif

for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) {
if (add) { 
 ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
 }
 entries[i].value = value;
 version++;
return;
 } 

#if FEATURE_RANDOMIZED_STRING_HASHING
 collisionCount++;
#endif
 }
int index;
if (freeCount > 0) {
 index = freeList;
 freeList = entries[index].next;
 freeCount--;
 }
else {
if (count == entries.Length)
 {
Resize();
 targetBucket = hashCode % buckets.Length;
 }
 index = count;
 count++;
 }

 entries[index].hashCode = hashCode;
 entries[index].next = buckets[targetBucket];
 entries[index].key = key;
 entries[index].value = value;
 buckets[targetBucket] = index;
 version++;

#if FEATURE_RANDOMIZED_STRING_HASHING

#if FEATURE_CORECLR
// In case we hit the collision threshold we'll need to switch to the comparer which is using randomized string hashing
// in this case will be EqualityComparer.Default.
// Note, randomized string hashing is turned on by default on coreclr so EqualityComparer.Default will 
// be using randomized string hashing

if (collisionCount > HashHelpers.HashCollisionThreshold && comparer == NonRandomizedStringEqualityComparer.Default) 
 {
 comparer = (IEqualityComparer) EqualityComparer<string>.Default;
Resize(entries.Length, true);
 }
#else
if(collisionCount > HashHelpers.HashCollisionThreshold && HashHelpers.IsWellKnownEqualityComparer(comparer)) 
 {
 comparer = (IEqualityComparer) HashHelpers.GetRandomizedEqualityComparer(comparer);
Resize(entries.Length, true);
 }
#endif // FEATURE_CORECLR

#endif 
}

它还调用其他方法：

private void Initialize(int capacity) {
int size = HashHelpers.GetPrime(capacity);
 buckets = new int[size];
for (int i = 0; i < buckets.Length; i++) buckets[i] = -1;
 entries = new Entry[size];
 freeList = -1;
}

private void Resize(int newSize, bool forceNewHashCodes) {
 Contract.Assert(newSize >= entries.Length);
int[] newBuckets = new int[newSize];
for (int i = 0; i < newBuckets.Length; i++) newBuckets[i] = -1;
Entry[] newEntries = new Entry[newSize];
 Array.Copy(entries, 0, newEntries, 0, count);
if(forceNewHashCodes) {
for (int i = 0; i < count; i++) {
if(newEntries[i].hashCode != -1) {
 newEntries[i].hashCode = (comparer.GetHashCode(newEntries[i].key) & 0x7FFFFFFF);
 }
 }
 }
for (int i = 0; i < count; i++) {
if (newEntries[i].hashCode >= 0) {
int bucket = newEntries[i].hashCode % newSize;
 newEntries[i].next = newBuckets[bucket];
 newBuckets[bucket] = i;
 }
 }
 buckets = newBuckets;
 entries = newEntries;
}

最后，它们调用上的方法。我们将复制这些，因为如果您不处理边缘情况，它们将非常微不足道。我们的版本将如下所示：HashHelpers

public static class HashHelpers
{
public static readonly int[] primes = {
3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107, 131, 163, 197, 239, 293, 353, 431, 521, 631, 761, 919,
1103, 1327, 1597, 1931, 2333, 2801, 3371, 4049, 4861, 5839, 7013, 8419, 10103, 12143, 14591,
17519, 21023, 25229, 30293, 36353, 43627, 52361, 62851, 75431, 90523, 108631, 130363, 156437,
187751, 225307, 270371, 324449, 389357, 467237, 560689, 672827, 807403, 968897, 1162687, 1395263,
1674319, 2009191, 2411033, 2893249, 3471899, 4166287, 4999559, 5999471, 7199369};

public static int GetPrime(int min)
 {
foreach (var prime in primes)
 {
if (prime >= min) return prime;
 }

return min;
 }

public static int ExpandPrime(int oldSize)
 {
return GetPrime(2 * oldSize);
 }
}

因为我们将在调整大小完成后删除 logic 和 print state 。老实说，代码并不是很重要，它本质上是重新排列和针对新大小。这是我们的结果：Resizeif (forceNewHashCodes)entriesbuckets

private void Resize(int newSize) {
var newBuckets = new int[newSize];
for (var i = 0; i < newBuckets.Length; i++) newBuckets[i] = -1;
var newEntries = new Entry[newSize];
 Array.Copy(entries, 0, newEntries, 0, count);

for (var i = 0; i < count; i++) {
if (newEntries[i].hashCode >= 0) {
var bucket = newEntries[i].hashCode % newSize;
 newEntries[i].next = newBuckets[bucket];
 newBuckets[bucket] = i;
 }
 }
 buckets = newBuckets;
 entries = newEntries;

PrintFullState("\u2194\ufe0f Resize");
}

我们将初始化逻辑从 straight 移动到构造函数（因为它将更容易处理可空性）：if (buckets == ) Initialize(0);

以及初始化完成后的打印状态

public EducationalDictionary()
{
var size = HashHelpers.GetPrime(0);
 buckets = new int[size];
for (var i = 0; i < buckets.Length; i++) buckets[i] = -1;
 entries = new Entry[size];

PrintFullState("\ud83d\ude80 Initialized");
}

以下是我们将执行的更改列表：Add

删除参数验证
删除#
FEATURE_RANDOMIZED_STRING_HASHING
删除 logic under，因为我们还是删除了参数if (freeCount > 0)
删除 loop，检查已设置的 key：for (int i = buckets[targetBucket]; i >= 0; i = entries[i].next) {
添加后的打印状态

public void Add(TKey key, TValue value) {
var hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
var targetBucket = hashCode % buckets!.Length;

if (count == entries!.Length)
 {
Resize(HashHelpers.ExpandPrime(count));
 targetBucket = hashCode % buckets.Length;
 }

var index = count;
 count++;

 entries[index].hashCode = hashCode;
 entries[index].next = buckets[targetBucket];
 entries[index].key = key;
 entries[index].value = value;
 buckets[targetBucket] = index;

 Console.WriteLine();
PrintFullState($"\ud83d\udce5 Add: {key} - {value}. hashCode = {hashCode}, targetBucket = {targetBucket}");
}

现在看起来简单多了，对吧？现在让我们完成我们的类，实现 .GetValueOrDefault

获取价值

这一次的实现相当简单，甚至在原始代码中也是如此：GetValueOrDefault

internal TValue GetValueOrDefault(TKey key) {
int i = FindEntry(key);
if (i >= 0) {
return entries[i].value;
 }
return default(TValue);
}

在搜索本身（在方法中）：FindEntry

private int FindEntry(TKey key) {
if( key == ) {
 ThrowHelper.ThrowArgumentException(ExceptionArgument.key);
 }

if (buckets != ) {
int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) {
if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;
 }
 }
return -1;
}

这里没有太多需要清理的地方。但是我们将添加大量日志，因为这是我们最重要的逻辑

private int FindEntry(TKey key)
{
var hashCode = comparer.GetHashCode(key!) & 0x7FFFFFFF;
var initialBucketIndex = hashCode % buckets.Length;

 Console.WriteLine();
 Console.WriteLine($"\ud83d\udd0e Search. Key = {key}. Initial Bucket Index {initialBucketIndex}");

for (var i = buckets[initialBucketIndex]; i >= 0; i = entries[i].next)
 {
 Console.WriteLine();
 Console.WriteLine($"Comparing key from entries[{i}] ({entries[i]}) to {key}");

if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key))
 {
 Console.WriteLine($"Key is equal returning {i}");

return i;
 }
else {
 Console.WriteLine($"Key is not equal, moving to the next linked index ({entries[i].next})");
 }
 }

 Console.WriteLine();
 Console.WriteLine("Search exit condition met (i >= 0). Returning -1 (as not found)");
return -1;
}

这样就完成了我们的 .现在让我们使用它并查看我们准备的详细日志！EducationalDictionary

执行测试

我们将添加 4

C# Dictionary 的实际工作原理_c#dictionary.add

获取价值

执行测试

相关文章