Data Structure in Web Advanced JavaScript

Keywords: PHP Attribute github angular Redis

Complexity analysis

Large O Complexity Representation
Common ones are O(1), O(n), O(logn), O(n logn)
In addition to the large O representation, time complexity can be described in the following cases
Best case time complexity
Worst Case Time Complexity
Average case time complexity
Allocation time complexity

Code Execution Efficiency Analysis

In most cases, the efficiency of code execution can be analyzed using time complexity, but constants are often omitted from the large O representation.
But in industrial implementations, real execution efficiency is often considered in many ways:

Time Complexity
Spatial Complexity
Cache Friendly
Number of instruction bars
Degraded performance (such as hash conflict resolution, etc.)
etc.

These are the theoretical levels. For front-end development, if you are developing a base that will be widely used, theoretical analysis is not enough.
Now you can test your module with some base libraries: benchmark.js

data structure

1. Arrays

Features: Continuous memory, subscript access support, cache friendliness, time complexity O(1)
In-depth: From the JS inheritance chain, you know that Array arrays in JS inherit from Object s, and in some cases they degenerate into a hash table structure [Dictionary].
Think: Consider how the browser will store the array when

let myArray = [10000];
myArray[0] = 0;
myArray[9999] = 1;

2. Chain List

Features: Discontinuous memory, worst case O(n) for finding elements, and best case O(1).

Singly linked list
Bidirectional Chain List

In-depth: Determine if the chain list has rings?
With a step of 1 and an additional step of 2, if it encounters before the traversal is complete, there is a ring.

3. Stack

Features: FIFO, LIFO, the underlying structure can be an array or a chain table.
Scenario: When a function is called, the current execution environment is stacked and the context before the call is restored after the call is completed.
In-depth: Continuous stacking during recursive calls, if the recursion level is too deep, causing a stack overflow.
For example, the dirty value check in Angular JS has a nesting depth of 10.
Think: Can the forward and backward capabilities of browsers be implemented using stacks?
Dual stack structure for forward and backward stacks

4. Queues

Features: FIFO, the underlying structure can be an array or a list of chains.

No blocking queue
Chain List-based
Blocking Queue
Circular Queue Based on Array
Deep: Macro Queue and Micro sro Queue in JS

5. Jump Table

The bottom level is the list of chains. To increase the efficiency of finding chains, a layer of lookup chains is added based on the list of chains with (2/3/4/../n) nodes.
Features:'Support for interval lookup', support for dynamic data structures, need additional storage space to store index chain nodes
Time Complexity: Chain list-based lookup time complexity is determined by jump surface height
Scenarios: Ordered collections in Redis, for example

6. Hash List

The bottom level is the array structure, which is mapped to the subscript position in the array by calculating the key using a hash function.
Hash function requirements:

Hash value is a non-negative integer
key1 = key2, hash(key1) = hash(key2)
Key1!= key2, hash (key1)!= hash (key2)

Common hash algorithms are MD5, SHA, CRC, etc.
Hash conflict:

open addressing
Linear Detection: Find idle one after another in case of hash conflict
Double hash: Use multiple hash functions, if the first hash value is occupied, then the second hash value
Chain List Method
When hashing conflicts, link a chain table structure after the first level of data.The chain list can be single/double or tree

Load factor: The location/total location that has been used in the entire array
Dynamic Capacity Expansion: When the loading factor reaches a certain limit (e.g. 0.75), the entire underlying array is expanded
Note: Dynamic expansion is not done in one go in an industrial implementation.
Dynamic expansion done at one time can cause performance bottlenecks, typically when the load factor reaches a set value, a new array is requested.For each subsequent operation, move a data from the original array to the new one until the original array is empty.

JavaScript Object Data Structure

In another article, I have several modes when objects in JS are stored at the bottom level, so let's take a closer look at the source code

//https://github.com/v8/v8/blob/master/src/objects/js-objects.h
// Let's start with the definition of an object in JS, inherited from JSReceiver
// Line 278
class JSObject : public JSReceiver {
    //Omit...
}

// Line 26
// Next, JSReceiver inherits from HeapObject and has several important properties
// JSReceiver includes types on which properties can be defined, i.e.,
// JSObject and JSProxy.
class JSReceiver : public HeapObject {
 public:
  NEVER_READ_ONLY_SPACE
  // Returns true if there is no slow (ie, dictionary) backing store.
  // Is there a fast attribute mode
  inline bool HasFastProperties() const;

  // Returns the properties array backing store if it exists. 
  // Otherwise, returns an empty_property_array when there's a Smi (hash code) or an empty_fixed_array for a fast properties map.
  // Attribute Array
  inline PropertyArray property_array() const;

  // Gets slow properties for non-global objects.
  // Dictionary Properties
  inline NameDictionary property_dictionary() const;

  // Sets the properties backing store and makes sure any existing hash is moved
  // to the new properties store. To clear out the properties store, pass in the
  // empty_fixed_array(), the hash will be maintained in this case as well.
  void SetProperties(HeapObject properties);

  // There are five possible values for the properties offset.
  // 1) EmptyFixedArray/EmptyPropertyDictionary - This is the standard
  // placeholder.
  //
  // 2) Smi - This is the hash code of the object.
  //
  // 3) PropertyArray - This is similar to a FixedArray but stores
  // the hash code of the object in its length field. This is a fast
  // backing store.
  //
  // 4) NameDictionary - This is the dictionary-mode backing store.
  //
  // 4) GlobalDictionary - This is the backing store for the
  // GlobalObject.

  // Initialization Properties
  inline void initialize_properties();

From the above, we can see that there are two modes for object: fast attribute and dictionary attribute. Fast attribute is stored by array and dictionary attribute is stored by hash table.
Quick Attributes Not Deep Here, let's look at the underlying structure of NameDictionary

// https://github.com/v8/v8/blob/master/src/objects/dictionary.h
// Let's first look at the chain of inheritance
// Line 202
class V8_EXPORT_PRIVATE NameDictionary : public BaseNameDictionary<NameDictionary, NameDictionaryShape>{}
// Line 128
class EXPORT_TEMPLATE_DECLARE(V8_EXPORT_PRIVATE) BaseNameDictionary : public Dictionary<Derived, Shape> {}
// Line 26
class EXPORT_TEMPLATE_DECLARE(V8_EXPORT_PRIVATE) Dictionary : public HashTable<Derived, Shape> {}

// As we can see from the top that HashTable inherits, let's look at the definition of HashTable
// https://github.com/v8/v8/blob/master/src/objects/hash-table.h
// And the comments at the beginning of the file are already detailed

// HashTable is a subclass of FixedArray that implements a hash table that uses open addressing and quadratic probing.
*important: hash Tables use arrays as base data,Open addressing and secondary probing are implemented on it
// In order for the quadratic probing to work, elements that have not yet been used and elements that have been deleted are distinguished.  
// Probing continues when deleted elements are encountered and stops when unused elements are encountered.
* For secondary detection to work properly,not used/Deleted elements will be deleted by markup instead of directly deleting
// - Elements with key == undefined have not been used yet.
// - Elements with key == the_hole have been deleted.

// The following hash table derived classes will be used
// Line 292
template <typename Derived, typename Shape>
class EXPORT_TEMPLATE_DECLARE(V8_EXPORT_PRIVATE) ObjectHashTableBase
    : public HashTable<Derived, Shape> {}

//Next, let's look at several important behaviors and parameters of V8 when implementing hash tables
// https://github.com/v8/v8/blob/master/src/objects/objects.cc

1. Expansion
// Line 7590
// The following source code is the logic after adding a new element
ObjectHashTableBase<Derived, Shape>::Put(Isolate* isolate, Handle<Derived> table, Handle<Object> key, Handle<Object> value, int32_t hash) {    
  int entry = table->FindEntry(roots, key, hash);
  // Key is already in table, just overwrite value.
  // key already exists, override value
  if (entry != kNotFound) {
    table->set(Derived::EntryToValueIndex(entry), *value);
    return table;
  }

  // If more than 33% of the elements have been deleted, consider hash again
  // Rehash if more than 33% of the entries are deleted entries.
  // TODO(jochen): Consider to shrink the fixed array in place.
  if ((table->NumberOfDeletedElements() << 1) > table->NumberOfElements()) {
    table->Rehash(roots);
  }
  // If we're out of luck, we didn't get a GC recently, and so rehashing isn't enough to avoid a crash.
  // Re hash by double size if there are not enough estimated vacancies
  if (!table->HasSufficientCapacityToAdd(1)) {
    int nof = table->NumberOfElements() + 1;
    int capacity = ObjectHashTable::ComputeCapacity(nof * 2);
    if (capacity > ObjectHashTable::kMaxCapacity) {
      for (size_t i = 0; i < 2; ++i) {
        isolate->heap()->CollectAllGarbage(
            Heap::kNoGCFlags, GarbageCollectionReason::kFullHashtable);
      }
      table->Rehash(roots);
    }
  }

// Line 6583.
// Below is the logic for calculating estimated vacancies
HashTable<Derived, Shape>::EnsureCapacity(Isolate* isolate, Handle<Derived> table, int n, AllocationType allocation) {
  if (table->HasSufficientCapacityToAdd(n)) return table;
  int capacity = table->Capacity();
  int new_nof = table->NumberOfElements() + n;
  const int kMinCapacityForPretenure = 256;
  bool should_pretenure = allocation == AllocationType::kOld ||
                          ((capacity > kMinCapacityForPretenure) &&
                           !Heap::InYoungGeneration(*table));
  Handle<Derived> new_table = HashTable::New(
      isolate, new_nof,
      should_pretenure ? AllocationType::kOld : AllocationType::kYoung);

  table->Rehash(ReadOnlyRoots(isolate), *new_table);
  return new_table;
}

2. shrink 
// Line 6622
HashTable<Derived, Shape>::Shrink(Isolate* isolate,
                                                  Handle<Derived> table,
                                                  int additionalCapacity) {
  int capacity = table->Capacity();
  int nof = table->NumberOfElements();

  // Shrink to fit the number of elements if only a quarter of the capacity is filled with elements.
  // Shrink when only 1/4 of the load is
  if (nof > (capacity >> 2)) return table;
  // Allocate a new dictionary with room for at least the current number of
  // elements + {additionalCapacity}. The allocation method will make sure that
  // there is extra room in the dictionary for additions. Don't go lower than
  // room for {kMinShrinkCapacity} elements.
  int at_least_room_for = nof + additionalCapacity;
  int new_capacity = ComputeCapacity(at_least_room_for);
  if (new_capacity < Derived::kMinShrinkCapacity) return table;
  if (new_capacity == capacity) return table;

  const int kMinCapacityForPretenure = 256;
  bool pretenure = (at_least_room_for > kMinCapacityForPretenure) &&
                   !Heap::InYoungGeneration(*table);
  Handle<Derived> new_table =
      HashTable::New(isolate, new_capacity,
                     pretenure ? AllocationType::kOld : AllocationType::kYoung,
                     USE_CUSTOM_MINIMUM_CAPACITY);

  table->Rehash(ReadOnlyRoots(isolate), *new_table);
  return new_table;
}

Posted by fearfx on Sun, 16 Jun 2019 10:02:35 -0700

Programmer Group