How to Implement a C++ Vector From Scratch

How to Implement a C++ Vector From Scratch

Introduction

std::vector is a sequence container in C++ that represents a dynamic array. It is part of the C++ Standard Template Library (STL) and provides the ability to store and manipulate a collection of elements with automatic memory management and flexible resizing.

std::vector is the most commonly used container in C++ development, whether you are passionate about competitive coding or interested in C++ programming, learning the low-level implementation of it helps you gain a more thorough understanding of C++ language and prepares you for better using it in the future.

Prerequisite Knowledge

Rule of Five

The Rule of Five in C++ is a guideline helping developers manage resource ownership in user-defined types.
To be specific, if your class manages a resource, and you define any one of the following five special member functions, you should probably define all five to ensure correct behavior:

  • Destructor
  • Copy Constructor
  • Copy Assignment Operator
  • Move Constructor
  • Move Assignment Operator

reinterpret_cast<T>(expression)

reinterpret_cast is a low-level type conversion operator in C++ used to reinterpret the binary representation of a value as a different type.

Two important thing about it is that it does not perform any type safety checks and it simply treats the bits of one type as if they were another.

::operator new and ::operator delete

::operator new is a built-in global function in C++ that allocates raw, uninitialized memory on the heap.

It is not the same as the new expression, though the new expression uses it internally. The :: scope resolution prefix ensures you’re calling the global version of operator new, avoiding any class-specific overloads.

::operator delete is the global deallocation function to free raw memory that was previously allocated using ::operator new.

Step by Step Implementation

Private Fields

To implement a std::vector, firstly we need to define the private fields of this class.

1
2
3
T* data_ = nullptr;     // data_ is a pointer pointing to a generic type T
size_t capacity_ = 0;
size_t size_ = 0;

Additionally, we need to define the initial size of a vector and the growth factor. The value of a growth factor is the growing rate of a vector once the capacity is full.

1
2
static constexpr size_t startSize_ = 10;
static constexpr double growthFactor_ = 1.6;    // if growthFactor = 2, the vector will be doubled when it's full

Constructors and Destructor

According to Rule of Five, we need to implement three types of constructors for out vector class, which are Default Constructor, Copy Constructor, and Move Constructor.
In Default Constructor, we need to allocate initial capacity memory for the underlying data and initialzie the member variables.

1
2
3
4
Vector() {
    data_ = reinterpret_cast<T*>(::operator new(startSize_ * sizeof(T)));
    capacity_ = startSize_;
}

In Copy Constructor, we need to allocate capacity for a new vector and keep the input object unchanged.

1
2
3
4
5
6
Vector(const Vector& other) : size_(other.size_), capacity_(other.capacity_) {
    data_ = reinterpret_cast<T*>(::operator new(capacity_ * sizeof(T)));
    for (size_t i = 0; i < size_; ++i) {
        new (data_ + i) T(other.data_[i]);
    }
}

In Move Constructor, we need to transfer the ownership of resources from one object to another instead of copying, which means the input object of this constructor should be set to void.

1
2
3
4
5
6
Vector(Vector&& other) noexcept 
    : data_(other.data_), capacity_(other.capacity_), size_(other.size_) {
    other.data_ = nullptr;
    other.capacity_ = 0;
    other.size_ = 0;
}

In the Destructor

1
2
3
4
5
6
~Vector() {
    for (size_t i = 0; i < size_; ++i) {
        data_[i].~T();
    }
    ::operator delete(data_);
}

Assignments

According to the Rule of Five, both the copy assignemnt and the move assignemnt must be implemented for this class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// Copy assignment
Vector& operator=(const Vector& other) {
    if (this != &other) {
        Vector temp(other);
        swap(temp);
    }
    return *this;
}

// Move assignment
Vector& operator=(Vector&& other) noexcept {
    if (this != &other) {
        Vector temp(std::move(other));
        swap(temp);
    }
    return *this;
}

Out of convenience, we can also define a specific swap function, which swaps another object with this.

1
2
3
4
5
void swap(Vector& other) noexcept {
    std::swap(data_, other.data_);
    std::swap(capacity_, other.capacity_);
    std::swap(size_, other.size_);
}

Element Insertion

In std::vector, the push_back method appends a new element to the end of the container and there are two versions should be implemented, which are the one handling the left value

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
void push_back(const T& elem) {
    if (size_ == capacity_) {
        reserve(capacity_ ? static_cast<size_t>(capacity_ * growthFactor_) : startSize_);
        assert(size_ < capacity_);
    }
        
    assert(data_);
    new (data_ + size_) T(elem);
    ++size_;
}

and the one handling the right value.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
void push_back(T&& elem) {
    if (size_ == capacity_) {
        reserve(capacity_ ? static_cast<size_t>(capacity_ * growthFactor_) : startSize_);
        assert(size_ < capacity_);
    }
        
    assert(data_);
    new (data_ + size_) T(std::move(elem));
    ++size_;
}

Element Access

To allow users to access elements by index, we need to overload two versions of operator[], which handle both const and non-const objects.

1
2
3
4
5
6
7
8
9
const T& operator[](size_t idx) const { 
    assert(idx < size_);
    return data_[idx]; 
}
    
T& operator[](size_t idx) { 
    assert(idx < size_);
    return data_[idx]; 
}

In addition, common operations like front, back, and size should also be added, which are

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
const T& front() const { 
    assert(size_ > 0);
    return operator[](0); 
}
    
const T& back() const { 
    assert(size_ > 0);
    return operator[](size_ - 1); 
}
    
size_t size() const { return size_; }
size_t capacity() const { return capacity_; }
bool empty() const { return size_ == 0; }

Element Removal & Container Modification

There are three operations to remove elemnt from a vector, which are pop_back, erase, and clear.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Remove element at specific index
void erase(size_t idx) {
    if (idx >= size_) return;
        
    // Destroy the element at idx
    data_[idx].~T();
        
    // Move elements to fill the gap
    for (size_t j = idx; j < size_ - 1; ++j) {
        new (data_ + j) T(std::move(data_[j + 1]));
        data_[j + 1].~T();
    }
    --size_;
}

// Remove all elements in the vector
void clear() {
    for (size_t i = 0; i < size_; ++i) {
        data_[i].~T();
    }
    size_ = 0;
}

// Remove an element at the back of this vector
void pop_back() {
    if (size_ > 0) {
        --size_;
        data_[size_].~T();
    }
}

Memory Management

The reserve operation is used to pre-allocate memory for a container, which is useful when you know how many elements will be added to a vector.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
void reserve(size_t newCapacity) {
    if (capacity_ >= newCapacity) return;
        
    auto* newData = reinterpret_cast<T*>(::operator new(newCapacity * sizeof(T)));
    for (size_t i = 0; i < size_; ++i) {
        new (newData + i) T(std::move(data_[i]));
        data_[i].~T();
    }
    ::operator delete(data_);
    data_ = newData;
    capacity_ = newCapacity;
}

Complete Code

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
#include <cstddef>
#include <cassert>
#include <cstdlib>
#include <utility>
#include <new>
#include <vector>

template <typename T>
class Vector {
public:
    // Default constructor
    Vector() {
        data_ = reinterpret_cast<T*>(::operator new(startSize_ * sizeof(T)));
        capacity_ = startSize_;
    }
    
    // Copy constructor
    Vector(const Vector& other) : size_(other.size_), capacity_(other.capacity_) {
        data_ = reinterpret_cast<T*>(::operator new(capacity_ * sizeof(T)));
        for (size_t i = 0; i < size_; ++i) {
            new (data_ + i) T(other.data_[i]);
        }
    }
    
    // Move constructor
    Vector(Vector&& other) noexcept 
        : data_(other.data_), capacity_(other.capacity_), size_(other.size_) {
        other.data_ = nullptr;
        other.capacity_ = 0;
        other.size_ = 0;
    }
    
    // Copy assignment
    Vector& operator=(const Vector& other) {
        if (this != &other) {
            Vector temp(other);
            swap(temp);
        }
        return *this;
    }
    
    // Move assignment
    Vector& operator=(Vector&& other) noexcept {
        if (this != &other) {
            Vector temp(std::move(other));
            swap(temp);
        }
        return *this;
    }
    
    // Destructor
    ~Vector() {
        for (size_t i = 0; i < size_; ++i) {
            data_[i].~T();
        }
        ::operator delete(data_);
    }

    // Left value reference
    void push_back(const T& elem) {
        if (size_ == capacity_) {
            reserve(capacity_ ? static_cast<size_t>(capacity_ * growthFactor_) : startSize_);
            assert(size_ < capacity_);
        }
        
        assert(data_);
        new (data_ + size_) T(elem);  // Fixed: construct at correct position with elem
        ++size_;
    }
    
    // Right value reference
    void push_back(T&& elem) {
        if (size_ == capacity_) {
            reserve(capacity_ ? static_cast<size_t>(capacity_ * growthFactor_) : startSize_);
            assert(size_ < capacity_);
        }
        
        assert(data_);
        new (data_ + size_) T(std::move(elem));
        ++size_;
    }

    // Const value overload
    const T& operator[](size_t idx) const { 
        assert(idx < size_);
        return data_[idx]; 
    }
    
    // Non-const value overload
    T& operator[](size_t idx) { 
        assert(idx < size_);
        return data_[idx]; 
    }
    
    const T& front() const { 
        assert(size_ > 0);
        return operator[](0); 
    }
    
    const T& back() const { 
        assert(size_ > 0);
        return operator[](size_ - 1); 
    }
    
    size_t size() const { return size_; }
    size_t capacity() const { return capacity_; }
    bool empty() const { return size_ == 0; }

    void reserve(size_t newCapacity) {
        if (capacity_ >= newCapacity) return;
        
        auto* newData = reinterpret_cast<T*>(::operator new(newCapacity * sizeof(T)));
        for (size_t i = 0; i < size_; ++i) {
            new (newData + i) T(std::move(data_[i]));
            data_[i].~T();
        }
        ::operator delete(data_);
        data_ = newData;
        capacity_ = newCapacity;
    }

    void erase(size_t idx) {
        if (idx >= size_) return;
        
        // Destroy the element at idx
        data_[idx].~T();
        
        // Move elements to fill the gap
        for (size_t j = idx; j < size_ - 1; ++j) {
            new (data_ + j) T(std::move(data_[j + 1]));
            data_[j + 1].~T();
        }
        --size_;
    }
    
    void clear() {
        for (size_t i = 0; i < size_; ++i) {
            data_[i].~T();
        }
        size_ = 0;
    }
    
    void pop_back() {
        if (size_ > 0) {
            --size_;
            data_[size_].~T();
        }
    }

private:
    void swap(Vector& other) noexcept {
        std::swap(data_, other.data_);
        std::swap(capacity_, other.capacity_);
        std::swap(size_, other.size_);
    }

    T* data_ = nullptr;
    size_t capacity_ = 0;
    size_t size_ = 0;
    
    static constexpr size_t startSize_ = 10;
    static constexpr double growthFactor_ = 1.6;
};

int main() {
    Vector<int> v;
    assert(v.size() == 0);
    
    for (int i = 0; i < 100; ++i) v.push_back(i);
    for (int i = 0; i < 100; ++i) assert(v[i] == i);
    assert(v.size() == 100);
    
    for (int i = 0; i < 50; ++i) {
        assert(v[0] == i);
        v.erase(0);
    }
    assert(v.size() == 50);
    
    for (int i = 50; i < 100; ++i) {
        assert(v[0] == i);
        v.erase(0);
    }
    assert(v.size() == 0);
    
    Vector<int> v2;
    v2.push_back(42);
    assert(v2.front() == 42);
    assert(v2.back() == 42);
    
    Vector<int> v3 = v2;
    assert(v3.size() == 1);
    assert(v3[0] == 42);
    
    return 0;
}