Git Product home page Git Product logo

301-a4's Introduction

Review Assignment Due Date [comment]: <> (Do not remove this!) Points badge

Note: It will take 2-5 minutes for the points to update after you push.

Original assignment by Varick Erickson. Copyright (c) 2021 Varick Erickson.

Deliverables

Deliverable Points
Add/Contains 5
Remove 5
Assignment/MakeEmpty 5
Iterator 5
Overload Add/Remove 5
Union 5
Difference 5
Intersection 5
Rehash 5
Questions 15
Commits 5
Commenting 5
Total 70

The same standards for committing and commenting apply for this assignment as previous assignments. You should have pre/post comments in the header files at minimum as well as general comments. You should be committing and pushing regularly.

Questions (15 points)

  1. (1 pt/entry) Complete the following tables of Big-Oh.

Assume the set has n items and n buckets.

Function Big-Oh (average)
Add O(1)
Remove O(n)
Contains O(n)
  1. (1 pt/entry) Complete the following tables of Big-Oh.

For the following table, assume set A has n items and set B has m items. You can also assume that A has n buckets and B has m buckets. You may also assume n > m. Your answers should be written in terms of m and n. (Note there is no difference between average, worst, and best for these functions.)

Function Big-Oh
Union O(n+m)
Intersection O(min(n,m))
Difference O(n)
Assignment O(n)
  1. Suppose you used the following function for GetHashIndex().
int GetHashIndex(const T& elem)
{
    return 0;
}

How would this affect the Big-oh for the following functions?

Function Big-Oh (average)
Add O(n)
Remove O(n)
Contains O(n)
  1. (1pts) Suppose you have n items and k*n buckets. What would be the load factor in this case?

    Load Factor is still numElems / numBuckets but with n items and kn buckets, the Load Factor would be = n / (kn) = 1/k

  2. (4pts) What would be the optimal number of buckets to have with respect to runtime if you have n items?

    The optimal # of buckets would be to have 1.5 or 2 times the # of buckets as a general rule of thumb. There is no fixed optimal number of buckets for a hash table with respect to runtime when you have n items. It depends on the specific requirements and characteristics of the application, and it may require experimentation and tuning to find the optimal number of buckets for a particular use case.

Recommended Implementation Order

This is the recommend implementation order for the SetType class.

  1. SetType()/SetType(int numBucks)
  2. Size()
  3. Add(T elem)
  4. Contains(T elem)
  5. Remove(T elem)
  6. MakeEmpty()
  7. SetType& operator=(const SetType& other)
  8. SetType& operator+(T elem)
  9. SetType& operator-(T elem)
  10. SetType& operator+(T SetType&)
  11. SetType& operator-(T SetType&)
  12. void ResetIterator()
  13. T GetNextItem()
  14. Rehash(int newNumBuckets)/LoadFactor()/SetMaxLoad(double max)

SetType

For this assignment, you will be implementing a set using an array of link lists and a hash function to figure out what "bucket" and element should go into. The set is a template and can store any type. The following is a diagram of the data structure (in this instance the T for the forward list happens to be a char):

img.png

The set stores the elements using an array of singly linked list. Rather than create our own singly linked list, we will use the STL (standard template library) forward_list. The next section describes how to use a forward_list and some examples that will help with your implementation.

The STL forward_list

In industry, you will rarely implement data structures completely from scratch. For common data structures, there are many implementations already readily available. When programming in C++, the most common libraries you will use are from the Standard Template Library (STL). This library contains implementations of many common data structures such as trees, sets, link lists, etc.

The STL forward_list implements a basic singly linked list data structure. This section describes several functions from the forward_list that you will need in order to implement your SetType data structure. The forward_list_examples.cpp file shows some usage of this data structure.

Pushing an element on a forward_list

To add en element to the front of a forward_list, you can use the push_front function.

forward_list<int> mylist;  // mylist is a singly linked list of integers

mylist.push_front(42);  // This adds 42 to the front of myList

Removing elements from a forward_list

To remove en element from a forward_list, you can use the remove function.

mylist.remove(42);  // This removes 42 from mylist

Iterating over a forward_list

Iterating over a forward_list is a bit different than what you are used to. The STL libraries like the forward_list make use of iterator objects. The iterator object points to specific element in a data structure. In previous projects, we created a "ResetIterator" function to have the iterator start at the beginning of the data structure. For the forward_list, we create an iterator object using the forward_list that starts at the beginning of the list:

forward_list<T>::iterator it; // Creates an iterator object

it = mylist.begin();          // Initialize the iterator object to the 
                              // beginning of mylist

If we want to examine the current element that the iterator points to, then we dereference the iterator object:

forward_list<T>::iterator it; // Creates an iterator object called "it"

it = mylist.begin();          // Initialize the iterator object to the 
                              // beginning of mylist

// Notice we dereference the iterator object to access the iterator value.
cout << "This is the first item: " << *it << endl;   

If we want to get the next item in the iterator, then we use the pre or post incrementer operator:

forward_list<T>::iterator it; // Creates an iterator object

it = mylist.begin();          // Initialize the iterator object to the 
                              // beginning of mylist
                              
cout << "This is the first item: " << *it << endl;
it++;   // Move to the next item in iterator

cout << "This is the second item: " << *it << endl;

A common construction for iterating through the entire forward_list would be to use a for loop:

for (it = mylist.begin(); it != mylist.end(); ++it )
    cout << ' ' << *it;
cout << '\n';

Notice, we don't use a size or length to control the loop. In fact, the forward_list class does not have size function as part of the class. We instead check to see if the iterator object is at the end of the data structure using the member function end() of the forward_list object.

Clearing a forward_list

To clear a forward_list, you simply call the function clear. This will remove all the elements from the list.

mylist.clear();     //  Removes all items from the forward_list

SetType Descriptions

Basic_SetType_Test

The Basic_SetType_Test can be used for development. It will not be used for grading. This program does several very basic tests of SetType. It is useful starting point for testing your SetType. Feel free to modify this file as part of your testing.

Note: This program is a CMake Application. It will not show up as a Catch Application.

HINT: It is a very very very very good idea to start with this before running the SetType_Tests. If Basic_SetType_Test doesn't work properly, then the SetType_Tests won't work properly.

SetType() Default Constructor

The default constructor should use the default number of buckets and set the number of elements to 0. This should also set the maxLoad to the default load factor.

SetType(SetType& other) Copy Constructor

This function should call copySet to do a deep copy into this.

SetType(numBucks)

This constructor should uses the numBucks to create the buckets and set the number of elements to 0. This should also set the maxLoad to the default load factor.

Size()

This function returns the total number of elements in the set.

Add(T elem)

This function should add elem to the set. If elem already exists in the list, then the set should not change. This function should use the GetHashIndex to figure out what bucket elem should be placed and the elem should be pushed to the front of the forward_list.

The following is some code that shows how to find a bucket index.

// How to figure out the bucket an element should go into
int bucket = GetHashIndex(elem);

If the load factor is above the maxLoad, then the number of buckets needs to be doubled and Rehash should be called.

HINT: I recommend waiting to add the Rehash call until later. Don't forget about book keeping.

Remove(T elem)

This function should remove elem to the set. If elem does not exist in the list, then the set should not change. This function should use the GetHashIndex to figure out what bucket elem exists in and remove elem from the forward list.

Contains(T elem)

This function should return true if elem is in the set. You should find the bucket that elem is in and the search for that item within the bucket.

HINT: Be sure to look at how to iterate through a forward_list in the provided examples.

void MakeEmpty()

The MakeEmpty function should remove all elements from the set. This is done by clearing each bucket.

Hint: Loop over the buckets.

SetType operator+(T elem)

This is an alternative way to add an element to a set. However, this overload does not change the original calling set.

SetType<char> setA;
SetType<char> setB;

// setA will contain {a, b, c, d}
for (char c = 'a'; c < 'e'; c++) {
   setA.Add(c);
}

setB = setA + 'e';  // setB will contain {a, b, c, d, e}
                    // setA will still contain {a, b, c, d}

HINT: You can use the Add function and the assignment operator to help implement this overload.

SetType operator-(T elem)

This is an alternative way to remove an element to a set. However, this overload does not change the original calling set.

SetType<char> setA;
SetType<char> setB;

// setA will contain {a, b, c, d}
for (char c = 'a'; c < 'e'; c++) {
   setA.Add(c);
}

setB = setA - 'a';  // setB will contain {b, c, d}
                    // setA will still contain {a, b, c, d}

HINT: You can use the Remove function and the assignment operator to help implement this overload.

SetType operator+(SetType& otherSet)

This function returns the union of this and otherSet. This overload does not change the original instance or otherSet.

SetType<char> setA;
SetType<char> setB;
SetType<char> unionAB;

// setA will contain {a, b, c, d}
for (char c = 'a'; c < 'e'; c++) {
   setA.Add(c);
}

// setB will contain {e, f, g, h}
for (char c = 'e'; c < 'i'; c++) {
    setB.Add(c);
}

unionAB = setA + setB;  // unionAB will contain {a, b, c, d, e, f, g, h}
                        // Note that setA and setB will be unchanged.

HINT: A very inefficient solution will use multiple nested for loops, which leads to a runtime of O(n2).

Your implementation should NOT use a strategy that requires you to loop through all the elements of BOTH sets:

// Not ok...
for each bucket of this {
   for each item of this {
      for each bucket of otherSet {
         for each item of otherSet {
            if "item from this set" == "item from other set"
                add item to result
         }
      }
   }
}

// Also not ok...
for each item in this {
  for each item in otherSet { 
      if "item from this set" == "item from other set"
          add item to result
  }
}

SetType operator-(SetType& otherSet)

This function returns the difference of this and otherSet. This overload does not change the original instance or otherSet.

Order matters for difference

Unlike union, order matters. If you have sets A and B, A - B will produce a different result than B - A.

Example:
Let A = {a, b, c, d} and B = {b, d, e}. Then

A – B = {a, c}

and

B – A = {e}.

Example: Let G = {a, b, c} and H = {a, b, c}. Then G – H = {}.

SetType<char> setA;
SetType<char> setB;
SetType<char> diffAB;
SetType<char> diffBA;

// setA will contain {a, b, c, d}
for (char c = 'a'; c < 'e'; c++) {
   setA.Add(c);
}

// setB will contain {b, d, e}
setB.Add('b');
setB.Add('d');
setB.Add('e');


diffAB = setA - setB;  // diffAB will contain {a, c}
diffBA = setB - setA;  // diffBA will contain {e}

// Note that setA and setB will be unchanged.

SetType operator*(SetType& otherSet)

This function returns the intersection of this and otherSet. This overload does not change the original instance or otherSet.

SetType<char> setA;
SetType<char> setB;
SetType<char> intersectAB;

// setA will contain {a, b, c, d}
for (char c = 'a'; c < 'e'; c++) {
   setA.Add(c);
}

// setB will contain {b, d, e}
setB.Add('b');
setB.Add('d');
setB.Add('e');

intersectAB = setA*setB;  // intersectAB will contain {b, d}

// Note that setA and setB will be unchanged.

SetType& operator=(SetType const& otherSet)

This function should call copySet to do a deep copy into this.

void copySet(SetType const& otherSet)

This function does a deep copy into this. The function should copy the contents of each bucket from otherSet into the buckets of this. The function should also copy over the appropriate private variables from otherSet to this. Since the otherSet is passed in as a const, you will not be able to use otherSet.ResetIterator() or otherSet.GetNextItem().
You will need to loop through each bucket and all the items within each bucket.

HINT: Be sure that the number of buckets match. You will need to delete buckets and reallocate the correct number of buckets similar how you allocated the buckets for the constructor. Be also sure to make this empty before copying the elements from otherSet.

double LoadFactor()

This function returns the load factor of the set. This is total number of elements in the set divided by the total number of buckets.

void Rehash(int newNumBuckets)

This function rehashes the elements stored in the set using the newNumBuckets.

A simple strategy is to start with an empty set with the appropriate number of buckets (let's call this rehashSet). After that, you can use the iterator to add all the elements from this to the rehashSet. Once all the elements are copied into the rehashSet, you can use the assignment operator with *this and rehashSet.

Variables used for the SetType iterator

Two variables are used to keep track of the current element that should be returned next; currBucket and bucketIter. The currBucket is the index of the bucket that is currently being iterated over. The bucketIter is the iterator of the bucket currently being iterated over. A third variable called iterCount is used to determine how many iterations have occurred.
This variable is used to throw an error if we try to get the next item when there is no more items to iterate over.

void ResetIterator()

This function resets the iterator for the set. It behaves similarly to the other data structures covered previously. This function should set bucketIter to the beginning of the first bucket. It should also set currBucket and iterCount appropriately.

T GetNextItem()

This function should return the element that the bucketIter currently points to and then increment the bucketIter. If there are no more elements in the bucketIter (it is at the end), then the currBucket should be incremented and bucketIter should be set to the beginning of the next bucket.
If GetNextItem() is called and there are no more elements to iterate over, then it should throw a IteratorOutOfBounds error.

Here is a rough algorithm:

If there are no more elements (use iterCount)
   throw a SetError
   
// Find the first bucket that actually has something in it   
while bucketIter is at end
   set bucketIter to the next bucket and increment currBucket
   
Increment bucketIter
Increment iterCount

return 

HINT: You should be using end() to check if you are at the end of a bucket.

301-a4's People

Contributors

marc026 avatar github-classroom[bot] avatar akshithcsueb avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.