HW2: Programming Assignment

GitHub Actions

GitHub Actions will work very similarly to the last homework, but it will now run our auto-grader and provide a score on the auto-graded portion.

Please read the entire assignment through once before you start to perform any tasks.

Background

In this project you will write code to model a very simplified version of an online retail system such as Amazon. In this project you will read a database of products from certain categories as well as users and their associated information. Your program will then allow a user to interactively search for products based on certain keywords returning the products that match the search. From those matches, your program will allow a user to add select items to their “cart”, view items in their cart, purchase the items in their cart, and save the updated database of product and user information.

Important: In practice, reading and understanding others’ code is just as important as writing your own code. So in this project you will need to read and understand a good bit of provided code. Spend time understanding what it does and its structure.

One common need when reading large code bases (in the grand scheme this is not that large of a code base) is to find where classes or functions are defined that you see being called or used. Most editors have the ability to do this via some Find in files or Goto definition related feature(s). In Sublime if you open a folder (e.g. your pa2-username folder) you can right click on a class name or function name and choose Goto Definition to jump to where that class or function is declared/defined.

Another simple command line tool is grep. At the command prompt you can type:

grep "search phrase" file(s)

to find all occurrences of search phrase in the listed files (often replaced with the wildcard, *). Thus:

grep "Product" *

would output the lines of text from any file in the current directory that contained Product. An additional option is -n to show line numbers.

grep -n "Product" *

would include the line numbers in each file where the search phrase occurs.

The Data and Its Format

Your online retail system will sell products of various categories. All products (no matter the type) will have a:

Currently your system will support 3 categories of products and each category will supply additional fields of data as indicated below:

Note: ISBN, Author, Size, Brand, Genre, and Rating should all be string type data.

The program will also support a set of known users with a:

This information will be stored and can be accessed from a text database file.

<products>
product_category
name
price
quantity
category-specific-info
product_category
name
price
quantity
category-specific-info
...
</products>
<users>
username credit_amount type
username credit_amount type
...
</users>

An example is shown here of a sample file we will provide database.txt.

<products>
book
Data Abstraction & Problem Solving with C++
79.99
20
978-013292372-9
Carrano and Henry
book
Great Men and Women of Troy
19.50
5
978-000000000-1
Tommy Trojan
clothing
Men's Fitted Shirt
39.99
25
Medium
J. Crew
movie
Hidden Figures DVD
17.99
1
Drama
PG
</products>
<users>
aturing 100.00 0
johnvn 50.00
adal 120.00 1
</users>

Provided Code

Here is a list of the files in the codebase that we are providing. Please do not alter any of the files marked Complete! below.

Storage
Parsing
Utility code

Requirements

  1. Keywords - Your system should build an index mapping keywords to the set of products that match that keyword. A product should match a keyword if it appears in the product name or one of the following attributes below (dependent on specific type of product):
    • Books: the words in the author’s name should be searchable keywords as well as the book’s ISBN number
    • Clothing: the words in the brand should be searchable keywords
    • Movie: the movie’s genre should be a searchable keyword
    • For the product name, book author, and clothing brand we define a keyword to be any string of 2 or more characters. If such a word has punctuation it should be split at each punctuation character and the resulting substrings (of 2 or more characters) should be used as keywords. Here are some examples:
    • Men's should yield just a keyword of Men
    • J. would not yield any keyword since the remaining substring J is only 1 character
    • I'll would yield just ll since that substring is 2 or more characters (this is obviously a weird keyword but I'll isn’t a good search term anyway.) - For other keywords (book ISBN and movie genre) no punctuation or size analysis is necessary and it should be used verbatim as a keyword. Here is an example:
      • The ISBN 978-000000000-1 should be used exactly as is for the keyword entry - It is suggested you store your keywords in a common case so that searching is easy and case-insensitive
  2. AND/OR Search - Your system should allow users to search for products based on entering one or more keywords at the program menu prompt. An AND search should return all the products that contain ALL the search terms entered. An OR search is defined as all the products that contain ANY of the search terms entered. At the prompt the user will need to write AND or OR as their first word/command followed by any number of search terms separated by spaces. Your search should treat those terms as case-insensitive when it comes to matching. Examples might be: AND Men would be the same as OR Men since there is only 1 term and would return all products that have the word men. (i.e. the book Great Men and Women of Troy and Men's Fitted Shirt). AND hidden Data would return nothing since no products have both those terms OR hidden Data would return both the Hidden Figures DVD and Data Abstraction & Problem Solving with C++ products You may choose any reasonable behavior if the search consists only of AND or OR (no keywords)

  3. Your search must be implemented “efficiently”. You should not have to iterate over ALL products to find the appropriate matches. Some kind of mapping between keywords and products that match that keyword should be implemented.
  4. Hits - Results must be displayed to the user via the displayProducts(vector<Product*>& hits); function provided in amazon.cpp. Failure to use this function will result in LARGE deductions since it will make our testing much harder.
  5. Adding to Carts - You should support the notion of a “cart” for each user that they can add products to. Using the ADD username hit_result_index command should cause the product with index hit_result_index from the previous search result to be added to username’s cart (case insensitive). You must maintain the cart in FIFO (first-in, first-out) order though that doesn’t mean you HAVE TO use the C++ queue class. You don’t need to worry about removing products from a cart. If a product is added to a cart twice, treat them as separate items and store them in your cart twice (i.e. don’t try to store it once with a “quantity” of 2). This implies that each command of ADD adds 1 product to the CART. If the username or hit_result_index is either not provided, or invalid, print Invalid request to the screen and do not process the command. Note: The results from the last search should be retained until a new search is performed. Thus, the hits from one search can be referenced by many ADD commands.
  6. Viewing to Carts - You should support the VIEWCART username command which should print the products in username’s cart (case insensitive) at the current time. The items should be printed with some ascending index number so it is easy to tell how many items are in the cart. If the username provided is invalid, print Invalid username to the screen and do not process the command.
  7. Buying the cart - You should support the BUYCART username command which should cause the program to iterate through the items in username’s cart (case insensitive). If the item is in stock AND the user has enough money it should be removed from the cart, the in stock quantity reduced by 1, and the product price should be debited from the user’s available credit. If an item is not in stock or the user does not have enough credit, simply leave it in the cart and go on to the next product. Note: Your cart implementation must iterate through the products in the order they were added. If the username provided is invalid, print Invalid username to the screen and do not process the command.
  8. Quitting - You should support the QUIT filename command which should cause a new version of the database using the format described above to be saved to a file whose name is filename. It should represent the updated state of the database (i.e. changing product quantities and user credit) to reflect purchases. Note: Within the various sections, users and products may be written in any order (not necessarily matching the order of the input database file).

Here is an example run of the program

$ ./amazon database.txt

Read 4 products
Read 3 users
=====================================
Menu:
  AND term term ...
  OR term term ...
  ADD username search_hit_number
  VIEWCART username
  BUYCART username
  QUIT new_db_filename
====================================

Enter command:
OR hidden DATA
Hit   1
Data Abstraction & Problem Solving with C++
Author: Carrano and Henry ISBN: 978-013292372-9
79.99 20 left.

Hit   2
Hidden Figures DVD
Genre: Drama Rating: PG
17.99  1 left.


Enter command:
ADD johnvn 2

Enter command:
VIEWCART johnvn
Item 1
Hidden Figures DVD
Genre: Drama Rating: PG
17.99  1 left.


Enter command:
BUYCART johnvn

Enter command:
QUIT database.new

Here is the resulting output database. Notice the reduction in quantity of the Hidden Figures DVD and the credit for user johnvn

<products>
book
Data Abstraction & Problem Solving with C++
79.99
20
978-013292372-9
Carrano and Henry
book
Great Men and Women of Troy
19.5
5
978-000000000-1
Tommy Trojan
clothing
Men's Fitted Shirt
39.99
25
Medium
J. Crew
movie
Hidden Figures DVD
17.99
0
Drama
PG
</products>
<users>
aturing 100.00 0
johnvn 32.01 1
adal 120.00 1
</users>

Code Organization

Our code in amazon.cpp and db_parser.cpp can make calls via the DataStore interface by using a base class pointer/reference (DataStore* or DataStore&). It is this class where you will likely want to store products and users, via some kind of container object(s).

The parser in DBParser is complete but allows for extensions by “registering” certain “section parsers” and “product parsers”. Section parsers handle everything between <section>...</section> in the database file. We can create section parsers out in main() and register them with the DBParser which will maintain a map of section name to the given parser. When the parser encounters a particular section it will invoke the appropriate section parser. We have written two sections parsers: ProductSectionParser and UserSectionParser. These are complete.

Our product parsers will parse all the aspects of the specific category of product into the data members of the class and then call makeProduct(). It is here that you need to instantiate an appropriate product object and return a pointer to it. It will then be added to the data store object. You only need to fill in the code in the makeProduct() functions for each specific product parser and do not need to change any other code in product_parser.cpp other than adding appropriate #include statements.

Your Task (80%)

  1. Complete the parseStringToWords() in util.cpp according to the specification given above for taking a string of many words and splitting them into individual keywords (split at punctuation, with at least 2 character words)
  2. Complete the setIntersection and setUnion functions in util.h. These will help you later on to perform searching. These functions should run in O(n*log(n)) and NOT Θ(n^2). Note that these are templated functions operating on any generic set<T>. As a hint, to declare an iterator for set<T> you must precede the type with the keyword typename as in typename set<T>::iterator it. Another very important note about using iterators with C++ containers (e.g. vector, set, map ): if you are iterating over a container with iterators, you should NOT modify the contents as you iterate. Consider the scenario where you have an iterator to the beginning item of a vector, and in your loop you erase that element. Behind the scenes, the vector shifts all the data elements up a spot, moving the 1st element into the 0th location. When you increment the iterator you will now move to the next location in the vector skipping the 1st element (now in the 0-th location). Similar or more serious issues can arise when you insert items, etc. as you iterate.
  3. Write derived product classes: Book, Clothing and Movie implementing the keywords() which returns the appropriate keywords to index the product, displayString() [to create a string that contains the product info] and dump() [that outputs the database format of the product info]. We recommend trying to compile (NOT test, just compile) each of these files as you write them to avoid solving the same compile issue 3 times for each derived class. Remember you can easily compile by using the -c flag (e.g. $ g++ -g -Wall -std=c++11 -c book.cpp ). Each class should be written in its own .h and .cpp files (i.e. book.h, book.cpp, clothing.h, etc.)
  4. Complete each of the specialized product parser implementations of makeProduct() in product_parser.cpp to return a new specific Product for each category. Again we recommend ensuring this file can be compiled after you complete it.
  5. Implement a derived DataStore class called MyDataStore in mydatastore.h and mydatastore.cpp. It is here that you will implement the core functionality of your program: searching, adding products and users, saving the database, etc. For search, you can use the setIntersection and setUnion functions in util.h. This class is likely where you should store products and users in some fashion. Again we recommend compiling this file separately as well after you write the core functionality. You may need to add to it or modify it later as you work through other design aspects but make sure it can compile now even just using empty “dummy” function implementations. This derived class may define non-virtual functions to do other specific commands that the menu supports. It might be a good idea to have one member function in this class that corresponds to and performs each command from the menu options. You should not modify datastore.h.
  6. Complete amazon.cpp. It has a pretty good skeleton laid out for you to implement the user interface (text-based menu and command entry) and you only need to modify a few lines in the top area and then add the remaining menu option checks at the bottom. More specifically, you will need to:
    • Change the DataStore object instantiation to your derived type
    • Add checks for other menu input options, read their “arguments” and implement the desired behaviors.
    • You should not need to modify the parser calls at the top.
  7. Update the Makefile as needed. Remember we never compile .h files…those just get #included into the .cpp files that we actually compile.
  8. Be sure you have no memory leaks.
  9. You may NOT use any additional (we use a few in the code provided) algorithms from the <algorithm> library nor may you use the auto keyword.

Displaying Products

When you display the products, displayString() will be used to generate the information string. You should follow the format:

<name>
Author: <author> ISBN: <isbn>
<price> <quantity> left.
<name>
Size: <size> Brand: <brand>
<price> <quantity> left.
<name>
Genre: <genre> Rating: <rating>
<price> <quantity> left.

Test your Program Manually

We strongly recommend writing separate test drivers programs (i.e. separate .cpp files with a main()) that perform basic unit tests by calling various functions or instantiating your classes and invoking the various member functions. In this way you can have some confidence that the individual pieces work before you try to put them all together.

At the point where you need a database file to parse and act upon, you may use the database.txt file. Feel free to add products and users to the database.txt or, better, create your own database text file. Run the program and be sure to test various sequences of command that exercise the requirements described above.

Test your Program with our Test Suite

We provided a test suite that you can run locally or via GitHub Actions. To set this up locally, run the following on docker:

cd pa2-username
tar xvf hw2_tests.tar.gz
cd hw2_tests
cmake .

Now every time you want to test your program, do the following in hw2_tests:

make
cd amazon_tests
./amazon_tests

Finishing Up

Completion Checklist

Use git status in the pa2-username directory to make sure that there are no modified source code files that need to be submitted. If there are, use git add and git commit to commit those changes. Then use git push to push those changes to Github.

Ensure you add/commit/push all your source code files to your pa2-username repository.

WAIT! You aren’t done yet. Complete the sections below to ensure you’ve committed all your code.

GitHub Actions Summary

Commit then Re-clone your Repository (Optional)

If you want extra peace of mind on your submission or GitHub Actions isn’t working for some reason, try doing the following:

  1. In your terminal, cd to the folder that has your resources and pa2-username (i.e. csci104)
  2. Create a verify-hw2 directory: $ mkdir verify-hw2
  3. Go into that directory: $ cd verify-hw2
  4. Clone your hw_username repo: $ git clone git@github.com:usc-csci104-fall2024/pa2-username.git
  5. Go into your hw2 folder $ cd pa2-username
  6. Switch over to a docker shell, navigate to the same verify-hw2/pa2-username folder.
  7. Recompile and rerun your programs and tests to ensure that what you submitted works.