[text] VY CGRR 01 normalize databases Hey guys Takina and are here to teach you the fundamentals of database normalization which is an important for practical purposes such as reducing space wasted by redundant data decreasing the risk for logical errors upon data modification and making querying more efficient. Here were gonna introduce the first three normal forms in terms of some sample data from our very own Cafe LycoReco. Alright so lets look at this table of orders first. You can probably see it suffers from several obvious major issues. Orders Customer Order Shinji Yoshimatsu 2x Takinas Chocolate Parfait 1x Latte Robert Stuff Downey Jr. Cappuccino Robert Topala 2x Strawberry Smoothie 2x Takinas Chocolate Parfait Fuki Harukawa Latte 3x Cappuccino Shinji Yoshimatsu 2x Takinas Chocolate Parfait 1x Latte First there isnt a primary key which is illustrated here by the fact that duplicate records such as Shinjis can exist. Here we assume that the records are ordered chronologically but allowing order of records to be significant is highly invalid. To fix this we can add a timestamp column so the information in the table is no longer dependent on their ordering. There are some other problems that need to be addressed in order to put the database in first normal form. While we echnically dont have any mixed data types all orders are strings which is a good sign the format of the order column is highly inconsistent and subject to variation depending on what the person orders which makes it practically impossible to query. The data must be atomic indivisible in which at its current state it is not. One potential approach might be to split up the Order column into many columns where each item on the Cafe LycoReco menu would have its own column for quantity. However this is also infeasible since it uses a lot of unnecessary space and also would be quite awkward to query. Hence what we can do here is to split up each separate food order into its own record. Now that we don have any nonatomic data nor columns conveying redundantrepeated information anymore our database is A in first normal form. The three bolded fields make up the 4 primary key for the table. . Customer Timestamp Item Quantity Shinji Yoshimatsu 121347 PMTakinas Chocolate Parfait 2 Shinji Yoshimatsu 121347 PM Latte 1 Robert Stuff Downey Jr. 121519 PM Cappuccino 1 Robert Topala 22222 PM Strawberry Smoothie 2 Robert Topala 22222 PM Takinas Chocolate Parfait 2 Fuki Harukawa 23123 PM Latte 1 Fuki Harukawa 23123 PM Cappuccino 3 Shinji Yoshimatsu 55139 PMTakinas Chocolate Parfait 2 Shinji Yoshimatsu 55139 PM Latte 1 Suppose now we wanted to add an extra field to the database for ltem Type. One way to do this is simply adding it as a column to the Orders table Customer Timestamp Item ftem Type Quantity Shinji Yoshimatsu 121347 PM Takinas Chocolate Parfait Dessert 2 Shinji Yoshimatsu 121347 PM Latte Coffee 1 Robert Stuff Downey Jr. 121519 PM Cappuccino Coffee 1 Robert Topala 22222 PM Strawberry Smoothie Smoothie 1 Robert Topala 22222 PM Takinas Chocolate Parfait Dessert 1 Fuki Harukawa 23123 PM Latte Coffee 1 Fuki Harukawa 23123 PM Cappuccino Coffee 3 . Shinji Yoshimatsu 55138 PM Takinas Chocolate Parfait Dessert 2 Shinji Yoshimatsu 55139 PM Latte Coffee 1 . However this design is naive and is prone to anomalies . that may occur upon modifying the data. For instance if Robert Topala canceled his Strawberry Smoothie order we would no longer have access to the information ltem Type about Strawberry Smoothies. Also since Takina is notoriously bad at cooking if we hypothetically wanted to edit the ltem Type of her Chocolate Parfait from Dessert to Literal Shit wed have to do it for every order of her Chocolate Parfait which not only is error prone but also is unnecessarily inefficient just kidding Takina. Basically the reason these inconsistencies can happen is because the ltem Type field depends only on the ltem field of the 3field primary key. In order to prevent these anomalies and convert the database to 2nd normal form every nonkey field must depend on the entire primary key. A way we can fix this database is to simply create another table for the ltems Orders Customer Timestamp Item Quantity Shinji Yoshimatsu 121347 PM Takinas Chocolate Parfait 2 Shinji Yoshimatsu 121347 PM Latte 1 Robert Stuff Downey . 121519 PM Cappuccin 1 g A Robert Topala 22222 PM Strawberry Smoothie o ltems Robert Topala 22222 PM Takinas Chocolate Parfait 2 ltem Rem Type Fo Horuleans oo Pl Latte Takinas Chocoite Parf Dessert Fuki Harukawa 23123 PM Cappuccino g Lot Cofiee Cappuccino Cofles Shinji Yoshimatsu 55130 PM Takinas Chocolate Parfait 2 oy Smodthie Smoothie Shinji Yoshimatsu 55130 PM Latte 1 Now say we wanted to conduct a study on how healthy the foods that our customers are ordering are in which we could add Calories per Serving and Calorie Rating which could be low medium high as fields in the database. Initially well just add them to the ltems table like this Item Item Type Calories per serving Calorie Rating . Takinas Chocolate Parfait Dessert 263 High Latte Coffee 179 Medium Cappuccino Coffee 167 Medium Strawberry Smoothie Smoothie 83 Low However due to the Calorie Rating field depending on a nonkey field Calories per serving this design introduces another potential problem. Suppose if new tests showed that Strawberry Smoothies actually had 150 calories per serving and we need to update that value. Then well also have to manually update its Calorie Rating from Low to Medium long with that in which that extra step could easily be overlooked creating an inconsistency. In order to prevent this potential issue from occurring we should convert our database to third normal form which posits that no field can depend on any field other than primary key fields. Here this can be done by creating a new table that details the calorie cutoff values for each rating. ltems Table item ftem Type Calories per serving Takinas Chocolate Parfait Dessert 263 Latte Coffee 179 Cappuccino Coffee 167 Strawberry Smoothie Smoothie 83 Calorie Ratings Table Calorie Rating Cutoff Low 0 Medium 100 High 250 4 Very High 400 Oh shit it looks like Majimas causing some trouble again so as Lycoris agents we are mandated to shut that down as soon as possible. Hence we have to leave now. hope our tutorial on 1NF 2NF and 3NF was helpful and note that the first three normal forms should suffice for the majority of practic HOLY SHIT WHAT WAS THAT THEY . BROKE INTO THE CAFE OH SHIT . AAAAAAAAAAAAAAAA