courses:cs101-2021-02:lab3.html

<html> <h1 id=“lab-3-tables”>Lab 3: Tables</h1> <p>In this lab you’re going to be working with tables! The goal of this lab is to give you practice with:</p> <ul> <li>Extracting rows and columns from a table</li> <li>Writing and testing helper functions</li> <li>Filtering data with

filter-with

</li> <li>Writing nested functions</li> <li>Adding a column to a table</li> <li>Summarizing columns</li> </ul> <h2 id=“setup”>Setup</h2> <p><a href=“https://en.wikipedia.org/wiki/FiveThirtyEight”>FiveThirtyEight</a> conducted a survey in which tens of thousands of people were asked to choose between two candies; from the survey’s responses, they computed the winning percentage of each candy. Then, they compiled a data-set with

Number

attributes like the winning percentage, relative price, and sugar percentage, and also

Boolean

attributes such as chocolate, fruity, caramel, and hard. This <a href=“https://github.com/fivethirtyeight/data/blob/master/candy-power-ranking/candy-data.csv”>data</a> has been analyzed in a FiveThirtyEight article, <a href=“https://fivethirtyeight.com/videos/the-ultimate-halloween-candy-power-ranking/”>“The Ultimate Halloween Candy Power Ranking,”</a> which is definitely worth a read after the lab.</p> <p>In this lab, you’ll be looking at the <strong>relationships between these columns</strong>.</p> <p>First, to get started include this code at the top of your program:</p> <pre>

include shared-gdrive("dcic-2021",
  "1wyQZj_L0qqV9Ekgr9au6RX2iqt2Ga8Ep")
include gdrive-sheets

ssid = "1fJp5pbmutjQ9HlyUA8b0Yc-BqNxSd0i9wrZgZxpSCKM"
data-sheet = <span class="hljs-keyword">load</span>-spreadsheet(ssid)
candy-<span class="hljs-keyword">data</span> = 
  <span class="hljs-keyword">load</span>-<span class="hljs-keyword">table</span>: <span class="hljs-keyword">name</span>, chocolate, fruity, caramel, nutty, nougat, crisped-rice, 
    hard, bar, pluribus, sugar-<span class="hljs-keyword">percent</span>, price-<span class="hljs-keyword">percent</span>, win-<span class="hljs-keyword">percent</span>
    <span class="hljs-keyword">source</span>: <span class="hljs-keyword">data</span>-sheet.sheet-<span class="hljs-keyword">by</span>-<span class="hljs-keyword">name</span>(<span class="hljs-string">"candy-data"</span>, <span class="hljs-literal">true</span>)
  <span class="hljs-keyword">end</span>

</pre><p>This code loads a Pyret table from a Google Sheet. Press the “Run” button in the top right corner to process the code in the definitions window. Then type

candy-data

in the interactions window to see the data as a table.</p> <p>For this lab, you’ll want to refer to the <a href=“https://hackmd.io/@cs111/table”>Pyret Tables Documentation</a> and not the built-in Pyret documentation.</p> <h2 id=“part-1-filtering”>Part 1: Filtering</h2> <p>Let’s use the power of filtering to learn more from our candy data.</p> <h3 id=“1-1-sugar-rush”>1.1: Sugar Rush</h3> <p>We want to know which candies have the most sugar. Write an expression in the definitions window that produces a table containing only the candies where

sugar-percent &gt; 0.75

. To do this, you'll first need to write a function that takes a table

Row

as input and returns a

Boolean

that answers the question is the

sugar-percent

of that row greater than 75%. We've started the function for you. For this function you are just trying to determine if the input,

, has a value in the

sugar-percent

column that is greater than 75%. Be sure to test your function with a

where

block. For example, if you pass in the row

candy-data.row-n(0)

to your function, it should return

false

, and the inputting row

candy-data.row-n(4)

should cause your function to return

true

. As a reminder, the first row in the table has the index of

.</p> <pre>

<span class="hljs-function"><span class="hljs-keyword">fun</span> over-75-percent-<span class="hljs-title">sugar</span><span class="hljs-params">(r :: <span class="hljs-type">Row</span>)</span></span> -&gt; <span class="hljs-built_in">Boolean</span>:
  doc: <span class="hljs-string">"Returns true if the sugar-percent column in a row is over 75%"</span>
  ... # Replace the <span class="hljs-string">"..."</span> and <span class="hljs-keyword">this</span> comment with your code
end

</pre><p>Once you have written that function, we'll write an <strong>expression</strong> (not a function) to filter our table. Take a look at the <a href=“https://hackmd.io/@cs111/table#Creating-and-Extracting-Tables”>filter-with</a> function. The first input to the

filter-with

function is the

Table

to be filtered and second input is the name of the filter function, which is the function you just wrote above. That's right, as you hopefully remember, functions can be passed as inputs to other functions! Call over a coach if you want help using this structure. If you've got everything right, your expression will evaluate to a table with 15 rows in it. Hint: <em>your expression should be a call to

filter-with

with the proper inputs.</em></p> <p>Suggestion: When you’re done writing the expression, copy and paste this line</p> <pre>

<span class="hljs-meta">#-----------------------------------</span>

</pre><p>into the definitions window directly below it to separate it from the next part.</p> <h3 id=“1-2-bougie”>1.2: Bougie</h3> <p>Now that we know how to satisfy our sweet tooth, write an expression in the definitions window that produces a table containing only the candies where

price-percent

is greater than 90%. Be sure to test your predicate function with a

where

block. If you got your filter correct, you should have a table with 8 rows in it.</p> <p>Suggestion: Once again, separate it from the next part by pasting this line</p> <pre>

<span class="hljs-meta">#-----------------------------------</span>

</pre><p>on the line below it.</p> <h3 id=“1-3-chocolate”>1.3: Chocolate</h3> <p>How many of the candies have chocolate?</p> <p>Write an expression in the definitions window that outputs this number in the interactions window. You can get the length of a table by writing

&lt;table&gt;.length()

where

&lt;table&gt;

is the name of (or an expression that evaluates to) a table. If you got your filter correct, you'll find a table with 37 chocolate candies in it.</p> <p>Suggestion: Once again, separate it from the next part by pasting this line</p> <pre>

<span class="hljs-meta">#-----------------------------------</span>

</pre><p>on the line below it.</p> <h3 id=“1-4-chocolate-and-caramel”>1.4: Chocolate and Caramel</h3> <p>Of the candies that have chocolate, what proportion also have caramel?</p> <p>Write an expression in the definitions window that outputs this proportion in the interactions window. If you got the right answer, you'll see that 10/37 or about 27% of the chocolate candies also have caramel.</p> <p>Suggestion: Once again, separate it from the next part by pasting this line</p> <pre>

<span class="hljs-meta">#-----------------------------------</span>

</pre><p>on the line below it.</p> <h3 id=“1-5-chocolate-and-anything-optional-”>1.5: Chocolate and Anything (optional)</h3> <p>Next, we'd like to know what attribute is paired most frequently with chocolate. We could write filter functions like we did for part 1.4 for all the other possible pairings, but that would be very tedious and redundant.</p> <p>So instead, to help answer this question, let's write a function that generalizes the expression in 1.4. This function will take in a

String

representing the name of the ingredient being paired with chocolate and a

Table

that is only candies that have the chocolate attribute. This function returns a number which is the proportion of candies that have the attribute specified to the total number of chocolate candies.</p> <p>The function declaration should look something like this:</p> <pre>

<span class="hljs-function"><span class="hljs-keyword">fun</span> candy-with-<span class="hljs-title">attr</span><span class="hljs-params">(t :: <span class="hljs-type">Table</span>, attr :: <span class="hljs-type">String</span>)</span></span> -&gt; Number:

</pre><p>Look for commonalities across the code you wrote for the predicate functions in (1.3) and in (1.4). To do this you will need to use nested functions (a function defined inside of another function). This will allow the filter function to access the

attr

string, even though it is not being passed into the function.</p> <p>Compare the results of your function on the inputs “fruity,” “nutty,” and “caramel”. Hopefully your answer to “caramel” matches the answer you got in part 1.4! You don’t have to write code for this comparison – just use your new function to compute all three proportions and compare them manually.</p> <hr> <p><strong>Checkpoint</strong>: Call over a coach once you reach this point and talk over your code with them.</p> <hr> <h2 id=“part-2-building-columns-and-analyzing-them”>Part 2: Building columns and analyzing them</h2> <h3 id=“2-1-new-column”>2.1: New column</h3> <p>Build a column of Boolean values that indicates whether a candy is fruity and hard, but not a pluribus (multiple candies in a packet like Skittles or M&Ms). This should produce a new table with an added column. Write an expression in the definitions window that uses this new column to compute how many candies meet this condition.</p> <p>Hint: Take a look at the

build-column

function. Call over a coach if you want help using this.</p> <h3 id=“2-2-maximum”>2.2: Maximum</h3> <p>Of the candies for which this

Boolean

is true (fruity and hard, but not a pluribus), which has the highest winning percentage?</p> <p>Hint: This requires the use of the

order-by

function, in addition to the

row-n

and

filter-with

functions.</p> <p>An example of the

row-n

function to take the fifth row of a table:</p> <pre>

table.row-<span class="hljs-built_in">n</span>(<span class="hljs-comment">4</span>)

</pre><h3 id=“2-3-mean”>2.3: Mean</h3> <p>Of the candies for which this

Boolean

is true, what’s the average winning percentage?</p> <p>Hint: This requires the use of the

mean

function.</p> <h2 id=“takeaways”>Takeaways</h2> <p>This lab has mostly been about getting you comfortable working with tabular data and practicing some common operators on tables. It also gets you thinking about our course’s focus on data: what patterns of manipulating data do we often use in computations? How does the organization of our data impact our ability to answer these questions?</p> <p>Here, we see that filtering, ordering, and summarizing data are some of the key operations. So far we’ve only looked at these operations on tabular data, but these same building blocks will arise many times through this course. When you have a computation to perform around data, you should start by thinking through what combinations of filtering, sorting and summarizing will help you compute your answer.</p> <h2 id=“submitting-the-lab”>Submitting the Lab</h2> <p>When you've complete the exercises, show your code to your instructor or one of the coaches.</p> <p>Then, upload your

lab03.arr

file to the Lab 3 assignment on <a href=“https://www.gradescope.com”>Gradescope</a>.</p> <h2 id=“acknowledgments”>Acknowledgments</h2> <p>This lab includes material adapted from Kathi Fisler and colleagues at Brown University.</p> </html>