create account

Why grouping fruit and vegies together in an interventional study is probably a bad idea. by pibara

View this thread on steemit.com
· @pibara ·
$3.15
Why grouping fruit and vegies together in an interventional study is probably a bad idea.
![](https://cdn.steemitimages.com/DQmW7DZ4yip4NDNknYyocc279GKfAxNet1eAgbQu9o4EJyM/image.png)
In this blog post I want to look at nutrition groups. Specifically, I want to look, in an objective way, at the nutrition profile of fruit compared to other food groups. In nutrition studies, fruit is often grouped with vegetables, but is this actually a fair grouping? I want to use a public nutrition database and some basic Python Pandas functionality to look if this is justified.

# Getting the data

We start of wit getting some nutrition ingfo from https://fineli.fi/fineli/fi/avoin-data
The unpacked zip file contains a number of csv files that we will load into pandas.



```python
%matplotlib inline
import math
import numpy as np
import pandas
import matplotlib.pyplot as plt

component_value = pandas.read_csv("component_value.csv", sep=';', decimal=',')
food =  pandas.read_csv("food.csv", sep=';', encoding='latin1')
foodname = pandas.read_csv("foodname_EN.csv", sep=';', encoding='latin1')
fuclass = pandas.read_csv("fuclass_EN.csv", sep=';')
component_value = component_value[component_value['EUFDNAME'].apply(lambda x: isinstance(x, (str)))]
eufdname = pandas.read_csv("eufdname_EN.csv", sep=';')
```

# Normalizing the data
The next step is to normalize the data on nutrients, so we can work with normalized vector distance from here on.
The way we do this is, we take the mean and standard deviation for each of the nutrients in the nutrition database and we use this info to normalize the nutrient numbers to z-values. We create a new data frame with foods as rows and normalized nutrients as columns.


```python
df = pandas.merge(left=food[["FOODID","FUCLASS"]], right=fuclass[["THSCODE", "DESCRIPT"]], \
                  how='left', left_on="FUCLASS", right_on="THSCODE")[["FOODID","DESCRIPT"]]
foodshort = foodname[["FOODID","FOODNAME"]]

df = pandas.merge(how='left', right=df, left=foodshort, left_on="FOODID", right_on="FOODID")

for comp in component_value["EUFDNAME"].unique():
    filtered = component_value[component_value["EUFDNAME"] == comp][["FOODID","BESTLOC"]]
    std = filtered.loc[:,"BESTLOC"].std(axis=0)
    mean = filtered.loc[:,"BESTLOC"].mean(axis=0)
    filtered[comp] = (filtered["BESTLOC"] - mean) / std
    filtered = filtered[["FOODID", comp]]
    df = pandas.merge(left=df,right=filtered, how='left', left_on='FOODID', right_on='FOODID')

df = df.fillna(0)

```

# Food groups
Now that we have our normalized data, lets have a look at fruit, as a group, and see how that group compares to the other groups in our data set.


```python
fruit = df.loc[df['DESCRIPT'] == 'Fruits']
vegies = df.loc[df['DESCRIPT'] == 'Vegetables']
reference_vectordistance = np.linalg.norm((vegies.mean() - fruit.mean()).values[1:])

rowlist = []

for foodtype in df["DESCRIPT"].unique():
    if foodtype != 'Fruits':
        other = df.loc[df['DESCRIPT'] == foodtype]
        vectordistance = np.linalg.norm((other.mean() - fruit.mean()).values[1:])
        if vectordistance/reference_vectordistance < 10.0001:
            row = dict()
            row["foodtype"] = foodtype
            row["reldistance"] = vectordistance/reference_vectordistance
            rowlist.append(row)

peergroups = pandas.DataFrame(rowlist)

with pandas.option_context('display.max_rows', None, 'display.max_columns', None):
    print(peergroups.sort_values(by=['reldistance']))
        
```

                                        foodtype  reldistance
    106             Baby fruit and berry product     0.254584
    24                                    Juices     0.324371
    23                    Fruit and berry salads     0.419500
    88                          Vegetable salads     0.433896
    27                               Juice drink     0.499597
    89                     Fruit and berry soups     0.512050
    82    Fruit and berry dishes other than pies     0.572982
    67                              Other drinks     0.619280
    84                           Vegetable soups     0.652777
    107                   Baby vegetable product     0.655973
    65                     Soft drink with sugar     0.710715
    20                          Vegetable juices     0.716880
    95                               Pulse soups     0.729302
    83                             Potato dishes     0.742103
    86                         Cooked vegetables     0.744533
    90                          Vegetable sauces     0.753597
    87                          Vegetable dishes     0.764830
    109                           Baby fish dish     0.774588
    68                            Drinking water     0.778795
    97                                Meat soups     0.780833
    42                                   Yoghurt     0.805619
    93                              Pulse sauces     0.806330
    38                             Milks skimmed     0.809094
    108                           Baby meat dish     0.825446
    70                                  Porridge     0.829892
    116                              Sport drink     0.831736
    100                            Poultry soups     0.842000
    41                            Cultured milks     0.845855
    59                                    Coffee     0.846159
    60                                       Tea     0.850300
    15                           Cooked potatoes     0.889403
    85                              Mixed salads     0.897260
    94                              Pulse dishes     0.897811
    37                             Milks >2% fat     0.907533
    39                              Soured milks     0.912002
    81                             Milk desserts     0.920313
    44                             Milks <2% fat     0.928574
    69         Drinks with artificial sweeteners     0.935823
    112                             Seafood soup     0.936404
    79                               Milk sauces     0.999358
    17                                Vegetables     1.000000
    43                                     Quark     1.000320
    120                       Dietary supplement     1.013490
    5                             Savoury sauces     1.037593
    25                                   Berries     1.054466
    103                               Fish soups     1.064546
    16              Fried potatoes, French fries     1.106613
    91           Prepared salads with mayonnaise     1.109879
    78                                 Panncakes     1.132018
    101                           Poultry dishes     1.169911
    19                           Mushroom dishes     1.174364
    49                                 Ice cream     1.194230
    111  Seafood dishes, crustacean and molluscs     1.225429
    96                               Meat sauces     1.226042
    92                               Meat dishes     1.229997
    105                           Dessert sauces     1.237735
    40            Fermented milk products, other     1.259492
    102                           Poultry sauces     1.285769
    12                                      Rice     1.313129
    45                                     Cream     1.329619
    110                           Seafood sauces     1.334021
    118                                    Pizza     1.337806
    73                            Savoury bakery     1.343459
    10                                     Pasta     1.345270
    4                                 Condiments     1.378994
    75                    Sandwiches and burgers     1.403498
    74                              Sweet bakery     1.447861
    26                       Jams and marmalades     1.461037
    66                                    Ciders     1.513203
    104                              Fish sauces     1.523617
    77                                      Buns     1.609077
    80                                Egg dishes     1.611915
    98                               Fish dishes     1.682396
    76                               Wheat bread     1.689758
    99                                  Sausages     1.708877
    72                        Bread, mixed flour     1.730758
    21                                    Pulses     1.739442
    46           Cheese, unripened, fresh cheese     1.739644
    18                         Canned vegetables     1.781269
    117                      Cold cuts, sausages     1.807501
    57                  Crustaceans and molluscs     1.904892
    53                           Cold cuts, meat     1.998762
    71                                 Rye bread     2.126735
    114                         Savoury biscuits     2.146351
    11                            Sweet biscuits     2.227579
    3                  Miscellaneous ingredients     2.309356
    52                   Chicken and other birds     2.392425
    64                 Other alcoholic beverages     2.423097
    7                                Cereal bars     2.439470
    48                         Processed cheese      2.451552
    50                          Steaks and chops     2.459782
    13                         Breakfast cereals     2.524057
    8                                      Flour     2.544412
    22                            Pulse products     2.647546
    55                                      Fish     2.676576
    47                   Cheese, ripened cheese      2.809589
    14                            Savoury snacks     2.811795
    61                                     Beers     2.856793
    115                        Meal replacements     2.947267
    2                                  Chocolate     3.004876
    62                                     Wines     3.229641
    1                              Confectionery     3.240278
    34                    Blended spread  < 55 %     3.340992
    113           Infant formulas and human milk     3.342981
    36           Margarine and fat spread  < 55%     3.349092
    9               Nuts, seeds and dried fruits     3.505547
    119                               Sport food     3.511651
    51                             Meat products     3.523952
    58                                       Egg     3.555769
    56                             Fish products     3.588763
    33           Salad dressings and mayonnaises     3.628788
    6                                     Spices     4.116614
    29                    Blended spread >= 55 %     4.244458
    54                              Offal dishes     4.397559
    0                           Sugar and syrups     4.436681
    35           Margarine and fat spread >= 55%     4.517717
    28                          Butter, milk fat     5.055117
    30                Cooking and industrial fat     5.283595
    32                                Animal fat     5.471986
    31                                      Oils     7.149777
    63                                   Spirits     8.044268


Notice how SSBs as a group are 29% closer, nutritionally to fruit as a group than vegetables as a group are. At least according to our simple metric. Even drinking water and yogurt are. This isn't exactly giving us much to justify fruits being grouped with vegetables in nutition studies. 

And this is just for the distance between the *mean* of these food groups. Lets pick a random fruit, lets say a *banana* and compare it different individual foods outside of the vegetables group. 

# A Specific fruit

We looked at this for groups, now lets look at a specific fruit. One of my own favorites, a melon. And lets not look at SSB, drinking water and yogurt, but lets look at foods generally thought of as unhealthy that few people will think of comparing to a healthy peice of fruit. We take a look at McDonalds food and at chocolates and see how they compare to a melon.


```python
banana = df.loc[df['FOODNAME'] == 'HONEYDEW MELON, WITHOUT SKIN']
for header in df.head():
    if not header in ["FOODID","FOODNAME","DESCRIPT"]:
        df[header] = df[header] - banana[[header]].values[0]
rowlist = []
for index,row in df.iterrows():
    food = row.values[1]
    foodtype = row.values[2]
    vector = row.values[3:]
    distance = np.linalg.norm(vector)/reference_vectordistance
    if "MCDONALD" in food or foodtype == "Chocolate":
        row = dict()
        row["food"] = food
        row["distance"] = distance
        rowlist.append(row)
peerfood = pandas.DataFrame(rowlist)
with pandas.option_context('display.max_rows', None, 'display.max_columns', None):
    print(peerfood.sort_values(by=['distance']))
    
```

        distance                                               food
    21  0.956963                     MILKSHAKE, VANILLA, MCDONALD'S
    23  1.515775                     HAMBURGER, MCFEAST, MCDONALD'S
    10  1.589669         HAMBURGER, BEEF AND WHEAT ROLL, MCDONALD'S
    11  1.647408               HAMBURGER, CHEESE BURGER, MCDONALD'S
    13  1.647526      HAMBURGER, DOUBLE BURGER, BIG MAC, MCDONALD'S
    12  1.728808              HAMBURGER, CHICKEN BURGER, MCDONALD'S
    22  1.986608        HAMBURGER, DOUBLE CHEESE BURGER, MCDONALD'S
    1   2.542666         CHOCOLATE CONFECTION FILLED WITH MARMALADE
    14  2.915553            CHOCOLATE BAR, CARAMEL AND COOKIE, TWIX
    24  2.915631         CHOCOLATE CONFECTION FILLED WITH CHOCOLATE
    18  2.978043  SUFFELI CHOCOLATE BAR, WAFFLE, TOFFEE FILLING ...
    7   3.174270                             CHOCOLATE BAR, LOW-FAT
    6   3.224222                CHOCOLATE BAR WITH FILLING, AVERAGE
    2   3.336511                             CHOCOLATE BAR, AVERAGE
    15  3.350807  SUFFELI PUFFI SNACKS,PUFFED CORN AND CHOCOLATE...
    3   3.354410                   CHOCOLATE, PLAIN, DARK CHOCOLATE
    8   3.409912                         CHOCOLATE, WHITE CHOCOLATE
    0   3.431798                                 CHOCOLATE, AVERAGE
    16  3.469495                               CHOCOLATE NUT SPREAD
    20  3.509146           CHOCOLATE, MILK CHOCOLATE WITH HAZELNUTS
    4   3.763362                          CHOCOLATE, MILK CHOCOLATE
    17  3.789385                               KINDER CHOCOLATE EGG
    9   3.863914                                     RICE CHOCOLATE
    19  4.175258              CHOCOLATE, PLAIN, DARK CHOCOLATE, 80%
    5   5.557062                  CHOCOLATE, ARTIFICIALLY SWEETENED


Notice that a milk shake is closer to a melon than the average vegetable. Now let us pick a few nice ones from this lis. The milkshake, the double cheese burger and the twix candy bar and see how different vegetables compare to these:


```python
count1 = 0
count2 = 0
count3 = 0
tcount = 0
for index,row in df.iterrows():
    food = row.values[1]
    foodtype = row.values[2]
    vector = row.values[3:]
    distance = np.linalg.norm(vector)/reference_vectordistance
    if "Vegetables" == foodtype:
        tcount += 1
        if distance > 2.915553:
            count3 +=1
        if distance > 1.986608:
            count2 +=1
        if distance > 0.956963:
            count1 +=1
print("* A milkshake is nutritionally closer to a melon than", count1,"out of", tcount,"vegetables.")
print("* A double cheeseburger is nutritionally closer to a melon than", count2,"out of", tcount, "vegetables.")
print("* A Twix candy bar is nutritionally closer to a melon than", count3,"out of", tcount,"vegetables.")
```

    * A milkshake is nutritionally closer to a melon than 71 out of 103 vegetables.
    * A double cheeseburger is nutritionally closer to a melon than 26 out of 103 vegetables.
    * A Twix candy bar is nutritionally closer to a melon than 13 out of 103 vegetables.


Still making sense to you to run interventional studies that put vegetables and fruits in the same group? I would argue it doesn't.

But then, maybe you don't trust the normalized nutrition vector. Lets have a quick look at what the normalized nutrition actually looks like for a banana vs brocoli, kale, twix and a McDonald's milkshake.


```python
compare = ['MILKSHAKE, VANILLA, MCDONALD\'S','KALE','BROCCOLI','CHOCOLATE BAR, CARAMEL AND COOKIE, TWIX']
part = df.loc[df['FOODNAME'].isin(compare)]
part = part.set_index('FOODNAME').drop(['FOODID','DESCRIPT'], axis=1)
part = part.transpose().rename(columns={"CHOCOLATE BAR, CARAMEL AND COOKIE, TWIX": "TWIX",
                     "MILKSHAKE, VANILLA, MCDONALD'S": "MILKSHAKE"}) 
names = eufdname.drop(['LANG'], axis=1).rename(columns={"THSCODE": "FOODNAME"}).set_index("FOODNAME")
#pandas.merge(how='left', right=names, left=part, left_on="FOODNAME", right_on="THSCODE")

pandas.merge(how='left', right=names, left=part, left_index=True, right_index = True).set_index("DESCRIPT")
#names
```




<div>

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>BROCCOLI</th>
      <th>KALE</th>
      <th>TWIX</th>
      <th>MILKSHAKE</th>
    </tr>
    <tr>
      <th>DESCRIPT</th>
      <th></th>
      <th></th>
      <th></th>
      <th></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>energy,calculated</th>
      <td>-0.014435</td>
      <td>0.008692</td>
      <td>2.939874</td>
      <td>0.280767</td>
    </tr>
    <tr>
      <th>fat, total</th>
      <td>0.026593</td>
      <td>0.046368</td>
      <td>1.644287</td>
      <td>0.140263</td>
    </tr>
    <tr>
      <th>carbohydrate, available</th>
      <td>-0.296969</td>
      <td>-0.291666</td>
      <td>2.667469</td>
      <td>0.195103</td>
    </tr>
    <tr>
      <th>protein, total</th>
      <td>0.403420</td>
      <td>0.255376</td>
      <td>0.251675</td>
      <td>0.276349</td>
    </tr>
    <tr>
      <th>alcohol</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <th>organic acids, total</th>
      <td>0.100330</td>
      <td>-0.328305</td>
      <td>-0.446863</td>
      <td>-0.122195</td>
    </tr>
    <tr>
      <th>sugar alcohols</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <th>sugars, total</th>
      <td>-0.572030</td>
      <td>-0.562058</td>
      <td>4.221419</td>
      <td>0.366879</td>
    </tr>
    <tr>
      <th>fructose</th>
      <td>-0.299162</td>
      <td>-0.260447</td>
      <td>0.716580</td>
      <td>-0.457189</td>
    </tr>
    <tr>
      <th>galactose</th>
      <td>-0.131067</td>
      <td>-0.131067</td>
      <td>-0.156703</td>
      <td>2.268470</td>
    </tr>
    <tr>
      <th>glucose</th>
      <td>-0.277072</td>
      <td>-0.277072</td>
      <td>1.086415</td>
      <td>0.312071</td>
    </tr>
    <tr>
      <th>lactose</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.495146</td>
      <td>0.566464</td>
    </tr>
    <tr>
      <th>maltose</th>
      <td>0.056397</td>
      <td>0.056397</td>
      <td>0.479377</td>
      <td>0.028199</td>
    </tr>
    <tr>
      <th>sucrose</th>
      <td>-0.539437</td>
      <td>-0.539437</td>
      <td>4.571218</td>
      <td>0.135884</td>
    </tr>
    <tr>
      <th>starch, total</th>
      <td>0.008880</td>
      <td>0.008880</td>
      <td>0.518603</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <th>fibre, total</th>
      <td>0.590525</td>
      <td>1.290406</td>
      <td>0.171471</td>
      <td>-0.174970</td>
    </tr>
    <tr>
      <th>fibre, water-insoluble</th>
      <td>0.490057</td>
      <td>0.285866</td>
      <td>0.314453</td>
      <td>-0.163352</td>
    </tr>
    <tr>
      <th>polysaccharides, non-cellulosic, water-soluble</th>
      <td>0.572664</td>
      <td>0.572664</td>
      <td>0.155437</td>
      <td>-0.163618</td>
    </tr>
    <tr>
      <th>folate, total</th>
      <td>1.546457</td>
      <td>1.643241</td>
      <td>0.034257</td>
      <td>0.031612</td>
    </tr>
    <tr>
      <th>niacin equivalents, total</th>
      <td>0.319264</td>
      <td>0.188485</td>
      <td>0.116861</td>
      <td>0.136289</td>
    </tr>
    <tr>
      <th>niacin, preformed (nicotinic acid + nicotinamide)</th>
      <td>0.241677</td>
      <td>0.281957</td>
      <td>-0.039877</td>
      <td>-0.079753</td>
    </tr>
    <tr>
      <th>vitamers pyridoxine (hydrochloride)</th>
      <td>0.224716</td>
      <td>1.303355</td>
      <td>-0.134830</td>
      <td>-0.044943</td>
    </tr>
    <tr>
      <th>riboflavine</th>
      <td>0.657697</td>
      <td>1.176932</td>
      <td>0.169617</td>
      <td>0.636928</td>
    </tr>
    <tr>
      <th>thiamin (vitamin B1)</th>
      <td>0.369891</td>
      <td>0.475574</td>
      <td>-0.065524</td>
      <td>0.030120</td>
    </tr>
    <tr>
      <th>vitamin A  retinol activity equivalents</th>
      <td>0.080351</td>
      <td>0.734835</td>
      <td>0.025000</td>
      <td>0.015968</td>
    </tr>
    <tr>
      <th>carotenoids, total</th>
      <td>1.107892</td>
      <td>16.470065</td>
      <td>-0.005260</td>
      <td>-0.018312</td>
    </tr>
    <tr>
      <th>vitamin B-12 (cobalamin)</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.011532</td>
      <td>0.087641</td>
    </tr>
    <tr>
      <th>vitamin C (ascorbic acid)</th>
      <td>2.999856</td>
      <td>3.037288</td>
      <td>-0.475378</td>
      <td>-0.419766</td>
    </tr>
    <tr>
      <th>vitamin D</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.004403</td>
      <td>0.004403</td>
    </tr>
    <tr>
      <th>vitamin E  alphatocopherol</th>
      <td>0.244547</td>
      <td>1.993427</td>
      <td>0.420917</td>
      <td>0.007411</td>
    </tr>
    <tr>
      <th>vitamin K, total</th>
      <td>2.221502</td>
      <td>12.518243</td>
      <td>0.049457</td>
      <td>0.007459</td>
    </tr>
    <tr>
      <th>calcium</th>
      <td>0.095603</td>
      <td>1.051637</td>
      <td>0.113176</td>
      <td>0.487941</td>
    </tr>
    <tr>
      <th>iron, total</th>
      <td>0.207805</td>
      <td>0.221893</td>
      <td>0.202170</td>
      <td>-0.016906</td>
    </tr>
    <tr>
      <th>iodide (iodine)</th>
      <td>-0.018761</td>
      <td>-0.018761</td>
      <td>-0.017968</td>
      <td>-0.016768</td>
    </tr>
    <tr>
      <th>potassium</th>
      <td>0.469394</td>
      <td>0.678014</td>
      <td>-0.061256</td>
      <td>-0.120452</td>
    </tr>
    <tr>
      <th>magnesium</th>
      <td>0.197976</td>
      <td>0.395951</td>
      <td>0.327828</td>
      <td>0.044940</td>
    </tr>
    <tr>
      <th>salt</th>
      <td>-0.012421</td>
      <td>-0.018237</td>
      <td>0.108501</td>
      <td>0.012049</td>
    </tr>
    <tr>
      <th>phosphorus</th>
      <td>0.361836</td>
      <td>0.200401</td>
      <td>0.228235</td>
      <td>0.406369</td>
    </tr>
    <tr>
      <th>selenium, total</th>
      <td>0.177199</td>
      <td>0.177199</td>
      <td>0.027749</td>
      <td>0.046444</td>
    </tr>
    <tr>
      <th>zinc</th>
      <td>0.390879</td>
      <td>0.188201</td>
      <td>0.319218</td>
      <td>0.290988</td>
    </tr>
    <tr>
      <th>fatty acids, total</th>
      <td>0.013528</td>
      <td>0.036739</td>
      <td>1.618932</td>
      <td>0.141260</td>
    </tr>
    <tr>
      <th>fatty acids, total polyunsaturated</th>
      <td>0.036390</td>
      <td>0.107449</td>
      <td>0.432253</td>
      <td>-0.001229</td>
    </tr>
    <tr>
      <th>fatty acids, total monounsaturated cis</th>
      <td>0.002320</td>
      <td>0.003099</td>
      <td>1.345681</td>
      <td>0.080810</td>
    </tr>
    <tr>
      <th>fatty acids, total saturated</th>
      <td>0.004420</td>
      <td>0.009818</td>
      <td>2.052560</td>
      <td>0.233120</td>
    </tr>
    <tr>
      <th>fatty acids, total trans</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.241665</td>
      <td>0.147224</td>
    </tr>
    <tr>
      <th>fatty acids, total n-3 polyunsaturated</th>
      <td>0.073295</td>
      <td>0.236547</td>
      <td>0.023455</td>
      <td>-0.017662</td>
    </tr>
    <tr>
      <th>fatty acids, total n-6 polyunsaturated</th>
      <td>0.009726</td>
      <td>0.033363</td>
      <td>0.535066</td>
      <td>0.002518</td>
    </tr>
    <tr>
      <th>fatty acid 18:2 cis,cis n-6 (linoleic acid)</th>
      <td>0.010242</td>
      <td>0.035108</td>
      <td>0.560131</td>
      <td>0.001131</td>
    </tr>
    <tr>
      <th>fatty acid 18:3 n-3 (alpha-linolenic acid)</th>
      <td>0.077831</td>
      <td>0.251260</td>
      <td>0.024947</td>
      <td>-0.018711</td>
    </tr>
    <tr>
      <th>fatty acid 20:5 n-3 (EPA)</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <th>fatty acid 22:6 n-3 (DHA)</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.000000</td>
    </tr>
    <tr>
      <th>cholesterol (GC)</th>
      <td>0.000000</td>
      <td>0.000000</td>
      <td>0.107896</td>
      <td>0.071424</td>
    </tr>
    <tr>
      <th>sterols, total</th>
      <td>0.197817</td>
      <td>0.039677</td>
      <td>0.176732</td>
      <td>-0.010203</td>
    </tr>
    <tr>
      <th>tryptophan</th>
      <td>0.256079</td>
      <td>0.843467</td>
      <td>0.256079</td>
      <td>0.342292</td>
    </tr>
  </tbody>
</table>
</div>



I hope the simple analysis above shows and justifies my stance that fruits and vegetables grouped together in an interventional study is a horrible idea. Whatever te outcome, it will say very little about either fruit nor vegetables. 

Note that in this analysis I didn't put any weight on any of the nutrients other than the data set did by grouping or not grouping nutrients together. Also the comparison is based on a per unit of weight basis. The results on a per calory basis are different but the same. Different in that other groups turn up as closer to fruit than vegetables, but the same in that fruits and vegetables turn out very much different and more different than many other obviously unrelated food groups in this data set. As I didn't want to make this blog post longet hant it already is, I ommitted the per kcal variant.

As you might have noticed, I am more comfortable with data than I am with biochemistry, so there might be major issues with analyzing the distance between different foods in the way that I did above. I'm here to learn, so if there are fundamental flaws with this way of looking at the data, please drop me a comment, or let me know on [Twitter](https://twitter.com/EngineerDiet).
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 256 others
properties (23)
post_id80,898,574
authorpibara
permlinkwhy-grouping-fruit-and-vegies-together-in-an-interventional-study-is-probably-a-bad-idea
categorysteemstem
json_metadata{"tags":["steemstem","nutrition","food","health","data","pandas","palnet"],"image":["https:\/\/cdn.steemitimages.com\/DQmW7DZ4yip4NDNknYyocc279GKfAxNet1eAgbQu9o4EJyM\/image.png"],"links":["https:\/\/fineli.fi\/fineli\/fi\/avoin-data","https:\/\/twitter.com\/EngineerDiet"],"app":"palnet\/0.1","format":"markdown","canonical_url":"undefined\/@pibara\/why-grouping-fruit-and-vegies-together-in-an-interventional-study-is-probably-a-bad-idea"}
created2019-10-15 08:47:00
last_update2019-10-15 08:47:00
depth0
children5
net_rshares11,453,288,876,892
last_payout2019-10-22 08:47:00
cashout_time1969-12-31 23:59:59
total_payout_value1.588 SBD
curator_payout_value1.563 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length26,436
author_reputation7,318,876,931,172
root_title"Why grouping fruit and vegies together in an interventional study is probably a bad idea."
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (320)
@co2fund ·
Thank you @pibara for reducing your CO<sub>2</sub> footprint with the [CO<sub>2</sub> Compensation Coin (COCO)](https://steem-engine.com/?p=market&t=COCO) 👍 @co2fund
👍  
properties (23)
post_id80,898,772
authorco2fund
permlinkre-why-grouping-fruit-and-vegies-together-in-an-interventional-study-is-probably-a-bad-idea-20191015t085410z
categorysteemstem
json_metadata{"app":"rewarding\/0.1.5"}
created2019-10-15 08:54:12
last_update2019-10-15 08:54:12
depth1
children0
net_rshares2,580,995,893
last_payout2019-10-22 08:54:12
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.000 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length165
author_reputation410,519,067,127
root_title"Why grouping fruit and vegies together in an interventional study is probably a bad idea."
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (1)
@joshman ·
$0.02
Interesting stuff. Data science FTW!
👍  
properties (23)
post_id80,899,239
authorjoshman
permlinkre-pibara-pzesk6
categorysteemstem
json_metadata{"tags":["steemstem"],"app":"steempeak\/1.17.1"}
created2019-10-15 09:19:18
last_update2019-10-15 09:19:18
depth1
children0
net_rshares133,948,035,294
last_payout2019-10-22 09:19:18
cashout_time1969-12-31 23:59:59
total_payout_value0.011 SBD
curator_payout_value0.011 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length36
author_reputation51,813,680,698,502
root_title"Why grouping fruit and vegies together in an interventional study is probably a bad idea."
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (1)
@verifyme ·
@pibara You have received a 100% upvote from @intro.bot because this post did not use any bidbots and you have not used bidbots in the last 30 days!

Upvoting this comment will help keep this service running.
👍  
properties (23)
post_id80,930,443
authorverifyme
permlinkre-why-grouping-fruit-and-vegies-together-in-an-interventional-study-is-probably-a-bad-idea-pibara-verifyme-randomvote
categorysteemstem
json_metadata{"tags":["verifyme","random-upvote"],"users":["pibara","intro.bot"],"app":"null\/null","format":"markdown"}
created2019-10-16 11:11:36
last_update2019-10-16 11:11:36
depth1
children0
net_rshares96,204,164,551
last_payout2019-10-23 11:11:36
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.000 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length210
author_reputation1,412,537,544,622
root_title"Why grouping fruit and vegies together in an interventional study is probably a bad idea."
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (1)
@steemstem ·
re-pibara-why-grouping-fruit-and-vegies-together-in-an-interventional-study-is-probably-a-bad-idea-20191016t113506024z
<div class='text-justify'> <div class='pull-left'> <center> <br /> <img width='200' src='https://res.cloudinary.com/drrz8xekm/image/upload/v1553698283/weenlqbrqvvczjy6dayw.jpg'> </center>  <br/> </div> 

This post has been voted on by the **SteemSTEM curation team** and voting trail. It is elligible for support from @curie and @minnowbooster.<br /> 

If you appreciate the work we are doing, then consider supporting our witness [@stem.witness](https://steemconnect.com/sign/account_witness_vote?approve=1&witness=stem.witness). Additional witness support to the [curie witness](https://steemconnect.com/sign/account_witness_vote?approve=1&witness=curie) would be appreciated as well.<br /> 

For additional information please join us on the [SteemSTEM discord]( https://discord.gg/BPARaqn) and to get to know the rest of the community!<br />

Please consider using the <a href='https://www.steemstem.io'>steemstem.io</a> app and/or including @steemstem in the list of beneficiaries of this post. This could yield a stronger support from SteemSTEM.
👍  
properties (23)
post_id80,931,031
authorsteemstem
permlinkre-pibara-why-grouping-fruit-and-vegies-together-in-an-interventional-study-is-probably-a-bad-idea-20191016t113506024z
categorysteemstem
json_metadata{"app":"steemstem-bot"}
created2019-10-16 11:35:09
last_update2019-10-16 11:35:09
depth1
children0
net_rshares98,157,140,865
last_payout2019-10-23 11:35:09
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.000 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length1,050
author_reputation214,343,891,436,406
root_title"Why grouping fruit and vegies together in an interventional study is probably a bad idea."
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (1)
@agmoore2 ·
$0.05
8fqyai09q
Wow! That is an impressive collection of data.  I haven't had a chance to actually analyze your findings, but these are the kinds of lists that invite attention (mine, anyway).  Compliments on this hard work and for looking at an accepted idea and challenging it with science.
Respect!
👍  , , ,
properties (23)
post_id80,936,276
authoragmoore2
permlink8fqyai09q
categorysteemstem
json_metadata{"tags":"steemstem","app":"steemstem"}
created2019-10-16 15:21:24
last_update2019-10-16 15:21:24
depth1
children0
net_rshares323,499,508,922
last_payout2019-10-23 15:21:24
cashout_time1969-12-31 23:59:59
total_payout_value0.029 SBD
curator_payout_value0.024 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length285
author_reputation9,476,908,045,141
root_title"Why grouping fruit and vegies together in an interventional study is probably a bad idea."
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (4)