Getting Started with {phyf}

library(phyf)
#> 
#> Attaching package: 'phyf'
#> The following object is masked from 'package:stats':
#> 
#>     pf
library(ape)
test_tree <- rtree(100) 
tree_pf <- pf_as_pf(test_tree)
tree_pf
#> # A tibble: 198 × 3
#>    label is_tip phlo                                                            
#>    <chr> <lgl>  <pfc>                                                           
#>  1 t8    TRUE   ◎── 0.74──→ Node2 ── 0.30──→ No… ── 0.10──→ Node4 ── 0.32──→ t8 
#>  2 t69   TRUE   ◎── 0.74──→ Node2 ── 0.30──→ No…── 0.10──→ Node4 ── 0.13──→ t69 
#>  3 t77   TRUE   ◎── 0.74──→ Node2 ── 0.30──→ Node3 ── 0.92──→ t77               
#>  4 t42   TRUE   ◎──0.735──→ Node2 ──0.566──→ No…─0.524──→ Node11 ──0.227──→ t42 
#>  5 t94   TRUE   ◎──0.735──→ Node2 ──0.566──→ No…─0.524──→ Node11 ──0.885──→ t94 
#>  6 t31   TRUE   ◎──0.735──→ Node2 ──0.566──→ No…─0.077──→ Node10 ──0.698──→ t31 
#>  7 t65   TRUE   ◎──0.735──→ Node2 ──0.566──→ No…─0.340──→ Node13 ──0.980──→ t65 
#>  8 t100  TRUE   ◎──0.735──→ Node2 ──0.566──→ No…0.340──→ Node13 ──0.485──→ t100 
#>  9 t95   TRUE   ◎──0.735──→ Node2 ──0.566──→ No…─0.718──→ Node15 ──0.692──→ t95 
#> 10 t75   TRUE   ◎──0.735──→ Node2 ──0.566──→ No…─0.784──→ Node16 ──0.085──→ t75 
#> # ℹ 188 more rows

This tibble is easy to join data by using the node_names, which include the tip labels from the phylogeny as well as ‘NodeXX’ for internal nodes, where XX starts a 1 and goes to the total number of internal nodes. You can also use node_nums to join by the node number, where the node number uses the traditional ordering of the nodes used in the ape package in phylo objects. Usually you will only have data on the tips, and so when joining to the pf object (using e.g. dplyr::left_join()) the internal node rows will recieve NA values. This is the desired behaviour. The missing values are easy to drop for fitting a model, but are useful later when making prediction (that is, ancestral state estimates).

Here, we will use one of the built-in pf datasets in {phyf} to show off some of its feature.

data("avonet")
avonet
#> # A tibble: 13,338 × 39
#>    label is_tip phlo            Species3 Family3 Order3 Total.individuals Female
#>    <chr> <lgl>  <pfc>           <chr>    <chr>   <chr>              <dbl>  <dbl>
#>  1 Rhea… TRUE   ◎── 24.…ricana  Rhea am… Rheidae Rheif…                 5      2
#>  2 Rhea… TRUE   ◎── 24.…ennata  Rhea pe… Rheidae Rheif…                 6      3
#>  3 Apte… TRUE   ◎── 24.…tralis  Apteryx… Aptery… Apter…                 6      2
#>  4 Apte… TRUE   ◎── 24.…ntelli  Apteryx… Aptery… Apter…                 4      2
#>  5 Apte… TRUE   ◎── 24.…owenii  Apteryx… Aptery… Apter…                 5      2
#>  6 Apte… TRUE   ◎── 24.…aastii  Apteryx… Aptery… Apter…                 9      6
#>  7 Drom… TRUE   ◎── 24.…andiae  Dromaiu… Dromai… Casua…                 5      2
#>  8 Casu… TRUE   ◎── 24.…uarius  Casuari… Casuar… Casua…                 7      2
#>  9 Casu… TRUE   ◎── 24.…nnetti  Casuari… Casuar… Casua…                 4      1
#> 10 Stru… TRUE   ◎──   2…amelus  Struthi… Struth… Strut…                 8      1
#> # ℹ 13,328 more rows
#> # ℹ 31 more variables: Male <dbl>, Unknown <dbl>, Complete.measures <dbl>,
#> #   Beak.Length_Culmen <dbl>, Beak.Length_Nares <dbl>, Beak.Width <dbl>,
#> #   Beak.Depth <dbl>, Tarsus.Length <dbl>, Wing.Length <dbl>,
#> #   Kipps.Distance <dbl>, Secondary1 <dbl>, `Hand-Wing.Index` <dbl>,
#> #   Tail.Length <dbl>, Mass <dbl>, Mass.Source <chr>, Mass.Refs.Other <chr>,
#> #   Inference <chr>, Traits.inferred <chr>, Reference.species <chr>, …

First, let’s try plotting the object.