# Arrays

Arrays are ordered collections of things. Order is important, because later we'll talk about hashes and bags that are unordered. Arrays have lots of handy functions. Let's explore.

# Length, Order

How long is that array? Here's a list of the main dogs in my life from when I was a kid to now:

``````> my @dogs = <Chaps Patton Bowzer Logan Lulu Patch>
[Chaps Patton Bowzer Logan Lulu Patch]
> @dogs.elems
6
> +@dogs
6
> @dogs.Int
6
> @dogs.Numeric
6
``````

The `elems` methods returns the number of elements and is the most obvious way to find it. Putting a `+` plus sign in front coerces the array into a numerical context which returns the length of the array as does calling the `Int` and `Numeric` methods.

If I want my dogs going from current to first, I can `reverse` them:

``````> @dogs.reverse
[Patch Lulu Logan Bowzer Patton Chaps]
``````

I can sort them either by their name:

``````> @dogs.sort
(Bowzer Chaps Logan Lulu Patch Patton)
``````

Or, by passing a "unary" operator (one that takes a single argument), by the results of that operation such as the number of characters in the name (i.e., the length of the string):

``````> @dogs.sort(*.chars)
(Lulu Chaps Logan Patch Patton Bowzer)
> @dogs.sort(*.chars).reverse
(Bowzer Patton Patch Logan Chaps Lulu)
``````

I can find all possible pairs of dogs:

``````> @dogs.combinations(2)
((Chaps Patton) (Chaps Bowzer) (Chaps Logan) (Chaps Lulu) (Chaps Patch) (Patton Bowzer) (Patton Logan) (Patton Lulu) (Patton Patch) (Bowzer Logan) (Bowzer Lulu) (Bowzer Patch) (Logan Lulu) (Logan Patch) (Lulu Patch))
``````

I can find just the length of each dog's name by using `map` to apply a the `chars` function to each element:

``````> @dogs.map(*.chars)
(5 6 6 5 4 5)
> @dogs.map(&chars)
(5 6 6 5 4 5)
``````

The first version is using the `chars` method called on each list object while the second version is applying the `chars` function to each element (https://docs.perl6.org/routine/chars). The leading ampersand `&` is passing the `chars` function as a reference. Note that you cannot call it like so:

``````> @dogs.map(chars)
===SORRY!=== Error while compiling:
Calling chars() will never work with proto signature (\$)
------> @dogs.map(⏏chars)
``````

The `map` function is something I'd really like you to understand, so let's break it down a bit. I can get the same answer (sort of) with a `for` loop:

``````> for @dogs -> \$dog { say \$dog.chars }
5
6
6
5
4
5
``````

And I can capture the numbers by using `do`:

``````> my @chars = do for @dogs -> \$dog { \$dog.chars }
[5 6 6 5 4 5]
``````

Or, more briefly:

``````> my @chars = do for @dogs { .chars }
[5 6 6 5 4 5]
``````

So the `map` function just turns that around a bit:

``````> my @chars = map { .chars }, @dogs
[5 6 6 5 4 5]
``````

And the `map` method (of an Array) turns that around again to look more like the `for` version:

``````> my @chars = @dogs.map({ .chars })
[5 6 6 5 4 5]
> my @chars = @dogs.map(*.chars)
[5 6 6 5 4 5]
> my @chars = @dogs.map: *.chars
[5 6 6 5 4 5]
``````

You see there is more than one way to write a map. We'll break this down later.

The elements in a Array are not limited to scalars. Using `map`, I create a Array of lists that each combines a dog's name with its length using the `Z` "zip" operator (https://docs.perl6.org/routine/Z):

``````> @dogs Z @dogs.map(*.chars)
((Chaps 5) (Patton 6) (Bowzer 6) (Logan 5) (Lulu 4) (Patch 5))
``````

Does that makes sense? Zip takes two lists and combines them element-by-element, stopping on the shorter list:

``````> 1..10 Z 'a'..'z'
((1 a) (2 b) (3 c) (4 d) (5 e) (6 f) (7 g) (8 h) (9 i) (10 j))
> 1..* Z 'Bowzer'.comb
((1 B) (2 o) (3 w) (4 z) (5 e) (6 r))
``````

Lastly, I'll show you how `sum` (https://docs.perl6.org/routine/sum) total number of characters in all the dog names:

``````> @dogs.map(*.chars).sum
31
``````

# Iterating

One of the most common array operations is to iterate over the members while keeping track of the position. Here's a script that breaks a string (here maybe some DNA) into a list using the `comb` method and prints the position and the letter:

``````\$ cat -n iterate1.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my \$i = 0;
5        for \$dna.comb -> \$letter {
6            \$i++;
7            say "\$i: \$letter";
8        }
9    }
\$ ./iterate1.pl6 AACTAG
1: A
2: A
3: C
4: T
5: A
6: G
``````

This is so common that Arrays have shorter ways to do this:

``````\$ cat -n iterate2.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        for \$dna.comb.kv -> \$k, \$v {
5            say "{\$k+1}: \$v";
6        }
7    }
\$ ./iterate2.pl6 AACTAG
1: A
2: A
3: C
4: T
5: A
6: G
``````

Positions in Perl Arrays and Strings start at 0, so I have to add 1 to `\$k`. Notice that I can run code inside a string by putting `{}` curly braces around it.

I don't have to give the pointy block signature `-> \$k, \$v` bit. I can use `\$^k` and `\$^v` (or `\$^a` and `\$^b` or whatever) to refer to the first and second arguments (in sorted Unicode order) to the block (https://docs.perl6.org/language/variables#index-entry-%24%5E):

``````\$ cat -n iterate3.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        for \$dna.comb.kv { say join ": ", \$^k + 1, \$^v }
5    }
\$ ./iterate3.pl6 AACTAG
1: A
2: A
3: C
4: T
5: A
6: G
``````

Here's a version using `pairs` to get a List of Pair types (https://docs.perl6.org/type/Pair) with the index (position) as the "key" and the letter as the "value":

``````\$ cat -n iterate4.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        for \$dna.comb.pairs -> \$pair {
5            printf "%s: %s\n", \$pair.key + 1, \$pair.value;
6        }
7    }
\$ ./iterate3.pl6 AACTAG
1: A
2: A
3: C
4: T
5: A
6: G
``````

Again, I don't have to have the `-> \$pair` bit if I use the `^` twigil. I can just refer to the one positional argument as `\$^pair` and call the `key` and `value` methods on that:

``````\$ cat -n iterate5.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        for \$dna.comb.pairs { say join ': ', \$^pair.key + 1, \$^pair.value }
5    }
[saguaro@~/work/metagenomics-book/perl6/lists]\$ ./iterate5.pl6 AACTAG
1: A
2: A
3: C
4: T
5: A
6: G
``````

Or I can use the `:` twigil (https://docs.perl6.org/language/variables#The_:_Twigil) a way to declare named parameters:

``````\$ cat -n iterate6.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        for \$dna.comb.pairs -> (:\$key, :\$value) {
5            say join ': ', \$key + 1, \$value;
6        }
7    }
\$ ./iterate6.pl6 AACTAG
1: A
2: A
3: C
4: T
5: A
6: G
``````

# Filtering

Often you want to choose or remove certain members of an array. Let's find only the Gs and Cs in a string (https://en.wikipedia.org/wiki/GC-content). Note that I uppercase (`uc`) the `\$dna` first so that I only have to check for one case of letters:

``````\$ cat -n gc1.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my @gc;
5        for \$dna.uc.comb -> \$base {
6            @gc.push(\$base) if \$base eq 'G' || \$base eq 'C';
7        }
8        say "\$dna has {@gc.elems}";
9    }
\$ ./gc1.pl6 AACTAG
AACTAG has 2
``````

But `grep` is a much shorter way to find all the elements matching a given condition. Like `map`, `grep` takes a block of code that will be executed for each member of the array. Any elements for which the block evaluates to "True-ish" are allowed through. The `\$_` (topic, thing, "it") variable has the current element, so the code is asking "if the thing is a 'G' or if the thing is a 'C'". One can use the `*` to represent "it" and eschew the curly brackets:

``````> grep {\$_ > 5}, 1..10
(6 7 8 9 10)
> grep * > 5, 1..10
(6 7 8 9 10)
``````

Here's the GC filter written with `grep`:

``````\$ cat -n gc2.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my @gc = \$dna.uc.comb.grep({\$_ eq 'G' || \$_ eq 'C'});
5        say "\$dna has {@gc.elems}";
6    }
``````

Here I'll use a Junction (https://docs.perl6.org/type/Junction) to compare to "G or C" in one go:

``````\$ cat -n gc3.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my @gc = \$dna.uc.comb.grep(* eq 'G' | 'C');
5        say "\$dna has {@gc.elems}";
6    }
``````

Another way to write the `|` Junction is with `any`. The `so` routine (https://docs.perl6.org/routine/so) collapses the various Booleans down to a single value.

``````> 'G' eq 'G' | 'C'
any(True, False)
> so 'G' eq 'G' | 'C'
True
> 'G' eq any(<G C>)
any(True, False)
> so 'G' eq any(<G C>)
True
``````

It's extremely common to use regular expressions (https://docs.perl6.org/type/Regex) to filter lists. We'll cover these more later, but here I'm using a character class to represent either "G or C":

``````\$ cat -n gc4.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my @gc = \$dna.uc.comb.grep(/<[GC]>/);
5        say "\$dna has {@gc.elems}";
6    }
``````

Here's how you can find the prime numbers between 1 and 10:

``````> (1..10).grep(*.is-prime)
(2 3 5 7)
``````

# Classification

We can group elements based on predicates we supply. Here is how we can split up the numbers 1 through 10 based on whether they are or are not even divisible by 2:

``````> 2 %% 2
True
> 3 %% 2
False
> (1..10).classify(* %% 2)
{False => [1 3 5 7 9], True => [2 4 6 8 10]}
``````

Going back to our G-C counter, we can group each base into whether it is or isn't a "G" or a "C":

``````\$ cat -n gc5.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my %hash = \$dna.uc.comb.categorize({?/<[GC]>/});
5        say "\$dna has {%hash<True>.elems}";
6    }
``````

In my opinion, it's not intuitive to use "True" or "False," so let's provide our own String value for the name of the bucket we want:

``````\$ cat -n gc6.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Str \$dna) {
4        my %hash = \$dna.uc.comb.categorize({ /<[GC]>/ ?? 'GC' !! 'Other' });
5        say "\$dna has {%hash<GC>.elems}";
6    }
``````

It might help to see that one in the REPL:

``````> 'AACTAG'.uc.comb.classify({?/<[GC]>/})
{False => [A A T A], True => [C G]}
> 'AACTAG'.uc.comb.classify({/<[GC]>/ ?? 'GC' !! 'Other'})
{GC => [C G], Other => [A A T A]}
``````

`classify` takes a code block and uses the resulting string to put the element into a bucket. Here I've used the same regular expression `/<[GC]>/` to return the string "GC" if it's a match or "Other" if it's not. The combination of the `?? !!` is the "ternary" operator that we'll talk about more later. The resulting Hash has a key called "GC" and its value is a list containing the "G" and "C" found in the string.

So you're seeing that lists can be inside of other Lists as well as inside of Hashes and other data structures.

I can classify my `@dogs` based on the length of their names using that same syntax variations we saw for `map`:

``````> @dogs.classify({.chars})
{4 => [Lulu], 5 => [Chaps Logan Patch], 6 => [Patton Bowzer]}
> @dogs.classify(*.chars)
{4 => [Lulu], 5 => [Chaps Logan Patch], 6 => [Patton Bowzer]}
> @dogs.classify(&chars)
{4 => [Lulu], 5 => [Chaps Logan Patch], 6 => [Patton Bowzer]}
``````

Lists can also be composed of Pairs (https://docs.perl6.org/type/Pair). Here I'll redeclare my `@dogs` with their names as the "key" and thier sex as the "value." Then I can `classify` them on their `value`:

``````> my @dogs = Chaps => 'male', Patton => 'male', Bowzer => 'male', Logan => 'male', Lulu => 'female', Patch => 'male'
[Chaps => male Patton => male Bowzer => male Logan => male Lulu => female Patch => male]
> @dogs.classify(-> \$dog {\$dog.value})
{female => [Lulu => female], male => [Chaps => male Patton => male Bowzer => male Logan => male Patch => male]}
> @dogs.classify({\$^dog.value})
{female => [Lulu => female], male => [Chaps => male Patton => male Bowzer => male Logan => male Patch => male]}
> @dogs.classify({.value})
{female => [Lulu => female], male => [Chaps => male Patton => male Bowzer => male Logan => male Patch => male]}
> @dogs.classify(*.value)
{female => [Lulu => female], male => [Chaps => male Patton => male Bowzer => male Logan => male Patch => male]}
``````

# Picking random elements

To finish off, let's write a Shakespearean insult generator that uses the `pick` method to randomly choose some perjoratives:

``````\$ cat -n insult.pl6
1    #!/usr/bin/env perl6
2
3    sub MAIN (Int :\$n=1) {
4        my @adjectives = qw{scurvy old filthy scurilous lascivious
5            foolish rascaly gross rotten corrupt foul loathsome irksome
6            heedless unmannered whoreson cullionly false filthsome
7            toad-spotted caterwauling wall-eyed insatiate vile peevish
8            infected sodden-witted lecherous ruinous indistinguishable
9            dishonest thin-faced slanderous bankrupt base detestable
10            rotten dishonest lubbery};
11        my @nouns = qw{knave coward liar swine villain beggar
12            slave scold jolthead whore barbermonger fishmonger carbuncle
13            fiend traitor block ape braggart jack milksop boy harpy
14            recreant degenerate Judas butt cur Satan ass coxcomb dandy
15            gull minion ratcatcher maw fool rogue lunatic varlet worm};
16
17        printf "You %s, %s, %s %s!\n",
19    }
\$ ./insult.pl6 -n=5
You foul, dishonest, old recreant!
You irksome, gross, false degenerate!
You ruinous, unmannered, foolish slave!
You scurry, slanderous, peevish harpy!
``````

You can also use `roll` so that each selection is made independently. Above I'm using the `qw{}` "quote-word" operator (https://docs.perl6.org/language/quoting#index-entry-qw_word_quote) to create a list of words rather than writing:

``````my @adjectives = "scurvy", "old", "filthy";
``````

You should spend a few minutes reading about all the different quoting options available as they will come in handy.