Perl List Processing Is for Hashes, Too

Perl’s list-processing prowess isn’t just for arrays. Hashes, or key-value pairs, can be sliced and diced through built-in and extended functions.

Mark Gardner

CORE ·

Feb. 10, 22 · Tutorial

Likes (4)

Comment

Save

2.1K Views

This month I started a new job at Alert Logic, a cybersecurity provider with Perl (among many other things) at its beating heart. I’ve been learning a lot, and part of the process has been understanding the APIs in the codebase. To that end, I’ve been writing small test scripts to tease apart data structures, using Perl array-processing, list-processing, and hash — i.e. associative array — processing functions.

I’ve covered map, grep, and friends a couple of times before. Most recently, I described using List::Util’s any function to check if a condition is true for any item in a list. In the simplest case, you can use it to check to see if a given value is in the list at all:

      Perl 
    
 
 
    use feature 'say';
use List::Util 'any';
my @colors =
  qw(red orange yellow green blue indigo violet);
say 'matched' if any { /^red$/ } @colors; 
   

However, if you’re going to be doing this a lot with arbitrary strings, Perl FAQ section 4 advises turning the array into the keys of a hash and then checking for membership there. For example, here’s a simple script to check if the colors input (either from the keyboard or from files passed as arguments) are in the rainbow:

      Perl 
    
    #!/usr/bin/env perl

use v5.22; # introduced <<>> for safe opening of arguments
use warnings;
 
my %in_colors = map {$_ => 1}
  qw(red orange yellow green blue indigo violet);

while (<<>>) {
  chomp;
  say "$_ is in the rainbow" if $in_colors{$_};
}

List::Util has a bunch of functions for processing lists of pairs that I’ve found useful when pawing through hashes. pairgrep, for example, acts just like grep but instead assigns $a and $b to each key and value passed in and returns the resulting pairs that match. I’ve used it as a quick way to search for hash entries matching certain value conditions:

      Perl 
    
    use List::Util 'pairgrep';
my %numbers = (zero => 0, one => 1, two => 2, three => 3);
my %odds = pairgrep {$b % 2} %numbers;

Sure, you could do this by invoking a mix of plain grep, keys, and a hash slice, but it’s noisier and more repetitive:

      Perl 
    
    use v5.20; # for key/value hash slice 
my %odds = %numbers{grep {$numbers{$_} % 2} keys %numbers};

pairgrep’s compiled C-based XS code can also be faster, as evidenced by this Benchmark script that works through a hash made of the Unix words file (479,828 entries on my machine):

      Perl 
    
 
 
    #!/usr/bin/env perl

use v5.20;
use warnings;
use List::Util 'pairgrep';
use Benchmark 'cmpthese';

my (%words, $count);
open my $fh, '<', '/usr/share/dict/words'
  or die "can't open words: $!";
while (<$fh>) {
  chomp;
  $words{$_} = $count++;
}
close $fh;

cmpthese(100, {
  grep => sub {
    my %odds = %words{grep {$words{$_} % 2} keys %words};
  },
  pairgrep => sub {
    my %odds = pairgrep {$b % 2} %words;
  },
} ); 
   

Benchmark output:

   
               Rate     grep pairgrep
grep     1.47/s       --     -20%
pairgrep 1.84/s      25%       --

In general, I urge you to work through the Perl documentation's tutorials on references, lists of lists, the data structures cookbook, and the FAQs on array and hash manipulation. Then dip into the various list-processing modules (especially the included List::Util and CPAN’s List::SomeUtils) for ready-made functions for common operations. You’ll find a wealth of techniques for creating, managing, and processing the data structures that your programs need.

Perl (programming language) Processing

Published at DZone with permission of Mark Gardner, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending

Perl List Processing Is for Hashes, Too

Perl’s list-processing prowess isn’t just for arrays. Hashes, or key-value pairs, can be sliced and diced through built-in and extended functions.

Related

Partner Resources