Perl - find and save in an associative array word and word context

无人久伴 提交于 2019-12-01 11:15:35

I think I see what you're trying to do: index semantic links between words followed by lists of synonyms. Am I correct? :-)

Where a word appears in more than one synonym list, then for that word you create a hash entry with the word as a key and using the keywords for which it was originally a synonym as values ... or something like that. Using a hash of arrays - as in the solution by @Lee Duhem - you get a list (array) of synonyms for each key word. This is a common pattern. You do end up with a lot of hash entries though.

I've been playing with a neat module by @miygawa called Hash::MultiValue that takes a different approach to accessing a list of values associated with each hash key: multi-value hash. A few nice features are that you can create hash of array references on the fly from the multi-value hash, "flatten" the hash, write callbacks to go with the ->each() method, and other neat things so it's pretty flexible. I believe the module has no dependencies (other than for testing). Plus it's by @miyagawa (and other contributors) so using it and reading it is good for you :-)

I'm no expert and I'm not sure it's appropriate for what you want - as a variation on Lee's approach you might have something like:

#!/usr/bin/env perl
use strict;
use warnings;
use Hash::MultiValue;

my $words_hash = Hash::MultiValue->new();

# set up the mvalue hash
for my $words (<DATA>) {
  my @synonyms = split (',' , $words) ; 
  $words_hash->add( shift @synonyms => (@synonyms[0..$#synonyms]) ) ;
};

for my $key (keys %{ $words_hash } ) {
  print "$key --> ", join(", ",  $words_hash->get_all($key)) ;
};

print "\n";

sub synonmize {
  my $bonmot = shift;
  my @bonmot_syns ;

  # check key "$bonmot" for word to search and show values
  push @bonmot_syns , $words_hash->get_all($bonmot);

  # now grab values but leave out synonym's synonyms
  foreach (keys %{ $words_hash } ) {
    if ($_ !~ /$bonmot/ && grep {/$bonmot/} $words_hash->get_all($_)) {
      push @bonmot_syns, grep {!/$bonmot/} $words_hash->get_all($_);
    }
  }

  # show the keys with values containing target word
  $words_hash->each(
    sub { push @bonmot_syns,  $_[0] if grep /$bonmot/ ,  @_[1..$#_] ; }
  );

  chomp @bonmot_syns ;
  print "synonymes pour \"$bonmot\": @bonmot_syns \n" ;
}

# find synonyms 
synonmize("chose");
synonmize("truc");
synonmize("matière");

__DATA__
affaire,chose,question
cause,chose,matière
chose,truc,bidule
fille,demoiselle,femme,dame

Output:

fille --> demoiselle, femme, dame
affaire --> chose, question
cause --> chose, matière
chose --> truc, bidule

synonymes pour "chose": truc bidule question matière affaire cause 
synonymes pour "truc": bidule chose 
synonymes pour "matière": chose cause

Tie::Hash::MultiValue is another alternative. Kudos to @Lee for a quick clean solution :-)

For each element in @list, split it at ,, and use each field as key of %te, push others to the value of that key:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

my @list = (
    "affaire,chose,question",
    "cause,chose,matière",
);

my %te;

foreach my $str (@list) {
    my @field = split /,/, $str;
    foreach my $key (@field) {
        my @other = grep { $_ ne $key } @field;
        push @{$te{$key}}, @other;
    }
}

print Dumper(\%te);

Ouput:

$ perl t.pl
$VAR1 = {
          'question' => [
                          'affaire',
                          'chose'
                        ],
          'affaire' => [
                         'chose',
                         'question'
                       ],
          'matière' => [
                          'cause',
                          'chose'
                        ],
          'cause' => [
                       'chose',
                       'matière'
                     ],
          'chose' => [
                       'affaire',
                       'question',
                       'cause',
                       'matière'
                     ]
        };
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!