Make a script in python that lists adjacent words through Unix?

问题

How can I write a script in python through nested dictionaries that takes a txt file written as,

white,black,green,purple,lavendar:1

red,black,white,silver:3

black,white,magenta,scarlet:4

and make it print for each entry before the : character, all neighbors it showed up next to

white: black silver magenta

black: white green red 

green: black purple

and so on

Edit: Well, I didn't post what I have because it is rather unsubstantial...I'll update it if I figure out anything else... I just have been stuck for a while - all I have figured out how to do is post each word/letter on a separate line with:

from sys import argv
script,filename=argv
txt=open(filename)
for line in txt:
    line=line[0:line.index(';')]
    for word in line.split(","):
        print word

I guess what I want is to have some kind of for loop that runs through each word, if the word is not in an original dictionary, I'll add it to it, then I'll search through for words that appear next to it in the file.

回答1:

Input

a,c,f,g,hi,lw:1

f,g,j,ew,f,h,a,w:3

fd,s,f,g,s:4

Code

neighbours = {}

for line in file('4-input.txt'):
    line = line.strip()
    if not line:
        continue    # skip empty input lines

    line = line[:line.index(':')]   # take everything left of ':'

    previous_token = ''
    for token in line.split(','):
        if previous_token:
            neighbours.setdefault(previous_token, []).append(token)
            neighbours.setdefault(token, []).append(previous_token)
        previous_token = token

    import pprint
    pprint.pprint(neighbours)

Output

{'a': ['c', 'h', 'w'],
'c': ['a', 'f'],
'ew': ['j', 'f'],
'f': ['c', 'g', 'g', 'ew', 'h', 's', 'g'],
'fd': ['s'],
'g': ['f', 'hi', 'f', 'j', 'f', 's'],
'h': ['f', 'a'],
'hi': ['g', 'lw'],
'j': ['g', 'ew'],
'lw': ['hi'],
's': ['fd', 'f', 'g'],
'w': ['a']}

Tidying up the prettyprinted dictionary is left as an exercise for the reader. (Because dictionaries are inherently not sorted into any order, and removing the duplicates without changing the ordering of the lists is also annoying).

Easy solution:

for word, neighbour_list in neighbours.items():
    print word, ':', ', '.join(set(neighbour_list))

But that does change the ordering.

回答2:

Here you go:

from collections import defaultdict

char_map = defaultdict(set)
with open('input', 'r') as input_file:
    for line in input_file:
        a_list, _ = line.split(':') # Discard the stuff after the :
        chars = a_list.split(',') # Get the elements before : as a list
        prev_char = ""
        for char, next_char in zip(chars, chars[1:]): # For every character add the 
                                                      # next and previous chars to the 
                                                      # dictionary
            char_map[char].add(next_char)
            if prev_char:
                char_map[char].add(prev_char)
            prev_char = char

print char_map

回答3:

def parse (input_file):
char_neighbours = {}
File = open(input_file,'rb')
for line in File:
    line = line.strip().split(':')[0]
    if line != "":
        csv_list=line.split(',')
        for i in xrange(0,len(csv_list)-1):
            value = char_neighbours.get(csv_list[i]) or False
            if value is False:
                char_neighbours[csv_list[i]] = []
            if(i<len(csv_list)):
                if str(csv_list[i+1]) not in char_neighbours[str(csv_list[i])]:
                    char_neighbours[str(csv_list[i])].append(str(csv_list[i+1]))
            if(i>0):
                if str(csv_list[i-1]) not in char_neighbours[str(csv_list[i])]:
                    char_neighbours[str(csv_list[i])].append(str(csv_list[i-1]))
return char_neighbours

if __name__ == "__main__":
    dictionary=parse('test.txt')
    print dictionary

the parse method returns a dictionary of strings with a list of neighbours as their values

来源：https://stackoverflow.com/questions/22262990/make-a-script-in-python-that-lists-adjacent-words-through-unix

标签

python

line