Deleting previous token in a sentence if same as current token python

问题

I have 2 dictionaries of key, value pairs like:

tokenIDs2number = {(6, 7): 1000000000.0, (22,): 700.0, (12,): 3000.0}

tokenIDs2number = {(27, 28): u'South Asia'}

The keys are tuples of the index locations of number and location slots in the sentence:

GDP in 2007 totaled about $ 1 billion , or about $ 3,000 per capita -LRB- exceeding the average of about $ 700 in the rest of South Asia -RRB- .

I want to loop through all the tuples for both the numbers and locations, and remove values from the tuples if they are next to each other, e.g. make them:

tokenIDs2number = {(7,): 1000000000.0, (22,): 700.0, (12,): 3000.0}

tokenIDs2number = {(28,): u'South Asia'}

So that later on, I can fill this sentence token in with location and number slots, so the sentence becomes:

GDP in 2007 totaled about $ NUMBER_SLOT , or about $ NUMBER_SLOT per capita -LRB- exceeding the average of about $ NUMBER_SLOT in the rest of LOCATION_SLOT -RRB- .

Instead of:

GDP in 2007 totaled about $ NUMBER_SLOT NUMBER_SLOT , or about $ NUMBER_SLOT per capita -LRB- exceeding the average of about $ 700 in the rest of LOCATION_SLOT LOCATION_SLOT -RRB- .

Current code:

for locationTokenIDs, location in tokenIDs2location.items():
  for numberTokenIDs, number in tokenIDs2number.items():
    prevNoID=numberTokenIDs[0]
    prevLocID=locationTokenIDs[0]
    for numberTokenID in numberTokenIDs:
        for locationTokenID in locationTokenIDs:
            if numberTokenID==prevNoID+1:
                numberTokenIDs.remove(numberTokenIDs[prevNoID])
                if numberTokenID>0 and numberTokenID<(len(sampleTokens)-1):
                    prevNoID = numberTokenID
            if locationTokenID==prevLocID+1:
                locationTokenIDs.remove(locationTokenIDs[prevLocID])
                if locationTokenID>0 and locationTokenID<(len(sampleTokens)-1):
                    prevLocID = locationTokenID

However, it seems I cannot just remove numbers from a tuple, so I am struggling to figure out how to do this.

回答1:

Since tuples (and usually dict keys in general) are immutable, you can not change the keys directly. However, you can use a dictionary comprehension to transform your dict to what you need in one line:

tokenIDs2number = {(6, 7): 1000000000.0, (22,): 700.0, (12,): 3000.0}
tokenIDs2number = {(k[-1],): v for k, v in tokenIDs2number.items()}

Using k[-1] to always access the last element lets you handle tuples of any length the same way.

来源：https://stackoverflow.com/questions/38506857/deleting-previous-token-in-a-sentence-if-same-as-current-token-python

标签

python

dictionary

tuples