问题
In Python only, and using data from a Pandas dataframe, how can I use PuLP to solve linear programming problems the same way I can in Excel? How much budget should be allocated to each Channel under the New Budget column so we maximize the total number of estimated successes? I'm really looking for a concrete example using data from a dataframe and not really high-level advice.
Problem Data Setup
Channel 30-day Cost Trials Success Cost Min Cost Max New Budget
0 Channel1 1765.21 9865 812 882.61 2647.82 0
1 Channel2 2700.00 15000 900 1350.00 4050.00 0
2 Channel3 2160.00 12000 333 1080.00 3240.00 0
This is a Maximization problem.
The objective function is:
objective_function = sum((df['New Budget']/(df['30-day Cost']/df['Trials']))*(df['Success']/df['Trials']))
The constraints are:
- The sum of
df['New Budget']must equal5000 - The
New Budgetfor a given channel can go no lower than theCost Min - The
New Budgetfor a given channel can go no higher than theCost Max
Any ideas how to translate this pandas dataframe solver linear problem using PuLP or any other solver approach? The end-result would be what you see in the image below.
回答1:
In general you create a dictionary of variables (x in this case) and a model variable (mod in this case). To create the objective you use sum over the variables times some scalars, adding that result to mod. You construct constraints by again computing linear combinations of variables, using >=, <=, or ==, and adding that constraint to mod. Finally you use mod.solve() to get the solutions.
import pulp
# Create variables and model
x = pulp.LpVariable.dicts("x", df.index, lowBound=0)
mod = pulp.LpProblem("Budget", pulp.LpMaximize)
# Objective function
objvals = {idx: (1.0/(df['30-day Cost'][idx]/df['Trials'][idx]))*(df['Success'][idx]/float(df['Trials'][idx])) for idx in df.index}
mod += sum([x[idx]*objvals[idx] for idx in df.index])
# Lower and upper bounds:
for idx in df.index:
mod += x[idx] >= df['Cost Min'][idx]
mod += x[idx] <= df['Cost Max'][idx]
# Budget sum
mod += sum([x[idx] for idx in df.index]) == 5000.0
# Solve model
mod.solve()
# Output solution
for idx in df.index:
print idx, x[idx].value()
# 0 2570.0
# 1 1350.0
# 2 1080.0
print 'Objective', pulp.value(mod.objective)
# Objective 1798.70495012
Data:
import numpy as np
import pandas as pd
idx = [0, 1, 2]
d = {'channel': pd.Series(['Channel1', 'Channel2', 'Channel3'], index=idx),
'30-day Cost': pd.Series([1765.21, 2700., 2160.], index=idx),
'Trials': pd.Series([9865, 1500, 1200], index=idx),
'Success': pd.Series([812, 900, 333], index=idx),
'Cost Min': pd.Series([882.61, 1350.00, 1080.00], index=idx),
'Cost Max': pd.Series([2647.82, 4050.00, 3240.00], index=idx)}
df = pd.DataFrame(d)
df
# 30-day Cost Cost Max Cost Min Success Trials channel
# 0 1765.21 2647.82 882.61 812 9865 Channel1
# 1 2700.00 4050.00 1350.00 900 1500 Channel2
# 2 2160.00 3240.00 1080.00 333 1200 Channel3
来源:https://stackoverflow.com/questions/33160262/linear-programming-simplex-lp-pulp