'Freezing' an expression

问题

I have a C++ expression that I wish to 'freeze'. By this, I mean I have syntax like the following:

take x*x with x in container ...

where the ... indicates further (non-useful to this problem) syntax. However, if I attempt to compile this, no matter what preprocessor translations I've used to make 'take' an 'operator' (in inverted commas because it's technically not an operator, but the translation phase turns it into a class with, say, operator* available to it), the compiler still attempts to evaluate / work out where the x*x is coming from, (and, since it hasn't been declared previously (as it's declared further at the 'in' stage), it instead) can't find it and throws a compile error.

My current idea essentially involves attempting to place the expression inside a lambda (and since we can deduce the type of the container, we can declare x with the right type as, say, [](decltype(*begin(container)) x) { return x*x } -- thus, when the compiler looks at this statement, it's valid and no error is thrown), however, I'm running into errors actually achieving this.

Thus, my question is: Is there a way / what's the best way to 'freeze' the x*x part of my expression?

EDIT: In an attempt to clarify my question, take the following. Assume that the operator- is defined in a sane way so that the following attempts to achieve what the above take ... syntax does:

MyTakeClass() - x*x - MyWithClass() - x - MyInClass() - container ...

When this statement is compiled, the compiler will throw an error; x is not declared so x*x makes no sense (nor does x - MyInClass(), etc, etc). What I'm trying to achieve is to find a way to make the above expression compile, using any voodoo magic available, without knowing the type of x (or, in fact, that it will be named x; it could viably be named 'somestupidvariablename') in advance.

回答1:

I made an answer very similar to my previous answer, but using actual expression templates, which should be much faster. Unfortunately, MSVC10 crashes when it attempts to compile this, but MSVC11, GCC 4.7.0 and Clang 3.2 all compile and run it just fine. (All other versions untested)

Here's the usage of the templates. Implementation code is here.

#define take 
#define with ,
#define in >>= 

//function call for containers 
template<class lhsexpr, class container>
lhsexpr operator>>=(lhsexpr lhs, container& rhs)
{
    for(auto it=rhs.begin(); it!=rhs.end(); ++it)
        *it = lhs(*it);
    return lhs;
}

int main() {
    std::vector<int> container0;
    container0.push_back(-4);
    container0.push_back(0);
    container0.push_back(3);
    take x*x with x in container0; //here's the magic line
    for(auto it=container0.begin(); it!=container0.end(); ++it)
        std::cout << *it << ' ';
    std::cout << '\n';

    auto a = x+x*x+'a'*x;
    auto b = a; //make sure copies work
    b in container0;
    b in container1;
    std::cout << sizeof(b);

    return 0;
}

As you can see, this is used exactly like my previous code, except now all the functions are decided at compile time, which means this will have exactly the same speed as a lambda. In fact, C++11 lambdas were preceeded by boost::lambda which works on very similar concepts.

This is a separate answer, because the code is far different, and far more complicated/intimidating. That's also why the implementation is not in the answer itself.

回答2:

I came up with an almost solution, based on expression templates (note: these are not expression templates, they are based on expression templates). Unfortunately, I could not come up with a way that does not require you to predeclare x, but I did come up with a way to delay the type, so you only have to declare x one globally, and can use it for different types over and over in the same program/file/scope. Here is the expression type that works the magic, which I designed to be very flexible, you should be able to easily add operations and uses at will. It is used exactly how you described, except for the predeclaration of x.

Downsides I'm aware of: it does require T*T, T+T, and T(long) be compilable.

expression x(0, true); //x will be the 0th parameter.  Sorry: required :(

int main() {
    std::vector<int> container;
    container.push_back(-3);
    container.push_back(0);
    container.push_back(7);
    take x*x with x in container; //here's the magic line
    for(unsigned i=0; i<container.size(); ++i)
        std::cout << container[i] << ' ';

    std::cout << '\n';
    std::vector<float> container2;
    container2.push_back(-2.3);
    container2.push_back(0);
    container2.push_back(7.1);
    take 1+x with x in container2; //here's the magic line
    for(unsigned i=0; i<container2.size(); ++i)
        std::cout << container2[i] << ' ';

    return 0;
}

and here's the class and defines that makes it all work:

class expression {
    //addition and constants are unused, and merely shown for extendibility
    enum exprtype{parameter_type, constant_type, multiplication_type, addition_type} type;
    long long value; //for value types, and parameter number
    std::unique_ptr<expression> left; //for unary and binary functions
    std::unique_ptr<expression> right; //for binary functions

public:
    //constructors
    expression(long long val, bool is_variable=false) 
    :type(is_variable?parameter_type:constant_type), value(val)
    {}
    expression(const expression& rhs) 
    : type(rhs.type)
    , value(rhs.value)
    , left(rhs.left.get() ? std::unique_ptr<expression>(new expression(*rhs.left)) : std::unique_ptr<expression>(NULL))
    , right(rhs.right.get() ? std::unique_ptr<expression>(new expression(*rhs.right)) : std::unique_ptr<expression>(NULL))
    {}
    expression(expression&& rhs) 
    :type(rhs.type), value(rhs.value), left(std::move(rhs.left)), right(std::move(rhs.right)) 
    {}
    //assignment operator
    expression& operator=(expression rhs) {
       type = rhs.type;
       value = rhs.value;
       left = std::move(rhs.left);
       right = std::move(rhs.right);
       return *this;
    } 

    //operators
    friend expression operator*(expression lhs, expression rhs) {
        expression ret(0);
        ret.type = multiplication_type;
        ret.left = std::unique_ptr<expression>(new expression(std::move(lhs)));
        ret.right = std::unique_ptr<expression>(new expression(std::move(rhs)));
        return ret;
    }
    friend expression operator+(expression lhs, expression rhs) {
        expression ret(0);
        ret.type = addition_type;
        ret.left = std::unique_ptr<expression>(new expression(std::move(lhs)));
        ret.right = std::unique_ptr<expression>(new expression(std::move(rhs)));
        return ret;
    }

    //skip the parameter list, don't care.  Ignore it entirely
    expression& operator<<(const expression&) {return *this;}
    expression& operator,(const expression&) {return *this;}

    template<class container>    
    void operator>>(container& rhs) {
        for(auto it=rhs.begin(); it!=rhs.end(); ++it)
            *it = execute(*it);
    }  
private: 
    //execution
    template<class T>
    T execute(const T& p0) {
       switch(type) {
       case parameter_type :
           switch(value) {
           case 0: return p0; //only one variable
           default: throw std::runtime_error("Invalid parameter ID");
           }
       case constant_type:
           return ((T)(value));
       case multiplication_type:
           return left->execute(p0) * right->execute(p0);
       case addition_type:
           return left->execute(p0) + right->execute(p0);
       default: 
           throw std::runtime_error("Invalid expression type");
       }
    }
    //This is also unused, and merely shown as extrapolation
    template<class T>
    T execute(const T& p0, const T& p1) {
       switch(type) {
       case parameter_type :
           switch(value) {
           case 0: return p0;
           case 1: return p1; //this version has two variables
           default: throw std::runtime_error("Invalid parameter ID");
           }
       case constant_type:
           return value;
       case multiplication_type:
           return left->execute(p0, p1) * right->execute(p0, p1);
       case addition_type:
           return left->execute(p0, p1) + right->execute(p0, p1);
       default: 
           throw std::runtime_error("Invalid expression type");
       }
    }
}; 
#define take 
#define with <<
#define in >>

Compiles and runs with correct output at http://ideone.com/Dnb50

You may notice that since the x must be predeclared, the with section is ignored entirely. There's almost no macro magic here, the macros effectively turn it into "x*x >> x << container", where the >>x does absolutely nothing at all. So the expression is effectively "x*x << container".

Also note that this method is slow, because this is an interpreter, with almost all the slowdown that implies. However, it has the bonus that it is serializable, you could save the function to a file, load it later, and execute it then.

R.MartinhoFernandes has observed that the definition of x can be simplified to merely be expression x;, and it can deduce the order of parameters from the with section, but it would require a lot of rethinking of the design and would be more complicated. I might come back and add that functionality later, but in the meantime, know that it is definitely possible.

If you can modify the expression to take(x*x with x in container), than that would remove the need to predeclare x, with something far far simpler than expression templates.

#define with ,
#define in ,
#define take(expr, var, con) \
   std::transform(con.begin(), con.end(), con.begin(), \
   [](const typename con::value_type& var) -> typename con::value_type \
   {return expr;});

int main() {
    std::vector<int> container;
    container.push_back(-3);
    container.push_back(0);
    container.push_back(7);
    take(x*x with x in container); //here's the magic line
    for(unsigned i=0; i<container.size(); ++i)
        std::cout << container[i] << ' ';
}

回答3:

I don't think it is possible to get this "list comprehesion" (not quite, but it is doing the same thing) ala haskell using the preprocessor. The preprocessor just does simple search and replace with the possibility of arguments, so it cannot perform arbitrary replacements. Especially changing the order of parts of expression is not possible.

I cannot see a way to do this, without changing the order, since you always need x somehow to appear before x*x to define this variable. Using a lambda will not help, since you still need x in front of the x*x part, even if it is just as an argument. This makes this syntax not possible.

There are some ways around this:

Use a different preprocessor. There are preprocessors based on the ideas of Lisp-macros, which can be made syntax aware and hence can do arbitrary transformation of one syntax tree into another. One example is Camlp4/Camlp5 developed for the OCaml language. There are some very good tutorials on how to use this for arbitrary syntax transformation. I used to have an explanation on how to use Camlp4 to transform makefiles into C code, but I cannot find it anymore. There are some other tutorials on how to do such things.
Change the syntax slightly. Such list comprehension is essientially just a syntactic simplification of the usage of a Monad. With the arrival of C++11 Monads have become possible in C++. However the syntactic sugar may not be. If you decide to wrap the stuff you are trying to do in a Monad, many things will still be possible, you will just have to change the syntax slightly. Implementing Monads in C++ is anything but fun though (although I first expected otherwise). Have a look here for some example how to get Monads in C++.

回答4:

The best approach is to parse it using the preprocessor.I do believe the preprocessor can be a very powerful tool for building EDSLs(embedded domain specific languages), but you must first understand the limitations of the preprocessor parsing things. The preprocessor can only parse out predefined tokens. So the syntax must be changed slightly by placing parenthesis around the expressions, and a FREEZE macro must surround it also(I just picked FREEZE, it could be called anything):

FREEZE(take(x*x) with(x, container))

Using this syntax you can convert it to a preprocessor sequence(using the Boost.Preprocessor library, of course). Once you have it as a preprocessor sequence you can apply lots of algorithms to it to transform it to however you like. A similiar approach is done with the Linq library for C++, where you can write this:

LINQ(from(x, numbers) where(x > 2) select(x * x))

Now, to convert to a pp sequence first you need to define the keywords to be parsed, like this:

#define KEYWORD(x) BOOST_PP_CAT(KEYWORD_, x)
#define KEYWORD_take (take)
#define KEYWORD_with (with)

So the way this will work is when you call KEYWORD(take(x*x) with(x, container)) it will expand to (take)(x*x) with(x, container), which is the first step towards converting it to a pp sequence. Now to keep going we need to use a while construct from the Boost.Preprocessor library, but first we need to define some little macros to help us along the way:

// Detects if the first token is parenthesis
#define IS_PAREN(x) IS_PAREN_CHECK(IS_PAREN_PROBE x)
#define IS_PAREN_CHECK(...) IS_PAREN_CHECK_N(__VA_ARGS__,0)
#define IS_PAREN_PROBE(...) ~, 1,
#define IS_PAREN_CHECK_N(x, n, ...) n
// Detect if the parameter is empty, works even if parenthesis are given
#define IS_EMPTY(x) BOOST_PP_CAT(IS_EMPTY_, IS_PAREN(x))(x)
#define IS_EMPTY_0(x) BOOST_PP_IS_EMPTY(x)
#define IS_EMPTY_1(x) 0 

// Retrieves the first element of the sequence
// Example:
// HEAD((1)(2)(3)) // Expands to (1)
#define HEAD(x) PICK_HEAD(MARK x)
#define MARK(...) (__VA_ARGS__),
#define PICK_HEAD(...) PICK_HEAD_I(__VA_ARGS__,)
#define PICK_HEAD_I(x, ...) x

// Retrieves the tail of the sequence
// Example:
// TAIL((1)(2)(3)) // Expands to (2)(3)
#define TAIL(x) EAT x
#define EAT(...)

This provides some better detection of parenthesis and emptiness. And it provides a HEAD and TAIL macro which works slightly different than BOOST_PP_SEQ_HEAD. (Boost.Preprocessor can't handle sequences that have vardiac parameters). Now heres how we can define a TO_SEQ macro which uses the while construct:

#define TO_SEQ(x) TO_SEQ_WHILE_M \
( \
BOOST_PP_WHILE(TO_SEQ_WHILE_P, TO_SEQ_WHILE_O, (,x)) \
)

#define TO_SEQ_WHILE_P(r, state) TO_SEQ_P state
#define TO_SEQ_WHILE_O(r, state) TO_SEQ_O state
#define TO_SEQ_WHILE_M(state) TO_SEQ_M state

#define TO_SEQ_P(prev, tail) BOOST_PP_NOT(IS_EMPTY(tail))
#define TO_SEQ_O(prev, tail) \
BOOST_PP_IF(IS_PAREN(tail), \
TO_SEQ_PAREN, \
TO_SEQ_KEYWORD \
)(prev, tail)
#define TO_SEQ_PAREN(prev, tail) \
(prev (HEAD(tail)), TAIL(tail))

#define TO_SEQ_KEYWORD(prev, tail) \
TO_SEQ_REPLACE(prev, KEYWORD(tail))

#define TO_SEQ_REPLACE(prev, tail) \
(prev HEAD(tail), TAIL(tail))

#define TO_SEQ_M(prev, tail) prev

Now when you call TO_SEQ(take(x*x) with(x, container)) you should get a sequence (take)((x*x))(with)((x, container)).

Now, this sequence is much easier to work with(because of the Boost.Preprocessor library). You can now reverse it, transform it, filter it, fold over it, etc. This is extremely powerful, and is much more flexible than having them defined as macros. For example, in the Linq library the query from(x, numbers) where(x > 2) select(x * x) gets transformed into these macros:

LINQ_WHERE(x, numbers)(x > 2) LINQ_SELECT(x, numbers)(x * x)

Which these macros, it will then generate the lambda for list comprehension, but they have much more to work with when it generates the lambda. The same can be done in your library too, take(x*x) with(x, container) could be transformed into something like this:

FREEZE_TAKE(x, container, x*x)

Plus, you aren't defining macros like take which invade the global space.

Note: These macros here require a C99 preprocessor and thus won't work in MSVC.(There are workarounds though)

来源：https://stackoverflow.com/questions/11009233/freezing-an-expression

标签

c++

templates

c++11

c-preprocessor