Runtime function branching on compile

Consider a compile-time function of the form:

template <unsigned int Value>
constexpr unsigned int function() 
{
    // Just for the example, but it could be very complicated here
    return Value*Value;
}

How to write the runtime equivalent that will call the right compile-time version using template metaprogramming, knowing that the value will always be in the [From, To[ interval:

template <unsigned int From, unsigned int To, /* Something here */>
constexpr unsigned int function(const unsigned int value)
{
    // Something here
}

The branching to the right version should be as fast as possible.

For example function<0, 32>(6) (runtime version) should call function<6>() (compile-time version).

EDIT: Explanation: Why do I want to do this? This function (real use case) need to be as fast as possible (supercomputing issues). By providing the parameter at compile-time, I can generate very efficient code. If I simply move the value from the template parameter to a function parameter, the code is between 10 and 100 times slower. But in fact, this parameter does not have a very wide range of possible values (like between 0 and 32 ): so it would be far more efficient to branch at runtime on the right compile-time version.


The easiest way is to set up a recursive cascading if / recurse chain.

#define RETURNS(X) -> decltype(X) { return (X); }

template<unsigned From, unsigned To, typename Target>
struct CallIf {
  constexpr auto operator()( unsigned N )
    RETURNS( (N==From)?Target::template func<From>():CallIf<From+1, To, Target>()( N ) );
};
template<unsigned From, typename Target>
struct CallIf<From, From+1, Target> {
  constexpr auto operator()( unsigned N )
    RETURNS( Target::template func<From>() );
};

struct Func {
  template<unsigned V>
  constexpr unsigned func() const {
    return function<V>();
  }
};

or something like that, and rely on the compiler collapsing that chain of if s down to one. (if you know the return type, you can do away with that annoying RETURNS macro, or if you have C++1y features you can do the same).

Now, you might want to compare this to a version that does a binary search on value in that range, using a similar recursive call case. Similarly, you could do it by checking and setting bits in a compile time value.

template<unsigned From, unsigned To, typename Target>
struct CallIf {
  enum { Mid = From + (To-From)/2 }; // avoid overflow risk
  constexpr auto operator()( unsigned N )
    RETURNS( (N>=Mid)?CallIf<Mid, To, Target>()(N):CallIf<From,Mid,Target>()(N) );
};

with the same specialization for the 1-width case.

Another approach would be to set up a static array of invocations of function<V> and then do an array dereference at run time:

template<unsigned...> struct indexes {};
template<unsigned Min, unsigned Max, unsigned... Is> struct make_indexes:make_indexes<Min, Max-1, Max-1, Is...> {};
template<unsigned Min, unsigned... Is> struct make_indexes<Min, Min, Is...>:indexes<Is...> {};

template<unsigned From, unsigned To, typename Target>
struct CallIf {
  template<unsigned... Is>
  unsigned Invoke( indexes<Is...>, unsigned N ) const {
    typedef unsigned(*target)();
    static target ts[] = { &(function<Is>)... };
    return ts[N-From]();
  };
  unsigned operator()( unsigned N ) const {
    return Invoke( make_indexes<From, To>(), N );
  }
};

although I'm not sure how to make the above constexpr easily in C++11 at least, and I skipped out on doing return-type deduction.

None of the above are tested or compiled, so most likely will require some repair. But the core concept will work. The specialization might need some work: doing <From, From+1 to terminate is something I haven't done in practice: if that causes issues, you can do <From, Width based helper, and specialize on Width=1 .

I personally call this technique (embodied in the CallIf type above) the "magic switch", where we take a run time valid and "magically" make it a compile time value. I only mention this because you can probably find me talking about it on stack overflow by googling Yakk and "magic switch" (and site:stackoverflow.com) for other variations, some of which have been compiled and have live examples attached.

Finally, while the last version (the manual jump table) may be fastest, if you are calling it so often that the speed of this invocation is key, you might want to think about wrapping not just the call site, but the algorithm surrounding it in a magic switch: do the dispatching earlier. If, however, you only get the index at the last moment, and you are ok with a non- constexpr call, it should work. Note that static arrays will be created for each Function and To and From used.


您可以创建一个( constexpr )结果数组,如下所示:

#if 1 // Not in C++11

template <std::size_t ...> struct index_sequence {};

template <std::size_t I, std::size_t ...Is>
struct make_index_sequence : make_index_sequence < I - 1, I - 1, Is... > {};

template <std::size_t ... Is>
struct make_index_sequence<0, Is...> : index_sequence<Is...> {};

#endif

template <unsigned int Value>
constexpr unsigned int function()
{
    // Just for the example, but it could be very complicated here
    return Value * Value;
}

namespace detail
{
    template <std::size_t From, std::size_t...Is>
    struct result_array
    {
        static constexpr std::array<unsigned int, sizeof...(Is)> values = {{::function<From + Is>()...}};
    };

    template <std::size_t From, std::size_t...Is>
    constexpr std::array<unsigned int, sizeof...(Is)> result_array<From, Is...>::values;

    template <std::size_t From, std::size_t...Is>
    constexpr unsigned int function(unsigned int value, index_sequence<Is...>)
    {
        return result_array<From, Is...>::values[value - From];
    }
} // namespace detail

template <unsigned int From, unsigned int To>
constexpr unsigned int function(const unsigned int value)
{
    static_assert(From < To, "Invalid template parameters");
    return detail::function<From>(value, make_index_sequence<std::size_t(To + 1 - From)>());
}
链接地址: http://www.djcxy.com/p/68364.html

上一篇: 编译时为什么不评估constexpr(MSVC 2015)?

下一篇: 运行时函数在编译时分支