c78c8190d5
Thanks to @asLody for optimizing this function. This raised the focus that this function should be optimized more. The current table assumes that the host GPU is able to invert for free, so only AND,OR,XOR are accumulated in the performance metrik. Performance results: Instructions 0: 8 1: 30 2: 114 3: 80 4: 24 Latency 0: 8 1: 30 2: 194 3: 24 |
||
---|---|---|
.. | ||
backend | ||
frontend | ||
ir_opt | ||
CMakeLists.txt | ||
environment.h | ||
exception.h | ||
host_translate_info.h | ||
object_pool.h | ||
profile.h | ||
program_header.h | ||
runtime_info.h | ||
shader_info.h | ||
stage.h | ||
varying_state.h |