C++模板元编程 - c++编程基础

准建议的最小实例化深度只有17层，然而大多数编译器都能够处理至少几十层，有些编译器允许实例化至数百层，更有一些可达数千层，直至资源耗尽。

假如我们拿掉XY模板局部特化版本，情况会如何？
90 // xy2.h
91
92 //原始摸板
93 template
94 class XY
95 {
96 public:
97 enum { result_ = Base * XY::result_ };
98 };

测试程序不变：
99 // xytest2.cpp
100
101 #include
102 #include "xy2.h"
103
104 int main()
105 {
106 std::cout << "X^Y<5, 4>::result_ = " << XY<5, 4>::result_;
107 }

执行如下编译命令：

C:\>g++ -c xytest2.cpp

你将会看到递归实例化将一直进行下去，直到达到编译器的极限。

GNU C++ (MinGW Special) 3.2的默认实例化极限深度为500层，你也可以手工调整实例化深度：

C:\>g++ -ftemplate-depth-3400 -c xytest2.cpp

事实上，就本例而言，g++ 3.2允许的实例化极限深度还可以再大一些（我的测试结果是不超过3450层）。

因此，在使用模板元编程技术时，我们总是要给出原始模板的特化版（局部特化版或完全特化版或兼而有之），以作为递归模板实例化的终结准则。

利用模板元编程技术解开循环

模板元编程技术最早的实际应用之一是用于数值计算中的解循环。举个例子，对一个数组进行求和的常见方法是：
108 // sumarray.h
109
110 template
111 inline T sum_array(int Dim, T* a)
112 {
113 T result = T();
114 for (int i = 0; i < Dim; ++i)
115 {
116 result += a[i];
117 }
118 return result;
119 }

这当然可行，但我们也可以利用模板元编程技术来解开循环：
120 // sumarray2.h
121
122 // 原始模板
123 template
124 class Sumarray
125 {
126 public:
127 static T result(T* a)
128 {
129 return a[0] + Sumarray::result(a+1);
130 }
131 };
132
133 // 作为终结准则的局部特化版
134 template
135 class Sumarray<1, T>
136 {
137 public:
138 static T result(T* a)
139 {
140 return a[0];
141 }
142 };

用法如下：
143 // sumarraytest2.cpp
144
145 #include
146 #include "sumarray2.h"
147
148 int main()
149 {
150 int a[6] = {1, 2, 3, 4, 5, 6};
151 std::cout << " Sumarray<6>(a) = " << Sumarray<6, int>::result(a);
152 }

当我们计算Sumarray<6, int>::result(a)时，实例化过程如下：
153 Sumarray<6, int>::result(a)
154 = a[0] + Sumvector<5, int>::result(a+1)
155 = a[0] + a[1] + Sumvector<4, int>::result(a+2)
156 = a[0] + a[1] + a[2] + Sumvector<3, int>::result(a+3)
157 = a[0] + a[1] + a[2] + a[3] + Sumvector<2, int>::result(a+4)
158 = a[0] + a[1] + a[2] + a[3] + a[4] + Sumvector<1, int>::result(a+5)
159 = a[0] + a[1] + a[2] + a[3] + a[4] + a[5]

可见，循环被展开为a[0] + a[1] + a[2] + a[3] + a[4] + a[5]。这种直截了当的展开运算几乎总是比循环来得更有效率。

也许拿一个有着600万个元素的数组来例证循环开解的优势可能更有说服力。生成这样的数组很容易，有兴趣，你不妨测试、对比一下。

（感谢一位朋友的测试。他说，“据在Visual C++ 2003上实测编译器应当进行了尾递归优化，可以不受上面说的递归层次的限制，然而连加的结果在数组个数达到4796之后就不再正确了，程序输出了空行，已经出错” — 2003年12月30日补充）

模板元编程在数值计算程序库中的应用 www.2cto.com

Blitz++之所以“快如闪电”（这正是blitz的字面含义），离不开模板元程序的功劳。Blitz++淋漓尽致地使用了元编程技术，你可以到这些文件源代码中窥探究竟：
dot.h
matassign.h
matmat.h
matvec.h
metaprog.h
product.h
sum.h
vecassign.h

让我们看看Blitz++程序库dot.h文件中的模板元程序：
160 template
161 class _bz_meta_vectorDot {
162 public:
163 enum { loopFlag = (I < N-1) 1 : 0 };
164
165 template
166 static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, _bz_typename T_expr2::T_numtype)
167 f(const T_expr1& a, const T_expr2& b)
168 {
169 return a[I] * b[I] + _bz_meta_vectorDot::f(a,b);
170 }
171
172 template
173 static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, _bz_typename T_expr2::T_numtype)
174 f_value_ref(T_expr1 a, const T_expr2& b)
175 {
176 return a[I] * b[I] + _bz_meta_vectorDot::f(a,b);
177 }
178
179 template
180 static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, _bz_typename T_expr2::T_numtype)
181 f_ref_value(const T_expr1& a, T_expr2 b)
182 {
183 return a[I] * b[I] + _bz_meta_vectorDot::f(a,b);
184 }
185
186 template
187 static inline BZ_PROMOTE(_bz_typename T_expr1::T_numtype, P_numtype2)
188 dotWithArgs(const T_expr1& a, P_numtype2 i1, P_numtype2 i2=0,
189 P_numtype2 i3=0, P_numtype2 i4=0, P_numtype2 i5=0, P_numtype2 i6=0,
190 P_nu

C++模板元编程 (二)