Search This Blog

Monday, March 28, 2011

Externded gmm for backgroud substract

http://code.google.com/p/cubgs/

Introduction

This is an implementation of Extended Gaussian mixture model (GMM) applying in Background subtraction for NVIDIA CUDA platform. The original CPU idea and implementation can be found in Ref. 1 and 2.

Usage

<T.B.D>
Please see the sample code in svn, read the article on CodeProject, read the paper, or the lengthy technical report.

License

The testing video clips and libraries (CUDA, OpenCV) belong to its authors.
The following conditions are derived from Zivkovic's original CPU implementation:
This work may not be copied or reproduced in whole or in part for any commercial purpose. Permission to copy in whole or in part without payment of fee is granted for nonprofit educational and research purposes provided that all such whole or partial copies include the following:
  • this notice,
  • an acknowledgment of the authors and individual contributions to the work;
Copying, reproduction, or republishing for any other purpose shall require a license. Please contact the author in such cases. All the code is provided without any guarantee.
If you use this project in academic publications, please cite:
Vu Pham, Phong Vo, Hung Vu Thanh, Bac Le Hoai, "GPU Implementation of Extended Gaussian Mixture Model for Background Subtraction", IEEE-RIVF 2010 International Conference on Computing and Telecommunication Technologies, Vietnam National University, Nov. 01-04, 2010. DOI: 10.1109/RIVF.2010.5634007.

References

  1. Z.Zivkovic, "Improved adaptive Gausian mixture model for background subtraction", International Conference Pattern Recognition, Vol.2, pages: 28-31, 2004.
  2. Z.Zivkovic, F. van der Heijden, "Efficient adaptive density estimation per image pixel for the task of background subtraction", Pattern Recognition Letters, vol. 27, no. 7, pages 773-780, 2006.

Saturday, March 26, 2011

EM算法

http://liuxqsmile.blogbus.com/logs/17290002.html

版权声明:转载时请以超链接形式标明文章原始出处和作者信息及本声明
http://liuxqsmile.blogbus.com/logs/17290002.html

高斯混合模型的期望最大化算法

本文大部分内容翻译自Answers.com的Expectation-maximization algorithm 词条。
对 于一个不完整的样本集,假设y为观测(已知)数据,z为丢失数据,z和y构成了完整的数据。z可以为丢失的观测量,也可以是隐含的变量,如果这些变量已知 的话将会简化问题的求解。例如,在混合模型中,如果产生样本点的各个成分的分布为已知,则可以大大简化似然函数的计算。
假设p为完整数据的联合概率密度函数,该函数具有参数θ:p( mathbf y, mathbf z | theta),它也可以看成是完整数据的似然函数(即样本集(y,z)具有θ分布函数的相似性),一个关于θ的函数。进一步,根据贝叶斯公式和全概率公式,丢失数据对于观测数据的条件分布为:
p(mathbf z |mathbf y, theta) = frac{p(mathbf y, mathbf z | theta)}{p(mathbf y | theta)} = frac{p(mathbf y|mathbf z, theta) p(mathbf z |theta) }{int p(mathbf y|mathbf hat{z}, theta) p(mathbf hat{z} |theta) dmathbf hat{z}}
这个式子仅要求计算观测数据对于丢失数据的条件概率(似然函数)p(mathbf y|mathbf z, theta)和丢失数据的分布概率p(mathbf z |theta)。(分母仅为归一化需要)
EM算法通过迭代方法,构造更好的 θ12,... ,逐步优化初始的 θ0, θ的递推优化公式通过下式得到:
theta_{n+1} = argmax_{theta}Q(theta)
Q(θ)是对数似然函数lnp( mathbf y, mathbf z | theta)的 期望。换句话说,我们不知道所有数据的值,所以我们不知道似然函数的准确值,但是对于给定的观测数据y,我们可以得到未知的z的概率分布,对于每一个丢失 的观测(z中的一个元素),都有一个关于θ的似然值,这样我们就可以计算似然函数的数学期望。这个期望显然受到θ的值的影响。

Thursday, March 24, 2011

Gaussian Mixture

EM of GMMs with GPU Acceleration (CUDA)

EM converging on correct parameters
Sample Expectation Maximization; red Xs represent initial guessed means.

About:

This is a parallel implementation of the Expectation Maximization algorithm for Gaussian Mixture Models, designed to run on NVidia graphics cards supporting CUDA.  On my machine*, it provides up to 170x performance increases versus a CPU reference version.
See the report for more information.
The interesting code is all in gpugaumixmod.h and gpugaumixmod_kernel.h. The reference CPU implementation is included in cpugaumixmod.h.
It can be integrated into any C program on a CUDA enabled system.  Additionally, Matlab integration is provided in gmm.cu.

Expectation Maximization is a powerful method for converging to a local maximum.  K-means clustering is a special case of expectation maximization of Gaussian clusters.

It was created originally as a term project for my Computational Statistics with Application to Bioinformatics class.  See the report for more information.

E-mail me with any questions or comments!
*CUDA 2.1, GTX285, C2D E8400 @ 3GHZ, 4GB ram, Vista 64-bit

Downloads:


Links:


Implementation

The interesting code is all in gpugaumixmod.h and gpugaumixmod_kernel.h.
The reference CPU implementation is in gaumixmod.h, which is from Numerical Recipes 3.
It can be integrated into any C program on a CUDA-enabled system.  Additionally,
Matlab Mex integration is provided in gmm.cu.

Compiling

You may have trouble compiling as-is, as the config files are set up to
run on my Windows Vista 64bit machine, but it's just a standard Cuda kernel
underneath so it should be portable.  A precompiled Windows 64-bit version is
included.
See compile.m for the command I use to compile the CUDA/Mex files.
You'll need the CUDA drivers and toolkit from Nvidia:
http://www.nvidia.com/object/cuda_get.html You'll also want to install the MATLAB plug-in for CUDA: http://developer.nvidia.com/object/matlab_cuda.html">http://www.nvidia.com/object/cuda_get.html
You'll also want to install the MATLAB plug-in for CUDA:
http://developer.nvidia.com/object/matlab_cuda.html

Running

Once compiled, start off by running gmm_example in Matlab to see it in action.
See experiment1, experiment2, experiment3 for ready-to-run experiments


CUDA kernel performance

Updates:

Initial release May 6, 2009
Updated May 8, 2009:
-Fixed synchronization issues in kernel
-Cleaned up mex wrapper
-Cleaned up some potential memory issues
-Combined GpuGauMixmod and GauMixmod classes into single class
Updated May 10, 2009:
-Performance increases, doubling speed vs previous version.
 Best case it performs 170x reference version speed
 (16 dims, 16 clusters, 1000000 data points).
    *Unrolled some loop tails
    *Decomposed problem even more
    *Changed unnecessary double to float
    *Using fast exp and log functions now.
  *Memcopy to GPU is somewhat lazier.
-Updated plotting and experiment code
-Uncombined GpuGauMixmod and GauMixmod classes again.
Updated May 21, 2009:
-Now handles simultaneous random restarts.  Just make the
 matrix passed to it 3-dimensional, with the 3rd dimension containing different
 restarts.
  *The step function will return a list of log likelihoods for the different restarts.
  *The functions for querying response and mean/sigmas now take an optional second
   parameter for getting the data for the corresponding restart.

Wednesday, March 23, 2011

跟我一起写 Makefile(五)

六、多目标
Makefile的规则中的目标可以不止一个,其支持多目标,有可能我们的多个目标同时依赖于 一个文件,并且其生成的命令大体类似。于是我们就能把其合并起来。当然,多个目标的生成规则的执行命令是同一个,这可能会可我们带来麻烦,不过好在我们的 可以使用一个自动化变量“$@”(关于自动化变量,将在后面讲述),这个变量表示着目前规则中所有的目标的集合,这样说可能很抽象,还是看一个例子吧。
    bigoutput littleoutput : text.g
            generate text.g -$(subst output,,$@) > $@

    上述规则等价于:
    bigoutput : text.g
            generate text.g -big > bigoutput
    littleoutput : text.g
            generate text.g -little > littleoutput

    其中,-$(subst output,,$@)中的“$”表示执行一个Makefile的函数,函数名为subst,后面的为参数。关于函数,将在后面讲述。这里的这个函数是截 取字符串的意思,“$@”表示目标的集合,就像一个数组,“$@”依次取出目标,并执行命令。

七、静态模式
静态模式可以更加容易地定义多目标的规则,可以让我们的规则变得更加的有弹性和灵活。我们还是先来看一下语法:
    <targets ...>: <target-pattern>: <prereq-patterns ...>
            <commands>
            ...


    targets定义了一系列的目标文件,可以有通配符。是目标的一个集合。
    target-parrtern是指明了targets的模式,也就是的目标集模式。
    prereq-parrterns是目标的依赖模式,它对target-parrtern形成的模式再进行一次依赖目标的定义。
这样描述这三个东西,可能还是没有说清楚,还是举个例子来说明一下吧。如果我们 的<target-parrtern>定义成“%.o”,意思是我们的<target>集合中都是以“.o”结尾的,而如果我们 的<prereq-parrterns>定义成“%.c”,意思是对<target-parrtern>所形成的目标集进行二次 定义,其计算方法是,取<target-parrtern>模式中的“%”(也就是去掉了[.o]这个结尾),并为其加上[.c]这个结尾, 形成的新集合。
所以,我们的“目标模式”或是“依赖模式”中都应该有“%”这个字符,如果你的文件名中有“%”那么你可以使用反斜杠“\”进行转义,来标明真实的“%”字符。
看一个例子:
    objects = foo.o bar.o
    all: $(objects)
    $(objects): %.o: %.c
            $(CC) -c $(CFLAGS) $< -o $@


上面的例子中,指明了我们的目标从$object中获取,“%.o”表明要所有以 “.o”结尾的目标,也就是“foo.o bar.o”,也就是变量$object集合的模式,而依赖模式“%.c”则取模式“%.o”的“%”,也就是“foo bar”,并为其加下“.c”的后缀,于是,我们的依赖目标就是“foo.c bar.c”。而命令中的“$<”和“$@”则是自动化变量,“$<”表示所有的依赖目标集(也就是“foo.c bar.c”),“$@”表示目标集(也就是“foo.o bar.o”)。于是,上面的规则展开后等价于下面的规则:

    foo.o : foo.c
            $(CC) -c $(CFLAGS) foo.c -o foo.o
    bar.o : bar.c
            $(CC) -c $(CFLAGS) bar.c -o bar.o

试想,如果我们的“%.o”有几百个,那种我们只要用这种很简单的“静态模式规则”就可以写完一堆规则,实在是太有效率了。“静态模式规则”的用法很灵活,如果用得好,那会一个很强大的功能。再看一个例子:

    files = foo.elc bar.o lose.o
    $(filter %.o,$(files)): %.o: %.c
            $(CC) -c $(CFLAGS) $< -o $@
    $(filter %.elc,$(files)): %.elc: %.el
            emacs -f batch-byte-compile $<


$(filter %.o,$(files))表示调用Makefile的filter函数,过滤“$filter”集,只要其中模式为“%.o”的内容。其的它内容,我就不用多说了吧。这个例字展示了Makefile中更大的弹性。


八、自动生成依赖性
在Makefile中,我们的依赖关系可能会需要包含一系列的头文件,比如,如果我们的main.c中有一句“#include "defs.h"”,那么我们的依赖关系应该是:
    main.o : main.c defs.h
但是,如果是一个比较大型的工程,你必需清楚哪些C文件包含了哪些头文件,并且,你在加入或删 除头文件时,也需要小心地修改Makefile,这是一个很没有维护性的工作。为了避免这种繁重而又容易出错的事情,我们可以使用C/C++编译的一个功 能。大多数的C/C++编译器都支持一个“-M”的选项,即自动找寻源文件中包含的头文件,并生成一个依赖关系。例如,如果我们执行下面的命令:
    cc -M main.c
其输出是:
    main.o : main.c defs.h
于是由编译器自动生成的依赖关系,这样一来,你就不必再手动书写若干文件的依赖关系,而由编译器自动生成了。需要提醒一句的是,如果你使用GNU的C/C++编译器,你得用“-MM”参数,不然,“-M”参数会把一些标准库的头文件也包含进来。
    gcc -M main.c的输出是:
    main.o: main.c defs.h /usr/include/stdio.h /usr/include/features.h \
         /usr/include/sys/cdefs.h /usr/include/gnu/stubs.h \
         /usr/lib/gcc-lib/i486-suse-linux/2.95.3/include/stddef.h \
         /usr/include/bits/types.h /usr/include/bits/pthreadtypes.h \
         /usr/include/bits/sched.h /usr/include/libio.h \
         /usr/include/_G_config.h /usr/include/wchar.h \
         /usr/include/bits/wchar.h /usr/include/gconv.h \
         /usr/lib/gcc-lib/i486-suse-linux/2.95.3/include/stdarg.h \
         /usr/include/bits/stdio_lim.h


    gcc -MM main.c的输出则是:
    main.o: main.c defs.h
那么,编译器的这个功能如何与我们的Makefile联系在一起呢。因为这样一来,我们的 Makefile也要根据这些源文件重新生成,让Makefile自已依赖于源文件?这个功能并不现实,不过我们可以有其它手段来迂回地实现这一功能。 GNU组织建议把编译器为每一个源文件的自动生成的依赖关系放到一个文件中,为每一个“name.c”的文件都生成一个“name.d”的 Makefile文件,[.d]文件中就存放对应[.c]文件的依赖关系。
于是,我们可以写出[.c]文件和[.d]文件的依赖关系,并让make自动更新或自成[.d]文件,并把其包含在我们的主Makefile中,这样,我们就可以自动化地生成每个文件的依赖关系了。
这里,我们给出了一个模式规则来产生[.d]文件:
    %.d: %.c
            @set -e; rm -f $@; \
             $(CC) -M $(CPPFLAGS) $< >
$@.$$$$; \
             sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' <
$@.$$$$ > $@; \
             rm -f
$@.$$$$

这个规则的意思是,所有的[.d]文件依赖于[.c]文件,“rm -f $@”的意思是删除所有的目标,也就是[.d]文件,第二行的意思是,为每个依赖文件“$<”,也就是[.c]文件生成依赖文件,“$@”表示模式 “%.d”文件,如果有一个C文件是name.c,那么“%”就是“name”,“$$$$”意为一个随机编号,第二行生成的文件有可能是 “name.d.12345”,第三行使用sed命令做了一个替换,关于sed命令的用法请参看相关的使用文档。第四行就是删除临时文件。

总而言之,这个模式要做的事就是在编译器生成的依赖关系中加入[.d]文件的依赖,即把依赖关系:
    main.o : main.c defs.h
转成:
    main.o main.d : main.c defs.h
于是,我们的[.d]文件也会自动更新了,并会自动生成了,当然,你还可以在这个[.d]文件 中加入的不只是依赖关系,包括生成的命令也可一并加入,让每个[.d]文件都包含一个完赖的规则。一旦我们完成这个工作,接下来,我们就要把这些自动生成 的规则放进我们的主Makefile中。我们可以使用Makefile的“include”命令,来引入别的Makefile文件(前面讲过),例如:
    sources = foo.c bar.c
    include $(sources:.c=.d)
上述语句中的“$(sources:.c=.d)”中的“.c=.d”的意思是做一个替换,把 变量$(sources)所有[.c]的字串都替换成[.d],关于这个“替换”的内容,在后面我会有更为详细的讲述。当然,你得注意次序,因为 include是按次来载入文件,最先载入的[.d]文件中的目标会成为默认目标。

跟我一起写 Makefile(四)

书写规则
————

规则包含两个部分,一个是依赖关系,一个是生成目标的方法。
在Makefile中,规则的顺序是很重要的,因为,Makefile中只应该有一个最终目 标,其它的目标都是被这个目标所连带出来的,所以一定要让make知道你的最终目标是什么。一般来说,定义在Makefile中的目标可能会有很多,但是 第一条规则中的目标将被确立为最终的目标。如果第一条规则中的目标有很多个,那么,第一个目标会成为最终的目标。make所完成的也就是这个目标。
好了,还是让我们来看一看如何书写规则。

一、规则举例
    foo.o : foo.c defs.h       # foo模块
            cc -c -g foo.c

看到这个例子,各位应该不是很陌生了,前面也已说过,foo.o是我们的目标,foo.c和defs.h是目标所依赖的源文件,而只有一个命令“cc -c -g foo.c”(以Tab键开头)。这个规则告诉我们两件事:
    1、文件的依赖关系,foo.o依赖于foo.c和defs.h的文件,如果foo.c和defs.h的文件日期要比foo.o文件日期要新,或是foo.o不存在,那么依赖关系发生。
    2、如果生成(或更新)foo.o文件。也就是那个cc命令,其说明了,如何生成foo.o这个文件。(当然foo.c文件include了defs.h文件)


二、规则的语法
      targets : prerequisites
        command
        ...

      或是这样:
      targets : prerequisites ; command
            command
            ...

targets是文件名,以空格分开,可以使用通配符。一般来说,我们的目标基本上是一个文件,但也有可能是多个文件。
command是命令行,如果其不与“target:prerequisites”在一行,那么,必须以[Tab键]开头,如果和prerequisites在一行,那么可以用分号做为分隔。(见上)
prerequisites也就是目标所依赖的文件(或依赖目标)。如果其中的某个文件要比目标文件要新,那么,目标就被认为是“过时的”,被认为是需要重生成的。这个在前面已经讲过了。
如果命令太长,你可以使用反斜框(‘\’)作为换行符。make对一行上有多少个字符没有限制。规则告诉make两件事,文件的依赖关系和如何成成目标文件。
一般来说,make会以UNIX的标准Shell,也就是/bin/sh来执行命令。

三、在规则中使用通配符
如果我们想定义一系列比较类似的文件,我们很自然地就想起使用通配符。make支持三各通配符:“*”,“?”和“[...]”。这是和Unix的B-Shell是相同的。
波浪号(“~”)字符在文件名中也有比较特殊的用途。如果是“~/test”,这就表示当前用 户的$HOME目录下的test目录。而“~hchen/test”则表示用户hchen的宿主目录下的test目录。(这些都是Unix下的小知识 了,make也支持)而在Windows或是MS-DOS下,用户没有宿主目录,那么波浪号所指的目录则根据环境变量“HOME”而定。
通配符代替了你一系列的文件,如“*.c”表示所以后缀为c的文件。一个需要我们注意的是,如果我们的文件名中有通配符,如:“*”,那么可以用转义字符“\”,如“\*”来表示真实的“*”字符,而不是任意长度的字符串。
好吧,还是先来看几个例子吧:
    clean:
         rm -f *.o

    上面这个例子我不不多说了,这是操作系统Shell所支持的通配符。这是在命令中的通配符。
    print: *.c
         lpr -p $?
         touch print

    上面这个例子说明了通配符也可以在我们的规则中,目标print依赖于所有的[.c]文件。其中的“$?”是一个自动化变量,我会在后面给你讲述。
    objects = *.o
    上面这个例子,表示了,通符同样可以用在变量中。并不是说[*.o]会展开,不!objects的值就是“*.o”。Makefile中的变量其实就是 C/C++中的宏。如果你要让通配符在变量中展开,也就是让objects的值是所有[.o]的文件名的集合,那么,你可以这样:
    objects := $(wildcard *.o)
这种用法由关键字“wildcard”指出,关于Makefile的关键字,我们将在后面讨论。

四、文件搜寻
在一些大的工程中,有大量的源文件,我们通常的做法是把这许多的源文件分类,并存放在不同的目录中。所以,当make需要去找寻文件的依赖关系时,你可以在文件前加上路径,但最好的方法是把一个路径告诉make,让make在自动去找。
Makefile文件中的特殊变量“VPATH”就是完成这个功能的,如果没有指明这个变量,make只会在当前的目录中去找寻依赖文件和目标文件。如果定义了这个变量,那么,make就会在当当前目录找不到的情况下,到所指定的目录中去找寻文件了。
    VPATH = src:../headers
上面的的定义指定两个目录,“src”和“../headers”,make会按照这个顺序进行搜索。目录由“冒号”分隔。(当然,当前目录永远是最高优先搜索的地方)
另一个设置文件搜索路径的方法是使用make的“vpath”关键字(注意,它是全小写的), 这不是变量,这是一个make的关键字,这和上面提到的那个VPATH变量很类似,但是它更为灵活。它可以指定不同的文件在不同的搜索目录中。这是一个很 灵活的功能。它的使用方法有三种:
    1、vpath <pattern> <directories>
    为符合模式<pattern>的文件指定搜索目录<directories>。
    2、vpath <pattern>
    清除符合模式<pattern>的文件的搜索目录。
    3、vpath
    清除所有已被设置好了的文件搜索目录。
vapth使用方法中的<pattern>需要包含“%”字符。“%”的意思是匹 配零或若干字符,例如,“%.h”表示所有以“.h”结尾的文件。<pattern>指定了要搜索的文件集, 而<directories>则指定了<pattern>的文件集的搜索的目录。例如:
    vpath %.h ../headers
该语句表示,要求make在“../headers”目录下搜索所有以“.h”结尾的文件。(如果某文件在当前目录没有找到的话)
我们可以连续地使用vpath语句,以指定不同搜索策略。如果连续的vpath语句中出现了相同的<pattern>,或是被重复了的<pattern>,那么,make会按照vpath语句的先后顺序来执行搜索。如:
    vpath %.c foo
    vpath %   blish
    vpath %.c bar

其表示“.c”结尾的文件,先在“foo”目录,然后是“blish”,最后是“bar”目录。
    vpath %.c foo:bar
    vpath %   blish

而上面的语句则表示“.c”结尾的文件,先在“foo”目录,然后是“bar”目录,最后才是“blish”目录。

五、伪目标
最早先的一个例子中,我们提到过一个“clean”的目标,这是一个“伪目标”,
    clean:
            rm *.o temp

正像我们前面例子中的“clean”一样,即然我们生成了许多文件编译文件,我们也应该提供一个清除它们的“目标”以备完整地重编译而用。 (以“make clean”来使用该目标)
因为,我们并不生成“clean”这个文件。“伪目标”并不是一个文件,只是一个标签,由于 “伪目标”不是文件,所以make无法生成它的依赖关系和决定它是否要执行。我们只有通过显示地指明这个“目标”才能让其生效。当然,“伪目标”的取名不 能和文件名重名,不然其就失去了“伪目标”的意义了。
   当然,为了避免和文件重名的这种情况,我们可以使用一个特殊的标记“.PHONY”来显示地指明一个目标是“伪目标”,向make说明,不管是否有这个文件,这个目标就是“伪目标”。
    .PHONY : clean
只要有这个声明,不管是否有“clean”文件,要运行“clean”这个目标,只有“make clean”这样。于是整个过程可以这样写:
     .PHONY: clean
    clean:
            rm *.o temp

伪目标一般没有依赖的文件。但是,我们也可以为伪目标指定所依赖的文件。伪目标同样可以作为 “默认目标”,只要将其放在第一个。一个示例就是,如果你的Makefile需要一口气生成若干个可执行文件,但你只想简单地敲一个make完事,并且, 所有的目标文件都写在一个Makefile中,那么你可以使用“伪目标”这个特性:
    all : prog1 prog2 prog3
    .PHONY : all
    prog1 : prog1.o utils.o
            cc -o prog1 prog1.o utils.o
    prog2 : prog2.o
            cc -o prog2 prog2.o
    prog3 : prog3.o sort.o utils.o
            cc -o prog3 prog3.o sort.o utils.o
我们知道,Makefile中的第一个目标会被作为其默认目标。我们声明了一个“all”的伪 目标,其依赖于其它三个目标。由于伪目标的特性是,总是被执行的,所以其依赖的那三个目标就总是不如“all”这个目标新。所以,其它三个目标的规则总是 会被决议。也就达到了我们一口气生成多个目标的目的。“.PHONY : all”声明了“all”这个目标为“伪目标”。
随便提一句,从上面的例子我们可以看出,目标也可以成为依赖。所以,伪目标同样也可成为依赖。看下面的例子:
    .PHONY: cleanall cleanobj cleandiff
    cleanall : cleanobj cleandiff
            rm program

    cleanobj :
            rm *.o

    cleandiff :
            rm *.diff

“make clean”将清除所有要被清除的文件。“cleanobj”和“cleandiff”这两个伪目标有点像“子程序”的意思。我们可以输入“make cleanall”和“make cleanobj”和“make cleandiff”命令来达到清除不同种类文件的目的。

Friday, March 18, 2011

C语言的移位操作符

2010-07-02

C语言的移位操作符

文章分类:Java编程
位移位运算符是将数据看成二进制数,对其进行向左或向右移动若干位的运算。位移位运算符分为左移和右移两种,均为双目运算符。第一运算对象是移位对 象,第二个运算对象是所移的二进制位数。
位移位运算符的运算对象、运算规则与结果、结合性如表2-16所示。

移位时,移 出的位数全部丢弃,移出的空位补入的数与左移还是右移花接木有关。如果是左移,则规定补入的数全部是0;如果是右移,还与被移位的数据是否带符号有关。若 是不带符号数,则补入的数全部为0;若是带符号数,则补入的数全部等于原数的最左端位上的原数(即原符号位)。具体移位规则如下所示。

位移位运算符的优先级如下:
·算术运算符 优先于 位移位运算符 优先于 关系运算符
·位移位运算符是同级别的,结合性是 自左向右
例如,设无符号短整型变量a为0111(对应二进制数为0000000001001001),
则:a<<3 结果为01110(对应二进制数为0000001001001000),a不变
a>>4 结果为04 (对应二进制数为0000000000000100),a不变
又如,设短整型变量a为-4(对应二进制数为 1111111111111100),
则:a<<3 结果为-32(对应二进制数为1111111111100000),a不变
a>>4 结果为-1(对应二进制数为1111111111111111),a不变

C语言里的左移和右移运算
2006-09-30 13:52
先说左移,左移就是把一个数的所有位都向左移动若干位,在C中用<<运算符.例如:
int i = 1;
i = i << 2;    //把i里的值左移2位

也就是说,1的2进制是000...0001(这里1前面0的个数和int的位数有关,32位机器,gcc里有31个0),左移2位之后变成 000... 0100,也就是10进制的4,所以说左移1位相当于乘以2,那么左移n位就是乘以2的n次方了(有符号数不完全适用,因为左移有可能导致符号变化,下面 解释原因)
需要注意的一个问题是int类型最左端的符号位和移位移出去的情况.我们知道,int是有符号的整形数,最左端的1位是符号位,即0正1负,那么移 位的时候就会出现溢出,例如:
int i = 0x40000000; //16进制的40000000,为2进制的01000000...0000
i = i << 1;

那么,i在左移1位之后就会变成0x80000000,也就是2进制的100000...0000,符号位被置1,其他位全是0,变成了int类型 所能表示的最小值,32位的int这个值是-2147483648,溢出.如果再接着把i左移1位会出现什么情况呢?在C语言中采用了丢弃最高位的处理方 法,丢弃了1之后,i的值变成了0.
左移里一个比较特殊的情况是当左移的位数超过该数值类型的最大位数时,编译器会用左移的位数去模类型的最大位数,然后按余数进行移位,如:
int i = 1, j = 0x80000000; //设int为32位
i = i << 33;     // 33 % 32 = 1 左移1位,i变成2
j = j << 33;     // 33 % 32 = 1 左移1位,j变成0,最高位被丢弃

在用gcc编译这段程序的时候编译器会给出一个warning,说左移位数>=类型长度.那么实际上i,j移动的就是1位,也就是33%32 后的余数.在gcc下是这个规则,别的编译器是不是都一样现在还不清楚.
总之左移 就是: 丢弃最高位,0补最低位
再说右移,明白了左移的道理,那么右移就比较好理解了.
右移的概念和左移相反,就是往右边挪动若干位,运算符是>>.
右移对符号位的处理和左移不同,对于有符号整数来说,比如int类型,右移会保持符号位不变,例如:
int i = 0x80000000;
i = i >> 1;    //i的值不会变成0x40000000,而会变成0xc0000000

就是说,符号位向右移动 后,正数的话补0,负数补1,也就是汇编语言中的算术右移.同样当移动的位数超过类型的长度时,会取余数,然后移动余数个位.
       负数10100110 >>5(假设字长为8位),则得到的是    11111101
总之,在C中,左移是逻辑/算术左移(两者完全相同),右移是算术右移,会保持符号位不变 .实际应用中可以根据情况用左/右移做快速的乘 /除运算,这样会比循环效率高很多.
 
  在很多系统程序中常要求在位(bit)一级进行运算或处理。C语言提供了位运算的功能, 这使得C语言也能像汇编语言一样用来编写系统程序。
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
操作符 作用
────────────────────────────
& 位逻辑与
| 位逻辑或
^ 位逻辑异或
- 位逻辑反
>> 右移
<< 左移
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
         按位运算是对字节或字中的实际位进行检测、设置或移位, 它只适用于字符型和整数型变量以及它们的变体, 对其它数据类型不适用。
         我们要注意区分位运算和逻辑运算。


         1. 按位与运算 按位与运算符"&"是双目运算符。其功能是参与运算的两数各对应的二进位相与。只有对应的两个二进位均为1时,结果位才为1 ,否则为0。参与运算的数以补码方式出现。
例如:9&5可写算式如下: 00001001 (9的二进制补码)&00000101 (5的二进制补码) 00000001 (1的二进制补码)可见9&5=1。
         按位与运算通常用来对某些位清0或保留某些位。例如把a 的高八位清 0 , 保留低八位, 可作 a&255 运算 ( 255 的二进制数为0000000011111111)。
main(){
int a=9,b=5,c;
c=a&b;
printf("a=%d\nb=%d\nc=%d\n",a,b,c);
}

2. 按位或运算 按位或运算符“|”是双目运算符。其功能是参与运算的两数各对应的二进位相或。只要对应的二个二进位有一个为1时,结果位就为1。参与运算的两个数均以补 码出现。
例如:9|5可写算式如下: 00001001|00000101
00001101 (十进制为13)可见9|5=13
main(){
int a=9,b=5,c;
c=a|b;
printf("a=%d\nb=%d\nc=%d\n",a,b,c);
}

3. 按位异或运算 按位异或运算符“^”是双目运算符。其功能是参与运算的两数各对应的二进位相异或,当两对应的二进位相异时,结果为1。参与运算数仍以补码出现,例如 9^5可写成算式如下: 00001001^00000101 00001100 (十进制为12)
main(){
int a=9;
a=a^15;
printf("a=%d\n",a);
}

4. 求反运算 求反运算符~为单目运算符,具有右结合性。 其功能是对参与运算的数的各二进位按位求反。例如~9的运算为: ~(0000000000001001)结果为:1111111111110110
5. 左移运算 左移运算符“<<”是双目运算符。其功能把“<< ”左边的运算数的各二进位全部左移若干位,由“<<”右边的数指定移动的位数,高位丢弃,低位补0。例如: a<<4 指把a的各二进位向左移动4位。如a=00000011(十进制3),左移4位后为00110000(十进制48)。
6. 右移运算 右移运算符“>>”是双目运算符。其功能是把“>> ”左边的运算数的各二进位全部右移若干位,“>>”右边的数指定移动的位数。例如:设 a=15,a>>2 表示把000001111右移为00000011(十进制3)。应该说明的是,对于有符号数,在右移时,符号位将随同移 动。当为正数时, 最高位补0,而为负数时,符号位为1,最高位是补0或是补1 取决于编译系统的规定。
main(){
unsigned a,b;
printf("input a number: ");
scanf("%d",&a);
b=a>>5;
b=b&15;
printf("a=%d\tb=%d\n",a,b);
}
请 再看一例!
main(){
char a='a',b='b';
int p,c,d;
p=a;
p=(p<<8)|b;
d=p&0xff;
c=(p&0xff00)>>8;
printf("a=%d\nb=%d\nc=%d\nd=%d\n",a,b,c,d);
}

当进行按位与或时,最好使用16进制,在程序中这样表示:0x01 表示0000 0001
所以,字符类型a的最高位强制1可以这样:a=a|0x80。其他的可以依次类推!




评论
发表评论
表情图标

字体颜色:  字体大小:  对齐:
提示:选择您需要装饰的文字, 按上列按钮即可添加上相应的标签
您还没有登录,请登录后发表评论(快捷键 Alt+S / Ctrl+Enter)

Tuesday, March 15, 2011

Double-Word Integers

http://www.delorie.com/gnu/docs/gcc/gcc_39.html

Double-Word Integers

ISO C99 supports data types for integers that are at least 64 bits wide, and as an extension GCC supports them in C89 mode and in C++. Simply write long long int for a signed integer, or unsigned long long int for an unsigned integer. To make an integer constant of type long long int, add the suffix `LL' to the integer. To make an integer constant of type unsigned long long int, add the suffix `ULL' to the integer.
You can use these types in arithmetic like any other integer types. Addition, subtraction, and bitwise boolean operations on these types are open-coded on all types of machines. Multiplication is open-coded if the machine supports fullword-to-doubleword a widening multiply instruction. Division and shifts are open-coded only on machines that provide special support. The operations that are not open-coded use special library routines that come with GCC.
There may be pitfalls when you use long long types for function arguments, unless you declare function prototypes. If a function expects type int for its argument, and you pass a value of type long long int, confusion will result because the caller and the subroutine will disagree about the number of bytes for the argument. Likewise, if the function expects long long int and you pass int. The best way to avoid such problems is to use prototypes.

long long type

http://en.wikipedia.org/wiki/Long_integer
In later C99 version of the C programming language, a long long type is supported that doubles the capacity of the standard long to 64 bits. This type is not supported by compilers that require C code to be C++ ISO compliant, because the long long type does not currently exist in C++ (it is defined in the new, yet unfinished, C++ standard, C++0x). For an ANSI/ISO compliant compiler the minimum requirements for the specified ranges must be fulfilled; however, extending this range is permitted.[1] [2] This can be an issue when exchanging code and data between platforms, or doing direct hardware access. Thus, there are several sets of headers providing platform independent exact width types. The C standard library provides stdint.h; this was introduced in C99.

Hilbert space filling curve

http://people.csail.mit.edu/jaffer/Geometry/HSFC

space-filling curve playground equipment http://people.csail.mit.edu/jaffer/Geometry/HSFC

Hilbert Space-Filling Curves

A space-filling curve is a parameterized, injective function which maps a unit line segment to a continuous curve in the unit square, cube, hypercube, etc, which gets arbitrarily close to a given point in the unit cube as the parameter increases. Space-filling curves serve as a counterexample to less-than-rigorous notions of dimension. In addition to their mathematical importance, space-filling curves have applications to dimension reduction, mathematical programming [Butz 1968], sparse multi-dimensional database indexing [Lawder 2000], and electronics [Zhu 2003].
In 2003 I found three algorithms for the n-dimensional Hilbert Space-Filling Curve (fourth found recently):
There is a surfeit of 2-dimensional space-filling curves, but generalizing them to higher rank is not necessarily practical or even possible. I had some issues with these sources:
  • All are cryptic. Butz introduces seven layers of subscripted variables to describe his algorithm in terms of boolean exclusive-or; obscuring the higher level operations (such as gray-code->integer) being done. The C code in the Utah Raster Toolkit which implements Butz's algorithm does its processing through the use of computed tables.
    The algorithm presented here is in terms of higher level constructs: Gray-code, lamination, rotation and reflection.
  • The conversion functions for all take an argument for the width (in bits) of the multidimensional indexes. Although the origin of the multidimensional spaces are 0, the coordinates returned for other scalars depend on the width argument. The algorithmic extension presented here is a bijection between N and Nn, imposing no limitations on integer size. Some datasets will no longer need to be scaled.
  • The scalar must have rank times the precision of each multidimensional coordinate. C's integer size limitations spoil the algorithm for high dimensional spaces. The algorithms presented here do not depend on, and are not bound by integer size limitations.

Self-Similarity

The Z-curve, one of the simplest space-filling functions, forms each vector coordinate from every n-th bit of the scalar. It is self-avoiding, but contains some long edges. The transforms we are examining are self-similar. For the better behaved members of this class the vector displacements from changing a digit in the scalar are less than or equal to the vector displacements from changing a higher order digit.

Gray code

The simplest optimal space-filling curve is one which uses two positions on each axis. A Gray code is an ordering of non-negative integers in which exactly one bit differs between each pair of successive elements. There are multiple Gray codings. An n-bit Gray code corresponds to a Hamiltonian cycle on an n-dimensional hypercube.

Combining the Gray codes with self-similarity, we want the top-level displacements to be a Gray-coded n-dimensional hypercube. To add a layer of low-order bits, replace each stroke of the lowest-order Gray-code with a Gray-coded hypercube.
That doesn't look right! The net displacement from each hypercube is one edge. We must rotate each small hypercube so that its net displacement matches the edge it is replacing. This series of rotations starts with the most significant hypercube.
Permutations in the n bits encoding one hypercube correspond to its rotations. When n is 2, there are only two possible rotations; but as n increases, the number of distinct rotations grows to n! (n factorial).
When there are choices in which rotation to use, the resulting curves may be different. Alber and Niedermeier [Alber 1998] reveal:
..., whereas there is basically one Hilbert curve in the 2D world, our analysis shows that there are 1536 structurally different 3D Hilbert curves.

It turns out that cyclic rotations (shifts which loop around) of the field of bits encoding a hypercube is sufficient for space-filling curves. We must also potentially reverse the net direction of the small hypercube to match the direction of the edge it is replacing. This is a simple matter of flipping that bit which encodes the net direction.

The Butz Algorithm

So far this is the algorithm disclosed by [Butz 1971] and cited by Thomas and Moore. Teasing an understanding in terms of Gray-codes, lamination, rotations and reflections from the C code and Butz's papers has been quite a challenge.
The Butz algorithm and its derivatives take the number of bits per coordinate as a parameter. Although the origin of the multidimensional spaces are at (0, ..., 0), the coordinates returned for other indexes depend on the width argument.
These algorithms treat their numbers as base-2 fractions; the curves generated have the same alignment independent of the width arguments. The figure to the right superimposes the curves generated for widths 1 through 5 with proportional line thicknesses. The horizontal white line is the region not covered by any of the curves.
integer->hilbert-coordinates generates these curves when called with an optional width argument. Otherwise it treats its scalars and coordinates as integers.

Improvements

The orientation of the most significant hypercube is unconstrained; so I devised an orientation dependent on the size of the hypercube such that the initial sequence of coordinates generated is identical for all hypercube sizes. The bit-width argument is then superfluous; and integer->hilbert-coordinates is thus a bijection from N to Nn, imposing no limitations on integer size. Because the hypercube orientations are important only in relation to each other, my implementation rotates the system as a whole to an orientation such that the coordinates generated for the scalar 1 will differ from the origin in their first coordinate only. This property does not hold when calling the functions with width arguments.

Z

integer->hilbert-coordinates is a good solution when all the expected coordinates are nonnegative. Also useful would be a space-filling curve covering all integer coordinates, Zn. Working in two dimensions, I could not find a way to orient the different sized squares so that they matched at segments around a central origin. But the base-3 Peano space-filling curve can be defined to map Z to Zn with origin at (0, ..., 0).

Flonums

What about floating-point numbers? Each unit segment in n-space borders a unit cube filled by a scalar unit interval. The integer conversions start from the high order bits, so continuing into fractional bits is straightforward. Although one can combine flonum coordinates into a scalar, as the rank grows the scalar needs more and more precision.
Perhaps the most common application of the space-filling transform is sorting of multidimensional coordinates. D. Moore innovates by providing box and comparison functions for rank-n floating-point coordinates instead of floating-point Hilbert conversions [Moore 1999].

Praxis

slib/phil-spc.scm is a Scheme implementation of Hilbert conversions between non-negative scalars and non-negative coordinate vectors of arbitrary size.

Function: integer->hilbert-coordinates scalar rank
Returns a list of rank integer coordinates corresponding to exact non-negative integer scalar. The lists returned by integer->hilbert-coordinates for scalar arguments 0 and 1 will differ in the first element.
Function: integer->hilbert-coordinates scalar rank k
scalar must be a nonnegative integer of no more than rank*k bits. integer->hilbert-coordinates Returns a list of rank k-bit nonnegative integer coordinates corresponding to exact non-negative integer scalar. The curves generated by integer->hilbert-coordinates have the same alignment independent of k.

Function: hilbert-coordinates->integer coords
Function: hilbert-coordinates->integer coords k
Returns an exact non-negative integer corresponding to coords, a list of non-negative integer coordinates.

Further Work

  • Write comparison functions for Hilbert fixed-point coordinates.
  • Write comparison functions for Hilbert floating-point coordinates.
  • Use these functions to extend the WB B-trees to multidimensional indexes a la [Lawder 2000].

Copyright © 2003, 2005 Aubrey Jaffer

I am a guest and not a member of the MIT Computer Science and Artificial Intelligence Laboratory.  My actions and comments do not reflect in any way on MIT.


Space-Filling Curves

agj @ alum.mit.edu
Go Figure!

Monday, March 14, 2011

中国远征军

  一口气看完45集电视剧《中国远征军》,在为史诗般的剧情震撼同时,在思考跟本剧无关的一个问题,假如把战场比作一个职场,当下的人们应该从男一号韩绍功身上吸取什么呢?
1、     能上能下。中层干部韩绍功深得上司戴安澜的青睐,又有那么多兄弟拥护他,可谓春风得意、如鱼得水,如果平稳过渡,不出闪失,例如不做出在马路上搀扶晕倒的 老太太之事,定会升官加爵,可因韩绍功不听劝阻,见义勇为,损失了公司利益,违犯了“潜规则”,丧失了锦绣前程,被发配到基层,可韩绍功毫无怨言,带领弟 兄干出了另一番事业,此处省略若干文字,详情请看《中国远征军》。
2、     忠诚上司。军人以服从命令为天职,这句话在韩绍功身上演绎得淋漓尽致,他对上级绝对忠诚,从冒死殿后,只为上司撤退赢得时间可见一斑。
3、     体恤下级。现在江湖上流行一句话,朋友是干嘛用的,是用作背后插刀的,下级是干嘛用的,是用来当垫脚石的。可韩绍功老兄偏偏拧着干,当下级姚二林畏死潜逃 时,韩绍功只消拿起枪,对准逃兵姚二林的脑袋壳“嘭”的一下,就可弃卒保帅,可韩绍功视兄弟为手足,把姚二林放了,宁愿自己走上军事法庭去坐牢……为了救 下级杨文,更是不惜与美国老板史迪威撕破脸皮,韩兄何故如此哩?此处省去若干文字,详情请看剧情。
4、     不拉山头。戴安澜为国捐躯了,公司倒闭了,韩绍功面临着二次就业。此时,出现了谢孝彰这个“大救星”,谢孝彰拉着韩绍功投奔海归派孙立人,实则是想借助韩 绍功之力把孙立人拉下马,另立山头,却遭到韩绍功的严厉拒绝,原因是韩绍功有一颗忠诚的心,感恩的心——孙老板不但管你饭,还给你发工资,发奖金,你膘肥 了,却想着外国六,咱可不做农夫口袋里的蛇!
5、     义薄云天。韩绍功跟谢孝彰是兄弟,也是情敌,更是职场上的竞争对手,可在谢孝彰倒霉的时候,韩绍功没有袖手旁观,而是尽其所能,拉了谢孝彰一把,终于成就了谢孝彰,在电视剧里看到这一幕时,不由浮想联翩,想到了当下的人们,此处省略若干文字……
6、     不养二奶。韩绍功跟何玉姝是名义上的夫妻,二人之间没有轰轰烈烈的爱恋,这给了来自大洋彼岸的摩登女郎罗星机会,何玉姝的实用主义,跟罗星的罗曼蒂克形成 鲜明对比,罗星对韩绍功的追求比战场上的炮火更加猛烈,从炮火中写给韩绍功的情书可见一斑,韩绍功该如何选择?在那个年代,以韩绍功的实力和地位,娶个把 姨太太,养个二奶、三奶都不是问题,也无可厚非,可我们都想错了,关键时刻,没有出现《一声叹息》的镜头,韩绍功咬紧牙关扛住了,把犯了错误的何玉姝拥在 怀里说:“你坐牢了,我陪着你!”,看到此处,又是浮想联翩,省去文字若干……

Hilbert Curve Concepts & Implementation

Hilbert Curve
Concepts & Implementation


return to main index



Introduction

The web notes by Andrew Cumming of Napier University Edinburgh, provide a very good introduction to the Hilbert curve. His pseudo code, slightly modified, is shown in listing 1.

Listing 1 (from Andrew Cumming)


procedure hilbert(x, y, xi, xj, yi, yj, n)
/* x and y are the coordinates of the bottom left corner */
/* xi & xj are the i & j components of the unit x vector of the frame */
/* similarly yi and yj */
if (n <= 0) then
   LineTo(x + (xi + yi)/2, y + (xj + yj)/2);
else
   {
   hilbert(x,           y,           yi/2, yj/2,  xi/2,  xj/2, n-1);
   hilbert(x+xi/2,      y+xj/2 ,     xi/2, xj/2,  yi/2,  yj/2, n-1);
   hilbert(x+xi/2+yi/2, y+xj/2+yj/2, xi/2, xj/2,  yi/2,  yj/2, n-1);
   hilbert(x+xi/2+yi,   y+xj/2+yj,  -yi/2,-yj/2, -xi/2, -xj/2, n-1);
   }
end procedure;

To fully understand how Andrews recursive function operates is not easy and in order to do so it is necessary to break-down his Hilbert procedure to that the role that each "part" plays in the construction of the curve. We will begin by looking at the inputs of the Hilbert function.

Inputs

In addition to the counter (n) that tracks the level from which the function begins to recursively call itself, there are six other inputs to Andrew's procedure. The first two define the x and y coordinates of an input point that will be used in the calculation of a point on the Hilbert curve. The next 4 values define two vectors. Before we see how they are used we must understand what is meant by the inputs labelled xi, xj, yi and yj. Hopefully, figure 1 will make them clearer.

The large black dot, labelled P, in figure 1 is defined by its x and y coordinates - shown as small dots on the two axes. We can see, from the two arrows, that each dot marks the head of a vector. We might call these the x-vector and the y-vector.


Figure 1

Although these vectors represent the x and y coordinates of P they, like any vector, have their own coordinates. For example, the coordinates of the x-vector might be 1.0, 0.0, while the coordinates of the y-vector might be 0.0, 0.5. To prevent the confusion of labelling the coordinates of these vectors as "x" and "y" we, instead, refer to them by the labels "i" and "j". For example,
x-vector              y-vector
    xi = 1.0 and          yi = 0.0 and
    xj = 0.0              yj = 0.5
We can make use of these vectors to move (ie. translate) a copy of point P. For example, in Andrew's procedure we note this statement.
LineTo(x + (xi + yi)/2, y + (xj + yj)/2);
Rewritten for point P we have,
Px = Px + (xi + yi)/2
       = 1 + (1 + 0)/2
       = 1.5

    Py = Py + (xj + yj)/2
       = 0.5 + (0 + 0.5)/2
       = 0.75

The effect of using the i's and j's on point P can be seen in figure 2.


Figure 2

Point P could also have been translated by directly changing its x and y values. However, using the "i" and "j" coordinates (x-vector and y-vectors) moves the point in a structured and proportional fashion. In the context of their use in Andrew's procedure, the "i" and "j" components provide, what are in effect, additional "handles" to control the placement of points along the Hilbert curve.

Modifying the Inputs

Looking at the hilited code shown in listing 2 we see the Hilbert procedure calls itself four times. On each call it passes slightly modified versions of its original input values to itself.

Listing 2


procedure hilbert(x, y, xi, xj, yi, yj, n)
if (n <= 0) then
   LineTo(x + (xi + yi)/2, y + (xj + yj)/2);
else
   {
   hilbert(x,           y,           yi/2, yj/2,  xi/2,  xj/2, n-1);
   hilbert(x+xi/2,      y+xj/2 ,     xi/2, xj/2,  yi/2,  yj/2, n-1);
   hilbert(x+xi/2+yi/2, y+xj/2+yj/2, xi/2, xj/2,  yi/2,  yj/2, n-1);
   hilbert(x+xi/2+yi,   y+xj/2+yj,  -yi/2,-yj/2, -xi/2, -xj/2, n-1);

   }
end procedure;

//      x  y  xi  xj  yi  yj  n
    hilbert(0, 0, 1,  0,  0,  1,  1);
Tracking the results of the recursions will show what happens to original x,y coordinates - shown in above in blue. The i and the j vectors will be ignored. The four recursive calls will be labelled A, B, C and D ie.
A    hilbert(...);
    B    hilbert(...);
    C    hilbert(...);
    D    hilbert(...);
Performing the arithmetic on the x and y's for the first level of recursion for A, B, C and D generates the following x,y values,
x    y
    A    hilbert(0.0  0.0, ...);
    B    hilbert(0.5  0.0, ...);
    C    hilbert(0.5  0.5, ...);
    D    hilbert(0.5  1.0, ...);
The value of x and y can be plotted using the LineTo - listing 3.

Listing 3


procedure hilbert(x, y, xi, xj, yi, yj, n)
if (n <= 0) then
   LineTo(x  + (xi + yi)/2, y  + (xj + yj)/2");
else
   {
   hilbert(x,           y,           yi/2, yj/2,  xi/2,  xj/2, n-1);
   hilbert(x+xi/2,      y+xj/2 ,     xi/2, xj/2,  yi/2,  yj/2, n-1);
   hilbert(x+xi/2+yi/2, y+xj/2+yj/2, xi/2, xj/2,  yi/2,  yj/2, n-1);
   hilbert(x+xi/2+yi,   y+xj/2+yj,  -yi/2,-yj/2, -xi/2, -xj/2, n-1);
   }
end procedure;

Figures 3 to 6 show the development of a Hilbert curve from 1 to 4 iterations. The colored dots correspond to the four recursions ie.
A    hilbert(...); red dot
    B    hilbert(...); green dot
    C    hilbert(...); blue dot
    D    hilbert(...); purple dot
Figure 3 shows the basic building block of the Hilbert curve is a open square and that increasingly complex patterns, for example figure 4, are constructed from connected repetitions of this shape. In figure 4 the connections that join the four repititions (2 iterations) are shown as dotted lines.

Figure 3
1 interation

Figure 4
2 interations


Figure 5
3 interations

Figure 6
4 interations

Basic Code

Listings 4 and 5 provide code for so-called RenderMan procedural primitives, one written in Python and the other in Tcl. For information about procedural primitives, or helper apps as they are often called, refer to the tutorial "RenderMan Procedural Primitives: Implementations in Python, Tcl and 'C'". A sample rib file to run the implementations is given by listing 6.



Listing 4 (hilbert.py)


import sys, math
  
def hilbert(x0, y0, xi, xj, yi, yj, n):
    if n <= 0:
        X = x0 + (xi + yi)/2
        Y = y0 + (xj + yj)/2
        
        # Output the coordinates of the cv
        print '%s %s 0' % (X, Y)
    else:
        hilbert(x0,               y0,               yi/2, yj/2, xi/2, xj/2, n - 1)
        hilbert(x0 + xi/2,        y0 + xj/2,        xi/2, xj/2, yi/2, yj/2, n - 1)
        hilbert(x0 + xi/2 + yi/2, y0 + xj/2 + yj/2, xi/2, xj/2, yi/2, yj/2, n - 1)
        hilbert(x0 + xi/2 + yi,   y0 + xj/2 + yj,  -yi/2,-yj/2,-xi/2,-xj/2, n - 1)
        
def main():
    args = sys.stdin.readline()
    # Remain the loop until the renderer releases the helper...
    while args:
        arg = args.split()
        # Get the inputs
        pixels = float(arg[0])
        ctype = arg[1]
        reps = int(arg[2])
        width = float(arg[3])
        
        # Calculate the number of curve cv's
        cvs = int(math.pow(4, reps))
            
        # Begin the RenderMan curve statement
        print 'Basis \"b-spline\" 1 \"b-spline\" 1'
        print 'Curves \"%s\" [%s] \"nonperiodic\" \"P\" [' % (ctype, cvs)
    
        # Create the curve
        hilbert(0.0, 0.0, 1.0, 0.0, 0.0, 1.0, reps)
    
        # End the curve statement
        print '] \"constantwidth\" [%s]' % width
      
        # Tell the renderer we have finished   
        sys.stdout.write('\377')
        sys.stdout.flush()
        
        # read the next set of inputs
        args = sys.stdin.readline()
if __name__ == "__main__":
    main()


Listing 5 (hilber.tcl)


fconfigure stdout -translation binary
  
# This is where the control vertices of the Hilbert curve are generated    
proc hilbert { x0 y0 xi xj yi yj n } {
    if { $n <= 0 } {
        set X [expr $x0 + ($xi + $yi)/2]
        set Y [expr $y0 + ($xj + $yj)/2]
        
        # Output the coordinates of the cv
        puts "$X $Y 0 "
    } else {
        set XI [expr $xi/2]
        set XJ [expr $xj/2]
        set YI [expr $yi/2]
        set YJ [expr $yj/2]
        
        # Begin recursion
        hilbert $x0                 $y0                $YI  $YJ  $XI  $XJ [expr $n - 1]
        hilbert [expr $x0+$XI]     [expr $y0+$XJ]      $XI  $XJ  $YI  $YJ [expr $n - 1]
        hilbert [expr $x0+$XI+$YI] [expr $y0+$XJ+$YJ]  $XI  $XJ  $YI  $YJ [expr $n - 1]
        hilbert [expr $x0+$XI+$yi] [expr $y0+$XJ+$yj] -$YI -$YJ -$XI -$XJ [expr $n - 1]
        }
    }
  
# Remain the loop until the renderer releases the helper...
while { [gets stdin args] != -1 } {
    # Get the inputs
    set pixels [lindex $args 0]
    set ctype  [lindex $args 1]
    set reps   [lindex $args 2]
    set width  [lindex $args 3]
  
    # Calculate the number of curve cv's
    set cvs [expr int(pow(4, $reps))]
        
    # Begin the RenderMan curve statement
    puts "Basis \"b-spline\" 1 \"b-spline\" 1"
    puts "Curves \"$ctype\" \[$cvs\] \"nonperiodic\" \"P\" \["
    
    # Create the curve
    hilbert 0.0  0.0  1.0  0  0  1.0 $reps
    
    # End the curve statement
    puts "\] \"constantwidth\" \[ $width \]"
  
    # Tell the renderer we have finished   
    puts "\377" 
    flush stdout
    }

The inputs (arguments) to the helper apps are,
    curve type - either linear or cubic
    number of iterations
    width of the curve


Figure 7
cubic - 4 iterations

Listing 6 (hilbert.rib)


Display "iterations" "framebuffer" "rgba"
Format 300 300 1
Projection "perspective" "fov" 12
ShadingRate 1
  
Translate  -0.5 -0.5 5
Rotate 0 1 0 0
Rotate 0   0 1 0
Scale 1 1 -1
WorldBegin
    TransformBegin
        Surface "constant"
        Color 1 1 1
        # For linux and MacOSC the paths to tclsh and
        # python must be fully specified ie,
        # /usr/bin/tclsh
        # /usr/bin/python
        # Args: curve type, iterations and curve width
        Procedural "RunProgram" ["tclsh  FULL_PATH/hilbert_helper.tcl" "cubic 4 0.005"] 
                    [-1 1 -1 1 -1 1]
        #Procedural "RunProgram" ["python  FULL_PATH/hilbert_helper.py" "linear 4 0.005"] 
        #           [-1 1 -1 1 -1 1]
    TransformEnd
WorldEnd


Mistakes can be Interesting

When entering the code for the Python and Tcl scripts I made a couple of mistakes that produced some interesting results. The first error occured in a python script.
hilbert(x0,               y0,               yi/2, yj/2, xi/2, xj/2, n - 1)
hilbert(x0 + xi/2,        y0 + xj/2,        xi/2, xj/2, yi/2, xj/2, n - 1)
hilbert(x0 + xi/2 + yi/2, y0 + xj/2 + yj/2, xi/2, xj/2, yi/2, xj/2, n - 1)
hilbert(x0 + xi/2 + yi,   y0 + xj/2 + yj,  -yi/2,-yj/2,-xi/2,-xj/2, n - 1)


Figure 8

The second error effected a tcl script.
hilbert $x0                    $y0                     $YI  $YJ  $XI  $XJ [expr $n - 1]
hilbert [expr $x0 + $XI]       [expr $y0 + $XJ]        $XI  $XJ  $YI  $YJ [expr $n - 1]
hilbert [expr $x0 + $XI + $YI] [expr $y0 + $XJ + $YJ]  $XI  $XJ  $YI  $YJ [expr $n - 1]
hilbert [expr $x0 + $XI + $YI] [expr $y0 + $XJ + $yj] -$YI -$YJ -$XI -$XJ [expr $n - 1]


Figure 9



© 2002- Malcolm Kesson. All rights reserved.