Menu

Phenomenon Engine / News: Recent posts

The begining of the world...

News-news-news !! I added full transformation pipeline of vertices and homogeneous clipping of triangles based on direct 3d documentation (like orientations of model and camera in world-space, perspective matrix calculation and so). I added backface-culling of triangles against camera in model space, so some triangles don't go down thru the pipeline and just those vertices are transformed, which are really needed (just for visible triangles).I created small demo where you can move with the camera and see a big cube with texture of size 2048 x2048. About the rasterizer. I reimplemented Nick's rasterizer with fixed-point math because of its numerical stability near edge of the drawing rectangle. Sometimes after clipping and homogeneous division the positions of points of the triangle was going outside of the screen which caused an error in triangle rasterizer.
About the demo;
q,e - moving in y direction
a,d - moving in x direction
w,s - moving in z direction
1,2,3- filtering method
9,0 - vsync on-off... read more

Posted by Jozef Tulec 2010-09-18

UT99 feeling...

Ok guys. What's new?
Now the triangle input coordiantes are in NDC (Normalized device coordinates), so x and y postion need to be in +1,-1 interval. Why this? because this are using graphicards and helped me to solve the problem when you change the size of the window. Now the size of of
the triangles is changing too and is propotional to the rendering window.

Aaand i added third texture filtering method for low-end pc's. Its almost fast like nearest texture filtering (because of 1 texture fetch) but looks almost like bilinear. Yes-yes you saw this method in Unreal. I found a description about this technique in old flipcode archive on net (http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml)... read more

Posted by Jozef Tulec 2010-08-29

Hierarchical z-buffer re-implemented...

News-news-news guys. So i implemented the hierarchical z-buffer with 3 basic funtions, for fast tile skip, standart per pixel z comparing and fast z writing without z comparing to old z values in z buffer.

I uploaded 2 demos. One with colored debug info and one without the coloring to see how it normal works.
*black tiles - skipped tiles of the hidden small quad
*green tiles - tiles drawn with the fast write fucntion (no z comparison) and are not compared against the triangle edges
*cyan tiles - tiles are drawn with fast write function (no z comparison) but compared against the triangle edges
*gray tiles - tiles are drawn with function that compares the z-values agaisnt the z-buffer and are compared against the triangle edges... read more

Posted by Jozef Tulec 2010-08-13

Per-pixel Texture LOD

New update. I added per-pixel mip-maping. To see how it workds i created 2 demos. One where we can see the mip-map levels, the second with normal drawing. There is a noisy pattern at the mip-map level boundary's. The reason is.. i use the "RCPPS" SSE instruction which is not so precise, as when i use "DIVPS" . Using "DIVPS" i get sharp edges on the mip-map boundary's, but this instruction is more slower then the "RCPPS". But when the mip-map levels are not colored and the texture is bilinear-filtered the noisy pattern is not visible. See the no-mip-map-colored demo. ;-) ... read more

Posted by Jozef Tulec 2010-07-22

Speed ups...

Ok guys. i removed in the procedure "texturesampler_bilinear " sse4 instructions because they caused small problems on AMD proccesors (i have intel) and replaced with faster lookup table which calculates the adresses in tile for bilinear sample fetching. Fps jumped from 26 to 32 fps.. with point sampling ist it about 36-37 fps.. so its nice speedup.
I compared "tile bilinear sampler" against "linear (standart in memory image representation on PC) bilinear sampler" and the speed stayed almost the same... linear representation of the texture was a bit slower,because of not cache-friendly representation of the texture. Tiled texture is good for big textures, because if the texture is in high resolution , the speed don't drop so fast down as in linear (standart) representation of the texture. Of course the linear calculation of the sample adress from texture coordinates is much simpler, but the cache-polution is much bigger and is causing much bigger slowdowns.
Oh and i forgot. I added padding for bilinear sampling for the left and bottom edge. If the texture is 128x128 big i create bigger texture 129x129 and i copy the samples on the right and bottom edge from location based on the configuration if the texture is in "s" or "t" axis repeated or clamped.

Posted by Jozef Tulec 2010-07-03

New development path...

Hey grils and boys. I uploaded new version of the engine. In this version i totaly removed the n-level z-buffer, because the data in the z-pyramide is huge, which caused slowdowns (cache pollution), so now i know, why the developers of graphic cards don't use the full hierarchical z-buffer pyramide, but just 1 or 2 levels of it. I use the same path. There is not zbuffer yet, but it will be. But with just 1 level, so for every 8x8 pixel tile will be one zmin and zmax. Removing the pyramide i needed to change the drawing strategy - the rasterization of triangle. So i used the basic rectangle traversal algorytm of Nicolas Capens, but it was slow and the checking if the tile is inside or outside of triangle too. So i coded fast reject and accept functions based on the idea in the document of intel about larabee rasterization. Yep it speed up. I changed the triangle traversal - algorytm to my, which draw the triangle on per block scanline basis, with optimized comparison logic if the block is outside, partially inside, or fully inside. Combining this with the trivial reject-accept-tile code it almost don't eat any cpu time and its not even assembler optimized. So if the triangle is accepted thru the transformation pipeline and all the tile's which covers the triangle are rejected with the hierarchical z-buffer will almost don't slow down the whole process. But function similar to query_occlusion in OpenGL or DX will help too. Oh and i almost forgot. Bilinear filtering and repeat-clamp texture addressing is implemented too .... aaaand you need sse4 to run the demo.

Posted by Jozef Tulec 2010-06-26

THNX TO ALL

It is motivating to continue, when i see, how many peoples are downloading my project. :-)

Posted by Jozef Tulec 2010-05-20
MongoDB Logo MongoDB