원본 사이트 : http://www.bluesnews.com/abrash/contents.shtml

Inside Quake: Visible-Surface Determination

by Michael Abrash

Years ago, I was working at Video Seven, a now-vanished video adapter manufacturer, helping to develop a VGA clone. The fellow who was designing Video Seven’s VGA chip, Tom Wilson, had worked around the clock for months to make his VGA run as fast as possible, and was confident he had pretty much maxed out its performance. As Tom was putting the finishing touches on his chip design, however, news came fourth-hand that a competitor, Paradise, had juiced up the performance of the clone they were developing, by putting in a FIFO.

That was it; there was no information about what sort of FIFO, or how much it helped, or anything else. Nonetheless, Tom, normally an affable, laid-back sort, took on the wide-awake, haunted look of a man with too much caffeine in him and no answers to show for it, as he tried to figure out, from hopelessly thin information, what Paradise had done. Finally, he concluded that Paradise must have put a write FIFO between the system bus and the VGA, so that when the CPU wrote to video memory, the write immediately went into the FIFO, allowing the CPU to keep on processing instead of stalling each time it wrote to display memory.

Tom couldn’t spare the gates or the time to do a full FIFO, but he could implement a one-deep FIFO, allowing the CPU to get one write ahead of the VGA. He wasn’t sure how well it would work, but it was all he could do, so he put it in and taped out the chip.

The one-deep FIFO turned out to work astonishingly well; for a time, Video Seven’s VGAs were the fastest around, a testament to Tom’s ingenuity and creativity under pressure. However, the truly remarkable part of this story is that Paradise’s FIFO design turned out to bear not the slightest resemblance to Tom’s, and didn’t work as well. Paradise had stuck a read FIFO between display memory and the video output stage of the VGA, allowing the video output to read ahead, so that when the CPU wanted to access display memory, pixels could come from the FIFO while the CPU was serviced immediately. That did indeed help performance--but not as much as Tom’s write FIFO.

What we have here is as neat a parable about the nature of creative design as one could hope to find. The scrap of news about Paradise’s chip contained almost no actual information, but it forced Tom to push past the limits he had unconsciously set in coming up with his original design. And, in the end, I think that the single most important element of great design, whether it be hardware or software or any creative endeavor, is precisely what the Paradise news triggered in Tom: The ability to detect the limits you have built into the way you think about your design, and transcend those limits.

The problem, of course, is how to go about transcending limits you don’t even know you’ve imposed. There’s no formula for success, but two principles can stand you in good stead: simplify, and keep on trying new things.

Generally, if you find your code getting more complex, you’re fine-tuning a frozen design, and it’s likely you can get more of a speed-up, with less code, by rethinking the design. A really good design should bring with it a moment of immense satisfaction in which everything falls into place, and you’re amazed at how little code is needed and how all the boundary cases just work properly.

As for how to rethink the design, do it by pursuing whatever ideas occur to you, no matter how off-the-wall they seem. Many of the truly brilliant design ideas I’ve heard over the years sounded like nonsense at first, because they didn’t fit my preconceived view of the world. Often, such ideas are in fact off-the-wall, but just as the news about Paradise’s chip sparked Tom’s imagination, aggressively pursuing seemingly-outlandish ideas can open up new design possibilities for you.

Case in point: The evolution of Quake’s 3-D graphics engine.

The toughest 3-D challenge of all

I’ve spent most of my waking hours for the last seven months working on Quake, id Software’s successor to DOOM, and after spending the next three months in much the same way, I expect Quake will be out as shareware around the time you read this.

In terms of graphics, Quake is to DOOM as DOOM was to its predecessor, Wolfenstein 3D. Quake adds true, arbitrary 3-D (you can look up and down, lean, and even fall on your side), detailed lighting and shadows, and 3-D monsters and players in place of DOOM’s sprites. Sometime soon, I’ll talk about how all that works, but this month I want to talk about what is, in my book, the toughest 3-D problem of all, visible surface determination (drawing the proper surface at each pixel), and its close relative, culling (discarding non-visible polygons as quickly as possible, a way of accelerating visible surface determination). In the interests of brevity, I’ll use the abbreviation VSD to mean both visible surface determination and culling from now on.

Why do I think VSD is the toughest 3-D challenge? Although rasterization issues such as texture mapping are fascinating and important, they are tasks of relatively finite scope, and are being moved into hardware as 3-D accelerators appear; also, they only scale with increases in screen resolution, which are relatively modest.

In contrast, VSD is an open-ended problem, and there are dozens of approaches currently in use. Even more significantly, the performance of VSD, done in an unsophisticated fashion, scales directly with scene complexity, which tends to increase as a square or cube function, so this very rapidly becomes the limiting factor in doing realistic worlds. I expect VSD increasingly to be the dominant issue in realtime PC 3-D over the next few years, as 3-D worlds become increasingly detailed. Already, a good-sized Quake level contains on the order of 10,000 polygons, about three times as many polygons as a comparable DOOM level.

The structure of Quake levels

Before diving into VSD, let me note that each Quake level is stored as a single huge 3-D BSP tree. This BSP tree, like any BSP, subdivides space, in this case along the planes of the polygons. However, unlike the BSP tree I presented last time, Quake’s BSP tree does not store polygons in the tree nodes, as part of the splitting planes, but rather in the empty (non-solid) leaves, as shown in overhead view in Figure 1.

Figure 1: In Quake, polygons are stored in the empty leaves. Shaded areas are solid leaves (solid volumes, such as the insides of walls).

Correct drawing order can be obtained by drawing the leaves in front-to-back or back-to-front BSP order, again as discussed last time. Also, because BSP leaves are always convex and the polygons are on the boundaries of the BSP leaves, facing inward, the polygons in a given leaf can never obscure one another and can be drawn in any order. (This is a general property of convex polyhedra.)

Culling and visible surface determination

The process of VSD would ideally work as follows. First, you would cull all polygons that are completely outside the view frustum (view pyramid), and would clip away the irrelevant portions of any polygons that are partially outside. Then you would draw only those pixels of each polygon that are actually visible from the current viewpoint, as shown in overhead view in Figure 2, wasting no time overdrawing pixels multiple times; note how little of the polygon set in Figure 2 actually need to be drawn. Finally, in a perfect world, the tests to figure out what parts of which polygons are visible would be free, and the processing time would be the same for all possible viewpoints, giving the game a smooth visual flow.

Figure 2: An ideal VSD architecture would draw only visible parts of visible polygons.


As it happens, it is easy to determine which polygons are outside the frustum or partially clipped, and it’s quite possible to figure out precisely which pixels need to be drawn. Alas, the world is far from perfect, and those tests are far from free, so the real trick is how to accelerate or skip various tests and still produce the desired result.

As I discussed at length last time, given a BSP, it’s easy and inexpensive to walk the world in front-to-back or back-to-front order. The simplest VSD solution, which I in fact demonstrated last time, is to simply walk the tree back-to-front, clip each polygon to the frustum, and draw it if it’s facing forward and not entirely clipped (the painter’s algorithm). Is that an adequate solution?

For relatively simple worlds, it is perfectly acceptable. It doesn’t scale very well, though. One problem is that as you add more polygons in the world, more transformations and tests have to be performed to cull polygons that aren’t visible; at some point, that will bog performance down considerably.

Happily, there’s a good workaround for this particular problem. As discussed earlier, each leaf of a BSP tree represents a convex subspace, with the nodes that bound the leaf delimiting the space. Perhaps less obvious is that each node in a BSP tree also describes a subspace--the subspace composed of all the node’s children, as shown in Figure 3. Another way of thinking of this is that each node splits into two pieces the subspace created by the nodes above it in the tree, and the node’s children then further carve that subspace into all the leaves that descend from the node.

Figure 3: Node E describes the shaded subspace, which contains leaves 5, 6, and 7, and node F.


Since a node’s subspace is bounded and convex, it is possible to test whether it is entirely outside the frustum. If it is, all of the node’s children are certain to be fully clipped, and can be rejected without any additional processing. Since most of the world is typically outside the frustum, many of the polygons in the world can be culled almost for free, in huge, node-subspace chunks. It’s relatively expensive to perform a perfect test for subspace clipping, so instead bounding spheres or boxes are often maintained for each node, specifically for culling tests.

So culling to the frustum isn’t a problem, and the BSP can be used to draw back to front. What’s the problem?


The problem John Carmack, the driving technical force behind DOOM and Quake, faced when he designed Quake was that in a complex world, many scenes have an awful lot of polygons in the frustum. Most of those polygons are partially or entirely obscured by other polygons, but the painter’s algorithm described above requires that every pixel of every polygon in the frustum be drawn, often only to be overdrawn. In a 10,000-polygon Quake level, it would be easy to get a worst-case overdraw level of 10 times or more; that is, in some frames each pixel could be drawn 10 times or more, on average. No rasterizer is fast enough to compensate for an order of magnitude more work than is actually necessary to show a scene; worse still, the painter’s algorithm will cause a vast difference between best-case and worst-case performance, so the frame rate can vary wildly as the viewer moves around.

So the problem John faced was how to keep overdraw down to a manageable level, preferably drawing each pixel exactly once, but certainly no more than two or three times in the worst case. As with frustum culling, it would be ideal if he could eliminate all invisible polygons in the frustum with virtually no work. It would also be a plus if he could manage to draw only the visible parts of partially-visible polygons, but that was a balancing act, in that it had to be a lower-cost operation than the overdraw that would otherwise result.

When I arrived at id at the beginning of March, John already had an engine prototyped and a plan in mind, and I assumed that our work was a simple matter of finishing and optimizing that engine. If I had been aware of id’s history, however, I would have known better. John had done not only DOOM, but also the engines for Wolf 3D and several earlier games, and had actually done several different versions of each engine in the course of development (once doing four engines in four weeks), for a total of perhaps 20 distinct engines over a four-year period. John’s tireless pursuit of new and better designs for Quake’s engine, from every angle he could think of, would end only when we shipped.

By three months after I arrived, only one element of the original VSD design was anywhere in sight, and John had taken “try new things” farther than I’d ever seen it taken.

The beam tree

John’s original Quake design was to draw front to back, using a second BSP tree to keep track of what parts of the screen were already drawn and which were still empty and therefore drawable by the remaining polygons. Logically, you can think of this BSP tree as being a 2-D region describing solid and empty areas of the screen, as shown in Figure 4, but in fact it is a 3-D tree, of the sort known as a beam tree. A beam tree is a collection of 3-D wedges (beams), bounded by planes, projecting out from some center point, in this case the viewpoint, as shown in Figure 5.

Figure 4: Quake's beam tree effectively partitioned the screen into 2-D regions.

Figure 5: Quake's beam tree was composed of 3-D wedges, or beams, projecting out from the viewpoint to polygon edges.


In John’s design, the beam tree started out consisting of a single beam describing the frustum; everything outside that beam was marked solid (so nothing would draw there), and the inside of the beam was marked empty. As each new polygon was reached while walking the world BSP tree front to back, that polygon was converted to a beam by running planes from its edges through the viewpoint, and any part of the beam that intersected empty beams in the beam tree was considered drawable and added to the beam tree as a solid beam. This continued until either there were no more polygons or the beam tree became entirely solid. Once the beam tree was completed, the visible portions of the polygons that had contributed to the beam tree were drawn.

The advantage to working with a 3-D beam tree, rather than a 2-D region, is that determining which side of a beam plane a polygon vertex is on involves only checking the sign of the dot product of the ray to the vertex and the plane normal, because all beam planes run through the origin (the viewpoint). Also, because a beam plane is completely described by a single normal, generating a beam from a polygon edge requires only a cross-product of the edge and a ray from the edge to the viewpoint. Finally, bounding spheres of BSP nodes can be used to do the aforementioned bulk culling to the frustum.

The early-out feature of the beam tree--stopping when the beam tree becomes solid--seems appealing, because it appears to cap worst-case performance. Unfortunately, there are still scenes where it’s possible to see all the way to the sky or the back wall of the world, so in the worst case, all polygons in the frustum will still have to be tested against the beam tree. Similar problems can arise from tiny cracks due to numeric precision limitations. Beam tree clipping is fairly time-consuming, and in scenes with long view distances, such as views across the top of a level, the total cost of beam processing slowed Quake’s frame rate to a crawl. So, in the end, the beam-tree approach proved to suffer from much the same malady as the painter’s algorithm: The worst case was much worse than the average case, and it didn’t scale well with increasing level complexity.

3-D engine du jour

Once the beam tree was working, John relentlessly worked at speeding up the 3-D engine, always trying to improve the design, rather than tweaking the implementation. At least once a week, and often every day, he would walk into my office and say “Last night I couldn’t get to sleep, so I was thinking...” and I’d know that I was about to get my mind stretched yet again. John tried many ways to improve the beam tree, with some success, but more interesting was the profusion of wildly different approaches that he generated, some of which were merely discussed, others of which were implemented in overnight or weekend-long bursts of coding, in both cases ultimately discarded or further evolved when they turned out not to meet the design criteria well enough. Here are some of those approaches, presented in minimal detail in the hopes that, like Tom Wilson with the Paradise FIFO, your imagination will be sparked.

Subdividing raycast: Rays are cast in an 8x8 screen-pixel grid; this is a highly efficient operation because the first intersection with a surface can be found by simply clipping the ray into the BSP tree, starting at the viewpoint, until a solid leaf is reached. If adjacent rays don’t hit the same surface, then a ray is cast halfway between, and so on until all adjacent rays either hit the same surface or are on adjacent pixels; then the block around each ray is drawn from the polygon that was hit. This scales very well, being limited by the number of pixels, with no overdraw. The problem is dropouts; it’s quite possible for small polygons to fall between rays and vanish.

Vertex-free surfaces: The world is represented by a set of surface planes. The polygons are implicit in the plane intersections, and are extracted from the planes as a final step before drawing. This makes for fast clipping and a very small data set (planes are far more compact than polygons), but it’s time-consuming to extract polygons from planes.

Draw-buffer: Like a z-buffer, but with 1 bit per pixel, indicating whether the pixel has been drawn yet. This eliminates overdraw, but at the cost of an inner-loop buffer test, extra writes and cache misses, and, worst of all, considerable complexity. Variations are testing the draw-buffer a byte at a time and completely skipping fully-occluded bytes, or branching off each draw-buffer byte to one of 256 unrolled inner loops for drawing 0-8 pixels, in the process possibly taking advantage of the ability of the x86 to do the perspective floating-point divide in parallel while 8 pixels are processed.

Span-based drawing: Polygons are rasterized into spans, which are added to a global span list and clipped against that list so that only the nearest span at each pixel remains. Little sorting is needed with front-to-back walking, because if there’s any overlap, the span already in the list is nearer. This eliminates overdraw, but at the cost of a lot of span arithmetic; also, every polygon still has to be turned into spans.

Portals: the holes where polygons are missing on surfaces are tracked, because it’s only through such portals that line-of-sight can extend. Drawing goes front-to-back, and when a portal is encountered, polygons and portals behind it are clipped to its limits, until no polygons or portals remain visible. Applied recursively, this allows drawing only the visible portions of visible polygons, but at the cost of a considerable amount of portal clipping.


In the end, John decided that the beam tree was a sort of second-order structure, reflecting information already implicitly contained in the world BSP tree, so he tackled the problem of extracting visibility information directly from the world BSP tree. He spent a week on this, as a byproduct devising a perfect DOOM (2-D) visibility architecture, whereby a single, linear walk of a DOOM BSP tree produces zero-overdraw 2-D visibility. Doing the same in 3-D turned out to be a much more complex problem, though, and by the end of the week John was frustrated by the increasing complexity and persistent glitches in the visibility code. Although the direct-BSP approach was getting closer to working, it was taking more and more tweaking, and a simple, clean design didn’t seem to be falling out. When I left work one Friday, John was preparing to try to get the direct-BSP approach working properly over the weekend.

When I came in on Monday, John had the look of a man who had broken through to the other side--and also the look of a man who hadn’t had much sleep. He had worked all weekend on the direct-BSP approach, and had gotten it working reasonably well, with insights into how to finish it off. At 3:30 AM Monday morning, as he lay in bed, thinking about portals, he thought of precalculating and storing in each leaf a list of all leaves visible from that leaf, and then at runtime just drawing the visible leaves back-to-front for whatever leaf the viewpoint happens to be in, ignoring all other leaves entirely.

Size was a concern; initially, a raw, uncompressed potentially visible set (PVS) was several megabytes in size. However, the PVS could be stored as a bit vector, with 1 bit per leaf, a structure that shrunk a great deal with simple zero-byte compression. Those steps, along with changing the BSP heuristic to generate fewer leaves (contrary to what I said a few months back, choosing as the next splitter the polygon that splits the fewest other polygons is clearly the best heuristic, based on the latest data) and sealing the outside of the levels so the BSPer can remove the outside surfaces, which can never be seen, eventually brought the PVS down to about 20 Kb for a good-size level.

In exchange for that 20 Kb, culling leaves outside the frustum is speeded up (because only leaves in the PVS are considered), and culling inside the frustum costs nothing more than a little overdraw (the PVS for a leaf includes all leaves visible from anywhere in the leaf, so some overdraw, typically on the order of 50% but ranging up to 150%, generally occurs). Better yet, precalculating the PVS results in a leveling of performance; worst case is no longer much worse than best case, because there’s no longer extra VSD processing--just more polygons and perhaps some extra overdraw--associated with complex scenes. The first time John showed me his working prototype, I went to the most complex scene I knew of, a place where the frame rate used to grind down into the single digits, and spun around smoothly, with no perceptible slowdown.

John says precalculating the PVS was a logical evolution of the approaches he had been considering, that there was no moment when he said “Eureka!”. Nonetheless, it was clearly a breakthrough to a brand-new, superior design, a design that, together with a still-in-development sorted-edge rasterizer that completely eliminates overdraw, comes remarkably close to meeting the “perfect-world” specifications we laid out at the start.

Simplify, and keep on trying new things

What does it all mean? Exactly what I said up front: Simplify, and keep trying new things. The precalculated PVS is simpler than any of the other schemes that had been considered (although precalculating the PVS is an interesting task that I’ll discuss another time). In fact, at runtime the precalculated PVS is just a constrained version of the painter’s algorithm. Does that mean it’s not particularly profound?

Not at all. All really great designs seem simple and even obvious--once they’ve been designed. But the process of getting there requires incredible persistence, and a willingness to try lots of different ideas until the right one falls into place, as happened here.

My friend Chris Hecker has a theory that all approaches work out to the same thing in the end, since they all reflect the same underlying state and functionality. In terms of underlying theory, I’ve found that to be true; whether you do perspective texture mapping with a divide or with incremental hyperbolic calculations, the numbers do exactly the same thing. When it comes to implementation, however, my experience is that simply time-shifting an approach, or matching hardware capabilities better, or caching can make an astonishing difference. My friend Terje Mathisen likes to say that “almost all programming can be viewed as an exercise in caching,” and that’s exactly what John did. No matter how fast he made his VSD calculations, they could never be as fast as precalculating and looking up the visibility, and his most inspired move was to yank himself out of the “faster code” mindset and realize that it was in fact possible to precalculate (in effect, cache) and look up the PVS.

The hardest thing in the world is to step outside a familiar, pretty good solution to a difficult problem and look for a different, better solution. The best ways I know to do that are to keep trying new, wacky things, and always, always, always try to simplify. One of John’s goals is to have fewer lines of code in each 3-D game than in the previous game, on the assumption that as he learns more, he should be able to do things better with less code.

So far, it seems to have worked out pretty well for him.

Learn now, pay forward

There’s one other thing I’d like to mention before I close up shop for this month. As far back as I can remember, DDJ has epitomized the attitude that sharing programming information is A Good Thing. I know a lot of programmers who were able to leap ahead in their development because of Hendrix’s Tiny C, or Stevens’ D-Flat, or simply by browsing through DDJ’s annual collections. (Me, for one.) Most companies understandably view sharing information in a very different way, as potential profit lost--but that’s what makes DDJ so valuable to the programming community.

It is in that spirit that id Software is allowing me to describe in these pages how Quake works, even before Quake has shipped. That’s also why id has placed the full source code for Wolfenstein 3D on ftp.idsoftware.com/idstuff/source; you can’t just recompile the code and sell it, but you can learn how a full-blown, successful game works; check wolfsrc.txt in the above-mentioned directory for details on how the code may be used.

So remember, when it’s legally possible, sharing information benefits us all in the long run. You can pay forward the debt for the information you gain here and elsewhere by sharing what you know whenever you can, by writing an article or book or posting on the Net. None of us learns in a vacuum; we all stand on the shoulders of giants such as Wirth and Knuth and thousands of others. Lend your shoulders to building the future!


Foley, James D., et al., Computer Graphics: Principles and Practice, Addison Wesley, 1990, ISBN 0-201-12110-7 (beams, BSP trees, VSD).

Teller, Seth, Visibility Computations in Densely Occluded Polyhedral Environments (dissertation), available on


along with several other papers relevant to visibility determination.

Teller, Seth, Visibility Preprocessing for Interactive Walkthroughs, SIGGRAPH 91 proceedings, pp. 61-69.

** 3차원 엔진 **
copyrightⓒ 김성완(찐빵귀신) [1999년 12월 21일]

1. 상용 3차원 엔진

국내는 대부분 Direct3D에 의존해서 3차원 게임을 개발하고 있고, 본격적인 의미의 3차원 엔진을 갖추고 있는 곳은 별로 없습니다.

상용3차원 엔진을 사서 하던 아니면 공개용 3차원 엔진을 사용하던 기본적인 3차원 엔진을 직접 개발 할 수 있을 정도의 지식과 능력을 갖추고 있지 않으면 제대로 활용하기 힘듭니다.

그러니까 3차원 엔진을 직접 개발할 능력이 되는데, 시간을 절약하기 위해서 상용3차원 엔진을 사는게 아니라면 사놓고도 어떻게 사용하는지 몰라서 헤멜 것이고, 결국 3차원 엔진을 개발하는데 드는 시간과 다름없는 시간을 소비할 겁니다. 기
본적인 3차원 프로그래밍에 대한 지식을 갖추었다 해도 전체 개발진이 상용 3차원 엔진에 익숙해져서 100% 제대로 활용하려면, 족히 6개월 정도의 시간이 필요합니다.
상용 3차원 엔진이 3차원 게임을 자동으로 척척 만들어 주는게 아니므로...

레인보우6 개발진도 3차원 엔진 라이센스 해서 개발하다 막판엔 결국 직접 만들어서 했다는 슬픈 스토리가 있죠.

2-3년씩 충분한 개발기간이 주어지는게 아니라면 남이 만든 엔진으로 어떻게 해볼려고 해도 부족한 시간속에서 제대로 분석하기도 힘들고, 일일이 기능들을 적용하고 테스트하고 해야하는데 시간은 턱없이 부족하고, 프로그램들에는 버그가 있게 마련이고 자기가 짠 프로그램도 버그 잡기 힘든데 하물며 남이 짠 건...

그리고 상용 3차원 엔진을 라이센스 할때는 반드시 소스도 함께 라이센스 해야합니다.
그런데 대개 소스까지 라이센스하면 라이센스 비가 엄청 뛰죠.
참고로 모노리쓰 사의 엔진의 경우 25만불 정도...
퀘이크나 언리얼 엔진의 경우는 소문에는 백만불 정도래는데..
아마 50만불 정도가 아닐까 합니다.
아주 저가형 엔진의 경우 '파워렌더'가 있는데 소스까지 라이센스하면 만불입니다.
그런데 싼게 비지떡이죠.

당장은 남이 만든 엔진을 사용하더라도 장기적인 안목에서 볼때 최소한 1년정도는 3차원 엔진 개발에 투자해야할 겁니다.

** 3차원 엔진 **
copyrightⓒ 김성완(찐빵귀신) [1998년 12월 28일]

2. D3d와 OpenGL은 3차원 엔진이 아니다.

3차원 하드웨어 가속을 인터페이스 해주는 대표적인 두개의 API가 바로 D3d와 OpenGL 이죠..

그런데 대개 초보들의 경우 이들 API를 사용하려면 모두다 이들 API만 사용해야하는 걸로 생각합니다.
하지만 실용적인 견지에서 보면 단지 폴리곤 rasterization 부분만 사용하면 그만입니다.

Direct3d와 OpenGl은 일반적인 의미의 3차원 엔진이 아니라 단지 하드웨어 가속기능을 동일한 인터페이스로 처리해주는 API라는걸 잊지마십시오.
3차원 가속기 없이 사용하는 것은 어디까지나 공부하기 위한 목적이지 실제 어플리케이션에서 사용하는 것은 개념없는 짓입니다.

그러므로 소프트웨어 3차원 엔진이 필요하면 퀘이크나 언리얼 엔진을 라이센스 하거나 직접 개발해야합니다.
Direct3d나 OpenGL은 3차원 가속기능의 표준적인 API이지, 일반적인 소프트웨어 3차원 엔진의 대체물이 아닙니다.
그리고 게임이 제대로된 완전제품이 되려면 일단 소프트웨어 엔진이 있어야 합니다.

소프트웨어 엔진은 단지 폴리곤 레스터라이즈만 소프트웨어로만 처리하는게 아니라, 그것보다 더 중요한 HSR(Hidden Surface Removal)을 담당하는 중요한 기능을 합니다.
당연한 얘기지만 하드웨어 가속기들도 HSR은 Z-Buffering외에는 전혀 지원하지 않습니다.
즉, BSP나 Portal등은 결국 소프트웨어의 몫입니다.

최근에 나온 툼레이더 3의 경우는 하드웨어 담당 기능의 대표격인 텍스츄어 필터링도 소프트웨어로 처리하더군요.(MMX이용)
그러니깐 툼레이더 2를 부두에서 돌린것보다 툼레이더 3를 소프트웨어 엔진에서 돌린 것이 더 화질이 좋습니다.
그리고 가장 이상적인 것은 기본적으로 소프트웨어 엔진이 하드웨어 가속기의 힘을 빌리지 않고도 최대한의 성능을 발휘해야합니다.

결국 퀘이크나 언리얼 엔진을 라이센스 할만한 돈이 없다면 직접 개발하는 수밖에 없습니다.
소프트웨어 엔진이 탄탄하게 갖추어져야 결국 하드웨어 가속기 상에서도 뛰어난 성능을 발휘하는 겁니다.

결론적으로 Direct3d나 OpenGL은 하드웨어 가속기능을 표준적으로 사용하기 위한 인터페이스지 결코 일반적인 목적의 3차원 엔진은 아닙니다.
물론 소프트웨 엔진은 320*200의 저해상도 모드만 지원하기로 한다면 사용할 수도 있지만..
요즘 처럼 이런 시대에 그런 저해상도를 유저들이 좋아할리 없겠죠.

** 3차원 엔진 **
copyrightⓒ 김성완(찐빵귀신) [1998년 1월 3일]

3. 국내에서는 3D엔진 안 만드는가??

대개 상용 3D 엔진들은 범용성을 만족해야 하므로, 속도면에서 특정 게임에 최적화된 자체 엔진들에 비해서 느립니다...
속도와 범용성은 같이 잡을 수 없는 두마리 토끼이죠..

대외적으로 상용 판매를 전재로 한 3D 엔진들은 어쩔 수 없이 범용성을 지향해야 하고, 결국 속도가 희생됩니다.
하지만 id의 퀘이크 엔진 처럼 거의 동일한 게임형식인 경우, 다른 개발사에 속도 희생 없이 제공이 가능합니다.

우리나라도 범용적인 상용 엔진 보다는 개발사간에 상호 라이센스 형식으로 기술을 제공하는게 좋을 듯 합니다.
괜히 각자 따로 자체 개발 하느라고 중복 투자 하느니 보다, 제일 괜찮은 엔진 개발한 회사 걸 서로 라이센스 해서 나누어 쓰는게 어떨지..

달러 한푼이 아까운 시절인데, 같은 국내 개발사 끼리 돕는게 애국이 아닐 지요...
제가 만일 개발사 오너이고 좋은 3D 엔진을 개발했다면 그렇게 했을텐데..
공짜로 주는 것도 아니고, 적정선에서 적절한 라이센스료 받고, 상호간에 직접 경쟁은 피하게 출시시기나 장르 조정하면 될 것으로...

사실 출시시기 조절이니 장르 조절이니 이런 것 안해도 계약하기 나름이죠.
3D 엔진 라이센스 해 간 개발사 게임이 잘나가면 일정 수량(만개 정도)이상 부터는 소정의 로얄티 받으면 되지요..
이러면 라이센스 해준 개발사만 이익을 보는게 아니라 라이센스 한 개발사도 3D 엔진 자체 개발하느라고 용쓰지 않아도 되고, 직접 개발시 보다 비용이 적게 드는 것이죠.

국내 게임 업게에 오래 몸담았던 경험으로는 우리나라에서 이런 일이 일어나기 힘들다는 걸 잘 알지만 꿈이라도 꾸어 보아야지요.

+ Recent posts