【GDC2013】IVB Atmospheric Light Scattering(Part 2)

5. The algorithm

Our algorithm combines epipolar sampling by Engelhardt and Dachsbacher [ED10] with 1D min/max mipmap by Chen et al. [CBDJ11]. It goes through the following steps:
本文的實(shí)現(xiàn)算法[ED10]極坐標(biāo)采樣算法與[CBDJ11]一維min/max mipmap算法的結(jié)合,實(shí)現(xiàn)步驟給出如下:

  1. Render the scene from the camera and from the light source(分兩次渲染場(chǎng)景威酒,一次normal播掷,一次shadow map)

  2. Reconstruct linear camera space Z from the depth buffer(從深度buffer中構(gòu)建出線性深度)

  3. Build 1D min/max mipmap(從shadow map中為每個(gè)epipolar slice構(gòu)建一個(gè)一維的min/max mipmap)

  4. Render coordinate texture(采樣點(diǎn)布置)

    • Samples are placed along the lines connecting the light source projected position with a user-defined number of points equidistantly placed along the border of the screen(采樣點(diǎn)的位置落在從光源在屏幕上的投影位置出發(fā)到屏幕上等距分布的邊緣點(diǎn)的各個(gè)射線上)
    • If the light source is outside the screen, the line is intersected with the screen border and samples are placed between the intersection points(如果光源投影位置超出屏幕之外辣辫,那么從光源位置引出的射線就會(huì)與屏幕邊緣存在兩個(gè)交點(diǎn)觉啊,采樣點(diǎn)就落在這兩個(gè)交點(diǎn)之間)
    • Samples that fall outside the screen are excluded from further processing(出于簡(jiǎn)化考慮榴鼎,超出屏幕外的采樣點(diǎn)就不需要考慮其對(duì)于大氣散射的影響了)
    • Detect depth discontinuities and refine initially placed sample locations(采樣點(diǎn)排布規(guī)律)
    • Initial samples are placed to capture light variation in the absence of occludes(在不考慮遮擋物的情況下迹鹅,先根據(jù)一定的規(guī)律設(shè)置一些初始的采樣點(diǎn)——通常是定長(zhǎng)分布)
    • Additional samples are placed at locations where camera space z difference between neighboring pixels exhibit a large threshold(之后根據(jù)深度的變化幅度权旷,在深度邊緣位置安插額外的采樣點(diǎn))
    • Perform ray marching for selected samples(對(duì)于給定的采樣點(diǎn),進(jìn)行ray marching計(jì)算——每個(gè)采樣點(diǎn)對(duì)應(yīng)的是屏幕上的一個(gè)像素戏锹,ray marching則是沿著相機(jī)與這個(gè)像素世界坐標(biāo)的連線進(jìn)行的一個(gè)積分計(jì)算)
    • Interpolate inscattering radiance in epipolar coordinates for the rest of the samples from ray marching samples(對(duì)于相鄰采樣點(diǎn)之間的區(qū)域的其余點(diǎn)的數(shù)據(jù)茶鉴,通過(guò)插值計(jì)算得到)
    • Transform inscattering from epipolar coordinates to downscaled rectangular buffer(將inscattering從極坐標(biāo)轉(zhuǎn)換到一個(gè)低分辨率的RT中,沒(méi)太看懂下面的實(shí)現(xiàn))
    • Determine the two closest epipolar lines
    • Project the sample onto the lines and perform bilateral bilinear filtering taking into account the z difference(將極坐標(biāo)中的采樣點(diǎn)投影到平面坐標(biāo)系中景用,并進(jìn)行一次雙線性插值,注意在插值的時(shí)候?qū)ι疃炔贿B貫的位置要進(jìn)行特別考慮)
    • Mark pixels for which there is no appropriate source samples(將那些沒(méi)有對(duì)應(yīng)的數(shù)據(jù)來(lái)源的像素標(biāo)記出來(lái))
    • Correct inscattering for these samples, which could not be correctly interpolated from epipolar coordinates by performing ray marching(對(duì)那些無(wú)法通過(guò)ray marching算法在極坐標(biāo)系中進(jìn)行正確插值的采樣點(diǎn)惭蹂,進(jìn)行額外的散射計(jì)算處理)
    • Upscale the inscattering image to the original resolution and combine it with the attenuated background(將之前低分辨率的平面坐標(biāo)散射貼圖上采樣到屏幕分辨率伞插,并將其作用于背景上,根據(jù)散射強(qiáng)度對(duì)各個(gè)像素顏色進(jìn)行相應(yīng)的blend處理)
    • Perform bilateral filtering(上采樣后盾碗,進(jìn)行雙線性模糊處理)
    • Mark pixels for which there is no appropriate source samples(將那些無(wú)任何數(shù)據(jù)關(guān)聯(lián)的像素標(biāo)記出來(lái))
    • Correct inscattering(修正處理)

While the most important concepts of the original epipolar sampling algorithm [ED10] are preserved, there are a number of improvements:
[ED10]極坐標(biāo)采樣算法中的一些非常重要的概念或者閃光點(diǎn)沒(méi)有被抓住媚污,本文對(duì)此給出了一系列的優(yōu)化:
* We use the additional down-sampling step to provide additional control over speed/quality tradeoff. Usually, downscaling by a factor of 2x does not significantly degrade visual quality, at the same time making the rays look smoother and improving performance
* 本文增加了一個(gè)額外的下采樣+上采樣處理,以實(shí)現(xiàn)速度與質(zhì)量之間的平衡廷雅,通常情況下2x的下采樣不會(huì)導(dǎo)致渲染質(zhì)量的下降耗美,而且還有助于降低低頻采樣點(diǎn)之間的噪聲京髓,使之看起來(lái)更為平滑。
* Additional inscattering correction steps are introduced:(引入了額外的散射修正處理)
* In the original approach, the authors used bilateral filtering along and between epipolar lines when performing transformation from epipolar geometry to the original rectangular coordinates. Up to five samples along each epipolar line were used. If there is no appropriate sample to filter from, the authors proposed going to the next outer epipolar line. We found this approach still produces rendering artifacts and can’t be efficiently implemented on the GPU because it requires branching.
* 原算法中商架,作者在將極坐標(biāo)轉(zhuǎn)換為平面坐標(biāo)的時(shí)候?qū)pipolar lines以及l(fā)ines之間的區(qū)域使用了雙線性采樣堰怨。在每條line上面最多使用五個(gè)采樣點(diǎn),如果某些區(qū)域上的采樣點(diǎn)比較稀疏蛇摸,導(dǎo)致部分像素拿不到對(duì)應(yīng)的采樣點(diǎn)數(shù)據(jù)源备图,就會(huì)從下一條line上獲取數(shù)據(jù),我們發(fā)現(xiàn)這種做法存在一些瑕疵赶袄,那就是由于shader分支的存在揽涮,會(huì)導(dǎo)致在GPU上實(shí)施起來(lái)比較低效。
* We implemented another method: we mark samples that cannot be correctly interpolated with stencil and perform an additional ray marching pass for these pixels using fewer steps. Note that this correction is done when both epipolar inscattering is transformed into rectangular geometry and when upscaling is performed.
* 相對(duì)于原算法饿肺,本文給出的算法就有所不同:前面說(shuō)到蒋困,會(huì)將那些無(wú)法獲取到數(shù)據(jù)源的像素標(biāo)記出來(lái),這個(gè)過(guò)程是通過(guò)stencil完成的敬辣,之后會(huì)對(duì)這些像素增加一次ray marching操作雪标,需要注意的是,這個(gè)步驟是在極坐標(biāo)已經(jīng)轉(zhuǎn)換到平面坐標(biāo)购岗,且上采樣已經(jīng)完成之后(放在這里的原因是汰聋?)

6. Implementation details

The algorithm implementation follows steps described in section 5. There are a number of textures which are used to store intermediate data required during the processing, which are summarized in table 1.

這里給出前面一節(jié)中的算法的實(shí)施細(xì)節(jié),下面列出算法中需要用來(lái)存儲(chǔ)一些中間數(shù)據(jù)的貼圖:

此處表格復(fù)制效果有問(wèn)題喊积,因此就不貼了烹困,有興趣的同學(xué)煩請(qǐng)移步原文
Table 1: Textures used by the algorithm implementation

The algorithm workflow is summarized in fig. 8 and 9, while the rest of this section provides various details.

整個(gè)算法的工作流程可以用圖8跟圖9來(lái)概括,后面會(huì)繼續(xù)介紹相關(guān)的實(shí)現(xiàn)細(xì)節(jié)乾吻。

圖8給出的是算法的一些前置工作流程:

1.從場(chǎng)景depth buffer中構(gòu)建出線性深度Space Z

2.將場(chǎng)景轉(zhuǎn)換到極坐標(biāo)空間髓梅,會(huì)用到三張貼圖,分別存儲(chǔ)像素的極坐標(biāo)绎签,對(duì)應(yīng)的屏幕空間Space Z(進(jìn)行連貫性檢查)以及極坐標(biāo)Depth(N_samples x N_slice枯饿,N_samples指的是從太陽(yáng)出發(fā)的射線像素?cái)?shù)目,注意诡必,對(duì)于部分射線奢方,不是所有的像素都是屏幕內(nèi)的,因此前面才會(huì)有需要將一些屏幕外的像素剔除出去爸舒,不再計(jì)算的說(shuō)法)

3.根據(jù)極坐標(biāo)貼圖與極坐標(biāo)屏幕空間Space Z貼圖擬定用于參與ray marching計(jì)算的采樣點(diǎn)蟋字,得到可供插值的原始計(jì)算結(jié)果貼圖

4.根據(jù)極坐標(biāo)貼圖,計(jì)算出當(dāng)前slice對(duì)應(yīng)的shadow map中的坐標(biāo)位置(一個(gè)一維貼圖)

5.根據(jù)slice uv貼圖采樣shadow map扭勉,得到一維的shadow map min/max mipmap(之所以要用mipmap鹊奖,就是為了構(gòu)建binary tree)貼圖

Fig.8: Preliminary steps of the algorithm

圖9給出的是inscattering的計(jì)算過(guò)程:

6.根據(jù)interpolation source貼圖(存儲(chǔ)的是每個(gè)像素對(duì)應(yīng)的兩個(gè)插值source sample的index)在極坐標(biāo)depth貼圖上標(biāo)記出source sample(用于通過(guò)ray marching計(jì)算inscattering)

7.對(duì)標(biāo)記需要通過(guò)ray marching計(jì)算inscattering的點(diǎn),根據(jù)所需要的輸入數(shù)據(jù)涂炎,計(jì)算出對(duì)應(yīng)的inscattering

8.根據(jù)計(jì)算出來(lái)的inscattering數(shù)據(jù)忠聚,插值出剩余的inscattering數(shù)據(jù)

9.將計(jì)算結(jié)果從極坐標(biāo)空間轉(zhuǎn)換到平面笛卡爾坐標(biāo)空間

10.將之前的計(jì)算結(jié)果下采樣后设哗,通過(guò)一些fix手段,消除瑕疵

11.將瑕疵消除后的貼圖上采樣两蟀,并與color貼圖相混合

12.對(duì)上采樣后的結(jié)果貼圖進(jìn)行再一次的fix處理

Fig.9: Final steps of the algorithm

Note that inscattering correction at steps 10 and 12 also uses camera space z, shadow map and slice UV direction textures, which is not shown in fig. 9 for clarity.

注意网梢,第10跟第12步的fix操作都是需要用到相機(jī)空間深度貼圖與shadow map貼圖以及UV direction貼圖(用于ray marching計(jì)算),圖中未標(biāo)出垫竞。下面給出實(shí)現(xiàn)的細(xì)節(jié)澎粟。

The remaining part of this section details all the stages of the algorithm.

6.1. Rendering coordinates texture

Coordinate texture generation is done by rendering screen-size quad with the texture set up to the pipeline as a render target. Pixel shader GenerateCoordinateTexturePS() performs all the required processing. Depth stencil state is configured to increment stencil value, thus all valid samples will be marked by 1 in the stencil, while all invalid will keep initial 0 value. Thus samples that fall behind the screen will be skipped from all further processing.

坐標(biāo)貼圖的生成是通過(guò)一個(gè)類(lèi)似后處理來(lái)完成的,一個(gè)PS用于計(jì)算對(duì)應(yīng)的極坐標(biāo)欢瞪,在這個(gè)過(guò)程中活烙,stencil pass被設(shè)置成increment,這樣一來(lái)遣鼓,滿足條件的像素的stencil為1啸盏,不滿足條件的為0(不在屏幕范圍內(nèi)的),這些不滿足條件的像素在后續(xù)的處理過(guò)程中就可以被直接跳過(guò)骑祟。

We assume that epipolar slice coordinate ranges from 0 to 1. Screen borders are traversed in a counter clockwise order starting from the left top corner (fig. 10): values from 0 to 0.25 define locations on the left border, values in the ranges from 0.25 to 0.25, from 0.5 to 0.75 and from 0.75 to 1.0, define locations on bottom, right and top borders correspondingly. Values 0, 0.25, 0.5, 0.75, 1 define locations in exactly the screen corners.

這里假設(shè)極坐標(biāo)范圍為[0,1]回懦,其角度定義方式按照逆時(shí)針進(jìn)行,如下圖所示次企,左上角為0怯晕,左下角為0.25,一直到循環(huán)回到左上角1.

_Screen borders traversal order

The stage consists of the following steps:

  • Computing epipolar line exit point

  • Computing epipolar line entry point given its exit point

  • Rescaling epipolar line length to provide even texel to screen pixel correspondence

  • Computing camera space z for the location

這個(gè)階段包括以下幾個(gè)步驟:

  • 計(jì)算epipolar line的終點(diǎn)

  • 在給定終點(diǎn)的前提下缸棵,計(jì)算epipolar line的起點(diǎn)

  • 對(duì)epipolar line進(jìn)行縮放舟茶,使得這個(gè)line上的像素接近于屏幕像素?cái)?shù)目

  • 對(duì)于line上的像素,計(jì)算其對(duì)應(yīng)的相機(jī)空間深度值

如圖所示堵第,如果不對(duì)epipolar line進(jìn)行縮放的話贿堰,就會(huì)導(dǎo)致在光源距離屏幕邊緣較近的情況下赞咙,某些line上的采樣點(diǎn)數(shù)目過(guò)多颊亮,導(dǎo)致浪費(fèi)宿饱,在根據(jù)到屏幕的距離進(jìn)行縮放之后,就能夠保證采樣點(diǎn)之間的距離適中针余,且可以將采樣點(diǎn)的分布pattern從rectangle變換成circular饲鄙,后者更適合用于計(jì)算light shaft。

Computing epipolar line exit point is relatively simple as it lies on one of the four screen boundaries. The following code effectively computes this location using arithmetic instructions only:

epipolar line的終點(diǎn)位置都是落在屏幕的四條邊上圆雁,下面給出其計(jì)算的代碼:

uint uiBoundary = clamp(floor( fEpipolarSlice * 4 ), 0, 3); //locate the edge index

float fPosOnBoundary = frac( fEpipolarSlice * 4 ); //take the fraction as the position

float fBoundaryCoord = -1 + 2*fPosOnBoundary; // Left Bttom Right Top float4 f4BoundaryXCoord = float4( -1, fBoundaryCoord, 1, -fBoundaryCoord); 

float4 f4BoundaryYCoord = float4(-fBoundaryCoord, -1, fBoundaryCoord, 1); //horse pattern

bool4 b4BoundaryFlags = bool4(uiBoundary.xxxx == uint4(0,1,2,3)); // Select the right coordinates for the boundary 

float2 f2ExitPoint = float2(dot(f4BoundaryXCoord, b4BoundaryFlags), dot(f4BoundaryYCoord, b4BoundaryFlags));

The next step is a bit more complex: we need to compute epipolar line entry point given its exit point and position of the light source on the screen. This is accomplished by the GetEpipolarLineEntryPoint() function. There are two possible cases: light is located inside the screen and outside it (fig. 11). The first case is simple: entry point is simply the position of the light on the screen. In the second case we have to find intersection of the epipolar line with the appropriate boundary.

下一步是在給定exit point的前提下計(jì)算entry point傍妒,這個(gè)計(jì)算有兩種情況:

1.光源落在屏幕內(nèi),那么光源在屏幕中的位置就是entry point所在位置

2.光源在屏幕之外摸柄,就需要計(jì)算exit point與光源在屏幕的四條邊上的另一個(gè)交點(diǎn)

Fig.11: Entry and exit points of an epipolar line when light source is on the screen and outside it

Our task is to find first intersection of the ray connecting light projected position and the exit point before the exit point (fig. 12).

根據(jù)entry與exit的定義,第二種情況中既忆,entry point肯定是處于light source與exit point連線中的驱负。

Fig.12: Computing entry and exit points of an epipolar line when light source is outside the screen

For this, we compute signed distances to left, bottom, right and top boundaries along the ray and find the maximum distance which is less then distance to the exit point (examine fig. 12). We also take care of near horizontal and near vertical ray orientations when distances to left and right or top and bottom boundaries cannot be properly computed. We use special flag vector to skip computations with incorrect boundaries. The following code snippet accomplishes this task using mathematical instructions only and avoiding branches:

下面的代碼給出第二種情況中嗦玖,是如何通過(guò)計(jì)算出連線上的各個(gè)點(diǎn)到四條邊帶符號(hào)距離來(lái)求取交點(diǎn)的(與某一條邊的距離為0,與其相鄰邊的距離符號(hào)要相反)跃脊,這里需要對(duì)垂直以及水平方向做特別考慮宇挫。

// Compute direction from the light source to the ray exit point: 

float2 f2RayDir = f2ExitPoint.xy - g_LightAttribs.f4LightScreenPos.xy; 

float fDistToExitBoundary = length(f2RayDir); 

f2RayDir /= fDistToExitBoundary; 

// Compute signed distances along the ray from the light position to all four boundaries 

bool4 b4IsCorrectIntersectionFlag = abs(f2RayDir.xyxy) > 1e-5; 

float4 f4DistToBoundaries = (float4(-1,-1,1,1) - g_LightAttribs.f4LightScreenPos.xyxy) / (f2RayDir.xyxy + !b4IsCorrectIntersectionFlag); 
//避免除以0,得到到各條邊的距離酪术,在這四個(gè)距離中器瘪,排除兩個(gè)不小于到exit point的距離
//從中挑選那個(gè)較大的一個(gè),就是我們想要的交點(diǎn)

// Addition of !b4IsCorrectIntersectionFlag is required to prevent divison by zero

 // Note that such incorrect lanes will be masked out 

// We now need to find first intersection BEFORE the intersection with the exit boundary 

// This means that we need to find maximum intersection distance which is less than fDistToBoundary 

// We thus need to skip all boundaries, distance to which is greater than the distance to exit boundary 

// Using -FLT_MAX as the distance to these boundaries will result in skipping them: 

b4IsCorrectIntersectionFlag = b4IsCorrectIntersectionFlag && ( f4DistToBoundaries < (fDistToExitBoundary - 1e-4) ); 

f4DistToBoundaries = b4IsCorrectIntersectionFlag * f4DistToBoundaries + !b4IsCorrectIntersectionFlag * float4(-FLT_MAX, -FLT_MAX, -FLT_MAX, -FLT_MAX); 

float fFirstIntersecDist = 0; 

fFirstIntersecDist = max(fFirstIntersecDist, f4DistToBoundaries.x); fFirstIntersecDist = max(fFirstIntersecDist, f4DistToBoundaries.y); fFirstIntersecDist = max(fFirstIntersecDist, f4DistToBoundaries.z); fFirstIntersecDist = max(fFirstIntersecDist, f4DistToBoundaries.w); 

// Now we can compute entry point: f2EntryPoint = g_LightAttribs.f4LightScreenPos.xy + f2RayDir * fFirstIntersecDist;

Note that if light source is located outside the screen, there could be several cases when the whole slice is not visible (fig. 13). For such cases coordinates of the entry point will also be outside the screen.

注意绘雁,如果光源處于屏幕之外的話橡疼,那么有部分的epipolar slice會(huì)是整個(gè)的處于屏幕之外(entry point也是落在屏幕之外的),如圖13所示

Fig.13: Entry point of a completely invisible epipolar line

Such pixels are easily detected and discarded so that they will be skipped from further processing.

這些情況可以通過(guò)如下代碼處理掉庐舟,避免后續(xù)浪費(fèi)算力:

if( any(abs(f2EntryPoint) > 1+1e-4) ) discard;//坐標(biāo)超出邊界[0,1]

If light source is located close to screen boundary, the screen length of epipolar lines could vary significantly. This will result in using too dense sampling for short lines and doing redundant calculations and also could cause aliasing artifacts. To solve this issue, we rescale the epipolar lines by advancing exit point (fig. 14). We strive to provide 1:1 correspondence between samples on the epipolar line and screen pixels. The following code updates epipolar line exit point:

如果光源位置比較靠近屏幕邊緣欣除,就會(huì)導(dǎo)致epipolar line的屏幕尺寸變化過(guò)于劇烈,從而導(dǎo)致一個(gè)極小范圍內(nèi)的過(guò)多的采樣點(diǎn)挪略,以及鋸齒等問(wèn)題历帚。為了避免這種情況,就需要對(duì)epipolar line進(jìn)行縮放杠娱,使之上面的像素跟屏幕空間的像素?cái)?shù)目能夠比較好的匹配起來(lái):

float fEpipolarSliceScreenLen = length( (f2ExitPoint - f2EntryPoint) * g_PPAttribs.m_f2ScreenResolution.xy / 2 ); 

f2ExitPoint = f2EntryPoint + (f2ExitPoint - f2EntryPoint) * max(g_PPAttribs.m_f2CoordinateTexDim.x / fEpipolarSliceScreenLen, 1);
Fig.14: Advancing epipolar line exit point to provide even sample to screen pixel correspondence

This step not only reduces the amount of computations necessary, but also results in a more natural circular-shaped distributions of samples against rectangular-shaped distribution in original algorithm.

Finally, we compute interpolated location of the sample between entry and exit points and discard these samples that fall outside the screen:

最終挽牢,根據(jù)exit跟entry point,對(duì)其間的數(shù)據(jù)進(jìn)行插值摊求。

f2XY = lerp(f2EntryPoint, f2ExitPoint, fSamplePosOnEpipolarLine);

if( any(abs(f2XY) > 1+1e-4) )

discard;

The shader also outputs camera space Z coordinate for the epipolar sample which is used by subsequent steps of the algorithm.

在這個(gè)計(jì)算過(guò)程中禽拔,還會(huì)同時(shí)輸出epipolar 采樣點(diǎn)的相機(jī)空間的深度值,用于后續(xù)的計(jì)算睹簇。

Note that coordinate texture as well as camera space z texture are initially cleared with incorrect coordinates which are outside the allowable ranges. This is important since these values will help skip such pixels from further processing.

注意奏赘,在整個(gè)計(jì)算過(guò)程之初,兩張貼圖的數(shù)據(jù)都是用無(wú)效的數(shù)據(jù)進(jìn)行過(guò)清理的太惠,保證可以跳過(guò)后續(xù)的計(jì)算步驟磨淌,節(jié)省開(kāi)銷(xiāo)。

渲染得到的coordinate 貼圖給出如上凿渊,其中每一行代表一個(gè)epipolar line梁只,每一列代表line上的采樣點(diǎn)對(duì)應(yīng)的屏幕空間的uv坐標(biāo)。

6.2. Refining sample locations

The next step of the algorithm is refining initially placed ray marching sample locations by finding depth discontinuities. This stage generates interpolation source texture which for each sample contains indices of two samples from the same slice, from which current sample will be interpolated. Basically, the algorithm performing search for the interpolation samples (indexed by left and right) in the depth array depth1d of size N, for the current texel at location x as presented in [ED10] is the following:

這一節(jié)給出ray marching采樣點(diǎn)根據(jù)深度不連貫的優(yōu)化處理過(guò)程埃脏,同時(shí)會(huì)生成插值source貼圖(每個(gè)像素給出其對(duì)應(yīng)的兩個(gè)ray marching采樣點(diǎn)的索引搪锣,在下面的算法中用left跟right來(lái)標(biāo)注),用于實(shí)現(xiàn)ray marching結(jié)果的插值彩掐,下面的代碼給出的是在一維的深度數(shù)組中(尺寸為N构舟,將每個(gè)line segment均分成N等份)查找出深度的斷裂點(diǎn):

left = right = x;

while ( left > 0 )

{

     if(abs( depth1d[left-1], depth1d[left] ) > threshold)

         break;

     left --;

}

while ( right < N-1 )

{

     if(abs( depth1d[right], depth1d[right+1] ) > threshold)

         break;

     right ++;

}
Algorithm 3: Searching for depth discontinuities.

If there is no depth discontinuities on the ray section, the interpolation sources are the end point of this section. If depth discontinuity is detected, the sample will be interpolated from sample placed directly before the break.

如果在當(dāng)前的line segment上深度一直都是連貫的,那么在這個(gè)line segment上就不需要增加額外的ray marching計(jì)算采樣點(diǎn)堵幽,這個(gè)segment中的其他點(diǎn)的數(shù)據(jù)可以直接根據(jù)segment的端點(diǎn)ray marching數(shù)據(jù)計(jì)算狗超,否則在深度斷裂處的前一個(gè)點(diǎn)處增加額外的ray marching點(diǎn)弹澎,并以此作為插值source點(diǎn)。

We tried several strategies while implementing search for depth discontinuities. Our first approach was straightforward implementation of the algorithm 2 in pixel shader as suggested by [ED10]. We found out that this implementation is too slow, especially when large initial sampling step (32 and greater) is used.

為了查找深度不連貫的點(diǎn)努咐,這里嘗試了多種方法苦蒿,最開(kāi)始的方法就是直接通過(guò)在像素shader中執(zhí)行algorithm 2來(lái)進(jìn)行,不過(guò)后來(lái)發(fā)現(xiàn)這種算法實(shí)現(xiàn)起來(lái)過(guò)于緩慢渗稍,且在初始的采樣step比較大的時(shí)候這個(gè)問(wèn)題就更為嚴(yán)重佩迟。

We implemented the optimized search algorithm using compute shader. The performance of the implementation is up to 6x higher than the performance of the original pixel-shader based approach and almost independent of the initial sampling step.

之后使用了一種基于compute shader的優(yōu)化搜索算法,其實(shí)現(xiàn)效率最快可以達(dá)到像素shader版本的六倍竿屹,且其時(shí)間消耗跟初始的采樣step是無(wú)關(guān)的报强。

The compute shader organizes the work it performs into groups of threads. Number of threads in each group must be at least 32 and not less than the initial sample step due to the reasons which will be clear a bit later. Each group of threads processes one or several sections in one epipolar slice. If initial sample step Si is less than the total number of threads Nt in a group, then the group processes Nt/Si ray sections. Otherwise (Nt=Si) each group processes one section. Location of one sample group within interpolation source texture is illustrated in fig. 15.

一個(gè)compute包含多個(gè)線程組,每個(gè)線程組中的線程數(shù)目不得少于32與初始的采樣step(原因后面會(huì)給出)羔沙,每個(gè)線程組會(huì)處理一個(gè)或者多個(gè)section(取決于step的尺寸躺涝,比如線程組包含32個(gè)線程,step為16扼雏,那么一個(gè)線程組可以處理2個(gè)section):如果線程組中包含的線程數(shù)Nt>step的大小Si坚嗜,那么一個(gè)線程組就會(huì)處理Nt/Si個(gè)section,否則處理一個(gè)section(前面約定了Nt不小于Si)诗充,從而保證每個(gè)線程只處理section中的一個(gè)單元苍蔬。

Fig.15: Location of the thread group within coordinate texture

Location of individual sample processed by one thread in the group is shown in fig. 16.

每個(gè)sample點(diǎn)都由一個(gè)線程處理,示意圖建Fig.16

Fig.16: Location of individual sample processed by one thread in the group

The number of threads in the group must be large enough to fully utilize the SIMD units, so we set it to be at least 128. Our implementation exploits group shared memory exposed by the compute shader to efficiently implement discontinuities search. The idea is first to check if there is a depth break located next to each sample in the group and store this information in a compact shared array. Since 1 bit is sufficient to indicate depth break, we can pack 32 flags into one UINT variable. We use the following shared array:

為了充分利用SIMD的便利蝴蜓,這里設(shè)定線程組內(nèi)的線程數(shù)目為128碟绑,由于線程組之間的線程是可以共享內(nèi)存的,因此考慮將各個(gè)線程在深度不連貫檢測(cè)的結(jié)果(用1位就可以表示了)放入到一個(gè)共享的數(shù)據(jù)結(jié)構(gòu)中茎匠,那么一個(gè)線程組就可以用4個(gè)uint來(lái)表示其檢測(cè)結(jié)果(這就是為什么線程組的大小不得小于32的原因)格仲,之后將這個(gè)結(jié)果用于搜索深度不連貫:

static const uint g_uiNumPackedFlags = THREAD_GROUP_SIZE/32; groupshared uint g_uiPackedCamSpaceDiffFlags[ g_uiNumPackedFlags ];

It is now clear why the group size must be at least 32. The array is then used to efficiently search for depth discontinuities.

At the very beginning the shader checks if the sample it processes is correct. Recall that coordinate texture is initially cleared with invalid coordinates, which indicate samples outside the screen:

之前說(shuō)過(guò),極坐標(biāo)貼圖中通過(guò)stencil來(lái)確保有效數(shù)據(jù)能夠被識(shí)別到诵冒,因此在shader運(yùn)行之前凯肋,需要通過(guò)有效性判斷,來(lái)避免無(wú)效的計(jì)算:

bool bIsValidThread = all( abs(f2SampleLocationPS) < 1+1e-4 );

The shader then loads camera space z for current sample location and for its neighboring sample from the camera z coordinate texture generated at the previous stage, computes the difference and compares it with the threshold and sets up appropriate flag. Notice using InterlockedOr() function.

之后讀取當(dāng)前sample對(duì)應(yīng)的相機(jī)空間的深度值與其相鄰sample的深度值汽馋,并進(jìn)行比對(duì)侮东,判定是否滿足深度不連貫的條件,并設(shè)置對(duì)應(yīng)的標(biāo)記:

// Load camera space Z for this sample and for its right neighbor (remeber to use global sample index)

bool bFlag; float fCamSpaceZ = g_tex2DEpipolarCamSpaceZ.Load( uint3(uiGlobalSampleInd, uiSliceInd, 0) );

float fRightNeighbCamSpaceZ = g_tex2DEpipolarCamSpaceZ.Load( uint3(uiGlobalSampleInd+1, uiSliceInd, 0) );

// Compare the difference with the threshold

bFlag = abs(fCamSpaceZ - fRightNeighbCamSpaceZ) < g_PPAttribs.m_fRefinementThreshold;

// Set appropriate flag using INTERLOCKED Or:

InterlockedOr( g_uiPackedCamSpaceDiffFlags[uiSampleInd/32], bFlag << (uiSampleInd%32) );

An important aspect here is that camera space z texture contains large negative values for all samples which fall behind the screen. As a result, difference between the last sample on the screen and the first sample behind the screen will always exceed the threshold, and depth discontinuity will be detected.

這里有一點(diǎn)需要注意豹芯,由于屏幕之后的采樣點(diǎn)(這是什么意思悄雅?)的相機(jī)空間深度值是一個(gè)比較大的負(fù)數(shù),因此屏幕上的最后一個(gè)像素與屏幕之后的第一個(gè)像素的深度差值總是會(huì)超出設(shè)定的閾值铁蹈,從而會(huì)被檢測(cè)到深度不連貫宽闲。

After all the flags are set, all the threads must be synchronized which is done by the call to GroupMemoryBarrierWithGroupSync() function.

在所有線程的flag計(jì)算都完成之后,需要通過(guò)GroupMemoryBarrierWithGroupSync函數(shù)來(lái)保證所有線程的數(shù)據(jù)都是同步的。

On the next step, the shader computes initial indices uiInitialSample0Ind and uiInitialSample1Ind of the source ray marching samples at the ends of the current ray section (fig. 17). Initial sample step is here adjusted to provide higher sampling density near the epipole to account for high variation of scattering intensity.

之后便锨,就要為各個(gè)sample點(diǎn)設(shè)置其插值的source sample的index:uiInitialSample0Ind & uiInitialSample1Ind 围辙。在這里,會(huì)需要在極點(diǎn)位置調(diào)整初始采樣step的尺寸放案,以保證在這片散射強(qiáng)度變化速率較大的區(qū)域擁有更高的采樣密度。

Fig.17: Initially placed ray marching samples at the ends of the ray section

If the sample index equals one of initial sample indices, the sample is ray marching and is interpolated from itself. Otherwise it is necessary to search for the two interpolation sources. But before performing the search itself, it is easy to check if there is at least one depth break on the current ray segment:

如果某個(gè)采樣點(diǎn)本身就是ray marching計(jì)算點(diǎn)矫俺,那么就直接用其計(jì)算結(jié)果作為輸出吱殉,否則就需要從兩個(gè)source ray marching計(jì)算點(diǎn)進(jìn)行插值,不過(guò)在進(jìn)行source搜尋之前厘托,還需要檢測(cè)在當(dāng)前的ray segment上是否存在深度不連貫點(diǎn)友雳。

bool bNoDepthBreaks = true;

#if INITIAL_SAMPLE_STEP < 32

{

     // Check if all uiInitialSampleStep flags starting from

     // position uiInitialSample0Ind are set:

     int iFlagPackOrder = uiInitialSample0Ind / 32;

     int iFlagOrderInPack = uiInitialSample0Ind % 32;

     uint uiFlagPack = uiPackedCamSpaceDiffFlags[iFlagPackOrder];      uint uiAllFlagsMask = ((1<<uiInitialSampleStep) - 1);

     if( ((uiFlagPack >> iFlagOrderInPack) & uiAllFlagsMask) !=                uiAllFlagsMask )

         bNoDepthBreaks = false;

}

#else

{

for(uint i=0; i < g_uiNumPackedFlags; ++i)

     if( uiPackedCamSpaceDiffFlags[i] != 0xFFFFFFFFU )

     // If at least one flag is not set, there is a depth break on this                 section

         bNoDepthBreaks = false;

} #endif

If there are no depth breaks, no further processing is required. If it was found, that there is at least one depth break on current ray section, we perform search for two interpolation sources. Due to the efficient packing scheme, the search can be efficiently implemented using bit manipulation and two intrinsic functions: firstbitlow() and firstbithigh() which return bit position of the first non-zero bit starting from the lowest order bit and the highest order bit correspondingly. If initial sample step is not greater than 32, this implementation avoids using loops at all. If initial sample step is greater than 32, the number of iterations the loop is executed is at most Si/32.

如果整個(gè)segment中不存在深度不連貫的點(diǎn),那么就不需要進(jìn)行后續(xù)的計(jì)算铅匹,否則就需要進(jìn)行進(jìn)一步的搜索source interpolation點(diǎn)押赊。在當(dāng)前的實(shí)現(xiàn)框架下,可以通過(guò)位操作(主要用到兩個(gè)位函數(shù):firstbitlow以及firstbithigh包斑,分別用于返回從最低位開(kāi)始以及最高位開(kāi)始的第一個(gè)非0數(shù)值所在位置)實(shí)現(xiàn)快速查找流礁。如果初始sample step不小于32,整個(gè)計(jì)算過(guò)程可以直接順序完成罗丰,否則需要增加循環(huán)處理機(jī)制神帅。

6.3. Building min/max mipmap

At this stage, a min/max mipmap is constructed. For this, a 1D auxiliary texture is first rendered which contains UV direction of the slice in the shadow map. The shader loads coordinates of the first sample in the slice (note that sample with the index 0 is the location of light source). It then reprojects this location from camera projection space to light space and computes direction from the light space position to the resulting location. Note that direction is normalized in a way that the it always covers one shadow map texel. Note also that complimentary depth buffering is used.

在這一步之前,需要先從shadow map中生成出當(dāng)前epipolar slice對(duì)應(yīng)的UV direction一維貼圖(這個(gè)一維貼圖指明了每個(gè)epipolar slice在shadow map中的uv方向)萌抵。之后據(jù)此構(gòu)建一維的min/max shadow mipmap找御。

UV direction貼圖生成算法給出如下:先讀取slice中的第一個(gè)采樣點(diǎn)的坐標(biāo),并將之從相機(jī)投影空間轉(zhuǎn)換到光源空間(注意绍填,index0的采樣點(diǎn)對(duì)應(yīng)的是光源霎桅,所以這里的第一個(gè)采樣點(diǎn)是光源之后的第一個(gè)采樣點(diǎn),也就是index1)讨永,之后從光源向這個(gè)位置引出一條射線滔驶,由于這個(gè)射線的方向是經(jīng)過(guò)歸一化的,因此其對(duì)應(yīng)的shadow map位置只覆蓋一個(gè)采樣點(diǎn)(在這個(gè)過(guò)程中還會(huì)用到complimentary 深度貼圖住闯,這個(gè)是干啥的瓜浸?),再跟光源在shadow map中的位置相減(光源在shadow map中的位置是不是在中心比原?)插佛,就得到了slice對(duì)應(yīng)的uv direction了。

float2 RenderSliceUVDirInShadowMapTexturePS(SScreenSizeQuadVSOutput In) : SV_Target

{

     // Load location of the first sample in the slice (after the light source, which is 0-th sample)

     uint uiSliceInd = In.m_f4Pos.x;

     float2 f2FirstSampleInSliceLocationPS = g_tex2DCoordinates.Load( uint3(1, uiSliceInd, 0) );

     if( any( abs(f2FirstSampleInSliceLocationPS) > 1 + 1e-4 ) )

         return float2(-10000, -10000);

// Reproject the sample location from camera projection space to light space

     float4 f4FirstSampleInSlicePosInLightProjSpace = mul( float4(f2FirstSampleInSliceLocationPS, 0, 1), g_LightAttribs.mCameraProjToLightProjSpace);                           f4FirstSampleInSlicePosInLightProjSpace /= f4FirstSampleInSlicePosInLightProjSpace.w;

float2 f2FirstSampleInSliceUVInShadowMap = ProjToUV( f4FirstSampleInSlicePosInLightProjSpace.xy );

// Compute direction from the camera pos in light space to the sample pos

     float2 f2SliceDir = f2FirstSampleInSliceUVInShadowMap - g_LightAttribs.f4CameraUVAndDepthInShadowMap.xy;

     f2SliceDir /= max(abs(f2SliceDir.x), abs(f2SliceDir.y));

     return f2SliceDir;

}

1D min/max mipmap is

in size where S is the resolution of original shadow map. Each row of the min/max mipmap contains 1D min/max binary tree for appropriate epipolar slice. The layout of the texture is shown in the figure below:

一維的min/map mipmap的分辨率為SxN_slices量窘,其中S指的是原shadow map的分辨率(這個(gè)地方尚存疑惑雇寇,指的應(yīng)該是一維的shadow map的分辨率)。mipmap的每一行都包含了對(duì)應(yīng)于一個(gè)epipolar slice的一個(gè)一維min/max的二叉樹(shù)結(jié)構(gòu),貼圖的布局如下圖所示:

Fig.18: Layout of min/max mipmap texture

Note that the 1D min/max mipmap does not contain original shadow map, so its first level is two times down sampled and has

samples. Note also that f2SliceDir is computed in a way that at most S steps are required to cross the shadow map texture.

由于一維的min/max mipmap陣列中并不需要包含原尺寸的shadow map锨侯,因此其第一級(jí)mipmap的分辨率為S/2嫩海,另外,f2SliceDir的計(jì)算過(guò)程最多需要S步就能cross整個(gè)shadow map貼圖(沒(méi)太明白這個(gè)意思)

Creating min/max shadow map is performed using well-known flip/flop approach when two textures are alternately used as a source and destination. Initial shadow map is used to initialize first level of the min/max binary trees. 1D min/max tree construction starts from the projected light source position. Note that Gather() instruction is used at this stage to load four source values which will be used to perform bilinear interpolation. In practice we do not construct full binary tree because it is very unlikely that coarse levels could be reached. Besides, rendering to low-resolution textures is inefficient on modern GPUs.

min/max shadow map的創(chuàng)建使用的是經(jīng)典的flip/flop實(shí)現(xiàn)方式(整個(gè)過(guò)程使用兩張貼圖囚痴,第一次使用A作為source叁怪,B作為destination,第二次就反過(guò)來(lái))深滚,第一次渲染的source貼圖使用的是shadow map奕谭。一維min/max二叉樹(shù)的構(gòu)建過(guò)程的起點(diǎn)為投影后的光源位置,且在這個(gè)過(guò)程中使用了Gather指令來(lái)同時(shí)采樣四個(gè)像素進(jìn)行雙線性插值痴荐。在實(shí)際計(jì)算中血柳,并不需要構(gòu)建完整的二叉樹(shù),因?yàn)檫@里認(rèn)為過(guò)于粗糙的層次數(shù)據(jù)在實(shí)際使用中不太可能會(huì)用到生兆,且在渲染中使用過(guò)低分辨率的貼圖难捌,其消耗會(huì)比較高(為什么?)

float2 InitializeMinMaxShadowMapPS(SScreenSizeQuadVSOutput In) : SV_Target

{

    uint uiSliceInd = In.m_f4Pos.y;

     // Load slice direction in shadow map

     float2 f2SliceDir = g_tex2DSliceUVDirInShadowMap.Load( uint3(uiSliceInd,0,0) );

     // Calculate current sample position on the ray

float2 f2CurrUV = g_LightAttribs.f4CameraUVAndDepthInShadowMap.xy + f2SliceDir *     floor(In.m_f4Pos.x) * 2.f * g_PPAttribs.m_f2ShadowMapTexelSize;

    // Gather 8 depths which will be used for PCF filtering for this sample and its immediate neighbor along the epipolar slice

     float4 f4Depths = g_tex2DLightSpaceDepthMap.Gather(samLinearBorder0, f2CurrUV);

     // Shift UV to the next sample along the epipolar slice:

     f2CurrUV +=     f2SliceDir * g_PPAttribs.m_f2ShadowMapTexelSize;

     float4 f4NeighbDepths = g_tex2DLightSpaceDepthMap.Gather(samLinearBorder0, f2CurrUV);

     float4 f4MinDepth = min(f4Depths, f4NeighbDepths);

     f4MinDepth.xy = min(f4MinDepth.xy, f4MinDepth.zw);

     f4MinDepth.x = min(f4MinDepth.x, f4MinDepth.y);

     float4 f4MaxDepth = max(f4Depths, f4NeighbDepths);

     f4MaxDepth.xy = max(f4MaxDepth.xy, f4MaxDepth.zw);

     f4MaxDepth.x = max(f4MaxDepth.x, f4MaxDepth.y);

     return float2(f4MinDepth.x, f4MaxDepth.x);

}

After that next levels of the binary trees are constructed by loading two min/max values from the next finer level and computing min/max value.

經(jīng)過(guò)初始的min/max構(gòu)建之后鸦难,之后的每一輪min/max構(gòu)建都是從上一輪構(gòu)建的結(jié)果中按照相鄰像素兩兩合并的方式求得根吁。

6.4. Ray marching

After the sampling refinement stage all ray marching samples are marked as being interpolated from themselves. Before performing ray marching, a screen-size quad is rendered with simple pixel shader discarding all pixels, which are not interpolated from themselves:

在經(jīng)過(guò)精修階段之后,所有的ray marching采樣點(diǎn)都被標(biāo)記成從自身進(jìn)行插值求得明刷,在進(jìn)行ray marching之前婴栽,會(huì)先通過(guò)PS進(jìn)行一次屏幕空間的渲染,在這個(gè)渲染中辈末,會(huì)discard掉那些非ray marching采樣點(diǎn)愚争。

uint2 ui2InterpolationSources = g_tex2DInterpolationSource.Load( uint3(In.m_f4Pos.xy,0) );

// Ray marching samples are interpolated from themselves, so it is easy to detect them:

if( ui2InterpolationSources.x != ui2InterpolationSources.y )
     discard;

Depth stencil state is configured to increment stencil value. As a result, all ray marching samples will be marked by the 2 in stencil.

通過(guò)stencil操作,保證了ray marching采樣點(diǎn)的stencil被設(shè)置成2

After that another screen-size quad is rendered with depth stencil configured to pass only these pixels whose stencil values equals 2 and discards all other pixels. Pixel shader performs ray marching on the selected locations as described in Algorithm 2 presented in section 4.

在上述操作完成之后挤聘,會(huì)再進(jìn)行一次屏幕空間PS渲染轰枝,只對(duì)那些stencil為2的像素進(jìn)行處理,在這個(gè)渲染中组去,會(huì)通過(guò)PS完成ray marching采樣點(diǎn)的ray marching渲染(用第四節(jié)中給出的算法2)

Implementing colored light shafts is straightforward. For this we only need stained glass color texture as seen from the light source and on Each iteration of Algorithm 2 fetch color from this texture along with the shadow depth. This color is then used to modulate the sun color. Note that in this case we cannot skip long lit sections of the ray, which makes 1D min/max optimization less efficient.

而如果想要實(shí)現(xiàn)帶顏色的light shaft鞍陨,就需要在algorithm 2中的每次迭代的時(shí)候,從一個(gè)以光源作為相機(jī)的觀察角度而得來(lái)的color貼圖中讀取對(duì)應(yīng)的顏色數(shù)據(jù)以及陰影深度數(shù)據(jù)从隆,之后將這個(gè)顏色數(shù)據(jù)與光照數(shù)據(jù)進(jìn)行調(diào)制诚撵,不過(guò)需要注意的是,如果要表達(dá)真實(shí)正確的結(jié)果键闺,就不能對(duì)整段連續(xù)的點(diǎn)亮區(qū)域進(jìn)行簡(jiǎn)化計(jì)算(即只計(jì)算兩端數(shù)據(jù)寿烟,中間數(shù)據(jù)使用插值得到),這就會(huì)導(dǎo)致實(shí)現(xiàn)消耗急劇增高

6.5. Interpolating in-scattering

The next stage is interpolation of in-scattered radiance for the rest of the samples from ray marching ones. For this, indices of interpolation sources are loaded from the interpolation source texture and appropriate sample are loaded from initial in-scattering texture. This work is done by the following shader.

在求得ray marching采樣點(diǎn)的inscattering數(shù)據(jù)之后辛燥,下一步就是對(duì)其他非ray marching采樣點(diǎn)進(jìn)行插值處理筛武,這個(gè)過(guò)程需要讀取插值source貼圖來(lái)獲取當(dāng)前像素所對(duì)應(yīng)的ray marching點(diǎn)的索引:

float3 InterpolateIrradiancePS(SScreenSizeQuadVSOutput In) : SV_Target

{

     uint uiSampleInd = In.m_f4Pos.x;

     uint uiSliceInd = In.m_f4Pos.y;

     // Get interpolation sources

     uint2 ui2InterpolationSources = g_tex2DInterpolationSource.Load(     uint3(uiSampleInd, uiSliceInd, 0) );

     float fInterpolationPos = float(uiSampleInd - ui2InterpolationSources.x) /     float( max(ui2InterpolationSources.y - ui2InterpolationSources.x,1) );

     float3 f3Src0 = g_tex2DInitialInsctrIrradiance.Load(     uint3(ui2InterpolationSources.x, uiSliceInd, 0) );

    float3 f3Src1 = g_tex2DInitialInsctrIrradiance.Load(     uint3(ui2InterpolationSources.y, uiSliceInd, 0));

     // Ray marching samples are interpolated from themselves

     return lerp(f3Src0, f3Src1, fInterpolationPos);

}

6.6. Transforming epipolar in-scattering to rectangular coordinates

At this stage we have to transform inscattering image from epipolar coordinates back to the rectangular space. For this purpose, we first need to determine the two closest epipolar lines, project the pixel onto these and perform bilateral interpolation. Rendering is done with depth stencil state configured to increment stencil value. Note that different depth stencil buffer is used at this stage. The following steps are performed:

在這個(gè)階段缝其,會(huì)需要將極坐標(biāo)系下的inscattering貼圖轉(zhuǎn)換回平面坐標(biāo)空間中,其實(shí)現(xiàn)方式是遍歷整個(gè)屏幕空間的所有像素徘六,對(duì)于每個(gè)像素内边,先找到兩條距離當(dāng)前像素最近的epipolar line,將像素投影到這兩條line上面(應(yīng)該是將低分辨率的貼圖的像素轉(zhuǎn)換到平面坐標(biāo)系空間)并進(jìn)行雙線性插值待锈。整個(gè)渲染過(guò)程需要設(shè)置stencil的pass操作為increment漠其。注意,在這個(gè)過(guò)程中竿音,會(huì)用到不同的depth/stencil buffer辉懒,其主要的操作步驟給出如下:

· The epipolar line which contains current pixel is identified -- 對(duì)每一個(gè)像素,判定其對(duì)應(yīng)的epipolar line

· Current sample is projected onto the line -- 將當(dāng)前像素投影到對(duì)應(yīng)的epipolar line

· Bilateral interpolation is performed using interpolated in-scattering texture and epipolar camera space z -- 根據(jù)極坐標(biāo)系下相機(jī)空間的深度值對(duì)插值后的inscattering貼圖采樣數(shù)據(jù)進(jìn)行雙線性插值

At the first step, we have to determine which epipolar line current pixel lies on. What we need to do is to connect light source projected position with one of four boundaries, and determine the slice entry and exit points. This is not as simple as it seems from the first glance because the light source could be outside the screen. Thus the obvious solution of determining the closest intersection with the boundary along the ray connecting light source and current pixel will not work if the light is not on the screen. A universal solution we implemented is based on determining to which of 4 sectors the pixel belongs to (fig. 19). If we know this, we will be immediately able to connect the light source with the appropriate border and get the exit point. After that we can compute entry point using GetEpipolarLineEntryPoint() function describe in section 6.1.

第一步是需要判定當(dāng)前像素所從屬的epipolar line谍失,直觀上來(lái)看,我們需要做的是根據(jù)光源位置計(jì)算出epipolar line的entry跟exit位置莹汤,不過(guò)實(shí)際計(jì)算過(guò)程中比這個(gè)會(huì)復(fù)雜一點(diǎn)快鱼,因?yàn)橛锌赡芄庠次恢檬翘幱谄聊恢獾模@里給出的一種通用的計(jì)算方式是先將光源跟屏幕的四個(gè)角點(diǎn)連接起來(lái)纲岭,將整個(gè)屏幕分割成四個(gè)區(qū)域抹竹,之后根據(jù)像素所在的位置確定當(dāng)前的像素是屬于哪個(gè)區(qū)域,那么這個(gè)區(qū)域所對(duì)應(yīng)的邊就是exit所在的邊止潮,據(jù)此可以計(jì)算出exit點(diǎn)窃判,之后根據(jù)6.1節(jié)的算法計(jì)算出entry點(diǎn)。

Fig.19: Four sectors of the screen formed by connecting light source with screen corners

To determine the sector, we first determine which of four half spaces formed by connecting each of the four screen corners with the light source the current pixel belongs to (fig 20).

為了確定像素所在的區(qū)域喇闸,這里的實(shí)現(xiàn)方法是先將光源與四個(gè)焦點(diǎn)分別連線袄琳,得到將屏幕空間一分為二的劃分,如下圖所示:

Fig.20: Four half-spaces formed by connecting each of the four screen corners with the light source

It is then easy to see that the following relations are always true for any pixel P:之后對(duì)于像素P而言燃乍,就能得到如下的一些等價(jià)關(guān)系:

The following code snippet efficiently computes sector mask vector, which has 1 in the component corresponding to the required sector:

下面的代碼可以高效的計(jì)算出區(qū)域mask vector唆樊,vector中為1的索引就是區(qū)域的索引:

float2 f2RayDir = normalize( In.m_f2PosPS - g_LightAttribs.f4LightScreenPos.xy );

float4 f4HalfSpaceEquationTerms = (In.m_f2PosPS.xxyy - float4(-1,1,-1,1)) * f2RayDir.yyxx;

bool4 b4HalfSpaceFlags = f4HalfSpaceEquationTerms.xyyx < f4HalfSpaceEquationTerms.zzww;

bool4 b4SectorFlags = b4HalfSpaceFlags.wxyz && !b4HalfSpaceFlags.xyzw;

Having this mask we can compute epipolar line exit point by computing distances to all four boundaries and then selecting the required distance using b4SectorFlags. After that and entry point can be computed with GetEpipolarLineEntryPoint():

得到這個(gè)mask vector之后,我們就能夠計(jì)算epipolar line的exit點(diǎn)刻蟹,而entry點(diǎn)的計(jì)算過(guò)程則可以給出如下:通過(guò)計(jì)算當(dāng)前像素到四個(gè)邊的距離逗旁,并根據(jù)b4SectorFlags從中選擇出對(duì)應(yīng)的距離,之后就可以通過(guò)GetEpipolarLineEntryPoint計(jì)算對(duì)應(yīng)的entry點(diǎn)舆瘪。

float4 f4DistToBoundaries = ( float4(-1,-1, 1,1) - g_LightAttribs.f4LightScreenPos.xyxy ) / (f2RayDir.xyxy + float4( abs(f2RayDir.xyxy)<1e-6 ) );

// Select distance to the exit boundary:

float fDistToExitBoundary = dot( b4SectorFlags, f4DistToBoundaries );

// Compute exit point on the boundary: float2 f2ExitPoint = g_LightAttribs.f4LightScreenPos.xy + f2RayDir * fDistToExitBoundary;

float2 f2EntryPoint = GetEpipolarLineEntryPoint(f2ExitPoint);

Note that the method above works for any light source location either when it is outside and inside the screen.

注意片效,上面的實(shí)現(xiàn)算法適用于所有的光源位置,不論是inside還是outside

Epipolar slice number which corresponds to ordering presented in section 6.1 can be computed as follows:

epipolar line索引就可以根據(jù)如下的算法計(jì)算得到:

float4 f4EpipolarSlice = float4(0, 0.25, 0.5, 0.75) + (0.5 + float4(-0.5, +0.5, +0.5, -0.5)*f2ExitPoint.yxyx)/4.0;

float fEpipolarSlice = dot(b4SectorFlags, f4EpipolarSlice);

Sample location on epipolar slice is computed by simply projecting the pixel onto the line connecting entry and exit points. This gives us coordinate f2ScatteredColorUV in the interpolated inscattering texture to filter from.

epipolar line上的采樣點(diǎn)位置則可以通過(guò)將像素投影到這個(gè)line上來(lái)求得英古,其給出的結(jié)果是插值后的inscattering貼圖上的UV坐標(biāo)f2ScatteredColorUV

To perform bilateral interpolation we need to compute bilinear weights and locations of the source samples. The following code snippet computes bilinear filter weights as well as texture coordinates f2ScatteredColorIJ of the center of the left bottom source texel.

為了進(jìn)行雙線性插值淀衣,就需要計(jì)算source采樣點(diǎn)的雙線性插值權(quán)重以及位置,下面的代碼給出了左下角source像素中心的貼圖坐標(biāo)以及權(quán)重的計(jì)算過(guò)程:

float2 f2ScatteredColorUVScaled = f2ScatteredColorUV.xy * f2ScatteredColorTexDim.xy - float2(0.5, 0.5);
float2 f2ScatteredColorIJ = floor(f2ScatteredColorUVScaled);
// Get bilinear filtering weights
float2 f2BilinearWeights = f2ScatteredColorUVScaled - f2ScatteredColorIJ;
// Get texture coordinates of the left bottom source texel. Again, offset by 0.5 is essential
// to align with texel center
f2ScatteredColorIJ = (f2ScatteredColorIJ + float2(0.5, 0.5)) / f2ScatteredColorTexDim.xy;

Camera space z values for 4 source samples can be obtained using Gather() intrinsic function. An important aspect here is offsetting coordinates to be at the same distance from all four source texels to eliminate rounding artifacts:
四個(gè)source采樣點(diǎn)的相機(jī)空間深度值可以通過(guò)Gather指令一次性讀取完成哺呜,不過(guò)這里需要注意的是舌缤,為了避免小數(shù)舍入的誤差箕戳,需要保證偏移坐標(biāo)到四個(gè)source采樣像素點(diǎn)的距離是相同的:

float4 f4SrcLocationsCamSpaceZ = g_tex2DEpipolarCamSpaceZ.Gather(samLinearClamp, f2ScatteredColorIJ + float2(0.5, 0.5) / f2ScatteredColorTexDim.xy)

After that bilateral weights f4BilateralWeights are computed. To perform inscattering texture filtering using two fetches instead of four, we use the following trick: we can obtain weighted sum of two samples using hardware supported bilinear filtering. For this we only need to compute appropriate offset:
通過(guò)上述計(jì)算過(guò)程,我們就得到了雙線性插值的權(quán)重f4BilateralWeights 国撵,為了能將貼圖采樣操作從四個(gè)降低到兩個(gè)陵吸,這里使用了一個(gè)小trick,借助與硬件的雙線性插值特性來(lái)提高實(shí)現(xiàn)效率:

float fRow0UOffset = f4BilateralWeights.z / max(f4BilateralWeights.z + f4BilateralWeights.w, 0.001); fRow0UOffset /= f2SrcTexDim.x;
float3 f3Row0WeightedCol = (f4BilateralWeights.z + f4BilateralWeights.w) * tex2DSrcTexture.SampleLevel(Sampler, f2LeftBottomSrcTexelUV + float2(fRow0UOffset, 0), 0, int2(0,0));

float fRow1UOffset = f4BilateralWeights.y / max(f4BilateralWeights.x + f4BilateralWeights.y, 0.001); fRow1UOffset /= f2SrcTexDim.x;

float3 f3Row1WeightedCol = (f4BilateralWeights.x + f4BilateralWeights.y) * tex2DSrcTexture.SampleLevel(Sampler, f2LeftBottomSrcTexelUV + float2(fRow1UOffset, 0 ), 0, int2(0,1)); f3ScatteredLight = f3Row0WeightedCol + f3Row1WeightedCol;

Note that presented implementation does not perform branching (except for discarding invalid pixel) and performs bilateral filtering of inscattering texture using just one gather and two bilinear fetches.
上面的實(shí)現(xiàn)算法介牙,除了在discarding無(wú)效像素之外壮虫,沒(méi)有其他的分支操作,只用了一個(gè)gather跟兩個(gè)雙線性采樣就實(shí)現(xiàn)了inscattering貼圖的雙線性過(guò)濾环础。
If total bilateral weight is close to zero, there are no appropriate samples which can be used to calculate inscattering for this sample. In this case we discard the sample, keeping 0 in the stencil.
如果整個(gè)雙線性權(quán)重接近于0囚似,那么就意味著找不到可以用來(lái)對(duì)當(dāng)前像素進(jìn)行插值的source 采樣點(diǎn),在這種情況下线得,會(huì)考慮將這個(gè)sample的插值計(jì)算停止掉饶唤,保持其輸出值為0.

6.7. Correcting in-scattering

For each pixel, which cannot be correctly interpolated from inscattering texture, we perform additional ray marching pass. Since these pixels are marked in stencil with 0, all we need to identify these pixels is to configure depth stencil state to pass only pixels whose stencil value equals 0 and discards all other pixels. Note that at this stage we cannot use 1D min/max mipmap, because it will require constructing this for each sample. We also use lower number of samples when performing this step with no visual impact.
對(duì)于那些無(wú)法通過(guò)插值求得ray marching結(jié)果的像素而言,在這個(gè)地方還需要對(duì)其再單獨(dú)進(jìn)行一次ray marching計(jì)算(這些像素在之前的實(shí)現(xiàn)中贯钩,已經(jīng)被stencil標(biāo)記為0了募狂,可以利用這個(gè)特性來(lái)進(jìn)行處理)。不過(guò)在這個(gè)計(jì)算中角雷,min/max shadow mipmap就不可用了祸穷,因?yàn)槿绻褂玫脑挘托枰獮槊總€(gè)此類(lèi)像素構(gòu)建一個(gè)mipmap勺三,消耗太高雷滚。不過(guò),可以只對(duì)其中的少量有代表性的采樣點(diǎn)進(jìn)行ray marching計(jì)算來(lái)降低消耗吗坚,且這個(gè)實(shí)現(xiàn)不會(huì)對(duì)視覺(jué)效果造成什么影響祈远。

6.8. Up-scaling in-scattering to original resolution

The final step of the algorithm is up-scaling the downscaled inscattering texture to original resolution and applying it to the attenuated background. This step is very similar to transforming the inscattering image from epipolar coordinates to rectangular with the main difference being is that original depth buffer is used to load camera space z and downscaled inscattering texture is used to load inscattering values. In the same manner pixels which cannot be correctly interpolated are marked in the stencil at this stage and finally corrected in the later pass. Note that we also apply phase function here, because it exhibits high variation near the epipole.
算法的最終一步就是要將inscattering結(jié)果貼圖跟原始的color貼圖混合起來(lái),在這個(gè)過(guò)程中刻蚯,那些無(wú)法通過(guò)插值得到inscattering結(jié)果的像素discard掉绊含,這些像素會(huì)在后面的處理中被考慮到,注意炊汹,在這個(gè)步驟中躬充,還需要處理相函數(shù)的影響,因?yàn)橄嗪瘮?shù)會(huì)在極點(diǎn)附近會(huì)有比較大的變化讨便。7. Sample structure
The sample project consists of the following files:
sample項(xiàng)目包含了以下文件:
* LightScattering.h, LightScattering.cpp – Responsible for application startup/shutdown, rendering the scene and shadow map, handling user input --項(xiàng)目的入口充甚,負(fù)責(zé)場(chǎng)景渲染與shadow map渲染以及用戶輸入
* LightSctrPostProcess.h, LightSctrPostProcess.cpp – Responsible for performing all the post processing steps to create light scattering effects--負(fù)責(zé)所有的后處理渲染步驟,以計(jì)算光照散射
* RenderTechnique.h, RenderTechnique.cpp – Provide auxiliary functionality for creating effect technique--一些輔助接口
* Common.fxh – Contains common shader definitions--通用的shader定義
* LightScattering.fx – Contains all the shaders performing post processing steps--后處理相關(guān)的所有shader
* RefineSampleLocations.fx – Contains compute shader performing sample refinement--用于對(duì)采樣點(diǎn)進(jìn)行精修的compute shader
* Structures.fxh – Contains definitions of structures used by the shaders--shader相關(guān)的結(jié)構(gòu)定義

8. How to integrate the technique into your engine

項(xiàng)目接入方式
Since the technique is completely post processing, integrating it into an existing engine is relatively simple. The technique is fully implemented in LightSctrPostProcess.h, LightSctrPostProcess.cpp and shader files Common.fxh, LightScattering.fx, RefineSampleLocations.fx and Structures.fxh. To compile the files you need DirectX SDK only.
All the work is done by PerformPostProcessing() method of CLightSctrPostProcess class, which takes two structs as arguments:

void PerformPostProcessing(SFrameAttribs &FrameAttribs, SPostProcessingAttribs &PPAttribs)

The first structure defines attributes for the frame to be processed and has the following declaration:

struct SFrameAttribs { ID3D11Device *pd3dDevice; ID3D11DeviceContext *pd3dDeviceContext; SLightAttribs LightAttribs; SCameraAttribs CameraAttribs; ID3D11ShaderResourceView *ptex2DSrcColorBufferSRV; ID3D11ShaderResourceView *ptex2DDepthBufferSRV; ID3D11ShaderResourceView *ptex2DShadowMapSRV; ID3D11ShaderResourceView *ptex2DStainedGlassSRV; ID3D11RenderTargetView *pDstRTV; ID3D11DepthStencilView *pDstDSV; }

The structure has two nested structures which define light source parameters (such as light direction and color, light source position on the screen etc.). The second structure defines camera attributes (position, world, view, projection matrices). Destination render target to which post-processed scene is written is specified by the pDstRTV member of the structure.

The second argument PPAttribs of the PerformPostProcessing() defines the method parameters such as number of epipolar slices, number of samples, initial sample step etc.

9. Performance

Performance results are given for rendering the scene shown in the fig. 21

Fig.21: The scene used for performance test

The effect configuration is:

  • Number of epipolar slices: 512

  • Total number of slices: 256

  • Initial sample step: 16

  • Downscale factor: 4

  • Shadow map resolution: 1024 x 1024

  • Screen resolution 1024x768

Hardware configuration: 3rd Gen Intel? Core? processor (code-named Ivy Bridge) with Intel? HD Graphics, 4 GB RAM.

Total time required to render the scene is 18.7 ms, of which 9.906 ms (52.9%) is spent on post processing.

場(chǎng)景渲染用了18.7ms霸褒,其中9.9ms用于實(shí)現(xiàn)散射計(jì)算

The timings for the individual steps of the algorithm are as follows:

  • Reconstructing camera space z coordinate: 0.390 ms (3.94 %)

  • Rendering coordinates texture: 0.595 ms (6.01 %)

  • Refining sample locations: 0.557 ms (5.62 %)

  • Constructing min/max shadow map: 0.736 ms (7.43 %)

  • Ray marching: 2.638 ms (26.63 %)

  • Interpolation: 0.141 ms (1.42 %)

  • Transformation to rectangular coords: 0.306 ms (3.09 %)

  • Fixing depth discontinuities: 1.015 ms (10.25 %)

  • Upscaling: 2.498 ms (25.22 %)

  • Fixing depth discontinuities: 1.030 ms (10.40 %)

  • Total: 9.906 ms (100 %)

Without 1D min/max mipmap optimization, ray marching step alone takes 5.651 ms which is 1.67x slower than 2.638+0.736=3.374 ms for optimized version. Note that performance improvement is much higher for higher shadow map resolution, or higher number of epipolar slices/number of samples in slice.

如果不用一維的min/max mipmap優(yōu)化算法伴找,ray marching計(jì)算的消耗需要5.65ms,其結(jié)果是優(yōu)化算法的1.67倍废菱,這個(gè)優(yōu)化效果對(duì)于高分辨率的shadow map或者更多的epipolar slice數(shù)可能更明顯技矮。

Note that according to timings presented in [ED10] for ATI Radeon HD4850 GPU, epipolar line generation took 1.2 ms while discontinuity search took 6.0 ms which is 5 times slower. In our implementation discontinuity search takes even less time than coordinate texture generation thanks to optimization with compute shader.

且由于compute shader的使用抖誉,這里給出的算法在搜尋深度不連貫點(diǎn)的時(shí)候的消耗時(shí)間比[ED10]更短。

Note also that Chen et al. report 55 fps for rendering the test scene on high-end NVidia GTX480 GPU [CBDJ11]. With similar quality settings (4096x4096 shadow map, 1024 epipolar slices with 1024 samples, 1x downscale factor, colored light shafts, 1280x960 screen resolution) for the scene of similar complexity (fig. 22) we were able to get more than 100 fps on the same graphics card and similar CPU.

在同樣的機(jī)器配置與算法配置情況下衰倦,這里給出的算法能夠達(dá)到100fps以上袒炉,而Chen的實(shí)現(xiàn)卻只有55fps。

Fig.22: Colored light shafts in high-quality settings

10. Conclusion

The volumetric lighting method presented in the sample efficiently combines epipolar sampling with 1D min/max mipmap construction and a number of optimization tricks to achieve high rendering speed and visual quality on Intel integrated graphics. Being fully post-processing technique, it can be easily integrated into the game engines or other applications.

這篇文章的源代碼可以在這個(gè)頁(yè)面找到樊零,不過(guò)這篇文章給出的算法我磁,只能用于計(jì)算平行光光源的light shaft效果,對(duì)于其他各類(lèi)光源的實(shí)現(xiàn)驻襟,就需要去這個(gè)頁(yè)面上下載對(duì)應(yīng)的源碼夺艰。

這篇文章中對(duì)應(yīng)的GDC演講的PPT給出如下

11. References

[PSS99] Preetham A.J., Shirley P., Smits B.E. (1999) A practical analytic model for daylight. In: Computer Graphics Proceedings, Annual Conference Series (Proc.SIGGRAPH ’99), pp 91–100.

[NSTN93] Nishita T., Sirai T., Tadamura K., Nakamae E.: Display of the Earth taking into account atmospheric scattering. In SIGGRAPH 93 (1993), ACM, pp. 175–182.

[HP02] Hoffman N., Preetham A. J.: Rendering outdoor light scattering in real time. Proceedings of Game Developer Conference (2002).

[ED10] Engelhardt, T., and Dachsbacher, C. 2010. Epipolar sampling for shadows and crepuscular rays in participating media with single scattering. In Proc. 2010 ACM SIGGRAPH symposium on Interactive 3D Graphics and Games, ACM, 119–125.

[BN08] Eric Bruneton and Fabrice Neyret.: Precomputed atmospheric scattering. Comput. Graph. Forum. Proceedings of the 19th Eurographics symposium on Rendering 2008 27:4 (2008), 1079-1086. Special issue.

[CBDJ11] Chen, J., Baran, I., Durand, F., and Jarosz, W. 2011. Real-time volumetric shadows using 1d min-max mipmaps. In Proceedings of the Symposium on Interactive 3D Graphics and Games, 39–46.

[GMF09] Gautrom, P., Marvie, J.-E., and Francois, G., 2009. Volumetric shadow mapping. ACM SIGGRAPH 2009 Sketches.

[TIS08] Tevs, A., Ihrke, I., and Seidel, H.-P. 2008. Maximum mipmaps for fast, accurate, and scalable dynamic height field rendering. In Symposium on Interactive 3D Graphics and Games (i3D’08), 183–190.

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市沉衣,隨后出現(xiàn)的幾起案子郁副,更是在濱河造成了極大的恐慌,老刑警劉巖豌习,帶你破解...
    沈念sama閱讀 218,284評(píng)論 6 506
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件霞势,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡斑鸦,警方通過(guò)查閱死者的電腦和手機(jī),發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,115評(píng)論 3 395
  • 文/潘曉璐 我一進(jìn)店門(mén)草雕,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)巷屿,“玉大人,你說(shuō)我怎么就攤上這事墩虹≈鼋恚” “怎么了?”我有些...
    開(kāi)封第一講書(shū)人閱讀 164,614評(píng)論 0 354
  • 文/不壞的土叔 我叫張陵诫钓,是天一觀的道長(zhǎng)旬昭。 經(jīng)常有香客問(wèn)我,道長(zhǎng)菌湃,這世上最難降的妖魔是什么问拘? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 58,671評(píng)論 1 293
  • 正文 為了忘掉前任,我火速辦了婚禮惧所,結(jié)果婚禮上骤坐,老公的妹妹穿的比我還像新娘。我一直安慰自己下愈,他們只是感情好纽绍,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,699評(píng)論 6 392
  • 文/花漫 我一把揭開(kāi)白布。 她就那樣靜靜地躺著势似,像睡著了一般拌夏。 火紅的嫁衣襯著肌膚如雪僧著。 梳的紋絲不亂的頭發(fā)上,一...
    開(kāi)封第一講書(shū)人閱讀 51,562評(píng)論 1 305
  • 那天障簿,我揣著相機(jī)與錄音盹愚,去河邊找鬼。 笑死卷谈,一個(gè)胖子當(dāng)著我的面吹牛杯拐,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播世蔗,決...
    沈念sama閱讀 40,309評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼端逼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了污淋?” 一聲冷哼從身側(cè)響起顶滩,我...
    開(kāi)封第一講書(shū)人閱讀 39,223評(píng)論 0 276
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤,失蹤者是張志新(化名)和其女友劉穎寸爆,沒(méi)想到半個(gè)月后礁鲁,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體,經(jīng)...
    沈念sama閱讀 45,668評(píng)論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡赁豆,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,859評(píng)論 3 336
  • 正文 我和宋清朗相戀三年仅醇,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片魔种。...
    茶點(diǎn)故事閱讀 39,981評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡析二,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出节预,到底是詐尸還是另有隱情叶摄,我是刑警寧澤,帶...
    沈念sama閱讀 35,705評(píng)論 5 347
  • 正文 年R本政府宣布安拟,位于F島的核電站蛤吓,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏糠赦。R本人自食惡果不足惜会傲,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,310評(píng)論 3 330
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望拙泽。 院中可真熱鬧唆铐,春花似錦、人聲如沸奔滑。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 31,904評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)朋其。三九已至王浴,卻和暖如春脆炎,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背氓辣。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 33,023評(píng)論 1 270
  • 我被黑心中介騙來(lái)泰國(guó)打工秒裕, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人钞啸。 一個(gè)月前我還...
    沈念sama閱讀 48,146評(píng)論 3 370
  • 正文 我出身青樓几蜻,卻偏偏與公主長(zhǎng)得像,于是被迫代替她去往敵國(guó)和親体斩。 傳聞我的和親對(duì)象是個(gè)殘疾皇子梭稚,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,933評(píng)論 2 355

推薦閱讀更多精彩內(nèi)容