kch3782 / torcwa Goto Github PK

View Code? Open in Web Editor NEW

76.0 5.0 12.0 4.26 MB

GPU-accelerated RCWA with automatic differentiation

License: Other

Jupyter Notebook 94.69% Python 5.31%

electromagnetic-simulation optics-simulation rigorous-coupled-wave pytorch

torcwa's People

Contributors

Stargazers

Watchers

Forkers

nunofernandes-plight cescchen1990 zhixwang mahmutruzi eminsight markma1990 zjuyinge satadrudas wenhan-zhang-327 sylee871103 sprbull cxcg

torcwa's Issues

Extract the electric field and angle by orders

Hi authors, can the tool extract the electric field by each order? For example, the xy-field or xz-field of order (0,1), (1,1),...

I also want to extract the angle of each order within a particular layer. There is a function in rcwa.py called diffraction_angle(), however I didn't find any introduction or explanation for this function in the manual.

I want to do some follow-up computation by orders, using both the field and the angle of it.

Thanks!

Reference Manual?

Dear authors,

Hi, many thanks for the library. Is there a reference manual describing the full commands, other than the attached user_guide_v1.0.pdf in the repo?

Thanks!

Multi-Objective FoM Parallelization

Has there been any consideration for parallelization when working with a multi-objective FoM? I've been able to successfully implement a basic for-loop for the multi-objective FoM (such as optimizing for different input angles or wavelengths). But this process is serial, so the time to run scales linearly with number of objectives.

Dispersion curves and mode analysis

Hello,
Congratulations on developing TORCWA!

1- Would you please advise on how to calculate mode dispersion curves of an individual layer, and mode evolution with geometric or materials parameters? Though I am new to the RCWA method, but I understand that computing eigenmodes of individual layer is the first step in the implemented algorithm. So, would it be possible to plot eigenvalues maps (such as Dirac cones) without the need to perform reflection/transmission computations?

2- A follow up on the first question, what is the best approach to track modes coupling evolutions across multiple layers with a geometric tuning in one of them?

My apologies if my questions are not actually issues with the package, but to my understanding providing hints or examples to tackle these questions could be of great help to a broad audience.

Thank you,
Hadi

Definition of "order"

Hi, I wonder about the definition of the order/truncations.

For example, if I set "order = 5", does it mean the orders considered in the Fourier computations are "-5, -4, ..., +4, +5" or "-2, -1, 0, +1, +2" (5 orders in total)?

I assume it's the former one, but I just to make sure if that is a correct understanding.

Thanks!

PEC/Metallic Structures

I was wondering if/how to implement TORCWA with a PEC/metallic back-reflector? For a metallic back-reflector, is it as simple as changing the substrate_eps to be a metallic dielectric function or is it more complicated than that? In a similar vein, what is the appropriate method to define a PEC? Would it be setting substrate_eps = inf?

_LinAlgError for a specific lattice size (px=3000)

Dear Authors,

Hi, thanks a lot for your library. I recently encountered a very strange error using this library. My code is as follows, which is only slightly modified from your Example0.ipynb:

# Import
import numpy as np
import torch
from matplotlib import pyplot as plt

import torcwa

# Hardware
# If GPU support TF32 tensor core, the matmul operation is faster than FP32 but with less precision.
# If you need accurate operation, you have to disable the flag below.
torch.backends.cuda.matmul.allow_tf32 = False
sim_dtype = torch.complex64
geo_dtype = torch.float32
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Simulation environment
# light
lamb0 = 375.                # nm
azi_ang = 0.*(np.pi/180)    # radian

# material
substrate_eps =  1.4782**2

# geometry
px = 3000
L = [px, px*np.sqrt(3)]            # nm / nm
torcwa.rcwa_geo.device = device
torcwa.rcwa_geo.dtype = geo_dtype
torcwa.rcwa_geo.Lx = L[0]
torcwa.rcwa_geo.Ly = L[1]
res = 30
torcwa.rcwa_geo.nx = int(L[0]/res)
torcwa.rcwa_geo.ny = int(L[1]/res)
torcwa.rcwa_geo.grid()
# torcwa.rcwa_geo.edge_sharpness = 1000.

x_axis = torcwa.rcwa_geo.x.cpu()
y_axis = torcwa.rcwa_geo.y.cpu()

# Generate and perform simulation
order_N = 12
order = [order_N,order_N]
sim = torcwa.rcwa(freq=1/lamb0,order=order,L=L,dtype=sim_dtype,device=device)
sim.add_input_layer(eps=substrate_eps)
sim.add_output_layer(eps=1.**2)
sim.set_incident_angle(inc_ang=0.,azi_ang=0.)

sim.add_layer(thickness=300, eps=1.6**2)
sim.add_layer(thickness=80e3, eps=1.**2)

When I run this snippet (in a Jupyter notebook), I got the following error:

_LinAlgError                              Traceback (most recent call last)
[Example 0 - Fresnel equation_modify.ipynb Cell 1 in 5
     [50](line=49) sim.add_output_layer(eps=1.**2)
     [51](line=50) sim.set_incident_angle(inc_ang=0.,azi_ang=0.)
---> [53](line=52) sim.add_layer(thickness=300, eps=1.6**2)
     [54](line=53) sim.add_layer(thickness=80e3, eps=1.**2)

File [c:\...\anaconda3\envs\torch\lib\site-packages\torcwa\rcwa.py:170](file:///C:/.../anaconda3/envs/torch/lib/site-packages/torcwa/rcwa.py:170), in rcwa.add_layer(self, thickness, eps, mu)
    167 else:
    168     self._eigen_decomposition()
--> 170 self._solve_layer_smatrix()

File [c:\...\anaconda3\envs\torch\lib\site-packages\torcwa\rcwa.py:1271](file:///C:/.../anaconda3/envs/torch/lib/site-packages/torcwa/rcwa.py:1271), in rcwa._solve_layer_smatrix(self)
   1268 Ctmp = torch.hstack((Ctmp1,Ctmp2))
   1270 # Mode coupling coefficients
-> 1271 self.Cf.append(torch.matmul( torch.linalg.inv(Ctmp), torch.vstack((2*torch.eye(2*self.order_N,dtype=self._dtype,device=self._device),
   1272     torch.zeros([2*self.order_N,2*self.order_N],dtype=self._dtype,device=self._device))) ))
   1273 self.Cb.append(torch.matmul( torch.linalg.inv(Ctmp), torch.vstack((torch.zeros([2*self.order_N,2*self.order_N],dtype=self._dtype,device=self._device),
   1274     2*torch.eye(2*self.order_N,dtype=self._dtype,device=self._device))) ))
   1276 self.layer_S11.append(torch.matmul(torch.matmul(self.E_eigvec[-1],phase), self.Cf[-1][:2*self.order_N,:]) + torch.matmul(self.E_eigvec[-1],self.Cf[-1][2*self.order_N:,:]))

_LinAlgError: torch.linalg.inv: The diagonal element 2 is zero, the inversion could not be completed because the input matrix is singular.

I find this error very strange because it only happens when px=3000. If I change the value of px from 3000 to either 2999 or 3001, the error is no longer there.

Do you have any idea why it happens? Is there any wrong way of using your library here?

Many thanks!

Best wishes,

"Materials" module

What is your source for the "Materials" module you use in the code? I do not think it is the same as nschloe/materials, nor any other publicly available "materials" python module.

CUDA out of memory in optimization problems

Hi authors, first I want to thank you again for the great work.

One issue I quite often have, is that when solving optimization problems, CUDA easily gets out of memory.

In my case, my optimization merit takes the 3D field as input. This means that I need to compute the 3D field, and retain the corresponding gradients as well. Because the highest memory in consumer GPUs is 24GB, and multi-GPU is not supported yet, when the meshing gets higher or diffraction orders get large, CUDA gets out of memory.

Is there any suggestion on your side to deal with this issue?

Many thanks!

What is the library for "import Materials"?

Hi, which is the library required for "import Materials" in the example files?

Thanks!

Increased GPU Memory Usage compared to reference paper

I am running some tests on memory usage versus number of truncated Fourier orders and my memory usage seems much higher than I expected based on the TORCWA text. Specifically, I am running the structure in example 2 with the order number set to 41. My memory usage is about 21 GB of 25 GB available on an Nvidia RTX A-5000. But based on the text, the 2060 Super was able to run the 41x41 TFO, which has much less memory than 21 GB it appears that the simulation requires.

Are there any possible reasons why the memory usage could be increased or ways to reduce it? I am running a slightly modified version of example 2, starting from a fresh kernel. Only modification was to remove the sweep of orders to just a single simulation of order 41,41.

Correction, the simulation did not finish and ran into an out of memory issue. Given that the A5000 is roughly equivalent to the 3090, this seems to be a large inconsistency.

field_xyz ?

Hi, I noticed that the program supports to extract field_xy, field_yz and field_xz.

Is it possible to extract field_xyz, so that I can do more complicated computations afterwards?

Thanks!

Batch simulation with varying layer thickness

Dear authors,

hi, thanks a lot for the nice RCWA tool. I have a question about improving the simulation efficiency:

In my model, there are several layers behind the grating. Among these layers, one of them is air (say, Layer 1). My goal is to extract the electric field pattern within the target layer (let's say Layer N). Furthermore, I need to collect the results for varying air gaps, i.e. with different thicknesses of Layer 1.

In principle, I can set it as several independent simulation jobs. In each job, I create a sim object, add the information of each layer and solve this object, in a repeated way. However, because the grating layer remains the same in all these sub-jobs, there are a lot of duplication computations done in this approach, and it results in a long running time.

I wonder in this case, where the variation of the input parameter is only the thickness of one uniform layer, would it be possible to get the simulations done more efficiently?

I hope I have explained myself clearly. Thank you very much in advance!

convergence of structured geom to homogenous layer when lattice size equals to the geom size

Hello dear Changhyun Kim,
Thanks for making the code available.
I tried to run a simulation where the size of a rectangular strucrue slows increases to the size of the cell lattice size, which is then essentially a homogenous thin layer. However, the calculated result is different from analytical formulas. I noticed in the example that the setup for homogenous thin film is different.
Can you comment why the code is designed that way?
It will be great if the setup is the same where one can do simulations on structures that smoothly transition from patterned layers to homogenous layer.

Absorption layer-by-layer and incoherent layers

Dear authors, I'm writing to you to ask your help to include, if possible, in the code the absorption layer by layer calculation for example in a multi-layered structures such as a metal- insulator- metal (MIM) one.

I wrote the following code that compute R,T and total absorption for all layer.

import numpy as np
import torch
import scipy.io
from matplotlib import pyplot as plt
import time
from tqdm import tqdm
import os
from pathlib import Path

import torcwa
import Materials
torch.backends.cuda.matmul.allow_tf32 = False
sim_dtype = torch.complex64
geo_dtype = torch.float32
device = torch.device('cuda')

inc_ang = 0.*(np.pi/180)    # radian
azi_ang = 0.*(np.pi/180)    # radian
# geometry
L = [1024.,1024.]            # nm / nm
torcwa.rcwa_geo.dtype = geo_dtype
torcwa.rcwa_geo.device = device
torcwa.rcwa_geo.Lx = L[0]
torcwa.rcwa_geo.Ly = L[1]
torcwa.rcwa_geo.nx = 1024
torcwa.rcwa_geo.ny = 1024
torcwa.rcwa_geo.grid()
torcwa.rcwa_geo.edge_sharpness = 1000.

x_axis = torcwa.rcwa_geo.x.cpu()
y_axis = torcwa.rcwa_geo.y.cpu()
z_axis = z.cpu()


#layers
layer_Ag=20
layer_ZnO = 90

order_NN = 5 
order = [order_NN,order_NN]
o = np.arange(-order_NN,order_NN+1)
all_orders=np.stack(np.meshgrid(o,o),-1).reshape(-1,2) # Here we consider all the diffracted orders


lam_min = 350.
lam_max = 900.
lam_step = 3.
DL=int((lam_max-lam_min)/lam_step)
lamb0 = torch.linspace(lam_min,lam_max,DL,dtype=geo_dtype,device=device)

# T1_sum = []
T_sum = torch.zeros(len(lamb0))
R_sum = torch.zeros(len(lamb0))
# T_out = []
for lamb0_ind in tqdm(range(len(lamb0))):
   lamb0_now = lamb0[lamb0_ind]
   sim = torcwa.rcwa(freq=1/lamb0_now,order=order,L=L,dtype=sim_dtype,device=device)
   
   Ag_eps = Materials.Ag.apply(lamb0_now/1000)**2
   ZnO_eps = Materials.ZnO.apply(lamb0_now)**2 
   substrate_esp=Materials.SodaLime.apply(lamb0_now/1000)**2

   sim.add_input_layer(eps=1)
   sim.add_output_layer(eps=substrate_esp)
   sim.set_incident_angle(inc_ang=inc_ang,azi_ang=azi_ang)
   
   sim.add_layer(thickness=layer_Ag, eps=Ag_eps)
   sim.add_layer(thickness=layer_ZnO, eps=ZnO_eps)
   sim.add_layer(thickness=layer_Ag, eps=Ag_eps)
   
   sim.solve_global_smatrix()
   
   rpp = sim.S_parameters(orders=all_orders, direction='f', port='r', polarization='pp')
   tpp = sim.S_parameters(orders=all_orders, direction='f', port='t', polarization='pp')
   rss = sim.S_parameters(orders=all_orders, direction='f', port='r', polarization='ss')
   tss = sim.S_parameters(orders=all_orders, direction='f', port='t', polarization='ss')
   rps = sim.S_parameters(orders=all_orders, direction='f', port='r', polarization='ps')
   tps = sim.S_parameters(orders=all_orders, direction='f', port='t', polarization='ps')
   rsp = sim.S_parameters(orders=all_orders, direction='f', port='r', polarization='sp')
   tsp = sim.S_parameters(orders=all_orders, direction='f', port='t', polarization='sp')

   T_sum[lamb0_ind] = ((torch.sum(torch.abs(tpp)**2) + torch.sum(torch.abs(tsp)**2)) + (torch.sum(torch.abs(tss)**2) + torch.sum(torch.abs(tps)**2)))/2
   R_sum[lamb0_ind] = ((torch.sum(torch.abs(rpp)**2) + torch.sum(torch.abs(rsp)**2)) + (torch.sum(torch.abs(rss)**2) + torch.sum(torch.abs(rps)**2)))/2

Lam=lamb0.cpu().numpy()
R=R_sum.cpu().numpy()
T = T_sum.cpu().numpy()
A_sum=1-R-T
Output=[Lam,R,T,A_sum]
Output=np.transpose(Output)
plt.plot(lamb0.cpu(),R)
plt.plot(lamb0.cpu(), A_sum)
plt.plot(lamb0.cpu(), T)
plt.title('Spectrum (order: '+str(order_NN)+')')
plt.xlabel('Wavelength (nm)')
plt.ylabel('Reflectance/ Absorbance (a.u.)')
plt.grid()

Did you have any suggestion to calculate the absorption layer by layer?

Another question that I have is related to the following task. Is it possible to include one or more than one incoherent layer? For example if I add on or below this MIM structure a thick polymer layer of tens micrometer?

Thanks in advance.

NaN occurs when lattice constant equals to wavelength at normal incidence

Congrats on developing the TORCWA. I recently ran into the issue that when you have the incident wavelength equals to the lattice constant the field output will all become NaN when using the package. I found out this is related to the line 1144 in rcwa.py, where we will have 0 value in the variable Kz_norm_dn. In line 1145-1146, the term self.Kx_norm_dn/Kz_norm_dn yields NaN values. Is there a way to fix this issue? Thanks!

1D simulation?

Hi, I wonder how shall the library be used for the 1D grating simulation properly & efficiently.

Thanks!

Materials module not installed

Thank you for making it publicly available.
I installed it. But it seems the Materials module is not available upon installation. Hope it could be fixed.
But I imported some refractive indices data from https://refractiveindex.info and run some simulations. It worked.

Besides, it would be great if a few more materials (for example, silicon) could be added to materials. Or even better, a module to import refractive indices from the the database.

Another improvement could be providing some estimation of how much time it takes to complete each example.

Anyway, thanks again !

Fourier order do't converge for metals

Hello
Thanks for the code and quick response to the comments & issues raised. Have you tried any convergence study for metallic structures ?
I tried it. Please see the notebook: https://github.com/MahmutRuzi/photonics-sim-examples/blob/main/torcwa_AuNanobar2.ipynb

I tried the Fourier truncation from Nx=Ny=9 up to Nx=Ny=22 for the mentioned example. The calculated reflectance is very erratic with respect to NxNy. I could try increasing the terms further, but not sure even that lead to convergence within reasonable time.
The Matlab code RETICOLO converges for NxNy=15 for the sample problem. It seems they implemented the enhanced transmittance method for fast convergence.
Can you please comment if the code as it stands now suitable for such simulations? If not, wow it can be modified?
An approach seems to be using adaptive spatial resolution method (https://doi.org/10.1364/JOSAA.28.000238) to sample fine around the dielectric/metal interface.

Another question not so related question: is it possible to import planar structures (GDSII for example) for the code instead of the predefined rectangles, ellipses, etc?

Thank you

multiple patterns within a single layer

Hello,
Thanks for sharing the code.
Are there any way to include multiple patterns within the same layer ? For example, two rectangles (or ellipses) of the same height within a layer separated by a given distance d . The idea is study the effect of coupling between these structures. One such example is shown in Figure 1 of this article https://doi.org/10.1002/adom.202200812.

GPU memory consumption

Hi authors, I have a question regarding memory consumption.

I find that the GPU memory usage of the following code increases with the iteration number (length of z). I cannot understand why, but I suspect it has something to do with the 'sim.field_xy'. requires_grad is True.

I_total = 0.
for (i, z_prop) in enumerate(z):
    [Ex, Ey, Ez], [_, _, _] = sim.field_xy(2, torcwa.rcwa_geo.x, torcwa.rcwa_geo.y, z_prop)
    I_i = torch.abs(Ex)**2 + torch.abs(Ey)**2 + torch.abs(Ez)**2
    I_total += I_i
    del Ex, Ey, Ez, I_i

Do you have any idea of the possible reasons?

Many thanks!

RuntimeError: Function 'EigBackward' returned nan values in its 0th output. Unstable P/Q Matrices?

I am attempting to run an optimization based on example 6 and I am running into an issue where the backwards propagation of the gradients will fail after a finite number of iterations. By looking into the traceback, the issue seems to arise from the eigen decomposition function, specifically the P and Q matrices. In the paper, it seems the P/Q can become singular matrices and lead to instabilities, but the problem persists if I switch to double-precision-point-floating numbers, though it occurs at a later step. For completeness, code is included. The main changes are changing the material to Tungsten (defined via index 2.5+22.5j), wavelength to 5 microns, design region is 3 um x 3 um with 5 nm mesh, the structure is illuminated from the backwards direction with a Tungsten back-reflector, and I have a two part FoM.

`import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

Import

import numpy as np
import torch
import scipy.io
from matplotlib import pyplot as plt
import time

import torcwa
import Materials

Hardware

If GPU support TF32 tensor core, the matmul operation is faster than FP32 but with less precision.

If you need accurate operation, you have to disable the flag below.

torch.backends.cuda.matmul.allow_tf32 = False
#sim_dtype = torch.complex128
#geo_dtype = torch.float64
sim_dtype = torch.complex64
geo_dtype = torch.float32

device = torch.device('cuda')

Simulation environment

light

lamb0 = torch.tensor(5000.,dtype=geo_dtype,device=device) # nm
inc_ang = 0.(np.pi/180) # radian
azi_ang = 0.(np.pi/180) # radian
theta = torch.linspace(0., 60.,2,dtype=geo_dtype,device=device) # nm

material

substrate_eps = 1.46**2
silicon_eps = Materials.aSiH.apply(lamb0)**2

Changing material to be Tungsten (W) for testing purposes. Let's just see if it runs at all.

#W = 1.33+5.24j #At 532 nm
W = 2.9 +22.5j # At 5 microns
silicon_eps = W**2
print(silicon_eps)
print(W)

geometry

L = [3000., 3000.] # nm / nm
torcwa.rcwa_geo.dtype = geo_dtype
torcwa.rcwa_geo.device = device
torcwa.rcwa_geo.Lx = L[0]
torcwa.rcwa_geo.Ly = L[1]
torcwa.rcwa_geo.nx = 600
torcwa.rcwa_geo.ny = 600
torcwa.rcwa_geo.grid()
torcwa.rcwa_geo.edge_sharpness = 1000.

x_axis = torcwa.rcwa_geo.x.cpu()
y_axis = torcwa.rcwa_geo.y.cpu()

layers

layer0_thickness = 300.`

`# Objective function
def objective_function(rho):
order = [12,12]
A_sum = []
for theta_inc in range(len(theta)):
inc_ang = theta[theta_inc](np.pi/180)
sim = torcwa.rcwa(freq=1/lamb0,order=order,L=L,dtype=sim_dtype,device=device)
# Linear in Index, non-linear in eps.
#plt.imshow(rho.detach().cpu().numpy())
#plt.colorbar()
substrate_eps_W = W**2
sim.add_input_layer(eps=substrate_eps_W)
sim.set_incident_angle(inc_ang=inc_ang,azi_ang=azi_ang,angle_layer='output')
# Linear in Index, non-linear in eps.
layer0_eps = substrate_eps
layer1_index = rhoW + (1.-rho)
layer1_eps = layer1_index**2
layer1_thickness = layer0_thickness
#layer0_eps = rho*silicon_eps + (1.-rho)
sim.add_layer(thickness=layer0_thickness,eps=layer0_eps)
sim.add_layer(thickness=layer1_thickness,eps=layer1_eps)
sim.solve_global_smatrix()
#t1xx = sim.S_parameters(orders=[1,0],direction='forward',port='transmission',polarization='xx',ref_order=[0,0])
#t1yy = sim.S_parameters(orders=[1,0],direction='forward',port='transmission',polarization='yy',ref_order=[0,0])
#t1xy = sim.S_parameters(orders=[1,0],direction='forward',port='transmission',polarization='xy',ref_order=[0,0])
#t1yx = sim.S_parameters(orders=[1,0],direction='forward',port='transmission',polarization='yx',ref_order=[0,0])

    if theta_inc == 1: # extra orders for 60 deg.
        r0xx2 = sim.S_parameters(orders=[0,0],direction='backward',port='reflection',polarization='xx',ref_order=[0,0])
        #t0xx2 = sim.S_parameters(orders=[0,0],direction='forward',port='transmission',polarization='xx',ref_order=[0,0])
        rn1xx = sim.S_parameters(orders=[-1,0],direction='backward',port='reflection',polarization='xx',ref_order=[0,0])
       # tn1xx = sim.S_parameters(orders=[-1,0],direction='forward',port='transmission',polarization='xx',ref_order=[0,0])
        Abs = 1-torch.abs(r0xx2)**2-torch.abs(rn1xx)**2
     #   print(Abs)
        A_sum.append(Abs)
    else:
        r0xx = sim.S_parameters(orders=[0,0],direction='backward',port='reflection',polarization='xx',ref_order=[0,0])
        #t0xx = sim.S_parameters(orders=[0,0],direction='forward',port='transmission',polarization='xx',ref_order=[0,0])
       
        Abs =torch.abs(r0xx)**2
      #  print(Abs)
        A_sum.append(Abs)       
return sum(A_sum)/len(theta) #Normalizing by number of angles    `

`torch.autograd.set_detect_anomaly(True)

Perform optimization

optimizer parameters for ADAM optimizer

gar_initial = 0.02
beta1 = 0.9
beta2 = 0.999
epsilon = 1.e-8
iter_max = 200
beta = np.exp(np.arange(start=0,stop=iter_max)np.log(1000)/iter_max)
gar = gar_initial * 0.5(1+np.cos(np.arange(start=0,stop=iter_max)*np.pi/iter_max))

blur kernel

blur_radius = 300.
dx, dy = L[0]/torcwa.rcwa_geo.nx, L[1]/torcwa.rcwa_geo.ny
x_kernel_axis = (torch.arange(torcwa.rcwa_geo.nx,dtype=geo_dtype,device=device)-(torcwa.rcwa_geo.nx-1)/2)*dx
y_kernel_axis = (torch.arange(torcwa.rcwa_geo.ny,dtype=geo_dtype,device=device)-(torcwa.rcwa_geo.ny-1)/2)*dy
x_kernel_grid, y_kernel_grid = torch.meshgrid(x_kernel_axis,y_kernel_axis,indexing='ij')
g = torch.exp(-(x_kernel_grid2+y_kernel_grid2)/blur_radius**2)
g = g/torch.sum(g)
g_fft = torch.fft.fftshift(torch.fft.fft2(torch.fft.ifftshift(g)))

torch.manual_seed(333)
rho = torch.rand((torcwa.rcwa_geo.nx,torcwa.rcwa_geo.ny),dtype=geo_dtype,device=device)
rho = (rho + torch.fliplr(rho))/2
rho_fft = torch.fft.fftshift(torch.fft.fft2(torch.fft.ifftshift(rho)))
rho = torch.real(torch.fft.fftshift(torch.fft.ifft2(torch.fft.ifftshift(rho_fft*g_fft))))
momentum = torch.zeros_like(rho)
velocity = torch.zeros_like(rho)

rho_history = []
FoM_history = []

start_time = time.time()
for it in range(0,iter_max):
rho.requires_grad_(True)
rho_fft = torch.fft.fftshift(torch.fft.fft2(torch.fft.ifftshift(rho)))
rho_bar = torch.real(torch.fft.fftshift(torch.fft.ifft2(torch.fft.ifftshift(rho_fftg_fft))))
rho_tilda = 1/2 + torch.tanh(2beta[it]rho_bar-beta[it])/(2np.math.tanh(beta[it]))
if it == 0:
test = rho_tilda
# Plot
# plt.imshow(test.detach().cpu().numpy())
# plt.colorbar()
FoM = objective_function(rho_tilda)
FoM.backward()

with torch.no_grad():
    rho_gradient = rho.grad
    #print(rho_gradient)
    rho.grad = None

    rho_history.append(rho_tilda.detach().cpu().numpy())
    FoM = float(FoM.detach().cpu().numpy())
    FoM_history.append(FoM)

    momentum = (beta1*momentum + (1-beta1)*rho_gradient)
    velocity = (beta2*velocity + (1-beta2)*(rho_gradient**2))
    rho += gar[it]*(momentum / (1-beta1**(it+1))) / torch.sqrt((velocity / (1-beta2**(it+1))) + epsilon)
    rho[rho>1] = 1
    rho[rho<0] = 0
    rho = (rho + torch.fliplr(rho))/2
    #plt.imshow(rho_gradient.detach().cpu().numpy())
    #plt.colorbar()
    end_time = time.time()
    elapsed_time = end_time - start_time
    print('Iteration:',it,'/ FoM:',int(FoM*10000)/10000,'/ Elapsed time:',str(int(elapsed_time))+' s')`

Focusing of field

Hello,
I want to ask if this RCWA can calculate the focused field instead of the diffracted field?

Normalization of intensity integrated on all orders

Dear authors, thanks for releasing this very nice software. I tried to launch a quick simulation to understand how the code handles normalization when integrating over all orders, but I'm getting extremely large results in transmission. I've tried increasing the number of harmonics, playing with ref_order and the evanescent parameter, or also increasing the precision to sim_dtype = torch.complex128 and geo_dtype = torch.float64, but none of these seem to help. Do you have any suggestions?
Also, when the simulations wavelength matches the structure period, I see that the torcwa returns zero diffraction efficiencies.
Please, see the script below.
Thank you for your help!

import numpy as np
import torch
import matplotlib.pyplot as plt
from tqdm import tqdm

import torcwa

torch.backends.cuda.matmul.allow_tf32 = False
sim_dtype = torch.complex64
geo_dtype = torch.float32
device = torch.device('cuda')

lam_min = 400.
lam_max = 600.
lam_step = 2.
DL = int((lam_max-lam_min)/lam_step)
lam_arr = torch.linspace(lam_min, lam_max, DL, dtype=geo_dtype, device=device)

nx, ny, edge_sharpness = 256, 256, 128
L = [400., 400.] # the period coincides by chance with the first wavelength ... 
geom = torcwa.geometry(Lx=L[0], Ly=L[1], nx=nx, ny=ny, edge_sharpness=edge_sharpness, dtype=geo_dtype, device=device)

# define layer
R = 120.
layer_geom = geom.circle(R=R, Cx=L[0]/2.,Cy=L[1]/2.)
layer_thickness = 400.

substrate_eps = 1.5**2

order_N = 10
order = [order_N, order_N]

# define array with all orders
o = np.arange(-order_N, order_N+1)
all_orders = np.stack(np.meshgrid(o,o),-1).reshape(-1,2)

T_sum = torch.zeros_like(lam_arr)
R_sum = torch.zeros_like(lam_arr)

for lam_idx in tqdm(range(len(lam_arr))):
  
   sim = torcwa.rcwa(freq=1.0/lam_arr[lam_idx], order=order, L=L, dtype=sim_dtype, device=device)
   
   sim.add_input_layer(eps=1.0)
   sim.add_output_layer(eps=substrate_eps)
   sim.set_incident_angle(inc_ang=0.0, azi_ang=0.0) # normal incidence

   layer_eps = layer_geom*substrate_eps + (1.-layer_geom)
   sim.add_layer(thickness=layer_thickness, eps=layer_eps)
   
   sim.solve_global_smatrix()

   rxx = sim.S_parameters(orders=all_orders, direction='f', port='r', polarization='xx')
   txx = sim.S_parameters(orders=all_orders, direction='f', port='t', polarization='xx')
   ryy = sim.S_parameters(orders=all_orders, direction='f', port='r', polarization='yy')
   tyy = sim.S_parameters(orders=all_orders, direction='f', port='t', polarization='yy')
   
   T_sum[lam_idx] = (torch.sum(torch.abs(txx)**2) + torch.sum(torch.abs(tyy)**2))/2
   R_sum[lam_idx] = (torch.sum(torch.abs(rxx)**2) + torch.sum(torch.abs(ryy)**2))/2

eps_out, _ = sim.return_layer(layer_num=0, nx=geom.nx, ny=geom.ny)

fig, axarr = plt.subplots(1,2, figsize=(16,4))
eps_im = axarr[0].imshow(eps_out.cpu().real, extent=[-L[0]/2, L[0]/2, -L[1]/2, L[1]/2])
fig.colorbar(eps_im, ax=axarr[0])
axarr[0].set(title=f'$\epsilon$ of layer #0 (FO: {order_N})',
             xlabel='x [nm]', ylabel='y [nm]')
axarr[1].plot(lam_arr.cpu(), R_sum.cpu(), label='R')  
axarr[1].plot(lam_arr.cpu(), T_sum.cpu(), label='T')
axarr[1].set(title=f'spectrum (FO: {order_N})',
             xlabel='$\lambda$ [nm]',
             ylabel='normalized(?) intensity')
axarr[1].legend()
plt.show()

Which gives

Topological optimization for unit cell smaller than source wavelength

Dear author,

I have run the code for Example 6 (topology optimization) in the depository and noticed that an error occurs when the length of the geometry (Lx) is set to be smaller than the wavelength of the incident light, e.g. Lx=400nm, lambda=532nm.

In particular, the optimization loop only runs for one iteration before terminating. The following messages appear indicating an infinite value for some matrix in the calculation of the E and H fields in layer0:

Cell In[9], line 39
36 rho_bar = torch.real(torch.fft.fftshift(torch.fft.ifft2(torch.fft.ifftshift(rho_fftg_fft))))
37 rho_tilda = 1/2 + torch.tanh(2beta[it]rho_bar-beta[it])/(2np.math.tanh(beta[it]))
---> 39 FoM = objective_function(rho_tilda)
40 FoM.backward()
42 with torch.no_grad():

Cell In[8], line 9, in objective_function(rho)
7 sim.set_incident_angle(inc_ang=inc_ang,azi_ang=azi_ang)
8 layer0_eps = rho*silicon_eps + (1.-rho)
----> 9 sim.add_layer(thickness=layer0_thickness,eps=layer0_eps)
10 sim.solve_global_smatrix()
11 t1xx = sim.S_parameters(orders=[1,0],direction='forward',port='transmission',polarization='xx',ref_order=[0,0])

File ~.conda\envs\torcwa\lib\site-packages\torcwa\rcwa.py:168, in rcwa.add_layer(self, thickness, eps, mu)
166 self._eigen_decomposition_homogenous(eps,mu)
167 else:
--> 168 self._eigen_decomposition()
170 self._solve_layer_smatrix()

File ~.conda\envs\torcwa\lib\site-packages\torcwa\rcwa.py:1236, in rcwa._eigen_decomposition(self)
1234 # Eigen-decomposition
1235 if self.stable_eig_grad is True:
-> 1236 kz_norm, E_eigvec = Eig.apply(torch.matmul(self.P[-1],self.Q[-1]))
1237 else:
1238 kz_norm, E_eigvec = torch.linalg.eig(torch.matmul(self.P[-1],self.Q[-1]))

File ~.conda\envs\torcwa\lib\site-packages\torch\autograd\function.py:506, in Function.apply(cls, *args, **kwargs)
503 if not torch._C._are_functorch_transforms_active():
504 # See NOTE: [functorch vjp and autograd interaction]
505 args = _functorch.utils.unwrap_dead_wrappers(args)
--> 506 return super().apply(*args, **kwargs) # type: ignore[misc]
508 if cls.setup_context == _SingleLevelFunction.setup_context:
509 raise RuntimeError(
510 'In order to use an autograd.Function with functorch transforms '
511 '(vmap, grad, jvp, jacrev, ...), it must override the setup_context '
512 'staticmethod. For more details, please see '
513 'https://pytorch.org/docs/master/notes/extending.func.html')

File ~.conda\envs\torcwa\lib\site-packages\torcwa\torch_eig.py:14, in Eig.forward(ctx, x)
11 @staticmethod
12 def forward(ctx,x):
13 ctx.input = x
---> 14 eigval, eigvec = torch.linalg.eig(x)
15 ctx.eigval = eigval.cpu()
16 ctx.eigvec = eigvec.cpu()

RuntimeError: torch.linalg.eig: input tensor should not contain infs or NaNs.

This error does not occur for geometry sizes smaller than the wavelength when there is no topology optimization process involved. I tried to inspect the relevant lines in the python files but could not find any obvious errors.

Do you know if there is any change I should make to the code? Thank you very much.

kch3782 / torcwa Goto Github PK

torcwa's People

Contributors

Stargazers

Watchers

Forkers

torcwa's Issues

Import

Hardware

If GPU support TF32 tensor core, the matmul operation is faster than FP32 but with less precision.

If you need accurate operation, you have to disable the flag below.

Simulation environment

light

material

Changing material to be Tungsten (W) for testing purposes. Let's just see if it runs at all.

geometry

layers

Perform optimization

optimizer parameters for ADAM optimizer

blur kernel

Recommend Projects

Recommend Topics

Recommend Org