<div dir="auto"><div>TSRK is an explicit solver. Unless you are changing the ts type from command line,  the explicit  jacobian should not be needed. On top of Barry's suggestion, I would suggest you to write the explicit RHS instead of assembly a throw away matrix every time that function needs to be sampled.<br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 9, 2023, 17:09 Niclas Götting <<a href="mailto:ngoetting@itp.uni-bremen.de">ngoetting@itp.uni-bremen.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br>

<br>

I'm currently trying to convert a quantum simulation from scipy to <br>

PETSc. The problem itself is extremely simple and of the form \dot{u}(t) <br>

= (A_const + f(t)*B_const)*u(t), where f(t) in this simple test case is <br>

a square function. The matrices A_const and B_const are extremely sparse <br>

and therefore I thought, the problem will be well suited for PETSc. <br>

Currently, I solve the ODE with the following procedure in scipy (I can <br>

provide the necessary data files, if needed, but they are just some <br>

trace-preserving, very sparse matrices):<br>

<br>

import numpy as np<br>

import scipy.sparse<br>

import scipy.integrate<br>

<br>

from tqdm import tqdm<br>

<br>

<br>

l = np.load("../liouvillian.npy")<br>

pump = np.load("../pump_operator.npy")<br>

state = np.load("../initial_state.npy")<br>

<br>

l = scipy.sparse.csr_array(l)<br>

pump = scipy.sparse.csr_array(pump)<br>

<br>

def f(t, y, *args):<br>

     return (l + 0.5 * (5 < t < 10) * pump) @ y<br>

     #return l @ y # Uncomment for f(t) = 0<br>

<br>

dt = 0.1<br>

NUM_STEPS = 200<br>

res = np.empty((NUM_STEPS, 4096), dtype=np.complex128)<br>

solver = <br>

scipy.integrate.ode(f).set_integrator("zvode").set_initial_value(state)<br>

times = []<br>

for i in tqdm(range(NUM_STEPS)):<br>

     res[i, :] = solver.integrate(solver.t + dt)<br>

     times.append(solver.t)<br>

<br>

Here, A_const = l, B_const = pump and f(t) = 5 < t < 10. tqdm reports <br>

about 330it/s on my machine. When converting the code to PETSc, I came <br>

to the following result (according to the chapter <br>

<a href="https://petsc.org/main/manual/ts/#special-cases" rel="noreferrer noreferrer" target="_blank">https://petsc.org/main/manual/ts/#special-cases</a>)<br>

<br>

import sys<br>

import petsc4py<br>

petsc4py.init(args=sys.argv)<br>

import numpy as np<br>

import scipy.sparse<br>

<br>

from tqdm import tqdm<br>

from petsc4py import PETSc<br>

<br>

comm = PETSc.COMM_WORLD<br>

<br>

<br>

def mat_to_real(arr):<br>

     return np.block([[arr.real, -arr.imag], [arr.imag, <br>

arr.real]]).astype(np.float64)<br>

<br>

def mat_to_petsc_aij(arr):<br>

     arr_sc_sp = scipy.sparse.csr_array(arr)<br>

     mat = PETSc.Mat().createAIJ(arr.shape[0], comm=comm)<br>

     rstart, rend = mat.getOwnershipRange()<br>

     print(rstart, rend)<br>

     print(arr.shape[0])<br>

     print(mat.sizes)<br>

     I = arr_sc_sp.indptr[rstart : rend + 1] - arr_sc_sp.indptr[rstart]<br>

     J = arr_sc_sp.indices[arr_sc_sp.indptr[rstart] : <br>

arr_sc_sp.indptr[rend]]<br>

     V = arr_sc_sp.data[arr_sc_sp.indptr[rstart] : arr_sc_sp.indptr[rend]]<br>

<br>

     print(I.shape, J.shape, V.shape)<br>

     mat.setValuesCSR(I, J, V)<br>

     mat.assemble()<br>

     return mat<br>

<br>

<br>

l = np.load("../liouvillian.npy")<br>

l = mat_to_real(l)<br>

pump = np.load("../pump_operator.npy")<br>

pump = mat_to_real(pump)<br>

state = np.load("../initial_state.npy")<br>

state = np.hstack([state.real, state.imag]).astype(np.float64)<br>

<br>

l = mat_to_petsc_aij(l)<br>

pump = mat_to_petsc_aij(pump)<br>

<br>

<br>

jac = l.duplicate()<br>

for i in range(8192):<br>

     jac.setValue(i, i, 0)<br>

jac.assemble()<br>

jac += l<br>

<br>

vec = l.createVecRight()<br>

vec.setValues(np.arange(state.shape[0], dtype=np.int32), state)<br>

vec.assemble()<br>

<br>

<br>

dt = 0.1<br>

<br>

ts = PETSc.TS().create(comm=comm)<br>

ts.setFromOptions()<br>

ts.setProblemType(ts.ProblemType.LINEAR)<br>

ts.setEquationType(ts.EquationType.ODE_EXPLICIT)<br>

ts.setType(ts.Type.RK)<br>

ts.setRKType(ts.RKType.RK3BS)<br>

ts.setTime(0)<br>

print("KSP:", ts.getKSP().getType())<br>

print("KSP PC:",ts.getKSP().getPC().getType())<br>

print("SNES :", ts.getSNES().getType())<br>

<br>

def jacobian(ts, t, u, Amat, Pmat):<br>

     Amat.zeroEntries()<br>

     Amat.aypx(1, l, structure=PETSc.Mat.Structure.SUBSET_NONZERO_PATTERN)<br>

     Amat.axpy(0.5 * (5 < t < 10), pump, <br>

structure=PETSc.Mat.Structure.SUBSET_NONZERO_PATTERN)<br>

<br>

ts.setRHSFunction(PETSc.TS.computeRHSFunctionLinear)<br>

#ts.setRHSJacobian(PETSc.TS.computeRHSJacobianConstant, l, l) # <br>

Uncomment for f(t) = 0<br>

ts.setRHSJacobian(jacobian, jac)<br>

<br>

NUM_STEPS = 200<br>

res = np.empty((NUM_STEPS, 8192), dtype=np.float64)<br>

times = []<br>

rstart, rend = vec.getOwnershipRange()<br>

for i in tqdm(range(NUM_STEPS)):<br>

     time = ts.getTime()<br>

     ts.setMaxTime(time + dt)<br>

     ts.solve(vec)<br>

     res[i, rstart:rend] = vec.getArray()[:]<br>

     times.append(time)<br>

<br>

I decomposed the complex ODE into a larger real ODE, so that I can <br>

easily switch maybe to GPU computation later on. Now, the solutions of <br>

both scripts are very much identical, but PETSc runs about 3 times <br>

slower at 120it/s on my machine. I don't use MPI for PETSc yet.<br>

<br>

I strongly suppose that the problem lies within the jacobian definition, <br>

as PETSc is about 3 times *faster* than scipy with f(t) = 0 and <br>

therefore a constant jacobian.<br>

<br>

Thank you in advance.<br>

<br>

All the best,<br>

Niclas<br>

<br>

<br>

</blockquote></div></div></div>