ChemTraYzer -- Reaction Models from Molecular Dynamics Simulations
ChemTraYzer (CTY) is an integrated simulation and analysis tool to study reactive molecular dynamics of a gas phase.
How to run the ChemTraYzer:
The main script is simulation.py which takes a configuration file as input:
python simulation.py < example/sim.confAvailable settings are desribed in the configuration file (example/sim.conf). CTY then builds the box with packmol and runs the dynamics with LAMMPS. Reactions and species are registered and sent to an quantum mechanics engine for optimization (Gaussian by default).
Once the high level calculations are done, CTY can build the reaction mechanism, and fit Arrhenius and NASA parameters:
python master.py . -reac default.reac -source QMwhere default.reac is the reaction list output from simulation.py and QM is the folder with the QM engine output. Final output is the reaction list default.reac, the observed rate constants reac.rate.tab, time-wise species and reactive event counts reac.spec.tab and reac.reac.tab, the mechanism in ChemKin format Default.therm and Default.chem, and the mechanism in a graph format Default.gml.
It is possible to run CTY without the dynamics part, if there is a LAMMPS-style bond table as in example/bonds.tatb:
python processing.py example/bonds.tatb -i 1:6 -i 2:1 -i 3:8 -norec 5where the repeated -i provides the atom/element mapping and -norec 5 enables the recrossing detection for an interval of 5 data points in the bond table. CTY then creates the reaction list output (bonds.tatb.reac), which in turn can be used to calculate rate constants from the observed events and concentrations:
python analyzing.py examples/bonds.tatb.reacSince in this case, CTY lacks the geometry information from the dynamics, a QM optimization and thermochemistry parameter fitting is not possible.
How to change the QM engine or the QM submission scripts:
The method submitQM() in simulation.py creates and submits a batch job containing the input for the QM engine (Gaussian09 in our case). It uses some static variables from the Simulation class to fill in e.g. the QM method. If you change the QM engine to something else than Gaussian, you also need to change the QM output read routine harvestGaussian() in harvesting.py. Keep in mind that the hindered rotor contribution to the RRHO partition functions is adapted to Gaussian output only.
Extending the supported elements of ChemTraYzer
CTY can handle the elements C, H O, N, F, S, Cl, He, Ne, and Ar by default. If you have other elements in your simulation, you need to add the maximum possible valency and the maximum possible coordination for those elements. Search for ".Val" and ".Cor" in processing.py and add the new numbers accordingly. CTY uses both values to distingush between long range interactions and actual bonds.
Python packages that are required to run ChemTraYzer:
Anharmonic correction to rate constants from RRHO-TST
For a short tutorial on how to correct your rate constants according to ACS Omega 2020, 5, 5, 2242-2253, please read the README.doc located in the folder small_tools/ACSOmega2020. You will need the python library TAMkin, which is not required to run ChemTraYzer.
Bug Reports:
2016-01-07: OpenBabel ignores the aromaticity of carbon atoms in SMILES in some cases
- OpenBabel adds too many implicit hydrogen atoms to carbon atoms which should be aromatic
-> analyzing.py returns a warning that the balance of elements failed, although the SMILES are balanced correctly
Bug Closed:
2015-07-30: Recrossing filter fails if final reaction of recrossing event is of type A -> A
- Reaction A -> A is removed before tracing reaction history
-> If Reaction A -> A is part of the recrossing process (only as one of the final steps), the filter tries to remove that reaction, but it cannot find it (due to previous point)
Version 1.1:
2015-07-15: (not directly related to the code) Installation of pygraphviz for windows failed
- a user reported problems with installing pygraphviz for windows 8
-> 'analyzing.py' cannot be used without pygraphviz
**SOLUTION:** pygraphviz is optional now -> 'analyzing.py' runs witout
Bug Fixes:
Version 1.1:
2015-07-08: Disturbance of atom balance in 'analyzing.py' if the initial chemical composition contains single-atom molecules.
- 'processing.py' does not recognize initializing reactions (atoms -> molecules) for single-atom molecules
- 'analyzing.py' computes initial chemical composition from initializing reactions
-> single-atom molecules are missed and concentration will become negative when these molecules are consumed. This in turn leads to wrong concentration integrals and therefore to wrong rate constants.
**SOLUTION:** Initial reactions changed so that single atoms are recognized
2015-08-31: When being initialized, Open Babel displays H and H2 molecules as empty strings
- Problems with displaying pure hydogen molecules
-> String representations of Reactions may have commas enclosing an empty string
**SOLUTION:** Identify H and H2 separately
2015-09-29: Systems without any initial bond will end with an error in the 'splitBonds' function
- Initialization of bond handler yields empty array
-> 'splitBonds' trys to access empty array and fails
**SOLUTION:** 'splitBonds' checks for empty bond array to avoid abort
2016-01-29: Recorssing filter period too long if first timestep in Bond-Order output non-Zero
- Filter period is the the product of the number of timesteps between two entries in the Bond-Order file and a user-defined number of Bond-Order steps
- the code assumes that the first timestep is zero and simply takes the second timestep as number of steps between two steps
-> the computed number of steps between two timesteps is way to large, thus the filter period is way too long and actual reactions may be filtered
**SOLUTION:** Initial timestep extraction method added
2016-01-29: Depending on the ReaxFF force field the Bond-Order between the two oxygens of a peroxide group (R-O-O.) is oscillating around 1.5
- O-O bond is oscillating between single and double bond, thus the 'open' valence of the central O is oscillating between one and zero. As a consequence, the R-O bond is periodically forming and breaking
-> the Recrossing filter is removing these oscillator events from the output until the R-O-O. undergoes further reactions (in which case R-O-O. would appear as an extremely short-lived intermediate).
**SOLUTION:** half-step bond orders allowed -> issue shifted to 0.25 BO before and after integer bond orders, which are presumably less important
We are happy to hear that you could tackle the previous problem.
Concerning your first question: This refers to the readme file in the example folder.
In the in.reac file you find the LAMMPS commands:
This lets LAMMPS dump every 10th time step. All of these will be processed by the processing.py if you give -n 0.
Concerning your second question: This refers to the recrossing filter.
Yes, it will effect the calculation results.
Due to molecules heavily vibrating within MD simulations with large thermal energies, bond order thresholds may be crossed a lot of times in the course of a reactive event and even in the course of a non-reactive collision. Registering all of these bond order changes as reactions will grossly overestimate rate constants and also add reactions to your list that actually didn't really happen.
There are different treatments known in literature to deal with this and CTY currently uses the robust solution of a fixed recrossing time. Probably you will obtain good results by keeping the parameter as 100, which, together with each 10th of the 0.1 fs timesteps written, corresponds to a recrossing time of 100 ps.
You may want to modify this number not only when you adjust your timestep or the number of timesteps being dumped, but also dependent on density and temperature of your system. If real subsequent reactions should happen faster than this recrossing time, you may reduce this number accordingly. That will probably not often be the case.
Concerning your third question:
This should print reactants and products of the reaction also as smiles.
k is the corresponding rate constant, followed by upper and lower statistical bounds for a standard chi of 0.9. Please take a look at our article "Assessing Statistical Uncertainties of Rare Events in Reactive Molecular Dynamics Simulations" at https://pubs.acs.org/doi/full/10.1021/acs.jctc.7b00524
N should count the actual number of reactive events for the corresponding reaction.
Best Regards
Wassja A. Kopp on behalf of the ChemTraYzer team
Last edit: Wassja A. Kopp 2023-03-21
Thank you for your guidance. I think I have figured out the 3 questions.Thank you again.
Best wish to you !
My comment to the first question seems lost:
We are happy to hear that you could tackle your previous request already.
Concerning your first question: You are referring to the example that comes with CTY 2.0.
Therein, each 10th frame is dumped, cf. in.reac for LAMMPS simulation:
All of these timesteps (ie. each 10th of the simulation) will be processed by processing.py when you specify -n 0
Dear Felix
I am trying to solve the reaction path problem using this excellent software ChemTraYzer 2.1. By creating a python2 environment in conda and installing openbabel 2.4.0, I solved the error of not being able to read the input file, but immediately after running python processing.py example/bonds. tatb -i 1:6 -i 2:1 -i 3:8 -norec 5 and the output is
WORK example/bonds.tatb <temperature> <volume> <timestep>
0::
0::
630::
1000::
1000::
Is this another error, I didn't find a similar error in the comments section, can you give me any help? Thanks!</timestep></volume></temperature>
Last edit: wxg 2023-04-12
Hi wxg,
apparently, CTY had problems creating the smiles codes. Did you get any other errors from openbabel during the CTY run?
Best, Felix
Hi Felix,
Why do I get a "Segmentation fault (core dumped)" error when I run the processing.py file after extending the nickel element? How to solve this problem?
Hi Jingyi Yu,
What kind of system are you running, and how large are your biggest structures? Openbabel has an upper atom limit when creating SMILES (around 250 I think). Can you provide more information on the input? From the segmentation fault error alone it's hard to tell what went wrong.
Thanks,
Best, Felix
Felix,
Thanks a lot for your attention! My system is an organic catalytic pyrolysis system containing C, H, O, and Ni atoms. The largest structure may have 500+ atoms, which may indeed exceed the limit. What else can I do in this situation?
Hi Jingyi Yu,
as of now CTY can't handle large structures, because it was designed for gas phase reactions. However, we are currently developing the next version with those systems in mind.
If you still want to use the data you produced, this workaround might work for you: identify the (reactive) surface atoms of your catalyst, then delete all other catalyst atoms from the bond file (lines that start with those ids). That way, the large structures will look smaller for the CTY analysis.
Best, Felix
Hi Felix,
Thank you very much for your reply and suggestions! The error of segmentation fault has been solved by reinstallation of CTY. However, I still have no idea about how to deal with the output file of more complicated system, like C/H/O/Ni catalytic pyrolysis system that I mentioned before. Because the reaxff method has a very good prospect in this field, it is hoped that there will be some tools that can help researchers analyze data of complex systems.
At present, I have no problem using CTY in a simple C/H or C/H/O system, but when I use it in a C/H/S system (with C-S-C bond), an "Open Babel Error in TetStereoToWedgeHash" error will appear , and I have updated the openbabel installation package. Most of the problems of these complex systems point to the inability of openbabel to recognize certain structures. How to solve this problem?
Thank you very much for developing CTY, which helped me a lot in my research. At the same time, I also hope that you can continue to develop software that can be applied to more systems.
Best, Jingyi Yu
Hi Jingyi Yu,
Thanks for your appreciation. The openbabel error comes up in a function that tries to prepare a 2D molecule representation for plotting.
https://openbabel.org/dev-api/group__stereo.shtml#ga182a53150e1793ac1ec14958afb05f17
According to the openbabel documentation, the error is a rare case. So far I am not sure how this is caused or how to avoid it. You can try and switch off molecule pictures by leaving out the '-dp' option.
Best, Felix
Dear Felix,
Thanks for your effort in this amazing software. I have two problems with the software and would like to ask you for advice. I was compiling lammps with python that never worked, so I used lammps to generate the required input files
Then I had my first problem. When I use the command
python master.py . -reac 2500k_dump2.reacto analyze the reac file, it prompts me with this:
Traceback (most recent call last):
File "master.py", line 209, in <module>
tshift = max(reaction)
ValueError: max() arg is an empty sequence
! closing: chemtrayzer.sqlite </module>
When I analyzed the bonds file, I ran into a second problem. I used
to analyse, it cause warning:
Open Babel Warning in CreateCisTrans
Error in cis/trans stereochemistry specified for the double bond
stoichiometric formulas generated and atom number computed
WARNING:
number of atoms vary between 35 and 2204
balance of elements evaluated
pecies done
conversion of reaction time history to reaction concentrations...
WARNING:
Balance of elements failed for
C1=C[C]2C[C@]32C(=C3)[CH]1:C=c1cccc2c1=C2
WARNING:
Balance of elements failed for
C=c1cccc2c1=C2:C=C1[C@H]2[C@]31[CH]C3=C[CH]2***
Only spec.tab, rate.tab, and reac.tab were generated. Will this error lead to errors in my analysis? How do I fix this? Thanks.
Last edit: xiance zhang 2023-08-01
Hi Xiance Zhang,
Thanks, hopefully the code will be helpful for your work!
The script master.py reads from a reaction list generated by one or many outputs of simulation.py or processing.py, not from the LAMMPS dump/bond files directly.
Counting the carbons and hydrogens in your SMILES, the atom balance seems fine. I guess that openbabel messes up the implicit hydrogens in one of the molecule SMILES (related to the cis/trans error?). In the final rate computation, this is no issue, only the molecular formula (CxHy) might be wrong for one of the species.
Best, Felix
Last edit: Cheng Wu 2023-09-04
Dear Felix,
I am interested in using your CTY software to find the reaction pathways and the product yields of some important compounds. However, I have encountered some difficulties and I hope you can help me.
I wonder if your software can only perform gas-phase reaction analysis, because I cannot find some major products, such as furans, 5-hydroxymethylfurfural, furfural, etc., using CTY. Do you have any suggestions on how to use your software to simulate these reactions and obtain the desired results?
Hi gakki!
Thanks for your interest in ChemTraYzer!
Unfortunately, CTY cannot handle liquid or solid systems. As reaction recognition is based on ReaxFF bond orders, it will be confused when analyzing non-gaseous systems. As a suggestion, you could write a script to remove all solvent or surface atoms from the bond output file. The analysis would then only apply to the solute/reagent molecules. However, solvation effects can't be taken into account in the optimizations.
Best, Felix
Hi Felix,
Thank you for the useful software.
Is there anyway to get average life span of species detected using CTY?
Also in the output file of analyzing.py ".spec.tab", I found some species count is negative. Can you please explain what does the negative count means?
Thanks in advance!
Last edit: Bhuvan 2024-01-31
Hi Bhuvan!
Thank you for using our software!
A negative species count is surely an error, apologies for that. The explanation why this can happen is a bit lengthy, but involves a known shortcoming of CTY2: when a resonating molecule changes its structure without a bond breaking/forming, then the composition changes, but no reaction is registered by CTY. In the end, the species concentration profile is recalculated from the sequence of reactions, and a missing reaction can lead to negative numbers. To overcome that, you can try and find the molecule's resonance structures in ".spec.tab" and sum up their numbers for each timestep as if they are one species.
There is no direct function to get the average lifetime of molecules, but shouldn't it be possible to obtain that value from the concentration profile and the number of molecules per species (all in "spec.tab")?
Best, Felix
First of all, thank you very much for providing such a great software, it helps me a lot. But now I have some small problems and I hope you can help me solve them if you can.
Is there any solution? scipy does not support decimal
Hi machen,
Thanks! We appretiate it!
It seems the error you have is caused by a too large "tmpA" value (which gets bigger the more reactants contribute to one reaction step). Maybe you chose the recrossing value ("-norec") too high and all reactions are lumped into one, with loads of reactants? What happens if you only analyze half of the trajectory, with e.g. "-end 50000"?
If your system is actually that big, you could also try casting into numpy.ulonglong instead of float64, because all of "tmpA"/"intA"/"dt"/"pos" are natural numbers.
Best, Felix
Hi Felix, thank you very much for developing such excellent software.
When I run the command: python analyzing.py bonds.reaxc.reac -main reac -dp 0.1 -gen id, after running for a while, it prompts: Error for Segmentation fault, as shown in the image. How can I solve it, my simulation system is 2190 atoms, and some of the previous reaction trajectories can be output normally, and they have been stuck here. Looking forward to your reply, thank you!
!!@felixschmalz
Hi Jiliang, thanks for using ChemTraYzer!
Because ChemTraYzer is written in Python, the Segmentation Fault is probably caused by C code like openbabel (which is used to convert the SMILES into PNGs). Openbabel's SMILES implementation is limited to molecules with less than ~250 atoms. One other users had this error before (here) and apparently solved it by reinstalling CTY. Honestly, I don't have a good solution for openbabels limitations, you can try switching off the picture output (leave out "-dp").
Best, Felix
Dear Felix,
Thanks for your excellent post-processing program! I simulated the gas-phase combustion by fix deposit command to inject a few oxygen atoms every several timesteps in the simulation box. So the bond file obtained will increase by several atoms every several timesteps. I process the bond file through the processing.py with no error message but it will just exit the program. The details are as follows:
When I process a bond file with a fixed number of atoms, processing.py outputs correctly. So I would like to ask if the current CTA program can process bond information with varying atomic numbers? Example bond information is as follows:
Best regards,
Jie Xiao
Last edit: Jie Xiao 2024-07-25
Hi all,
I don't even get the example running. Has anybody seen this error before? No idea why max(reactions) should be empty. It's the example and expected to get some output.
Regards
Wolfgang
wolfgang@L18001:~/Downloads/ChemTraYzer_2.1$ python analyzing.py -gen -dp -step 1000 -main test example/bonds.tatb
Fri Nov 22 20:36:52 2024 / UTC
!-------------- ChemTraYzer 2 - ReaxFF Data Analyzing ---------------! [-main
<main>] [-gen <label-style>]</label-style> [-dp <size>]</size> [-t !! !
! Version 2018-11-29 !
! Author Malte Doentgen, LTT RWTH Aachen University !
! Email chemtrayzer@ltt.rwth-aachen.de !
! !
!
! <threshold>]</threshold> [-r <atomic number="">:<min>-<max>]</max></min></atomic> [-g <atomic number="">]</atomic> !
! [-p <names>]</names> [-step <timestep>]</timestep> [-start <start>]</start> [-end <end>]</end> [-X !
! <uncertainty limit="">]</uncertainty> [-extra <list of="" comma-separated="" smiles="">]</list> !
! !
! options: !
! !
! -main: name for output like <main>.gml, the graph file or !
! <main>.pic/, the 2D depiction folder (default="Default") !
! !
! -gen: activates mechanism generation and stores reactions in LaTeX !
! format. <label-style> "id" prints only reaction IDs on connections, !
! "flux" adds the total net flux and "branching" adds the ratio of !
! ractions computed from net fluxes !
! !
! -dp: create 2D depiction for molecules and store them in the !
! <main>.pic/ folder (lack of 2D pictures may cause failure). The !
! <size> parameter is determining the size of the resulting pictures !
! and needs to be adjusted according to your "ImageMagick" / "convert" !
! version. !
! !
! -t: threshold for reaction flux which limits the reactions !
! considered for mechanism generation to those occuring <threshold> !
! times at least, default=1 !
! !
! -r: restrictions on element counts for species in the form <atomic !="" number="">:<min count="">-<max count=""> (e.g. 6:1-2 for species with carbon !
! atom counts between 1 and 2), default is no restrictions !
! !
! -g: define atomic number of element which should be used to group !
! the species, default=0 i.e. no grouping !
! !
! -p: define SMILES or indices species and/or reactions for plotting !
! them as follows <smile> or S<idx>. Reactions can be requested by !
! supplying their SMILES: ,:<c>,<d> with </d></c>,,... being !
! SMILES of the reacting molecules; or their indices: R<idx>. !
! Default=[], i.e. no plotting !
! !
! -step: give the timestep of the simulation in [fs] (default=0.1) !
! !
! -start: skip timesteps smaller than <start> for rate computation !
! (default=0) !
! !
! -end: end-flag for rate computation (default=-1) !
! !
! -X: percentage limit for rate constant uncertainty estimation !
! !
! -extra: list of species for which unimolecluar and bimolecular !
! reaction probabilities will be computed. Separte SMILES with comma !
! (,) !
!----------------------------------------------------------------------!
Reading files ...
!----------------------------------------------------------------------!
! The following list of work files will be merged into a single list !
! of reactions. All virtual reactions required for initial molecule !
! creation are stored at time=0 ! !
!----------------------------------------------------------------------!
example/bonds.tatb
Traceback (most recent call last):
</start></idx></idx></smile></max></min></atomic></threshold></size></main></label-style></main></main></main>File "analyzing.py", line 1012, in <module>
tshift = max(reaction)
ValueError: max() arg is an empty sequence
wolfgang@L18001:~/Downloads/ChemTraYzer_2.1$ </module>