Team:
Zahraa Sabra, Ali Alawieh, Fadi A Zaraket, AbdulRahman Bizri
“Superbugs” are a major worldwide concern owing to the increasing rates of bacterial antimicrobial resistance. Most interventions, to date, have done little to check this trend, and nternational agencies are acting to prevent an “Antimicrobial Armageddon”. Lately, the use of computational tools to handle large biological datasets gained robust attention with yet few applications in bacteriology. Our work presents and validates a novel hybrid method for understanding and predicting the progression of bacterial resistance at the population level using structural and probabilistic computational models. The method takes advantage of advances in computational modeling and data visualization techniques to develop new methods for the study of antimicrobial resistance. This allows investigators to understand patterns of increasing resistance, predict near-term future progressions, and orient infection control and antimicrobial stewardship programs. This paves the way for a new area of epidemiological research in microbiology using computational modeling.
This work include different files that need pre-configuration and a software installation to work:
The application of our work was written in Matlab and the open source code is available below.
The collected data from papers are recorded in excel sheets in the following format that shows the relevant features that allow the study of the bacteria antimicrobial relation.
The excel file that contains the collected data is named “InitialData.xlsx”.
The required fields in each entry in the excel are:
In order to optimize the execution of the code in MATLAB, for each column containing string values we associate numerical values that maps to the strings such that similar strings have similar numerical equivalent value, and different strings have different numerical equivalent values. This would fasten the comparison since string comparison is slower than numerical comparison.
On the other hand, we built up two matrices representing respectively the antimicrobial and bacteria features, all in numerical values.
The bacteria features are the bacteria name, the genus, the species, the group, and the preset category.
The antimicrobial features are the antimicrobial name, the group, the subgroup, the variant, and the preset category.
The user can select to study the AMR data based on a set of antimicrobial and bacteria features. The methods map the selected set of features using a filter on the matrices to extract the matching records.
The MATLAB file “TableData.m” generates the matrices of the data, the bacteria, and the antimicrobial using the excel files “InitialData.xlsx”, “bacteriaList.xlsx”, and “antimicrobialList.xlsx”. The generated matrices are saved in the file “DatabaseTable.mat”.
We implemented two models; the structural and the behavioral model. The output of the preprocessing step is the input of the two models below.
1-Structural model
When a user selects features of antimicrobial and bacteria from the structural GUI, the name fields of the antimicrobial and bacteria will include all names that match the selected features based on the predefined matrices. This is done by collecting the names of the antimicrobial and bacteria that match the features in the antimicrobial and bacteria matrices.
Then the user selects the sites he wants to study. From the generated matrix representing the whole database, we collect the entries that match the antimicrobial names, bacteria names, and studied sites. This is done using the Matlab file “SelectFromR.m” as part of the structural GUI file “ABresistance.fig”.
The new generated matrix that satisfies the selected conditions of antimicrobial, bacteria, and site is now ready for work:
Steps 1 through 5 are executed in the Matlab file “LumpMatrix.m” as part of the “ABResistance.fig” GUI file.
After preparing the data, now we can compute the difference resistance between two entries which are consecutive in time or with a difference of up to 5 available dates that comes after the date of the studied entry.
The GUI generates a graph that reflects the structure of the AMR over time. The graph is built up of nodes and edges connecting the nodes. In the graph a node represents the number of isolates studied over a year, along with the value of resistance during the specified interval of time between the two mentioned dates. The dates are presented as starting month and year and end month and year, where the month is here January since we are assuming that the unite of date is one year. As for the edge, the label indicates the difference of months between two nodes along with the difference of resistances between the two nodes. This difference may be negative if the resistance decreased from a node going to another one. Also, the difference is not a straight forward subtraction of values between the nodes. The description of the edge label will come shortly.
For a given node i, we calculate in months the average of the date between the start date and end date in the node, let’s say it is t(i) . In our model, the graph can visualize up to five differences relationship. The nth difference of time for the edge pointing to node i is calculated as follows:
Δ(n)t(i)=t(i)-[(t(i-1) +t(i-2)+..+t(i-n-1))/(n-1)]
And the nth resistance difference for the edge pointing to node i is calculated as follows:
Δ(n)R(i)=t(i)-[(R(i-1) +R(i-2)+..+R(i-n-1))/(n-1)]
The user can choose to visualize the first, second, third, fourth, and/or fifth difference AMR. The importance of such visualizations is to regard the evolution of the resistance difference over years.
Users may choose to visualize the AMR for a site or to lump the results over multiple sites. This may be done to visualize if the AMR differs depending on the selected site for a given antibiotics-bacteria combination.
“graphviz” is used as tool to visualize the graph of AMR trend over years. After specifying the content of the nodes, the connected nodes and the labels of the edges connecting those nodes, the code that generates the graph is written in an output text file (in our case its name is “output.txt”). Then we run from Matlab the dotty application and we pass it the output file in order to visualize the graph. The visualization of the structural graph is done when pressing on the button in the structural GUI.
The generation of nodes, edges, labels, and other details related to the graphviz visualization (like title, colors of the edges, etc.) is done in the Matlab file “GenGraphvizLumpNew1.m” as part of the GUI file “ABResistance.fig”. The generated graph starts with a Start node and ends with an End node so that we will have a connected graph that is contained between the start and the end nodes.
In the graph, for the selected differences, if the resistance difference on an edge is bigger than 90% of the variance of resistance differences over similar edge differences, the node to which this edge point to is colored red to indicate that a jump happened and that the resistance rose unexpectedly faster than before and/or after.
On the opposite side, if the resistance difference is less than -90% of the variance of resistance differences over similar edge differences, the node to which this edge point to is colored green to indicate that a great resistance reduction happened and that the resistance diminished unexpectedly faster than before and/or after.
For both cases the alarm encourages scientists to refer to the period of abrupt changes to analyze what could have happened to influence dramatically the resistance difference. This would lead to hypotheses of some historical/medical/environmental/economical/human interference/etc.. that may have caused the major changes on AMR.
The importance of the graph, in addition to what has been mentioned till now and the tracking of abrupt changes in AMR, is the ability to visualize the relationship not only between one specific antimicrobial and one specific bacteria from the table; rather the data is concatenated and aggregated for a set of features of antimicrobial and bacteria, which allows the visualization of the results over a wider range and over a holistic view.
Moreover one can choose to see one of the differences up to five differences where a flow may seem clear if the studied AMR was regarded over 3 difference date intervals for example and still be ambiguous if regarded over a one difference graph.
Apart from the GUI, but using the file “DatabaseTable.mat”, and based on the behavior of the AMR over years, we can track relations among antimicrobial-bacteria combinations that could lead to the discovery of similar genetic background, pattern recognition, for the related bacteria behaviors against antimicrobial. In order to visualize the antimicrobial-bacteria combinations and relations we did the following:
The steps 1 through 3 can be executed from the Matlab files “DistInfo.m” and “DistInfoNonUrine.m”
2-Behavioral GUI
The Hidden Markov Model (HMM) was selected to predict the evolution of AMR over one year based on the given history.
The file “DatabaseTable.mat” is used since it contains the preprocessed data as historical background to train the HMM, to predict the next year resistance, and to validate our model. Since the HMM is already explained in the text, we will only go into the technical steps to generate the HMM scores to come up with the next year predicted resistance.
In the behavioral GUI, a user first selects the features of antimicrobial, bacteria, and site to study. Then he chooses a threshold value after which the expected resistance is classified beyond the acceptable medical value. Thus if the predicted next year resistance was above such threshold the GUI will color it red to indicate to the physician that it is not recommended to use the selected antimicrobial set to fight the selected set of bacteria.
The user can choose the statistical mode to use for the predicted scores. He can select it to be permissive, moderate or restrictive. Also he can select whether he is using the model to validate its performance against the actual last recorded resistance, or to predict the expected next year resistance. The next year resistance is the resistance of the year next to the last entered year in the excel sheet. We will shortly explain the difference among these modes.
When the user presses on the button “Generate HMM Score” the following is done:
The Matlab code for the 5th and 6th steps is present in the file “ABResistanceHMM.fig”.
Note that for the validation of our model, for a given n entries we train the HMM over n-1 entries. Then we predict the HMM score for the next year and we compare the predicted resistance to the actual one recorded in the excel sheet.
Please note also that the HMM need at least a data set of more than five years to work correctly, so make sure that the selected set of antimicrobial, bacteria, sites give this minimum required number of years.
The data used in the Matlab applications were taken from the following papers:
[1] P. Santanam, G. Morenzoni, and F. Kayser, “Prevalence of antimicrobial resistance in Haemophilus influenzae in Greece, Israel, Lebanon and Morocco,” European Journal of Clinical Microbiology and Infectious Diseases, vol. 9, pp. 818-820, 1990.
[2] G. F. Araj, M. M. Uwaydah, and S. Y. Alami, “Antimicrobial susceptibility patterns of bacterial isolates at the American University Medical Center in Lebanon,” Diagnostic microbiology and infectious disease, vol. 20, pp. 151-158, 1994.
[3] M. Uwaydah, M. Jradeh, and Z. Shihab, “Antimicrobial resistance of clinical isolates of Streptococcus pneumoniae in Lebanon,” Journal of Antimicrobial Chemotherapy, vol. 38, pp. 283-286, 1996.
[4] M. Hamze and D. Sarkis, “Etude bicentrique de la sensibilité des sérotypes de Pseudomonas aeruginosa aux antibiotiques au liban,” Médecine et maladies infectieuses, vol. 28, pp. 668-672, 1998.
[5] G. Araj, H. Bey, L. Itani, and S. Kanj, “Drug-resistant Streptococcus pneumoniae in the Lebanon: implications for presumptive therapy,” International journal of antimicrobial agents, vol. 12, pp. 349-354, 1999.
[6] M. Hamze and D. Izard, “Sensibilité des entérobactéries aux antibiotiques. Situation en 1997 au Nord du Liban,” Médecine et maladies infectieuses, vol. 29, pp. 527-531, 1999.
[7] W. Kalaajieh, “Epidemiology of human brucellosis in Lebanon in 1997,” Médecine et maladies infectieuses, vol. 30, pp. 43-46, 2000.
[8] T. Shaar and R. Al-Hajjar, “Antimicrobial susceptibility patterns of bacteria at the Makassed General Hospital in Lebanon,” International journal of antimicrobial agents, vol. 14, pp. 161-164, 2000.
[9] M. Zouain and G. Araj, “Antimicrobial resistance of enterococci in Lebanon,” International journal of antimicrobial agents, vol. 17, pp. 209-213, 2001.
[10] A. I. Sharara, M. Chedid, G. F. Araj, K. A. Barada, and F. H. Mourad, “Prevalence of Helicobacter pylori resistance to metronidazole, clarithromycin, amoxycillin and tetracycline in Lebanon,” International journal of antimicrobial agents, vol. 19, pp. 155-158, 2002.
[11] Z. Daoud and N. Hakime, “Prevalence and susceptibility patterns of extended-spectrum betalactamase-producing Escherichia coli and Klebsiella pneumoniae in a general university hospital in Beirut, Lebanon,” Rev Esp Quimioter, vol. 16, pp. 233-8, 2003.
[12] M. Hamze, F. Dabboussi, W. Daher, and D. Izard, “Antibiotic resistance of Staphylococcus aureus at north Lebanon: place of the methicillin resistance and comparison of detection methods],” Pathologie-biologie, vol. 51, p. 21, 2003.
[13] M. Hamze, F. Dabboussi, and D. Izard, ”[A 4-year study of Pseudomonas aeruginosa susceptibility to antibiotics (1998-2001) in northern Lebanon],” Médecine et maladies infectieuses, vol. 34, pp. 321-324, 2004.
[14] J. N. Samaha-Kfoury, S. S. Kanj, and G. F. Araj, “In vitro activity of antimicrobial agents against extended-spectrum β-lactamase-producing Escherichia coli and Klebsiella pneumoniae at a tertiary care center in Lebanon,” American journal of infection control, vol. 33, pp. 134-136, 2005.
[15] S. D. Karam, A. Hajj, and A. Adaimé, “Evolution of the antibiotic resistance of Streptococcus pneumoniae from 1997 to 2004 at Hôtel-Dieu de France, a university hospital in Lebanon],” Pathologie-biologie, vol. 54, p. 591, 2006.
[16] M. Uwaydah, J. E. Mokhbat, D. Karam-Sarkis, R. Baroud-Nassif, and T. Rohban, “Penicillin-resistant Streptococcus pneumoniae in Lebanon: the first nationwide study,” International journal of antimicrobial agents, vol. 27, pp. 242-246, 2006.
[17] S. S. Kanj, O. El-Dbouni, Z. A. Kanafani, and G. F. Araj, “Antimicrobial susceptibility of respiratory pathogens at the American University of Beirut Medical Center,” International Journal of Infectious Diseases, vol. 11, pp. 554-556, 2007.
[18] M. Borg, V. De Sande‐Bruinsma, E. Scicluna, M. De Kraker, E. Tiemersma, J. Monen, and H. Grundmann, “Antimicrobial resistance in invasive strains of Escherichia coli from southern and eastern Mediterranean laboratories,” Clinical Microbiology and Infection, vol. 14, pp. 789-796, 2008.
[19] I. Saleh, O. Zouhairi, N. Alwan, A. Hawi, E. Barbour, and S. Harakeh, “Antimicrobial resistance and pathogenicity of Escherichia coli isolated from common dairy products in the Lebanon,” Annals of Tropical Medicine and Parasitology, vol. 103, pp. 39-52, 2009.
[20] G. Sawma-Aouad, F. Hashwa, and S. Tokajian, “Antimicrobial resistance in relation to virulence determinants and phylogenetic background among uropathogenic Escherichia coli in Lebanon,” Journal of Chemotherapy, vol. 21, pp. 153-158, 2009.
[21] N. El‐Najjar, M. Farah, F. Hashwa, and S. Tokajian, “Antibiotic resistance patterns and sequencing of class I integron from uropathogenic Escherichia coli in Lebanon,” Letters in applied microbiology, vol. 51, pp. 456-461, 2010.
[22] A. Hannoun, M. Shehab, M.-T. Khairallah, A. Sabra, R. Abi-Rached, T. Bazi, K. A. Yunis, G. F. Araj, and G. M. Matar, “Correlation between Group B Streptococcal genotypes, their antimicrobial resistance profiles, and virulence genes among pregnant women in Lebanon,” International journal of microbiology, vol. 2009, 2010.
[23] P. F. Abou Khalil, “Occurence of y-hemolysin and panton valentine leukocidin genes and antimicrobial susceptibility patterns of staphylococcus aureus isolated from clinical samples in Lebanon.(c2007),” 2011.
[24] W. Bahnan, F. Hashwa, G. Araj, and S. Tokajian, “emm typing, antibiotic resistance and PFGE analysis of Streptococcus pyogenes in Lebanon,” Journal of medical microbiology, vol. 60, pp. 98-101, 2011.
[25] Z. Daoud and C. Afif, “Escherichia coli isolated from urinary tract infections of lebanese patients between 2000 and 2009: Epidemiology and profiles of resistance,” Chemotherapy research and practice, vol. 2011, 2011.
[26] M. J. Farah, “Antibiotic resistance patterns and characterization of class I integron in uropathogenic escherichia coli in Lebanon.(c2008),” 2011.
[27] D. M. Haddad, “Antibiotic susceptibility patterns and detection of genes for enterotoxins and toxic Shock Syndrome Toxin-1 in Staphyloccoccus aureus involved in human infections in Lebanon.(c2007),” 2011.
[28] N. G. Issa, “Antimicrobial susceptibility testing and identification of exoS and exoU toxin genes of Pseudomonas aeruginosa isolated from clinical samples in lebanon.(c2008),” 2011.
[29] T. Sima, H. Dominik, A. Rana, H. Fuad, and A. George, “Toxins and Antibiotic Resistance in Staphylococcus aureus Isolated from a Major Hospital in Lebanon,” ISRN microbiology, vol. 2011, 2011.
[30] R. Hanna-Wakim, H. Chehab, I. Mahfouz, F. Nassar, M. Baroud, M. Shehab, G. Pimentel, M. Wasfy, B. House, G. Araj, G. Matar, and G. Dbaibo, “Epidemiologic characteristics, serotypes, and antimicrobial susceptibilities of invasive Streptococcus pneumoniae isolates in a nationwide surveillance study in Lebanon,” Vaccine, vol. 30 Suppl 6, pp. G11-7, Dec 31 2012.
[31] K. Imad, H. Monzer, and D. Fouad, “Molecular Characterization and resistance of H. influenzae isolated from Nasopharynx of Students in North Lebanon,” The International Arabic Journal of Antimicrobial Agents, vol. 2, 2012.